-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline should handle manifest files with OSX/Windows line endings #119
Comments
Hmmm... I've just tested a similar manifest file in the ASYNC_UPLOAD pipeline, and got the same result (minus the harvesting, which that pipeline explicitly does not do). |
The keys actually do have a newline character on the end, which is valid in S3 but confuses the S3 client applications like s3fuse and our bucket browser script: 01:34:15 ~$ aws s3api list-objects-v2 --no-sign --bucket imos-test-data --prefix IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/
{
"Contents": [
{
"LastModified": "2018-09-06T02:01:49.000Z",
"ETag": "\"c18c09d3f737a88bfbcfbdea9d52fa4d-12\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-189_END-20100601T040000Z_C-20120201T063245Z.nc\n",
"Size": 100566012
},
{
"LastModified": "2018-09-06T02:01:52.000Z",
"ETag": "\"27fb05f929e127e0754b27b0d2bcbfb1-12\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20091120T045000Z_GBRMYR_FV01_GBRMYR-0911-WQM-19_END-20100601T040000Z_C-20120201T063238Z.nc\n",
"Size": 93945740
},
{
"LastModified": "2018-09-06T02:01:54.000Z",
"ETag": "\"825f276178715c4ec8e61c9714197560-11\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20101103T090500Z_GBRMYR_FV01_GBRMYR-1010-WQM-188_END-20110414T212900Z_C-20120129T141746Z.nc\n",
"Size": 84451780
},
{
"LastModified": "2018-09-06T02:01:56.000Z",
"ETag": "\"6d0a812979510f78b64ba1383a460281-13\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20111017T062000Z_GBRMYR_FV01_GBRMYR-1110-WQM-187_END-20120412T221800Z_C-20121112T033903Z.nc\n",
"Size": 108330448
},
{
"LastModified": "2018-09-06T02:01:58.000Z",
"ETag": "\"9d6fe256ca93bf2a3c634b6f253e726e-9\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/IMOS_ANMN-QLD_CTPSOKUE_20111017T062000Z_GBRMYR_FV01_GBRMYR-1110-WQM-19_END-20120412T221800Z_C-20121112T033855Z.nc\n",
"Size": 75105704
},
{
"LastModified": "2018-09-06T02:02:01.000Z",
"ETag": "\"56e74a75ab329970f388200a721e75a6-13\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/non-QC/IMOS_ANMN-QLD_RCTPSOKUE_20111017T062000Z_GBRMYR_FV00_GBRMYR-1110-WQM-187_END-20120412T221800Z_C-20121112T033903Z.nc\n",
"Size": 108329076
},
{
"LastModified": "2018-09-06T02:02:03.000Z",
"ETag": "\"398c2d0d047c16dd315c1e70f6448717-9\"",
"StorageClass": "STANDARD",
"Key": "IMOS/ANMN/QLD/GBRMYR/Biogeochem_timeseries/non-QC/IMOS_ANMN-QLD_RCTPSOKUE_20111017T062000Z_GBRMYR_FV00_GBRMYR-1110-WQM-19_END-20120412T221800Z_C-20121112T033855Z.nc\n",
"Size": 75104268
}
]
} Do you have the exact input file still? I'm sure this could be picked up in unit tests... |
Yes, here it is (just added the ".txt" so GitHub accepts it) Could the problem be with the input file? It seems to have just standard CRLF line terminators. |
Yes that's the problem... CRLF is DOS line endings, which I'd say is what tripped it up. Might need to look at handling that scenario when parsing the text input files, because they could come from OSX or Windows via pasting etc., so it should handle line endings from all 3 platforms "just in case". |
I've just tried uploading a .map_manifest file into the new AODN_moorings_nocheck pipeline (see https://github.com/aodn/chef-private/pull/2984) on 4-nec-hob. The manifest file looked like this:
The pipeline sort of pretended to process the file, run a harvester, and eventually reported SUCCESS in the log (see
4-nec-hob:/mnt/ebs/log/pipeline/process/tasks.AODN_moorings_nocheck.log
, task idaa617f77-4c7a-46a9-828f-964672c52a8e
), but in fact it failed completely:Trying to list the uploaded files on S3 gives weird results:
Looks like maybe some characters got added on to the end of each file name somewhere along the way?
The text was updated successfully, but these errors were encountered: