This repository has been archived by the owner on Nov 14, 2023. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I wanted to propose this change to the other developers. There are a couple of little problems that emerge when using FastqLoader and FastqStorer to manipulate fastq files whose id lines don't follow the Illumina format:
An example where this an issue in practice, I wanted to convert base quality values from Illumina to Sanger encoding in a few gigabytes of fastq files provided by some sequencing center. The id lines were not in the "structured" Illumina format so I had no way to manipulate the data while keeping the same read ids. With this patch that's now possible.
For the most part, this change is backwards compatible. It adds a new field "id" to the tuples generated by FastqLoader, which can be easily ignored. There is one small thing though that might cause some surprises: if the "id" field exists in the tuple passed to FastqStorer, it will be used as the id line rather than constructing one from the meta data ("id" will be passed to FastqOutputFormat as the key, rather than passing null). I suspect this isn't a big issue, but I wanted to get some feedback before committing.