Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new features and Improvement to nf-core/mag: Trimmomatic, contigs param for binning, new concoct default params #744

Open
Pranjal-Bioinfo opened this issue Jan 16, 2025 · 1 comment · May be fixed by #745
Assignees
Labels
enhancement New feature or request

Comments

@Pranjal-Bioinfo
Copy link

Description of feature

1. Include Trimmomatic as an Additional Preprocessing Tool
Currently, quality trimming is supported only for fastp in the pipeline. While fastp can be very efficient, Trimmomatic offers particular advantages for specific datasets, such as handling paired-end reads more robustly or allowing for finer control over trimming parameters.
Adding Trimmomatic will allow users the ability to make choices, giving flexibility based on their dataset's requirements.
Including the step for optional Trimmomatic may gather more users who are used to or who prefer this tool for preprocessing.

2. Contigs Param for binning
Scaffolds may introduce errors due to misassemblies during linking.
Since contigs are the raw output of assemblers, they may provide a more realistic representation in binning based on both sequence composition and coverage.
Scaffolding is based on assumptions that may not hold for complex metagenomes, potentially biasing binning results.

3. New concoct default params
Currently, the cut_up_fasta.py script within CONCOCT is set to chunk contigs with the parameters -c 1999 -o 1900.
These settings yield small chunks of 1,999 bases with significant overlap of 1,900 bases, which in turn increases the number of fragments and subsequently the runtime.
A potential improvement is to use -c 10000 -o 0, which creates larger chunks (10,000 bases) with no overlap.
This approach is faster in generating fewer fragments and is also the default in the official GitHub example for CONCOCT.

@Pranjal-Bioinfo Pranjal-Bioinfo added the enhancement New feature or request label Jan 16, 2025
@amizeranschi
Copy link
Contributor

I am also interested in point 2 above, it would be great to have a pipeline parameter telling it to use contigs instead of scaffolds for all subsequent stages of the analysis (binning, taxonomy etc.), when assembling with SPAdes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants