RagTag is a collection of software tools for scaffolding and improving modern genome assemblies. Tasks include:
- Homology-based misassembly correction
- Homology-based assembly scaffolding and patching
- Scaffold merging
RagTag also provides command line utilities for working with common genome assembly file formats.
# install with conda
conda install -c bioconda ragtag
# correct a query assembly
ragtag.py correct ref.fasta query.fasta
# scaffold a query assembly
ragtag.py scaffold ref.fasta query.fasta
# scaffold with multiple references/maps
ragtag.py scaffold -o out_1 ref1.fasta query.fasta
ragtag.py scaffold -o out_2 ref2.fasta query.fasta
ragtag.py merge query.fasta out_*/*.agp other.map.agp
# use Hi-C to resolve conflicts
ragtag.py merge -b hic.bam query.fasta out_*/*.agp other.map.agp
# make joins and fill gaps in target.fa using sequences from query.fa
ragtag.py patch target.fa query.fa
Please see the Wiki for detailed documentation.
- Minimap2, Unimap, or Nucmer
- Python 3 (with the following auto-installed packages)
- numpy
- intervaltree
- pysam
- networkx
- Alonge, Michael, et al. "Automated assembly scaffolding elevates a new tomato system for high-throughput genome editing." bioRxiv (2021).
https://doi.org/10.1101/2021.11.18.469135
RagTag supersedes RaGOO:
- Alonge, Michael, et al. "RaGOO: fast and accurate reference-guided scaffolding of draft genomes." Genome biology 20.1 (2019): 1-17.
https://doi.org/10.1186/s13059-019-1829-6
Many of the major algorithmic improvements relative to RaGOO's first release were provided by Aleksey Zimin, lead developer of the MaSuRCA assembler. Luca Venturini suggested and initially implemented many feature enhancements, such as pysam integration. RagTag "merge" was inspired by CAMSA. The developer of CAMSA, Sergey Aganezov, helped review relevant RagTag code. RagTag "patch" was inspired by Grafter, a scaffolding tool written by Melanie Kirsche. Melanie provided guidance for the RagTag implementation. Michael Schatz has provided guidance for the whole project.