From aec90c7cd0d6179081274bb44f58caf0f9cdbf26 Mon Sep 17 00:00:00 2001 From: Nils Homer Date: Wed, 29 May 2024 23:27:10 -0400 Subject: [PATCH 1/2] Add TemplateCoordinate sort order to the usage of SortBam --- src/main/scala/com/fulcrumgenomics/bam/SortBam.scala | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala b/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala index 4e9a59360..02f5a8f63 100644 --- a/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala +++ b/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala @@ -41,6 +41,10 @@ import com.fulcrumgenomics.util.Io |and several |4. **RandomQuery**: sorts the reads into a random order but keeps reads with the same | queryname together. The ordering is deterministic for any given input. + |5. **TemplateCoordinate**: sorts the reads The sort order used by `GroupReadByUmi`. Sorts reads by + | the earlier unclipped 5' coordinate of the read pair, the higher unclipped 5' coordinate of the + | read pair, library, the molecular identifier (MI tag), read name, and if R1 has the lower + | coordinates of the pair. | |Uses a temporary directory to buffer sets of sorted reads to disk. The number of reads kept in memory |affects memory use and can be changed with the `--max-records-in-ram` option. The temporary directory From 0c13e841155337a868a5e2f45b87f470f65edf40 Mon Sep 17 00:00:00 2001 From: Nils Homer Date: Thu, 30 May 2024 19:07:13 -0400 Subject: [PATCH 2/2] Update src/main/scala/com/fulcrumgenomics/bam/SortBam.scala Co-authored-by: Matt Stone --- src/main/scala/com/fulcrumgenomics/bam/SortBam.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala b/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala index 02f5a8f63..ad5d63212 100644 --- a/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala +++ b/src/main/scala/com/fulcrumgenomics/bam/SortBam.scala @@ -41,7 +41,7 @@ import com.fulcrumgenomics.util.Io |and several |4. **RandomQuery**: sorts the reads into a random order but keeps reads with the same | queryname together. The ordering is deterministic for any given input. - |5. **TemplateCoordinate**: sorts the reads The sort order used by `GroupReadByUmi`. Sorts reads by + |5. **TemplateCoordinate**: The sort order used by `GroupReadByUmi`. Sorts reads by | the earlier unclipped 5' coordinate of the read pair, the higher unclipped 5' coordinate of the | read pair, library, the molecular identifier (MI tag), read name, and if R1 has the lower | coordinates of the pair.