Releases: ablab/IsoQuant
IsoQuant 3.6.3
-
Fix penalty score for terminal exon elongation when selecting similar isoforms for inconsistent reads #270, thanks to @biosalt-cc
-
Fix
transcript_model_grouped_counts
output format #275, thanks to @ljwharbers
IsoQuant 3.6.2
Important bug-fix release!
Fixes linear grouped counts output #258, big thanks to @qsonehara!
IsoQuant 3.6.1
IsoQuant 3.6.0
Fixes #236 by resolving duplicated noninformative
and intergenic
reads assignments.
As a results, also fixes duplicated novel transcripts. Thanks @jamestwebber for the report!
IsoQuant 3.5.2
IsoQuant 3.5.1
-
Fix YAML support in visualization #222
-
Fix transcript naming when IsoQuant-generated GTF is provided as input #219
-
Fix
exons
attribute duplication #219 -
Exon ids are now consistent between output and input annotations if present
-
New
--count_format
option for setting desired grouped counts format (matrix/linear/both), fixes #223
IsoQuant 3.5.0
-
New visualization software developed by @jackfreeman88. See more here.
-
Dramatically reduced RAM consumption for grouped counts, about 10-20x decrease on datasets with large number of groups.
Important fix for single-cell data processing. Should fix #189. -
Fixed #195: output GTF contained very similar isoforms and estimated their expression as 0.
-
New documentation is now available at ablab.github.io/IsoQuant.
IsoQuant 3.4.2
-
Dramatically reduce RAM consumption. Should fix #209.
IsoQuant 3.4.2 was tested on a simulated ONT dataset with 30M reads using 12 threads. In the default mode RAM consumption decreased from 280GB to 12GB when using the reference annotation and from 230GB down to 6GB in the reference-free mode. Running time in the default mode increased by approximately 20-25%. When using
--high_memory
option, running time remains the same as in 3.4.1, RAM consumption in the reference-based mode is 46GB, and 36GB in the reference-free mode. Note, that in general RAM consumption depends on the particular data being used and the number of threads.In brief, in 3.4.0 and 3.4.1 inadequate RAM consumption was caused by this commit. Apparently, adding a couple of
int
fields to theBasicReadAssignment
class made the default pickle serialization not to clean used memory (possibly, a leak). Since some large lists ofBasicReadAssignment
were sent between processes, this caused the main process to consume unnecessary RAM. When later new processes were created for GTF construction, total RAM consumption exploded thanks to the way Python multiprocessing works. This release implements two ways fixing the issue: sending objects via disk (default) and using custom pickle serialization (when--high_memory
is used). -
Transcript and exon ids are now identical between runs, including ones with different number of threads.
IsoQuant 3.4.1
IsoQuant 3.4.0
Major novelties and improvements:
-
Significant speed-up on datasets containing regions with extremely high coverage,often encountered on mitochondrial chromosomes (#97).
-
Added support for Illumina reads for spliced alignment correction (thanks to @rkpfeil).
-
Added support YAML files (thanks to @rkpfeil). Old options
--bam_list
and--fastq_list
are still availble, but deprecated since this version.
Transcript discovery and GTF processing:
-
Fixed missing genes in extended GTF (#140, #147, #151, #175).
-
Fixed strand detection and output of transcripts with
.
strand (#107). -
Added
--report_canonical
and--polya_requirement
options that allows to control level of filtering of output transcripts based on canonical splice sites and the presence of poly-A tails. (#128). -
Added check for input GTFs (#155).
-
Extract CDS, other features and attributes from reference GTF to the output GTFs (#176).
-
Reworked novel gene merging procedure (#164).
-
Revamped algorithm for assigning reads to novel transcripts and their quantification (#127).
Read assignment and quantification:
-
Optimized read-to-isoform assignment algorithm.
-
Added
gene_assignment_type
attribute to read assignments. -
Fixed duplicated records in
read_assignments.tsv
(#168). -
Improved gene and transcript quantification. Only unique assignments are now used for transcript quantification.
Added more options for quantification strategies (--gene_quantification
and--transcript_quantification
). -
New option to control TPM computing (
--normalization_method
). -
Improved consistency between
trascript_counts.tsv
andtranscript_model_counts.tsv
(#137). -
Introduced mapping quality filtering:
--min_mapq
,--inconsistent_mapq_cutoff
and--simple_alignments_mapq_cutoff
(#110).
Minor fixes and improvements:
-
Added
--bam_tags
option to import additional information from BAM files to read assignments output. -
Large output files are now gzipped by default,
--no_gzip
can be used to keep uncompressed output (#154). -
BAM stats are now printed to the log (#139).
-
Various minor fixes and requests: #106, #141, #143, #146, #179.
Special acknowledgement to @almiheenko for testing and reviewing PRs, and to @alexandrutomescu for supporting the project.