- biobambam2
- Added submodule for
bamsormadup
tool - Totally cheating - it uses Picard MarkDuplicates but with a custom search pattern and naming
- Added submodule for
- HiC Explorer
- Fixed bug where module tries to parse QC_table.txt, a new log file in hicexplorer v2.2.
- RSeQC
- Fixed bug where Junction Saturation plot for a single sample was mislabelling the lines.
- Samtools
- Utilize in-built
read_count_multiplier
functionality to plotflagstat
results more nicely
- Utilize in-built
- SnpEff
- Increased the default summary csv file-size limit from 1MB to 5MB.
- VCFTools
- Fixed a bug where
tstv_by_qual.py
produced invalid json from infinity-values.
- Fixed a bug where
- Added some installation docs for windows
- Added some docs about using MultiQC in bioinformatics pipelines
- Rewrote Docker image
- New base image
czentye/matplotlib-minimal
reduces image size from ~200MB to ~80MB - Proper installation method ensures latest version of the code
- New entrypoint allows easier command-line usage
- New base image
- MultiQC now ignores all
.md5
files - Use
SafeLoader
for PyYaml load calls, avoiding recent warning messages.
MultiQC v1.7 - 2018-12-21
- BISCUIT
- BISuilfite-seq CUI Toolkit
- Module written by @zwdzwd
- DamageProfiler
- A tool to determine ancient DNA misincorporation rates.
- Module written by @apeltzer
- FLASh
- FLASH (Fast Length Adjustment of SHort reads)
- Module written by @pooranis
- MinIONQC
- QC of reads from ONT long-read sequencing
- Module written by @ManavalanG
- phantompeakqualtools
- A tool for informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data.
- Module written by @chuan-wang
- Stacks
- A software for analyzing restriction enzyme-based data (e.g. RAD-seq). Support for Stacks >= 2.1 only.
- Module written by @remiolsen
- AdapterRemoval
- Handle error when zero bases are trimmed. See #838.
- Bcl2fastq
- New plot showing the top twenty of undetermined barcodes by lane.
- Informations for R1/R2 are now separated in the General Statistics table.
- SampleID is concatenate with SampleName because in Chromium experiments several sample have the same SampleName.
- deepTools
- New PCA plots from the
plotPCA
function (written by @chuan-wang) - New fragment size distribution plots from
bamPEFragmentSize --outRawFragmentLengths
(written by @chuan-wang) - New correlation heatmaps from the
plotCorrelation
function (written by @chuan-wang) - New sequence distribution profiles around genes, from the
plotProfile
function (written by @chuan-wang) - Reordered sections
- New PCA plots from the
- Fastp
- Fixed bug in parsing of empty histogram data. See #845.
- FastQC
- Refactored Per Base Sequence Content plots to show original underlying data, instead of calculating it from the page contents. Now shows original FastQC base-ranges and fixes 100% GC bug in final few pixels. See #812.
- When including a FastQC section multiple times in one report, the summary progress bars now behave as you would expect.
- FastQ Screen
- Don't hide genomes in the simple plot, even if they have zero unique hits. See #829.
- InterOp
- Fixed bug where read counts and base pair yields were not displaying in tables correctly.
- Number formatting for these fields can now be customised in the same way as with other modules, as described in the docs
- Picard
- InsertSizeMetrics: You can now configure to what degree the insert size plot should be smoothed.
- CollectRnaSeqMetrics: Add warning about missing rRNA annotation.
- CollectRnaSeqMetrics: Add chart for counts/percentage of reads mapped to the correct strand.
- Now parses VariantCallingMetrics reports. (Similar to GATK module's VariantEval.)
- phantompeakqualtools
- Properly clean sample names
- Trimmomatic
- Updated Trimmomatic module documentation to be more helpful
- New option to use filenames instead of relying on the command line used. See #864.
- Embed your custom images with a new Custom Content feature! Just add
_mqc
to the end of the filename for.png
,.jpg
or.jpeg
files. - Documentation for Custom Content reordered to make it a little more sane
- You can now add or override any config parameter for any MultiQC plot! See the documentation for more info.
- Allow
table_columns_placement
config to work with table IDs as well as column namespaces. See #841. - Improved visual spacing between grouped bar plots
- Custom content no longer clobbers
col1_header
table configs - The option
--file-list
that refers to a text file with file paths to analyse will no longer ignore directory paths - Sample name directory prefixes are now added after cleanup.
- If a module is run multiple times in one report, it's CSS and JS files will only be included once (
default
template)
MultiQC v1.6 - 2018-08-04
Some of these updates are thanks to the efforts of people who attended the NASPM 2018 MultiQC hackathon session. Thanks to everyone who attended!
- fastp
- An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...)
- Module started by @florianduclot and completed by @ewels
- hap.py
- Hap.py is a set of programs based on htslib to benchmark variant calls against gold standard truth datasets
- Module written by @tsnowlan
- Long Ranger
- Works with data from the 10X Genomics Chromium. Performs sample demultiplexing, barcode processing, alignment, quality control, variant calling, phasing, and structural variant calling.
- Module written by @remiolsen
- miRTrace
- A quality control software for small RNA sequencing data.
- Module written by @chuan-wang
- BCFtools
- New plot showing SNP statistics versus quality of call from bcftools stats (@MaxUlysse and @Rotholandus)
- BBMap
- Support added for BBDuk kmer-based adapter/contaminant filtering summary stats (@boulund
- FastQC
- New read count plot, split into unique and duplicate reads if possible.
- Help text added for all sections, mostly copied from the excellent FastQC help.
- Sequence duplication plot rescaled
- FastQ Screen
- Samples in large-sample-number plot are now sorted alphabetically (@hassanfa
- MACS2
- Output is now more tolerant of missing data (no plot if no data)
- Peddy
- Picard
- New submodule to handle
ValidateSamFile
reports (@cpavanrun) - WGSMetrics now add the mean and standard-deviation coverage to the general stats table (hidden) (@cpavanrun)
- New submodule to handle
- Preseq
- New config option to plot preseq plots with unique old coverage on the y axis instead of read count
- Code refactoring by @vladsaveliev
- QUAST
- Null values (
-
) in reports now handled properly. Bargraphs always shown despite varying thresholds. (@vladsaveliev)
- Null values (
- RNA-SeQC
- Don't create the report section for Gene Body Coverage if no data is given
- Samtools
- Fixed edge case bug where MultiQC could crash if a sample had zero count coverage with idxstats.
- Adds % proper pairs to general stats table
- Skewer
- Read length plot rescaled
- Tophat
- Fixed bug where some samples could be given a blank sample name (@lparsons)
- VerifyBamID
- Change column header help text for contamination to match percentage output (@chapmanb)
- New config option
remove_sections
to skip specific report sections from modules - Add
path_filters_exclude
to exclude certain files when running modules multiple times. You could previously only include certain files. - New
exclude_*
keys for file search patterns- Have a subset of patterns to exclude otherwise detected files with, by filename or contents
- Command line options all now use mid-word hyphens (not a mix of hyphens and underscores)
- Old underscore terms still maintained for backwards compatibility
- Flag
--view-tags
now works without requiring an "analysis directory". - Removed Python dependency for
enum34
(@boulund) - Columns can be added to
General Stats
table for custom content/module. - New
--ignore-symlinks
flag which will ignore symlinked directories and files. - New
--no-megaqc-upload
flag which disables automatically uploading data to MegaQC
- Fix path_filters for top_modules/module_order configuration only selecting if all globs match. It now filters searches that match any glob.
- Empty sample names from cleaning are now no longer allowed
- Stop prepend_dirs set in the config from getting clobbered by an unpassed CLI option (@tsnowlan)
- Modules running multiple times now have multiple sets of columns in the General Statistics table again, instead of overwriting one another.
- Prevent tables from clobbering sorted row orders.
- Fix linegraph and scatter plots data conversion (sporadically the incorrect
ymax
was used to drop data points) (@cpavanrun) - Adjusted behavior of ceiling and floor axis limits
- Adjusted multiple file search patterns to make them more specific
- Prevents the wrong module from accidentally slurping up output from a different tool. By @cpavanrun (see PR #727)
- Fixed broken report bar plots when
-p
/--export-plots
was specified (see issue #801)
MultiQC v1.5 - 2018-03-15
- HiCPro - New module!
- HiCPro: Quality controls and processing of Hi-C
- Module written by @nservant,
- DeDup - New module!
- DeDup: Improved Duplicate Removal for merged/collapsed reads in ancient DNA analysis
- Module written by @apeltzer,
- Clip&Merge - New module!
- Clip&Merge: Adapter clipping and read merging for ancient DNA analysis
- Module written by @apeltzer,
- bcl2fastq
- BUSCO
- Fixed configuration bug that made all sample names become
'short'
- Fixed configuration bug that made all sample names become
- Custom Content
- Parsed tables now exported to
multiqc_data
files
- Parsed tables now exported to
- Cutadapt
- Refactor parsing code to collect all length trimming plots
- FastQC
- Fixed starting y-axis label for GC-content lineplot being incorrect.
- HiCExplorer
- Updated to work with v2.0 release.
- Homer
- Made parsing of
tagInfo.txt
file more resilient to variations in file format so that it works with new versions of Homer. - Kept order of chromosomes in coverage plot consistent.
- Made parsing of
- Peddy
- Switch
Sex error
logic toCorrect sex
for better highlighting (@aledj2)
- Switch
- Picard
- Updated module and search patterns to recognise new output format from Picard version >= 2.16 and GATK output.
- Qualimap BamQC
- Fixed bug where start of Genome Fraction could have a step if target is 100% covered.
- RNA-SeQC
- Added rRNA alignment stats to summary table @Rolandde
- RSeqC
- Fixed read distribution plot by adding category for
other_intergenic
(thanks to @moxgreen) - Fixed a dodgy plot title (Read GC content)
- Fixed read distribution plot by adding category for
- Supernova
- Added support for Supernova 2.0 reports. Fixed a TypeError bug when using txt reports only. Also a bug when parsing empty histogram files.
- Invalid choices for
--module
or--exclude
now list the available modules alphabetically. - Linting now checks for presence in
config.module_order
and tags.
- Excluding modules now works in combination with using module tags.
- Fixed edge-case bug where certain combinations of
output_fn_name
anddata_dir_name
could trigger a crash - Conditional formatting - values are now longer double-labelled
- Made config option
extra_series
work in scatter plots the same way that it works for line plots - Locked the
matplotlib
version tov2.1.0
and below
MultiQC v1.4 - 2018-01-11
A slightly earlier-than-expected release due to a new problem with dependency packages that is breaking MultiQC installations since 2018-01-11.
- Sargasso
- Parses output from Sargasso - a tool to separate mixed-species RNA-seq reads according to their species of origin
- Module written by @hxin
- VerifyBAMID
- Parses output from VerifyBAMID - a tool to detect contamination in BAM files.
- Adds the
CHIPMIX
andFREEMIX
columns to the general statistics table. - Module written by @aledj2
- MACS2
- Updated to work with output from older versions of MACS2 by @avilella
- Peddy
- Add het check plot to suggest potential contamination by @aledj2
- Picard
- Picard HsMetrics
HS_PENALTY
plot now has correct axis labels - InsertSizeMetrics switches commas for points if it can't convert floats. Should help some european users.
- Picard HsMetrics
- QoRTs
- Added support for new style of output generated in the v1.3.0 release
- Qualimap
- QUAST
- New option to customise the default display of contig count and length (eg.
bp
instead ofMbp
). - See documentation. Written by @ewels and @Cashalow
- New option to customise the default display of contig count and length (eg.
- RSeQC
- Removed normalisation in Junction Saturation plot. Now raw counts instead of % of total junctions.
- Conditional formatting / highlighting of cell contents in tables
- If you want to make values that match a criteria stand out more, you can now write custom rules and formatting instructions for tables.
- For instructions, see the documentation
- New
--lint
option which is strict about best-practices for writing new modules- Useful when writing new modules and code as it throws warnings
- Currently only implemented for bar plots and a few other places. More linting coming soon...
- If MultiQC breaks and shows am error message, it now reports the filename of the last log it found
- Hopefully this will help with debugging / finding dodgy input data
- Addressed new dependency error with conflicting package requirements
- There was a conflict between the
networkx
,colormath
andspectra
releases. - I previously forced certain software versions to get around this, but
spectra
has now updated with the unfortunate effect of introducing a new dependency clash that halts installation.
- There was a conflict between the
- Fixed newly introduced bug where Custom Content MultiQC config file search patterns had been broken
- Updated pandoc command used in
--pdf
to work with new releases of Pandoc - Made config
table_columns_visible
module name key matching case insensitive to make less frustrating
MultiQC v1.3 - 2017-11-03
Only for users with custom search patterns for the bowtie
or star
: you will
need to update your config files - the bowtie
search key is now bowtie1
,
star_genecounts
is now star/genecounts
.
For users with custom modules - search patterns must now conform to the search
pattern naming convention: modulename
or modulename/anything
(the search pattern
string beginning with the name of your module, anything you like after the first /
).
- 10X Supernova
- Parses statistics from the de-novo Supernova software.
- Module written by @remiolsen
- BBMap
- deepTools - new module!
- Parse text output from
bamPEFragmentSize
,estimateReadFiltering
,plotCoverage
,plotEnrichment
, andplotFingerprint
- Module written by @dpryan79
- Parse text output from
- Homer Tag Directory - new submodule!
- Module written by @rdali
- illumina InterOp
- Module to parse metrics from illumina sequencing runs and demultiplexing, generated by the InterOp package
- Module written by @matthdsm
- RSEM - new module!
- Parse
.cnt
file comming from rsem-calculate-expression and plot read repartitions (Unalignable, Unique, Multi ...) - Module written by @noirot
- Parse
- HiCExplorer
- New module to parse the log files of
hicBuildMatrix
. - Module written by @joachimwolff
- New module to parse the log files of
- AfterQC
- Handle new output format where JSON summary key changed names.
- bcl2fastq
- Clusters per sample plot now has tab where counts are categoried by lane.
- GATK
- New submodule to handle Base Recalibrator stats, written by @winni2k
- HiSAT2
- Fixed bug where plot title was incorrect if both SE and PE bargraphs were in one report
- Picard HsMetrics
- Parsing code can now handle commas for decimal places
- Preseq
- Updated odd file-search pattern that limited input files to 500kb
- QoRTs
- Added new plots, new helptext and updated the module to produce a lot more output.
- Qualimap BamQC
- Fixed edge-case bug where the refactored coverage plot code could raise an error from the
range
call.
- Fixed edge-case bug where the refactored coverage plot code could raise an error from the
- Documentation and link fixes for Slamdunk, GATK, bcl2fastq, Adapter Removal, FastQC and main docs
- Many of these spotted and fixed by @juliangehring
- Went through all modules and standardised plot titles
- All plots should now have a title with the format Module name: Plot name
- New MultiQC docker image
- Ready to use docker image now available at https://hub.docker.com/r/ewels/multiqc/ (200 MB)
- Uses automated builds - pull
:latest
to get the development version, future releases will have stable tags. - Written by @MaxUlysse
- New
module_order
config options allow modules to be run multiple times- Filters mean that a module can be run twice with different sets of files (eg. before and after trimming)
- Custom module config parameters can be passed to module for each run
- File search refactored to only search for running modules
- Makes search much faster when running with lots of files and limited modules
- For example, if using
-m star
to only use the STAR module, all other file searches now skipped
- File search now warns if an unrecognised search type is given
- MultiQC now saves nearly all parsed data to a structured output file by default
- See
multiqc_data/multiqc_data.json
- This can be turned off by setting
config.data_dump_file: false
- See
- Verbose logging when no log files found standardised. Less duplication in code and logs easier to read!
- New documentation section describing how to use MultiQC with Galaxy
- Using
shared_key: 'read_counts'
in table header configs now applies relevant defaults
- Installation problem caused by changes in upstream dependencies solved by stricter installation requirements
- Minor
default_dev
directory creation bug squashed - Don't prepend the directory separator (
|
) to sample names with-d
when there are no subdirs yPlotLines
now works even if you don't setwidth
MultiQC v1.2 - 2017-08-16
We had a fantastic group effort on MultiQC at the 2017 BOSC CodeFest. Many thanks to those involved!
- AfterQC - New module!
- Added parsing of the AfterQC json file data, with a plot of filtered reads.
- Work by @raonyguimaraes
- bcl2fastq
- leeHom
- leeHom is a program for the Bayesian reconstruction of ancient DNA
- VCFTools
- Added initial support for VCFTools
relatedness2
- Added support for VCFTools
TsTv-by-count
TsTv-by-qual
TsTv-summary
- Module written by @mwhamgenomics
- Added initial support for VCFTools
- FastQ Screen
- Gracefully handle missing data from very old FastQ Screen versions.
- RNA-SeQC
- Add new transcript-associated reads plot.
- Picard
- New submodule to handle output from
TargetedPcrMetrics
- New submodule to handle output from
- Prokka
- Added parsing of the
# CRISPR arrays
data from Prokka when available (@asetGem)
- Added parsing of the
- Qualimap
- Some code refactoring to radically improve performance and run times, especially with high coverage datasets.
- Fixed bug where Cumulative coverage genome fraction plot could be truncated.
- New module help text
- Lots of additional help text was written to make MultiQC report plots easier to interpret.
- Updated modules:
- Bowtie
- Bowtie 2
- Prokka
- Qualimap
- SnpEff
- Elite team of help-writers:
- New config option
section_comments
allows you to add custom comments above specific sections in the report - New
--tags
and--view_tags
command line options- Modules can now be given tags (keywords) and filtered by those. So running
--tags RNA
will only run MultiQC modules related to RNA analysis. - Work by @Hammarn
- Modules can now be given tags (keywords) and filtered by those. So running
- Back-end configuration options to specify the order of table columns
- Modules and user configs can set priorities for columns to customise where they are displayed
- Work by @tbooth
- Added framework for proper unit testing
- Previous start on unit tests tidied up, new blank template and tests for the
clean_sample_name
functionality. - Added to Travis and Appveyor for continuous integration testing.
- Work by @tbooth
- Previous start on unit tests tidied up, new blank template and tests for the
- Bug fixes and refactoring of report configuration saving / loading
- Discovered and fixed a bug where a report config could only be loaded once
- Work by @DennisSchwartz
- Table column row headers (sample names) can now be numeric-only.
- Work by @iimog
- Improved sample name cleaning functionality
- Added option
regex_keep
to clean filenames by keeping the matching part of a pattern - Work by @robinandeer
- Added option
- Handle error when invalid regexes are given in reports
- Now have a nice toast error warning you and the invalid regexes are highlighted
- Previously this just crashed the whole report without any warning
- Work by @robinandeer
- Command line option
--dirs-depth
now sets-d
toTrue
(so now works even if-d
isn't also specified). - New config option
config.data_dump_file
to export as much data as possible tomultiqc_data/multiqc_data.json
- New code to send exported JSON data to a a web server
- This is in preparation for the upcoming MegaQC project. Stay tuned!
- Specifying multiple config files with
-c
/--config
now works as expected- Previously this would only read the last specified
- Fixed table rendering bug that affected Chrome v60 and IE7-11
- Table cell background bars weren't showing up. Updated CSS to get around this rendering error.
- HTML ID cleanup now properly cleans strings so that they work with jQuery as expected.
- Made bar graph sample highlighting work properly again
- Config
custom_logo
paths can now be relative to the config file (or absolute as before) - Report doesn't keep annoyingly telling you that toolbox changes haven't been applied
- Now uses more subtle toasts and only when you close the toolbox (not every click).
- Switching report toolbox options to regex mode now enables the Apply button as it should.
- Sorting table columns with certain suffixes (eg.
13X
) no works properly (numerically) - Fixed minor bug in line plot data smoothing (now works with unsorted keys)
MultiQC v1.1 - 2017-07-18
- BioBloom Tools
- Create Bloom filters for a given reference and then to categorize sequences
- Conpair
- Concordance and contamination estimator for tumor–normal pairs
- Disambiguate
- Bargraph displaying the percentage of reads aligning to two different reference genomes.
- Flexbar
- Flexbar is a tool for flexible barcode and adapter removal.
- HISAT2
- New module for the HISAT2 aligner.
- Made possible by updates to HISAT2 logging by @infphilo (requires
--new-summary
HISAT2 flag).
- HOMER
- Support for summary statistics from the
findPeaks
tool.
- Support for summary statistics from the
- Jellyfish
- Histograms to estimate library complexity and coverage from k-mer content.
- Module written by @vezzi
- MACS2
- Summary of redundant rate from MACS2 peak calling.
- QoRTs
- QoRTs is toolkit for analysis, QC and data management of RNA-Seq datasets.
- THetA2
- THeTA2 (Tumor Heterogeneity Analysis) estimates tumour purity and clonal / subclonal copy number.
- BCFtools
- Option to collapse complementary changes in substitutions plot, useful for non-strand specific experiments (thanks to @vladsaveliev)
- Bismark
- M-Bias plots no longer show read 2 for single-end data.
- Custom Content
- New option to print raw HTML content to the report.
- FastQ Screen
- Fixed edge-case bug where many-sample plot broke if total number of reads was less than the subsample number.
- Fixed incorrect logic of config option
fastqscreen_simpleplot
(thanks to @daler) - Organisms now alphabetically sorted in fancy plot so that order is nonrandom (thanks to @daler)
- Fixed bug where
%No Hits
was missed in logs from recent versions of FastQ Screen.
- HTSeq Counts
- Fixed but so that module still works when
--additional-attr
is specified in v0.8 HTSeq above (thanks to @nalcala)
- Fixed but so that module still works when
- Picard
- CollectInsertSize: Fixed bug that could make the General Statistics Median Insert Size value incorrect.
- Fixed error in sample name regex that left trailing
]
characters and was generally broken (thanks to @jyh1 for spotting this)
- Preseq
- Improved plots display (thanks to @vladsaveliev)
- Qualimap
- Only calculate bases over target coverage for values in General Statistics. Should give a speed increase for very high coverage datasets.
- QUAST
- Module is now compatible with runs from MetaQUAST (thanks to @vladsaveliev)
- RSeQC
- Changed default order of sections
- Added config option to reorder and hide module report sections
- If a report already exists, execution is no longer halted.
_1
is appended to the filename, iterating if this also exists.-f
/--force
still overwrites existing reports as before- Feature written by @Hammarn
- New ability to run modules multiple times in a single report
- Each run can be given different configuration options, including filters for input files
- For example, have FastQC after trimming as well as FastQC before trimming.
- See the relevant documentation for more instructions.
- New option to customise the order of report sections
- This is in addition / alternative to changing the order of module execution
- Allows one module to have sections in multiple places (eg. Custom Content)
- Tables have new column options
floor
,ceiling
andminRange
. - Reports show warning if JavaScript is disabled
- Config option
custom_logo
now works with file paths relative to config file directory and cwd.
- Table headers now sort columns again after scrolling the table
- Fixed buggy table header tooltips
- Base
clean_s_name
function now strips excess whitespace. - Line graphs don't smooth lines if not needed (number of points < maximum number allowed)
- PDF output now respects custom output directory.
MultiQC v1.0 - 2017-05-17
Version 1.0! This release has been a long time coming and brings with it some fairly major improvements in speed, report filesize and report performance. There's also a bunch of new modules, more options, features and a whole lot of bug fixes.
The version number is being bumped up to 1.0 for a couple of reasons:
- MultiQC is now (hopefully) relatively stable. A number of facilities and users are now using it in a production setting and it's published. It feels like it probably deserves v1 status now somehow.
- This update brings some fairly major changes which will break backwards compatibility for plugins. As such, semantic versioning suggests a change in major version number.
For most people, you shouldn't have any problems upgrading. There are two scenarios where you may need to make changes with this update:
Search patterns have been flattened and may no longer have arbitrary depth. For example, you may need to change the following:
fastqc:
data:
fn: 'fastqc_data.txt'
zip:
fn: '*_fastqc.zip'
to this:
fastqc/data:
fn: 'fastqc_data.txt'
fastqc/zip:
fn: '*_fastqc.zip'
See the documentation for instructions on how to write the new file search syntax.
See search_patterns.yaml
for the new module search keys
and more examples.
To see what changes need to applied to your custom plugin code, please see the MultiQC docs.
- Adapter Removal
- AdapterRemoval v2 - rapid adapter trimming, identification, and read merging
- BUSCO
- New module for the
BUSCO v2
tool, used for assessing genome assembly and annotation completeness.
- New module for the
- Cluster Flow
- Cluster Flow is a workflow tool for bioinformatics pipelines. The new module parses executed tool commands.
- RNA-SeQC
- New module to parse output from RNA-SeQC, a java program which computes a series of quality control metrics for RNA-seq data.
- goleft indexcov
- goleft indexcov uses the PED and ROC data files to create diagnostic plots of coverage per sample, helping to identify sample gender and coverage issues.
- Thanks to @chapmanb and @brentp
- SortMeRNA
- New module for
SortMeRNA
, commonly used for removing rRNA contamination from datasets. - Written by @bschiffthaler
- New module for
- Bcftools
- Fixed bug with display of indels when only one sample
- Cutadapt
- Now takes the filename if the sample name is
-
(stdin). Thanks to @tdido
- Now takes the filename if the sample name is
- FastQC
- Data for the Sequence content plot can now be downloaded from reports as a JSON file.
- FastQ Screen
- Rewritten plotting method for high sample numbers plot (~ > 20 samples)
- Now shows counts for single-species hits and bins all multi-species hits
- Allows plot to show proper percentage view for each sample, much easier to interpret.
- HTSeq
- Fix bug where header lines caused module to crash
- Picard
- New
RrbsSummaryMetrics
Submodule! - New
WgsMetrics
Submodule! CollectGcBiasMetrics
module now prints summary statistics tomultiqc_data
if found. Thanks to @ahvigil
- New
- Preseq
- Now trims the x axis to the point that meets 90% of
min(unique molecules)
. Hopefully prevents ridiculous x axes without sacrificing too much useful information. - Allows to show estimated depth of coverage instead of less informative molecule counts (see details).
- Plots dots with externally calculated real read counts (see details).
- Now trims the x axis to the point that meets 90% of
- Qualimap
- RNASeq Transcript Profile now has correct axis units. Thanks to @roryk
- BamQC module now doesn't crash if reports don't have genome gc distributions
- RSeQC
- Fixed Python3 error in Junction Saturation code
- Fixed JS error for Junction Saturation that made the single-sample combined plot only show All Junctions
- Change in module structure and import statements (see details).
- Module file search has been rewritten (see above changes to configs)
- Significant improvement in search speed (test dataset runs in approximately half the time)
- More options for modules to find their logs, eg. filename and contents matching regexes (see the docs)
- Report plot data is now compressed, significantly reducing report filesizes.
- New
--ignore-samples
option to skip samples based on parsed sample name- Alternative to filtering by input filename, which doesn't always work
- Also can use config vars
sample_names_ignore
(glob patterns) andsample_names_ignore_re
(regex patterns).
- New
--sample-names
command line option to give file with alternative sample names- Allows one-click batch renaming in reports
- New
--cl_config
option to supply MultiQC config YAML directly on the command line. - New config option to change numeric multiplier in General Stats
- For example, if reports have few reads, can show
Thousands of Reads
instead ofMillions of Reads
- Set config options
read_count_multiplier
,read_count_prefix
andread_count_desc
- For example, if reports have few reads, can show
- Config options
decimalPoint_format
andthousandsSep_format
now apply to tables as well as plots- By default, thosands will now be separated with a space and
.
used for decimal places.
- By default, thosands will now be separated with a space and
- Tables now have a maximum-height by default and scroll within this.
- Speeds up report rendering in the web browser and makes report less stupidly long with lots of samples
- Button beneath table toggles full length if you want a zoomed-out view
- Refactored and removed previous code to make the table header "float"
- Set
config.collapse_tables
toFalse
to disable table maximum-heights
- Bar graphs and heatmaps can now be zoomed in on
- Interactive plots sometimes hide labels due to lack of space. These can now be zoomed in on to see specific samples in more detail.
- Report plots now load sequentially instead of all at once
- Prevents the browser from locking up when large reports load
- Report plot and section HTML IDs are now sanitised and checked for duplicates
- New template available (called sections) which has faster loading
- Only shows results from one module at a time
- Makes big reports load in the browser much more quickly, but requires more clicking
- Try it out by specifying
-t sections
- Module sections tidied and refactored
- New helper function
self.add_section()
- Sections hidden in nav if no title (no more need for the hacky
self.intro +=
) - Content broken into
description
,help
andplot
, with automatic formatting - Empty module sections are now skipped in reports. No need to check if a plot function returns
None
! - Changes should be backwards-compatible
- New helper function
- Report plot data export code refactored
- Now doesn't export hidden samples (uses HighCharts export-csv plugin)
- Handle error when
git
isn't installed on the system. - Refactored colouring of table cells
- Docs updates (thanks to @varemo)
- Previously hidden log file
.multiqc.log
renamed tomultiqc.log
inmultiqc_data
- Added option to load MultiQC config file from a path specified in the environment variable
MULTIQC_CONFIG_PATH
- New table configuration options
sortRows: False
prevents table rows from being sorted alphabeticallycol1_header
allows the default first column header to be changed from "Sample Name"
- Tables no longer show Configure Columns and Plot buttons if they only have a single column
- Custom content updates
- New
custom_content
/order
config option to specify order of Custom Content sections - Tables now use the header for the first column instead of always having
Sample Name
- JSON + YAML tables now remember order of table columns
- Many minor bugfixes
- New
- Line graphs and scatter graphs axis limits
- If limits are specified, data exceeding this is no longer saved in report
- Visually identical, but can make report file sizes considerable smaller in some cases
- Creating multiple plots without a config dict now works (previously just gave grey boxes in report)
- All changes are now tested on a Windows system, using AppVeyor
- Fixed rare error where some reports could get empty General Statistics tables when no data present.
- Fixed minor bug where config option
force: true
didn't work. Now you don't have to always specify-f
!
MultiQC v0.9 - 2016-12-21
A major new feature is released in v0.9 - support for custom content. This means that MultiQC can now easily include output from custom scripts within reports without the need for a new module or plugin. For more information, please see the MultiQC documentation.
- HTSeq
- New module for the
htseq-count
tool, often used in RNA-seq analysis.
- New module for the
- Prokka
- Prokka is a software tool for the rapid annotation of prokaryotic genomes.
- Slamdunk
- Slamdunk is a software tool to analyze SLAMSeq data.
- Peddy
- Peddy calculates genotype :: pedigree correspondence checks, ancestry checks and sex checks using VCF files.
- Cutadapt
- Fixed bug in General Stats table number for old versions of cutadapt (pre v1.7)
- Added support for really old cutadapt logs (eg. v.1.2)
- FastQC
- New plot showing total overrepresented sequence percentages.
- New option to parse a file containing a theoretical GC curve to display in the background.
- Human & Mouse Genome / Transcriptome curves bundled, or make your own using fastqcTheoreticalGC. See the MultiQC docs for more information.
- featureCounts
- Added parsing checks and catch failures for when non-featureCounts files are picked up by accident
- GATK
- Fixed logger error in VariantEval module.
- Picard
- Fixed missing sample overwriting bug in
RnaSeqMetrics
- New feature to customise coverage shown from
HsMetrics
in General Statistics table see the docs for info). - Fixed compatibility problem with output from
CollectMultipleMetrics
forCollectAlignmentSummaryMetrics
- Fixed missing sample overwriting bug in
- Preseq
- Module now recognises output from
c_curve
mode.
- Module now recognises output from
- RSeQC
- Made the gene body coverage plot show the percentage view by default
- Made gene body coverage properly handle sample names
- Samtools
- New module to show duplicate stats from
rmdup
logs - Fixed a couple of niggles in the idxstats plot
- New module to show duplicate stats from
- SnpEff
- Fixed swapped axis labels in the Variant Quality plot
- STAR
- Fixed crash when there are 0 unmapped reads.
- Sample name now taken from the directory name if no file prefix found.
- Qualimap BamQC
- Add a line for pre-calculated reference genome GC content
- Plot cumulative coverage for values above 50x, align with the coverage histogram.
- New ability to customise coverage thresholds shown in General Statistics table (see the docs for info).
- Support for custom content (see top of release notes).
- New ninja report tool: make scatter plots of any two table columns!
- Plot data now saved in
multiqc_data
when 'flat' image plots are created- Allows you easily re-plot the data (eg. in Excel) for further downstream investigation
- Added 'Apply' button to Highlight / Rename / Hide.
- These tools can become slow with large reports. This means that you can enter several things without having to wait for the report to replot each change.
- Report heatmaps can now be sorted by highlight
- New config options
decimalPoint_format
andthousandsSep_format
- Allows you to change the default
1 234.56
number formatting for plots.
- Allows you to change the default
- New config option
top_modules
allows you to specify modules that should come at the top of the report - Fixed bar plot bug where missing categories could shift data between samples
- Report title now printed in the side navigation
- Missing plot IDs added for easier plot exporting
- Stopped giving warnings about skipping directories (now a debug message)
- Added warnings in report about missing functionality for flat plots (exporting and toolbox)
- Export button has contextual text for images / data
- Fixed a bug where user config files were loaded twice
- Fixed bug where module order was random if
--module
or--exclude
was used. - Refactored code so that the order of modules can be changed in the user config
- Beefed up code + docs in scatter plots back end and multiple bar plots.
- Fixed a few back end nasties for Tables
- Shared-key columns are no longer forced to share colour schemes
- Fixed bug in lambda modified values when format string breaks
- Supplying just data with no header information now works as advertised
- Improvements to back end code for bar plots
- New
tt_decimals
andtt_suffix
options for bar plots - Bar plots now support
yCeiling
,yFloor
andyMinRange
, as with line plots. - New option
hide_zero_cats:False
to force legends to be shown even when all data is 0
- New
- General Stats Showing x of y columns count is fixed on page load.
- Big code whitespace cleanup
MultiQC v0.8 - 2016-09-26
- GATK
- Added support for VariantEval reports, only parsing a little of the information in there so far, but it's a start.
- Module originally written by @robinandeer at the OBF Codefest, finished off by @ewels
- Bcftools
- QUAST
- QUAST is a tool for assessing de novo assemblies against reference genomes.
- Bismark now supports reports from
bam2nuc
, giving Cytosine coverage in General Stats. - Bowtie1
- Updated to try to find bowtie command before log, handle multiple logs in one file. Same as bowtie2.
- FastQC
- Sample pass/warn/fail lists now display properly even with large numbers of samples
- Sequence content heatmap display is better with many samples
- Kallisto
- Now supports logs from SE data.
- Picard
BaseDistributionByCycle
- new submodule! Written by @mlusignanRnaSeqMetrics
- new submodule! This one by @ewels ;)AlignmentSummaryMetrics
- another new submodule!- Fixed truncated files crash bug for Python 3 (#306)
- Qualimap RNASeqQC
- Fixed parsing bug affecting counts in Genomic Origin plot.
- Module now works with European style thousand separators (
1.234,56
instead of1,234.56
)
- RSeQC
infer_experiment
- new submodule! Written by @Hammarn
- Samtools
stats
submodule now has separate bar graph showing alignment scoresflagstat
- new submodule! Written by @HLWienckoidxstats
- new submodule! This one by @ewels again
- New
--export
/-p
option to generate static images plot inmultiqc_plots
(.png
,.svg
and.pdf
)- Configurable with
export_plots
,plots_dir_name
andexport_plot_formats
config options --flat
option no longer saves plots inmultiqc_data/multiqc_plots
- Configurable with
- New
--comment
/-b
flag to add a comment to the top of reports. - New
--dirs-depth
/-dd
flag to specify how many directories to prepend with--dirs
/-d
- Specifying a postive number will take that many directories from the end of the path
- A negative number will take directories from the start of the path.
- Directory paths now appended before cleaning, so
fn_clean_exts
will now affect these names. - New
custom_logo
attributes to add your own logo to reports. - New
report_header_info
config option to add arbitrary information to the top of reports. - New
--pdf
option to create a PDF report- Depends on Pandoc being installed and is in a beta-stage currently.
- Note that specifying this will make MultiQC use the
simple
template, giving a HTML report with much reduced functionality.
- New
fn_clean_sample_names
config option to turn off sample name cleaning- This will print the full filename for samples. Less pretty reports and rows on the General Statistics table won't line up, but can prevent overwriting.
- Table header defaults can now be set easily
- General Statistics table now hidden if empty.
- Some new defaults in the sample name cleaning
- Updated the
simple
template.- Now has no toolbox or nav, no JavaScript and is better suited for printing / PDFs.
- New
config.simple_output
config flag so code knows when we're trying to avoid JS.
- Fixed some bugs with config settings (eg. template) being overwritten.
- NFS log file deletion bug fixed by @brainstorm (#265)
- Fixed bug in
--ignore
behaviour with directory names. - Fixed nasty bug in beeswarm dot plots where sample names were mixed up (#278)
- Beeswarm header text is now more informative (sample count with more info on a tooltip)
- Beeswarm plots now work when reports have > 1000 samples
- Fixed some buggy behaviour in saving / loading report highlighting + renaming configs (#354)
Many thanks to those at the OpenBio Codefest 2016 who worked on MultiQC projects.
MultiQC v0.7 - 2016-07-04
- Kallisto - new module!
- Picard
- Code refactored to make maintenance and additions easier.
- Big update to
HsMetrics
parsing - more results shown in report, new plots (by @lpantano) - Updated
InsertSizeMetrics
to understand logs generated byCollectMultipleMetrics
(#215) - Newlines in picard output. Fixed by @dakl
- Samtools
- Code refactored
- Rewrote the
samtools stats
code to display more stats in report with a beeswarm plot.
- Qualimap
- Rewritten to use latest methods and fix bugs.
- Added Percentage Aligned column to general stats for
BamQC
module. - Extra table thresholds added by @avilella (hidden by default)
- General Statistics
- Some tweaks to the display defaults (FastQC, Bismark, Qualimap, SnpEff)
- Now possible to skip the General Statistics section of the report with
--exclude general_stats
- Cutadapt module updated to recognise logs from old versions of cutadapt (<= v1.6)
- Trimmomatic
- Now handles
,
decimal places in percentage values. - Can cope with line breaks in log files (see issue #212)
- Now handles
- FastQC refactored
- Now skips zip files if the sample name has already been found. Speeds up MultiQC execution.
- Code cleaned up. Parsing and data-structures standardised.
- New popovers on Pass / Warn / Fail status bars showing sample names. Fast highlighting and hiding.
- New column in General Stats (hidden by default) showing percentage of FastQC modules that failed.
- SnpEff
- Search pattern now more generic, should match reports from others.
- Counts by Effect plot removed (had hundreds of categories, was fairly unusable).
KeyError
bug fixed.
- Samblaster now gets sample name from
ID
instead ofSM
(@dakl) - Bowtie 2
- Now parses overall alignment rate as intended.
- Now depends on even less log contents to work with more inputs.
- MethylQA now handles variable spacing in logs
- featureCounts now splits columns on tabs instead of whitespace, can handle filenames with spaces
- Galaxy: MultiQC now available in Galax! Work by @devengineson / @yvanlebras / @cmonjeau
- See it in the Galaxy Toolshed
- Heatmap: New plot type!
- Scatter Plot: New plot type!
- Download raw data behind plots in reports! Available in the Export toolbox.
- Choose from tab-separated, comma-separated and the complete JSON.
- Table columns can be hidden on page load (shown through Configure Columns)
- Defaults are configurable using the
table_columns_visible
config option.
- Defaults are configurable using the
- Beeswarm plot: Added missing rename / highlight / hiding functionality.
- New
-l
/--file-list
option: specify a file containing a list of files to search. - Updated HighCharts to v4.2.5. Added option to export to JPEG.
- Can now cancel execution with a single
ctrl+c
rather than having to button mash - More granular control of skipping files during scan (filename, dirname, path matching)
- Fixed
--exclude
so that it works with directories as well as files
- Fixed
- New Clear button in toolbox to bulk remove highlighting / renaming / hiding filters.
- Improved documentation about behaviour for large sample numbers.
- Handle YAML parsing errors for the config file more gracefully
- Removed empty columns from tables again
- Fixed bug in changing module search patterns, reported by @lweasel
- Added timeout parameter to version check to prevent hang on systems with long defaults
- Fixed table display bug in Firefox
- Fixed bug related to order in which config files are loaded
- Fixed bug that broke the "Show only" toolbox feature with multiple names.
- Numerous other small bugs.
MultiQC v0.6 - 2016-04-29
- New Salmon module.
- New Trimmomatic module.
- New Bamtools stats module.
- New beeswarm plot type. General Stats table replaced with this when many samples in report.
- New RSeQC module: Actually a suite of 8 new modules supporting various outputs from RSeQC
- Rewrote bowtie2 module: Now better at parsing logs and tries to scrape input from wrapper logs.
- Made cutadapt show counts by default instead of obs/exp
- Added percentage view to Picard insert size plot
- Dynamic plots now update their labels properly when changing datasets and to percentages
- Config files now loaded from working directory if present
- Started new docs describing how each module works
- Refactored featureCounts module. Now handles summaries describing multiple samples.
- Stopped using so many hidden files.
.multiqc.log
now calledmultiqc.log
- New
-c
/--config
command line option to specify a MultiQC configuration file - Can now load run-specific config files called
multiqc_config.yaml
in working directory - Large code refactoring - moved plotting code out of
BaseModule
and into newmultiqc.plots
submodules - Generalised code used to generate the General Stats table so that it can be used by modules
- Removed interactive report tour, replaced with a link to a youtube tutorial
- Made it possible to permanently hide the blue welcome message for all future reports
- New option to smooth data for line plots. Avoids mega-huge plots. Applied to SnpEff, RSeQC, Picard.
Bugfixes:
- Qualimap handles infinity symbol (thanks @chapmanb )
- Made SnpEff less fussy about required fields for making plots
- UTF-8 file paths handled properly in Py2.7+
- Extending two config variables wasn't working. Now fixed.
- Dragging the height bar of plots now works again.
- Plots now properly change y axis limits and labels when changing datasets
- Flat plots now have correct path in
default_dev
template
MultiQC v0.5 - 2016-03-29
- New Skewer module, written by @dakl
- New Samblaster module, written by @dakl
- New Samtools stats module, written by @lpantano
- New HiCUP module
- New SnpEff module
- New methylQA module
- New "Flat" image plots, rendered at run time with MatPlotLib
- By default, will use image plots if > 50 samples (set in config as
plots_flat_numseries
) - Means that very large numbers of samples can be viewed in reports. eg. single cell data.
- Templates can now specify their own plotting functions
- Use
--flat
and--interactive
to override this behaviour
- By default, will use image plots if > 50 samples (set in config as
- MultiQC added to
bioconda
(with help from @dakl) - New plugin hook:
config_loaded
- Plugins can now add new command line options (thanks to @robinandeer)
- Changed default data directory name from
multiqc_report_data
tomultiqc_data
- Removed support for depreciated MultiQC_OSXApp
- Updated logging so that a verbose
multiqc_data/.multiqc.log
file is always written - Now logs more stuff in verbose mode - command used, user configs and so on.
- Added a call to multiqc.info to check for new versions. Disable with config
no_version_check
- Removed general stats manual row sorting.
- Made filename matching use glob unix style filename match patterns
- Everything (including the data directory) is now created in a temporary directory and moved when MultiQC is complete.
- A handful of performance updates for large analysis directories
MultiQC v0.4 - 2016-02-16
- New
multiqc_sources.txt
which identifies the paths used to collect all report data for each sample - Export parsed data as tab-delimited text,
JSON
orYAML
using the new-k
/--data-format
command line option - Updated HighCharts from
v4.2.2
tov4.2.3
, fixes tooltip hover bug. - Nicer export button. Now tied to the export toolbox, hopefully more intuitive.
- FastQC: Per base sequence content heatmap can now be clicked to show line graph for single sample
- FastQC: No longer show adapter contamination datasets with <= 0.1% contamination.
- Picard: Added support for
CollectOxoGMetrics
reports. - Changed command line option
--name
to--filename
--name
also used for filename if--filename
not specified.- Hide samples toolbox now has switch to show only matching samples
- New regex help box with examples added to report
- New button to copy general stats table to the clipboard
- General Stats table 'floating' header now sorts properly when scrolling
- Bugfix: MultiQC default_dev template now copies module assets properly
- Bufgix: General Stats table floating header now resizes properly when page width changes
MultiQC v0.3.2 - 2016-02-08
- All modules now load their log file search parameters from a config
file, allowing you to overwrite them using your user config file
- This is useful if your analysis pipeline renames program outputs
- New Picard (sub)modules - Insert Size, GC Bias & HsMetrics
- New Qualimap (sub)module - RNA-Seq QC
- Made Picard MarkDups show percent by default instead of counts
- Added M-Bias plot to Bismark
- New option to stream report HTML to
stdout
- Files can now be specified as well as directories
- New options to specify whether the parsed data directory should be created
- command line flags:
--data
/--no-data
- config option name:
make_data_dir
- command line flags:
- Fixed bug with incorrect path to installation dir config YAML file
- New toolbox drawer for bulk-exporting graph images
- Report side navigation can now be hidden to maximise horizontal space
- Mobile styling improved for narrow screen
- More vibrant colours in the general stats table
- General stats table numbers now left aligned
- Settings now saved and loaded to named localstorage locations
- Simplified interface - no longer global / single report saving
- Removed static file config. Solves JS error, no-one was doing this since we have standalone reports anyway.
- Added support for Python 3.5
- Fixed bug with module specific CSS / JS includes in some templates
- Made the 'ignore files' config use unix style file pattern matching
- Fixed some bugs in the FastQ Screen module
- Fixed some bugs in the FastQC module
- Fixed occasional general stats table bug
- Table sorting on sample names now works after renaming
- Bismark module restructure
- Each report type now handled independently (alignment / dedup / meth extraction)
- M-Bias plot now shows R1 and R2
- FastQC GC content plot now has option for counts or percentages
- Allows comparison between samples with very different read counts
- Bugfix for reports javascript
- Caused by updated to remotely loaded HighCharts export script
- Export script now bundled with multiqc, so does not depend on internet connection
- Other JS errors fixed in this work
- Bugfix for older FastQC reports - handle old style sequence dup data
- Bugfix for varying Tophat alignment report formats
- Bugfix for Qualimap RNA Seq reports with paired end data
MultiQC v0.3.1 - 2015-11-04
- Hotfix patch to fix broken FastQC module (wasn't finding
.zip
files properly) - General Stats table colours now flat. Should improve browser speed.
- Empty rows now hidden if appear due to column removal in general stats
- FastQC Kmer plot removed until we have something better to show.
MultiQC v0.3 - 2015-11-04
- Lots of lovely new documentation!
- Child templates - easily customise specific parts of the default report template
- Plugin hooks - allow other tools to execute custom code during MultiQC execution
- New Preseq module
- New design for general statistics table (snazzy new background bars)
- Further development of toolbox
- New button to clear all filters
- Warnings when samples are hidden, plus empty plots and table cols are hidden
- Active toolbar tab buttons are highlighted
- Lots of refactoring by @moonso to please the Pythonic gods
- Switched to click instead of argparse to handle command line arguments
- Code generally conforms to best practices better now.
- Now able to supply multiple directories to search for reports
- Logging output improved (now controlled by
-q
and-v
for quiet and verbose) - More HTML output dealt with by the base module, less left to the modules
- Module introduction text
- General statistics table now much easier to add to (new helper functions)
- Images, CSS and Javascript now included in HTML, meaning that there is a single report file to make sharing easier
- More accessible scrolling in the report - styled scrollbars and 'to top' button.
- Modules and templates now use setuptools entry points, facilitating plugins by other packages. Allows niche extensions whilst keeping the core codebase clean.
- The general stats table now has a sticky header row when scrolling, thanks to some new javascript wizardry...
- General stats columns can have a shared key which allows common colour schemes and data ranges. For instance, all columns describing a read count will now share their scale across modules.
- General stats columns can be hidden and reordered with a new modal window.
- Plotting code refactored, reports with many samples (>50 by default) don't automatically render to avoid freezing the browser.
- Plots with highlighted and renamed samples now honour this when exporting to different file types.
MultiQC v0.2 - 2015-09-18
- Code restructuring for nearly all modules. Common base module
functions now handle many more functions (plots, config, file import)
- See the contributing notes for instructions on how to use these new helpers to make your own module
- New report toolbox - sample highlighting, renaming, hiding
- Config is autosaved by default, can also export to a file for sharing
- Interactive tour to help users find their way around
- New Tophat, Bowtie 2 and QualiMap modules
- Thanks to @guillermo-carrasco for the QualiMap module
- Bowtie module now works
- New command line parameter
-d
prefixes sample names with the directory that they were found in. Allows duplicate filenames without being overwritten. - Introduction walkthrough helps show what can be done in the report
- Now compatible with both Python 2 and Python 3
- Software version number now printed on command line properly, and in reports.
- Bugfix: FastQC doesn't break when only one report found
- Bugfix: FastQC seq content heatmap highlighting
- Many, many small bugfixes
MultiQC v0.1 - 2015-09-01
- The first public release of MultiQC, after a month of development. Basic structure in place and modules for FastQC, FastQ Screen, Cutadapt, Bismark, STAR, Bowtie, Subread featureCounts and Picard MarkDuplicates. Approaching stability, though still under fairly heavy development.