SlideShare a Scribd company logo
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
MultiQC: summarize analysis results for
multiple tools and samples in a single
report
Danielle Denisko
Tech Talk
April 17, 2019
Template from: www.overleaf.com
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Outline
Introduction
Description
Installation and general usage
Configuration
YAML file
Plots
Customize
Integrate with pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with MultiQC
Examples
Plots in publications
Conclusion
Of interest
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Introduction
Summary:
flexibly integrates quality control (QC) metrics from
different tools into one HTML report
currently supports 73 tools
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Introduction
Summary:
flexibly integrates quality control (QC) metrics from
different tools into one HTML report
currently supports 73 tools
Feedback:
”Amazingly, MultiQC just works.”
”MultiQC just blew my mind.”
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Description
Motivation:
most QC tools produce reports on a per-sample basis
batch effects can be subtle and difficult to detect
combining log files is time consuming and repetitive
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Description
Motivation:
most QC tools produce reports on a per-sample basis
batch effects can be subtle and difficult to detect
combining log files is time consuming and repetitive
Some supported tools:
Pre-alignment: Cutadapt, FastQC, Fastp,
Trimmomatic
Alignment: Bismark, Bowtie 2, HiCUP, HISAT2,
STAR, TopHat
Post-alignment: Bamtools, deepTools, GATK,
HOMER, Picard, Samtools
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
To use:
command line
through Galaxy (usegalaxy.org)
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
To use:
command line
through Galaxy (usegalaxy.org)
Implementation:
Python 2.7+, 3.4+ or 3.5+
Jinja2 package to render final report
Figure 1: MultiQC homepage (multiqc.info).
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
To install:
pip install multiqc
conda install -c bioconda multiqc
git clone 
https://github.com/ewels/MultiQC.git
cd MultiQC
python setup.py install
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
Usage:
multiqc . # scans recursively for log files
Options:
ignore directories and samples (glob expansion)
specify file paths to scan
rename report or specify output directory
generate PDF report (via Pandoc)
save plots as stand alone files
select modules to run (or to not run)
... and even more customization through the
configuration file
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
Output:
multiqc report.html
contains navigation menu and toolbox to edit report
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
Output:
multiqc report.html
contains navigation menu and toolbox to edit report
Format:
general statistics table
interactive plots for various modules (HighCharts
JavaScript library)
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Installation and general usage
Output:
multiqc report.html
contains navigation menu and toolbox to edit report
Format:
general statistics table
interactive plots for various modules (HighCharts
JavaScript library)
Customize with toolbox:
regex mode
highlight/colour, rename, hide samples
save settings in browser local storage
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Configuration
Users can create a YAML configuration file
(multiqc config.yaml) and provide its path. Arguments in
this file may also be specified via --cl config.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Configuration: multiqc config.yaml
Clean sample names:
fn_clean_exts:
- ’.gz’
- ’.fastq’
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Configuration: multiqc config.yaml
Clean sample names:
fn_clean_exts:
- ’.gz’
- ’.fastq’
Default is truncate, but you can specify replace instead.
extra_fn_clean_exts:
- type: regex
pattern: ’^processed.’
extra_fn_clean_exts:
- type: remove
pattern: .sorted
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Configuration: multiqc config.yaml
Ignore files:
sample_names_ignore:
- ’SRR*’
sample_names_ignore_re:
- ’^SR{2}d{7}_1$’
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Configuration: multiqc config.yaml
Ignore files:
sample_names_ignore:
- ’SRR*’
sample_names_ignore_re:
- ’^SR{2}d{7}_1$’
Search for output files:
default search patterns are found in
search patterns.yaml
edit in config file to override
sp:
mqc_module:
fn: _mysearch.txt
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Configuration
Plotting when there are many samples:
disable on-load plotting (automatic for > 50 samples)
flat plots (MatPlotLib) rather than interactive plots
(automatic for > 100 samples)
Beeswarm plot instead of table (automatic for > 500
samples/rows)
Image source: http://www.cbs.dtu.dk/∼eklund/beeswarm/.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Customize reports
titles and description
rename bulk samples
comments above module sections
module order and section order
tables (column visibility, order, colour)
plots (number formatting decimalPoint format: ’,’)
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Workflows
Easily integrates with:
Nextflow
Snakemake
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
FastQC
Description: Java tool for quality control in
high-throughput sequencing data. Requires Picard
SAM/BAM libraries. Input can be FastQ, SAM, or BAM
files.
per base sequence quality/content
per sequence quality scores/GC
content/length
adapter and k-mer content,
overrepresented sequences
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
FastQC
ATF4 wild-type ChIP-seq sample:
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
FastQC
Multiple ChIP-seq samples (MultiQC output):
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools
Description: suite of Python tools for analysis of
high-throughput sequencing data.
Figure 2: DeepTools flowchart (deeptools.readthedocs.io/).
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools
quality checks
format conversion and normalization
plotting
plotCoverage: Assess sequencing depth.
plotFingerprint: Can ChIP signal be separated from
background signal?
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools plotCoverage
sample 1 million base
pairs
count number of
overlapping reads
plot frequency of found
read coverages
report mean coverage
per base pair (top right)
Figure 3: Example from
deeptools.readthedocs.io.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools plotCoverage
Figure 4: Only ATF4 samples. Figure 5: MultiQC.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools plotFingerprint
randomly sample
genomic regions of
particular length
sum per-base coverage
of BAM that overlaps
with regions
sort according to rank
plot cumulative sum
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools plotFingerprint
Figure 6: Examples of possible fingerprints for various histone
marks (deeptools.readthedocs.io).
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Modules
deepTools plotFingeprint
Figure 7: Only ATF4 samples. Figure 8: MultiQC.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Custom content
more restricted than standard modules
one plot per section
limited customization
not intended to be used with data from released tools
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Custom content
Figure 9: YAML config for ”example files” module.
Figure 10: Data file ”example files Sample 1.txt”.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Coding with MultiQC
write in main MultiQC vs. stand-alone plugin
publicly available tool: add to main and contribute
back via pull request
lint flag (--lint) gives warnings about things that
are not optimally configured
choose from various plot types:
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Example plots in publications
MultiQC has 244 citations.
Figure 11: Kim MS et al. (2018). Global gene expression profiling
for fruit organs and pathogen infections in the pepper, Capsicum
annuum L. Scientific Data, 5, 180103.
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
can summarize (and visualize) quality control metrics
for thousands of samples at once
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
can summarize (and visualize) quality control metrics
for thousands of samples at once
can customize all aspects of report both through the
config file and the toolbox
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
can summarize (and visualize) quality control metrics
for thousands of samples at once
can customize all aspects of report both through the
config file and the toolbox
plots are interactive
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
can summarize (and visualize) quality control metrics
for thousands of samples at once
can customize all aspects of report both through the
config file and the toolbox
plots are interactive
simple to add support for new modules (51 new
modules supported since publication in 2016)
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
can summarize (and visualize) quality control metrics
for thousands of samples at once
can customize all aspects of report both through the
config file and the toolbox
plots are interactive
simple to add support for new modules (51 new
modules supported since publication in 2016)
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Conclusion
Pros:
simple command line tool for generating quick plots
can summarize (and visualize) quality control metrics
for thousands of samples at once
can customize all aspects of report both through the
config file and the toolbox
plots are interactive
simple to add support for new modules (51 new
modules supported since publication in 2016)
Cons:
some configuration adjustments needed depending on
your versions of modules
MultiQC
D. Denisko
Introduction
Description
Installation and
general usage
Configuration
YAML file
Plots
Customize
Integrate with
pipelines
Modules
FastQC
deepTools
Custom content
Example
Coding with
MultiQC
Examples
Plots in
publications
Conclusion
Of interest
Of interest
”collects and visualises data parsed by MultiQC across
multiple runs”

More Related Content

PPTX
Dissertation defense
PDF
DockerとKubernetesをかけめぐる
ODT
Cross-compilation native sous android
PPTX
PHP Development Tools 2.0 - Success Story
PDF
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
PDF
Startup Containers in Lightning Speed with Lazy Image Distribution
PPTX
Docker @ FOSS4G 2016, Bonn
PDF
Automated Multiplatform Compilation and Validation of a Collaborative Reposit...
Dissertation defense
DockerとKubernetesをかけめぐる
Cross-compilation native sous android
PHP Development Tools 2.0 - Success Story
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
Startup Containers in Lightning Speed with Lazy Image Distribution
Docker @ FOSS4G 2016, Bonn
Automated Multiplatform Compilation and Validation of a Collaborative Reposit...

Similar to MultiQC: summarize analysis results for multiple tools and samples in a single report (20)

PPTX
Docker training
PDF
Usersguide
PDF
Tutorial contributing to nf-core
PPT
Bdd Net Frameworks
PDF
Accordion Pipelines - A Cloud-native declarative Pipelines and Dynamic workfl...
PPTX
ABCs of docker
PDF
Best Practices for Developing & Deploying Java Applications with Docker
PDF
Cssdoc
PPTX
Jenkins' shared libraries in action
PDF
Kubernetes Operability Tooling (GOTO Chicago 2019)
PDF
PDF
PDF
Anaconda Python KNIME & Orange Installation
PDF
Pharo GitLab Example: This is a simple Pharo Smalltalk pipeline example
PPT
Code Documentation. That ugly thing...
PDF
Deploying WSO2 Middleware on Containers
PDF
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
PDF
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
PPTX
Seattle Cassandra Users: An OSS Java Abstraction Layer for Cassandra
PPTX
Node js meetup
Docker training
Usersguide
Tutorial contributing to nf-core
Bdd Net Frameworks
Accordion Pipelines - A Cloud-native declarative Pipelines and Dynamic workfl...
ABCs of docker
Best Practices for Developing & Deploying Java Applications with Docker
Cssdoc
Jenkins' shared libraries in action
Kubernetes Operability Tooling (GOTO Chicago 2019)
Anaconda Python KNIME & Orange Installation
Pharo GitLab Example: This is a simple Pharo Smalltalk pipeline example
Code Documentation. That ugly thing...
Deploying WSO2 Middleware on Containers
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Seattle Cassandra Users: An OSS Java Abstraction Layer for Cassandra
Node js meetup
Ad

More from Hoffman Lab (20)

PPTX
Miller: A command-line tool for querying, shaping, and reformatting data files
PDF
GNU Parallel: Lab meeting—technical talk
PDF
TCRpower
PPTX
Efficient querying of genomic reference databases with gget
PPTX
WashU Epigenome Browser
PPTX
Wireguard: A Virtual Private Network Tunnel
PPTX
Plotting heatmap with matplotlib/seaborn
PPTX
Go Get Data (GGD)
PPTX
fastp: the FASTQ pre-processor
PPTX
R markdown and Rmdformats
PPTX
File searching tools
PPTX
Better BibTeX (BBT) for Zotero
PPTX
Awk primer and Bioawk
PPTX
Terminals and Shells
PPTX
BioRender & Glossary/Acronym
PPTX
Linters in R
PPTX
BioSyntax: syntax highlighting for computational biology
PPTX
Get Good With Git
PDF
Tech Talk: UCSC Genome Browser
PPTX
dreamRs: interactive ggplot2
Miller: A command-line tool for querying, shaping, and reformatting data files
GNU Parallel: Lab meeting—technical talk
TCRpower
Efficient querying of genomic reference databases with gget
WashU Epigenome Browser
Wireguard: A Virtual Private Network Tunnel
Plotting heatmap with matplotlib/seaborn
Go Get Data (GGD)
fastp: the FASTQ pre-processor
R markdown and Rmdformats
File searching tools
Better BibTeX (BBT) for Zotero
Awk primer and Bioawk
Terminals and Shells
BioRender & Glossary/Acronym
Linters in R
BioSyntax: syntax highlighting for computational biology
Get Good With Git
Tech Talk: UCSC Genome Browser
dreamRs: interactive ggplot2
Ad

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
KodekX | Application Modernization Development
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
20250228 LYD VKU AI Blended-Learning.pptx
KodekX | Application Modernization Development
Unlocking AI with Model Context Protocol (MCP)
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
cuic standard and advanced reporting.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Per capita expenditure prediction using model stacking based on satellite ima...
Digital-Transformation-Roadmap-for-Companies.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Spectroscopy.pptx food analysis technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

MultiQC: summarize analysis results for multiple tools and samples in a single report

  • 1. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest MultiQC: summarize analysis results for multiple tools and samples in a single report Danielle Denisko Tech Talk April 17, 2019 Template from: www.overleaf.com
  • 2. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Outline Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest
  • 3. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Introduction Summary: flexibly integrates quality control (QC) metrics from different tools into one HTML report currently supports 73 tools
  • 4. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Introduction Summary: flexibly integrates quality control (QC) metrics from different tools into one HTML report currently supports 73 tools Feedback: ”Amazingly, MultiQC just works.” ”MultiQC just blew my mind.”
  • 5. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Description Motivation: most QC tools produce reports on a per-sample basis batch effects can be subtle and difficult to detect combining log files is time consuming and repetitive
  • 6. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Description Motivation: most QC tools produce reports on a per-sample basis batch effects can be subtle and difficult to detect combining log files is time consuming and repetitive Some supported tools: Pre-alignment: Cutadapt, FastQC, Fastp, Trimmomatic Alignment: Bismark, Bowtie 2, HiCUP, HISAT2, STAR, TopHat Post-alignment: Bamtools, deepTools, GATK, HOMER, Picard, Samtools
  • 7. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage To use: command line through Galaxy (usegalaxy.org)
  • 8. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage To use: command line through Galaxy (usegalaxy.org) Implementation: Python 2.7+, 3.4+ or 3.5+ Jinja2 package to render final report Figure 1: MultiQC homepage (multiqc.info).
  • 9. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage To install: pip install multiqc conda install -c bioconda multiqc git clone https://github.com/ewels/MultiQC.git cd MultiQC python setup.py install
  • 10. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage Usage: multiqc . # scans recursively for log files Options: ignore directories and samples (glob expansion) specify file paths to scan rename report or specify output directory generate PDF report (via Pandoc) save plots as stand alone files select modules to run (or to not run) ... and even more customization through the configuration file
  • 11. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage Output: multiqc report.html contains navigation menu and toolbox to edit report
  • 12. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage Output: multiqc report.html contains navigation menu and toolbox to edit report Format: general statistics table interactive plots for various modules (HighCharts JavaScript library)
  • 13. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Installation and general usage Output: multiqc report.html contains navigation menu and toolbox to edit report Format: general statistics table interactive plots for various modules (HighCharts JavaScript library) Customize with toolbox: regex mode highlight/colour, rename, hide samples save settings in browser local storage
  • 14. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Configuration Users can create a YAML configuration file (multiqc config.yaml) and provide its path. Arguments in this file may also be specified via --cl config.
  • 15. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Configuration: multiqc config.yaml Clean sample names: fn_clean_exts: - ’.gz’ - ’.fastq’
  • 16. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Configuration: multiqc config.yaml Clean sample names: fn_clean_exts: - ’.gz’ - ’.fastq’ Default is truncate, but you can specify replace instead. extra_fn_clean_exts: - type: regex pattern: ’^processed.’ extra_fn_clean_exts: - type: remove pattern: .sorted
  • 17. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Configuration: multiqc config.yaml Ignore files: sample_names_ignore: - ’SRR*’ sample_names_ignore_re: - ’^SR{2}d{7}_1$’
  • 18. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Configuration: multiqc config.yaml Ignore files: sample_names_ignore: - ’SRR*’ sample_names_ignore_re: - ’^SR{2}d{7}_1$’ Search for output files: default search patterns are found in search patterns.yaml edit in config file to override sp: mqc_module: fn: _mysearch.txt
  • 19. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Configuration Plotting when there are many samples: disable on-load plotting (automatic for > 50 samples) flat plots (MatPlotLib) rather than interactive plots (automatic for > 100 samples) Beeswarm plot instead of table (automatic for > 500 samples/rows) Image source: http://www.cbs.dtu.dk/∼eklund/beeswarm/.
  • 20. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Customize reports titles and description rename bulk samples comments above module sections module order and section order tables (column visibility, order, colour) plots (number formatting decimalPoint format: ’,’)
  • 21. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Workflows Easily integrates with: Nextflow Snakemake
  • 22. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules FastQC Description: Java tool for quality control in high-throughput sequencing data. Requires Picard SAM/BAM libraries. Input can be FastQ, SAM, or BAM files. per base sequence quality/content per sequence quality scores/GC content/length adapter and k-mer content, overrepresented sequences
  • 23. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules FastQC ATF4 wild-type ChIP-seq sample:
  • 24. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules FastQC Multiple ChIP-seq samples (MultiQC output):
  • 25. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools Description: suite of Python tools for analysis of high-throughput sequencing data. Figure 2: DeepTools flowchart (deeptools.readthedocs.io/).
  • 26. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools quality checks format conversion and normalization plotting plotCoverage: Assess sequencing depth. plotFingerprint: Can ChIP signal be separated from background signal?
  • 27. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools plotCoverage sample 1 million base pairs count number of overlapping reads plot frequency of found read coverages report mean coverage per base pair (top right) Figure 3: Example from deeptools.readthedocs.io.
  • 28. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools plotCoverage Figure 4: Only ATF4 samples. Figure 5: MultiQC.
  • 29. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools plotFingerprint randomly sample genomic regions of particular length sum per-base coverage of BAM that overlaps with regions sort according to rank plot cumulative sum
  • 30. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools plotFingerprint Figure 6: Examples of possible fingerprints for various histone marks (deeptools.readthedocs.io).
  • 31. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Modules deepTools plotFingeprint Figure 7: Only ATF4 samples. Figure 8: MultiQC.
  • 32. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Custom content more restricted than standard modules one plot per section limited customization not intended to be used with data from released tools
  • 33. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Custom content Figure 9: YAML config for ”example files” module. Figure 10: Data file ”example files Sample 1.txt”.
  • 34. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Coding with MultiQC write in main MultiQC vs. stand-alone plugin publicly available tool: add to main and contribute back via pull request lint flag (--lint) gives warnings about things that are not optimally configured choose from various plot types:
  • 35. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Example plots in publications MultiQC has 244 citations. Figure 11: Kim MS et al. (2018). Global gene expression profiling for fruit organs and pathogen infections in the pepper, Capsicum annuum L. Scientific Data, 5, 180103.
  • 36. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots
  • 37. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots can summarize (and visualize) quality control metrics for thousands of samples at once
  • 38. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots can summarize (and visualize) quality control metrics for thousands of samples at once can customize all aspects of report both through the config file and the toolbox
  • 39. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots can summarize (and visualize) quality control metrics for thousands of samples at once can customize all aspects of report both through the config file and the toolbox plots are interactive
  • 40. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots can summarize (and visualize) quality control metrics for thousands of samples at once can customize all aspects of report both through the config file and the toolbox plots are interactive simple to add support for new modules (51 new modules supported since publication in 2016)
  • 41. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots can summarize (and visualize) quality control metrics for thousands of samples at once can customize all aspects of report both through the config file and the toolbox plots are interactive simple to add support for new modules (51 new modules supported since publication in 2016)
  • 42. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Conclusion Pros: simple command line tool for generating quick plots can summarize (and visualize) quality control metrics for thousands of samples at once can customize all aspects of report both through the config file and the toolbox plots are interactive simple to add support for new modules (51 new modules supported since publication in 2016) Cons: some configuration adjustments needed depending on your versions of modules
  • 43. MultiQC D. Denisko Introduction Description Installation and general usage Configuration YAML file Plots Customize Integrate with pipelines Modules FastQC deepTools Custom content Example Coding with MultiQC Examples Plots in publications Conclusion Of interest Of interest ”collects and visualises data parsed by MultiQC across multiple runs”