SlideShare a Scribd company logo
João André Carriço,
Microbiology Institute and Instituto de Medicina Molecular,
Faculty of Medicine, University of Lisbon
jcarrico@fm.ul.pt twitter: @jacarrico
ME081 – Meet-The-Expert Session
26th ECCMID, Amsterdam, Netherlands
7-12 April 2016
 This presentation is not intended to cover all available
software or databases (we would need several weeks or
months to do that)
 I’ll present what I use or intend to use in a near future
 I gladly accept any suggestions to included on similar
presentations in the future.
 It is supposed to be interactive so ask away during the
presentation.
 Available Databases
 Virulence Factors and AMR DBs
 Sequence-based typing databases: Pubmlst.org / Enterobase
 HighThroughput Sequencing data analysis (freeware)
 Prokka
 Roary
 Nullabor
 Microreact.org
 PHYLOViZ
 Commercial Solutions
 Bionumerics 7.5
 CLC GenomicsWorkbench (CLC Bio)
 Ridom Seqsphere+
Virulence Factor Databases
 VFDB (http://www.mgc.ac.cn/VFs/main.htm)
 Pathosystems Resource Integration Center (PATRIC)
VF (https)://www.patricbrc.org/)
 Victors (http://www.phidias.us/victors/)
 PHI-Base (http://www.phi-base.org/)
 MvirDB (http://mvirdb.llnl.gov/ )
To know more:
- Presentation on the Controversies in interpreting whole genome sequence data session :
http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
 Comprehensive Antibiotic Resistance Database
(CARD) (https://card.mcmaster.ca/)
 Repository of Antibiotic resistanceCassetes
(RAC)(http://rac.aihi.mq.edu.au/rac/)
 Integrall :The integron database
(http://integrall.bio.ua.pt/)
(…)
To know more :
http://www.slideshare.net/nickloman/eccmid-2015-so-i-have-sequenced-my-genome-what-now
Reads
(fastq files)
contigs
(fasta files)
Annotated contigs
(gbk/gff files)
Roary :PanGenome Analysis
Enterobase
BIGSdb
Nullabor
PHYLOViZ:
Tree + metada
visualization
Microreact.org:
Tree +metadata
+vizualization
Prokka
De novo assembler
http://www.pubmlst.org
http://bigsdb.web.pasteur.fr/
slide by @happy_khan
Martin Sergeant
Mark Achtman
Nabil-Fareed Alikhan
Zhemin Zhou
 Genome annotation made easy byTorsten
Seemann (slides byTorsten)
 Genome annotation: adding biological
information to the sequence, by describing
features
To know more :
http://www.slideshare.net/torstenseemann/prokka-rapid-bacterial-genome-annotation-abphm-2013
Available at: https://github.com/tseemann/prokka
 Pan genome analysis by Andrew Page
 Available at: https://sangerpathogens.github.io/Roary/
Core
genome
Accessory
genome
Pan-genome
 Inputs:Annotated de novo assemblies (GFF files)
• Typically from the annotation pipeline
 Outputs:
• Spreadsheet with presence and absence of genes
• Multi-FASTA alignment of core genes so you can build a tree
without a reference
• Multi-FASTA alignments for each gene
• Plots for the open/closed genome, unique genes
• Integrates with iCANDY so you can visualise all structural variation
• QC report from Kraken to help identify suspect samples
(Slide by Andrew Page)
Core (n or n-1 strains)
Soft-Core
(n-2 or n-3 strains)
Shell
( 8(?) to n-3 strains)
Cloud
( <8 (?) strains)
Core genome:
Core + Soft-Core
Accessory genome:
Shell + Cloud
iCANDY output of presence and
absence of genes in accessory
genome.
S. Weltevreden & public S. enterica
genomes
(Slide by Andrew Page)
 Complete pipeline from reads to reports byTorsten
Seemann
 Objective is automate analysis for everyday use on
public health labs /research settings
 Uses and distills outputs by a lot of software
 Avaliable at: https://github.com/tseemann/nullarbor
Slide byTorsten Seeman
From: https://github.com/tseemann/nullarbor
Slides byTorsten Seeman
www.phyloviz.net
Inputs:
- Tab separated txt (profiles)
- Fasta files
- Automatic database retrieval
(MLST)
Outputs:
• goeBURST and goeBURST
MST
• Link quality assessment
• High quality images
Can be easily applied to:
- MLST/ cgMLST/wgMLST
- MLVA
- SNP data*
- Gene Presence/absence
New features:
• Hierarchical clustering
• Neighbor-Joining
• Project Saving
 Available at http://online.phyloviz.net
 Web based version of PHYLOViZ
 Allows users to create their own datasets, save them and share their data
(privately or publicly)
 REST API available
 Scalable to thousands of nodes
 Tree Analysis tools:
 Interactive distance matrix
 NLV graph
Slide by @happy_khan
Computational Resources In Infectious Disease
NLV Graph
Tree cut-off
Full MST
Computational Resources In Infectious Disease
Computational Resources In Infectious Disease
Create Selections
Change tree options
 Available at http://microreact.org/
 Presentation on session Harnessing whole genome sequence data
for public health applications : Novel open access tools forWGS-
based pathogen surveillance and the identification of high-risk
clones
 http://eccmidlive.org/#resources/novel-open-access-tools-for-
wgs-based-pathogen-surveillance-and-the-identification-of-high-
risk-clones
Computational Resources In Infectious Disease
• Ridom Seqsphere+ : http://www.ridom.de/seqsphere/
• Applied Maths Bionumerics 7.6: http://www.applied-maths.com/bionumerics
• CLCBioGenomicWorkbench : http://www.clcbio.com/blog/clc-genomics-workbench-7-5/
• Huge variety of software and database solutions
• There is no single One-Size-Fits-All solution (job
security for bioinformaticians)
• Different questions require different approaches
• Always questions the results and data provenance
 ECCMID2015 Meet-the-expert session on “What bioinformatic tools
should I use for analysis of HighThroughput Sequencing data for
molecular diagnostics? ”
 Nick Loman: http://www.slideshare.net/nickloman/eccmid-2015-
meettheexpert-bioinformatics-tools
 João André Carriço:
http://www.slideshare.net/joaoandrecarrico/eccmid-meet-
theexpert2015
 UMMI Members
 Bruno Gonçalves
 Mário Ramirez
 José Melo-Cristino
 INESC-ID
 Alexandre Francisco
 Cátia Vaz
 Marta Nascimento
 EFSA INNUENDO Project (https://sites.google.com/site/innuendocon/)
 Mirko Rossi
 FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):
 Dag Harmsen (Univ. Muenster)
 Stefan Niemann (Research Center Borstel)
 Keith Jolley, James Bray and Martin Maiden (Univ. Oxford)
 Joerg Rothganger (RIDOM)
 Hannes Pouseele (Applied Maths)
 Genome Canada IRIDA project (www.irida.ca)
 Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar(NLM , PHAC)
 Ed Taboada and Peter Kruczkiewicz (LabFoodborne Zoonoses, PHAC)
 Fiona Brinkman (SFU)
 William Hsiao (BCCDC)
INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS

More Related Content

PPT
Integrating phylogenetic inference and metadata visualization for NGS data
PPTX
ECCMID 2016 - How to build actionable virulome databases
PPTX
Eccmid meet the expert 2015
PPTX
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
PPTX
Making Use of NGS Data: From Reads to Trees and Annotations
PPTX
Software Pipelines: The Good, The Bad and The Ugly
PPTX
Common languages in genomic epidemiology: from ontologies to algorithms
PDF
Introduction to 16S Microbiome Analysis
Integrating phylogenetic inference and metadata visualization for NGS data
ECCMID 2016 - How to build actionable virulome databases
Eccmid meet the expert 2015
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Making Use of NGS Data: From Reads to Trees and Annotations
Software Pipelines: The Good, The Bad and The Ugly
Common languages in genomic epidemiology: from ontologies to algorithms
Introduction to 16S Microbiome Analysis

What's hot (20)

PDF
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
PDF
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
PDF
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
PDF
16S rRNA Analysis using Mothur Pipeline
PDF
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
PPTX
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
PDF
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
PPTX
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
PPTX
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
PDF
Errors and Limitaions of Next Generation Sequencing
PDF
T-bioinfo overview
PPTX
Haendel clingenetics.3.14.14
PPT
20170209 ngs for_cancer_genomics_101
PPTX
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
PDF
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
PPTX
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
PPT
Metagenomics sequencing
PDF
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
PDF
Next Generation Sequencing Informatics - Challenges and Opportunities
PDF
SPIN Workshop Microbial Genomics @NIST
A peek inside the bioinformatics black box - DCAMG Symposium - mon 20 july 2015
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
16S rRNA Analysis using Mothur Pipeline
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Errors and Limitaions of Next Generation Sequencing
T-bioinfo overview
Haendel clingenetics.3.14.14
20170209 ngs for_cancer_genomics_101
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Metagenomics sequencing
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Next Generation Sequencing Informatics - Challenges and Opportunities
SPIN Workshop Microbial Genomics @NIST
Ad

Similar to Computational Resources In Infectious Disease (20)

PPTX
Cool Informatics Tools and Services for Biomedical Research
PDF
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
PPT
Reproducible bioinformatics pipelines with Docker and Anduril
PPTX
Reproducibility: 10 Simple Rules
PPTX
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
PPTX
Production Bioinformatics, emphasis on Production
PDF
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
PPTX
2015 genome-center
PPTX
How to be a bioinformatician
PDF
Bio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
PPTX
Enabling Large Scale Sequencing Studies through Science as a Service
PDF
Open PHACTS April 2017 Science webinar Workflow tools
PDF
Reproducible Research and the Cloud
PPTX
Software Sustainability: Better Software Better Science
PDF
Overview of Next Gen Sequencing Data Analysis
PPTX
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
PPTX
CLC bio presentation at 5th SFAF 6/3/2010
PDF
Initial steps towards a production platform for DNA sequence analysis on the ...
PDF
Ontologies Ontop Databases
PPTX
Enhancing the Quality of ImmPort Data
Cool Informatics Tools and Services for Biomedical Research
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducibility: 10 Simple Rules
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
Production Bioinformatics, emphasis on Production
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
2015 genome-center
How to be a bioinformatician
Bio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
Enabling Large Scale Sequencing Studies through Science as a Service
Open PHACTS April 2017 Science webinar Workflow tools
Reproducible Research and the Cloud
Software Sustainability: Better Software Better Science
Overview of Next Gen Sequencing Data Analysis
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
CLC bio presentation at 5th SFAF 6/3/2010
Initial steps towards a production platform for DNA sequence analysis on the ...
Ontologies Ontop Databases
Enhancing the Quality of ImmPort Data
Ad

Recently uploaded (20)

PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
famous lake in india and its disturibution and importance
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Sciences of Europe No 170 (2025)
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
TOTAL hIP ARTHROPLASTY Presentation.pptx
Biophysics 2.pdffffffffffffffffffffffffff
. Radiology Case Scenariosssssssssssssss
The KM-GBF monitoring framework – status & key messages.pptx
2. Earth - The Living Planet Module 2ELS
Introduction to Cardiovascular system_structure and functions-1
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Cell Membrane: Structure, Composition & Functions
microscope-Lecturecjchchchchcuvuvhc.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
famous lake in india and its disturibution and importance
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Sciences of Europe No 170 (2025)
Viruses (History, structure and composition, classification, Bacteriophage Re...
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...

Computational Resources In Infectious Disease

  • 1. João André Carriço, Microbiology Institute and Instituto de Medicina Molecular, Faculty of Medicine, University of Lisbon jcarrico@fm.ul.pt twitter: @jacarrico ME081 – Meet-The-Expert Session 26th ECCMID, Amsterdam, Netherlands 7-12 April 2016
  • 2.  This presentation is not intended to cover all available software or databases (we would need several weeks or months to do that)  I’ll present what I use or intend to use in a near future  I gladly accept any suggestions to included on similar presentations in the future.  It is supposed to be interactive so ask away during the presentation.
  • 3.  Available Databases  Virulence Factors and AMR DBs  Sequence-based typing databases: Pubmlst.org / Enterobase  HighThroughput Sequencing data analysis (freeware)  Prokka  Roary  Nullabor  Microreact.org  PHYLOViZ  Commercial Solutions  Bionumerics 7.5  CLC GenomicsWorkbench (CLC Bio)  Ridom Seqsphere+
  • 4. Virulence Factor Databases  VFDB (http://www.mgc.ac.cn/VFs/main.htm)  Pathosystems Resource Integration Center (PATRIC) VF (https)://www.patricbrc.org/)  Victors (http://www.phidias.us/victors/)  PHI-Base (http://www.phi-base.org/)  MvirDB (http://mvirdb.llnl.gov/ ) To know more: - Presentation on the Controversies in interpreting whole genome sequence data session : http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
  • 5.  Comprehensive Antibiotic Resistance Database (CARD) (https://card.mcmaster.ca/)  Repository of Antibiotic resistanceCassetes (RAC)(http://rac.aihi.mq.edu.au/rac/)  Integrall :The integron database (http://integrall.bio.ua.pt/) (…)
  • 6. To know more : http://www.slideshare.net/nickloman/eccmid-2015-so-i-have-sequenced-my-genome-what-now Reads (fastq files) contigs (fasta files) Annotated contigs (gbk/gff files) Roary :PanGenome Analysis Enterobase BIGSdb Nullabor PHYLOViZ: Tree + metada visualization Microreact.org: Tree +metadata +vizualization Prokka De novo assembler
  • 8. slide by @happy_khan Martin Sergeant Mark Achtman Nabil-Fareed Alikhan Zhemin Zhou
  • 9.  Genome annotation made easy byTorsten Seemann (slides byTorsten)  Genome annotation: adding biological information to the sequence, by describing features To know more : http://www.slideshare.net/torstenseemann/prokka-rapid-bacterial-genome-annotation-abphm-2013 Available at: https://github.com/tseemann/prokka
  • 10.  Pan genome analysis by Andrew Page  Available at: https://sangerpathogens.github.io/Roary/ Core genome Accessory genome Pan-genome
  • 11.  Inputs:Annotated de novo assemblies (GFF files) • Typically from the annotation pipeline  Outputs: • Spreadsheet with presence and absence of genes • Multi-FASTA alignment of core genes so you can build a tree without a reference • Multi-FASTA alignments for each gene • Plots for the open/closed genome, unique genes • Integrates with iCANDY so you can visualise all structural variation • QC report from Kraken to help identify suspect samples (Slide by Andrew Page)
  • 12. Core (n or n-1 strains) Soft-Core (n-2 or n-3 strains) Shell ( 8(?) to n-3 strains) Cloud ( <8 (?) strains) Core genome: Core + Soft-Core Accessory genome: Shell + Cloud
  • 13. iCANDY output of presence and absence of genes in accessory genome. S. Weltevreden & public S. enterica genomes (Slide by Andrew Page)
  • 14.  Complete pipeline from reads to reports byTorsten Seemann  Objective is automate analysis for everyday use on public health labs /research settings  Uses and distills outputs by a lot of software  Avaliable at: https://github.com/tseemann/nullarbor
  • 19. Inputs: - Tab separated txt (profiles) - Fasta files - Automatic database retrieval (MLST) Outputs: • goeBURST and goeBURST MST • Link quality assessment • High quality images Can be easily applied to: - MLST/ cgMLST/wgMLST - MLVA - SNP data* - Gene Presence/absence
  • 20. New features: • Hierarchical clustering • Neighbor-Joining • Project Saving
  • 21.  Available at http://online.phyloviz.net  Web based version of PHYLOViZ  Allows users to create their own datasets, save them and share their data (privately or publicly)  REST API available  Scalable to thousands of nodes  Tree Analysis tools:  Interactive distance matrix  NLV graph
  • 28.  Available at http://microreact.org/  Presentation on session Harnessing whole genome sequence data for public health applications : Novel open access tools forWGS- based pathogen surveillance and the identification of high-risk clones  http://eccmidlive.org/#resources/novel-open-access-tools-for- wgs-based-pathogen-surveillance-and-the-identification-of-high- risk-clones
  • 30. • Ridom Seqsphere+ : http://www.ridom.de/seqsphere/ • Applied Maths Bionumerics 7.6: http://www.applied-maths.com/bionumerics • CLCBioGenomicWorkbench : http://www.clcbio.com/blog/clc-genomics-workbench-7-5/
  • 31. • Huge variety of software and database solutions • There is no single One-Size-Fits-All solution (job security for bioinformaticians) • Different questions require different approaches • Always questions the results and data provenance
  • 32.  ECCMID2015 Meet-the-expert session on “What bioinformatic tools should I use for analysis of HighThroughput Sequencing data for molecular diagnostics? ”  Nick Loman: http://www.slideshare.net/nickloman/eccmid-2015- meettheexpert-bioinformatics-tools  João André Carriço: http://www.slideshare.net/joaoandrecarrico/eccmid-meet- theexpert2015
  • 33.  UMMI Members  Bruno Gonçalves  Mário Ramirez  José Melo-Cristino  INESC-ID  Alexandre Francisco  Cátia Vaz  Marta Nascimento  EFSA INNUENDO Project (https://sites.google.com/site/innuendocon/)  Mirko Rossi  FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):  Dag Harmsen (Univ. Muenster)  Stefan Niemann (Research Center Borstel)  Keith Jolley, James Bray and Martin Maiden (Univ. Oxford)  Joerg Rothganger (RIDOM)  Hannes Pouseele (Applied Maths)  Genome Canada IRIDA project (www.irida.ca)  Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar(NLM , PHAC)  Ed Taboada and Peter Kruczkiewicz (LabFoodborne Zoonoses, PHAC)  Fiona Brinkman (SFU)  William Hsiao (BCCDC) INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS