SlideShare a Scribd company logo
Reframing Phylogenetics
Unbiased comparative methods for
environmental metagenomics
sampling
Joe Parker
• My background & track record
• Environmental metagenomics – existing problems
• Phylogenomics Add Maximum Value To Datasets
• Illustrative study
Outline
Joe Parker: Novel methods for cutting-edge science
High-throughput
phylogenomics
Parallelised analyses
Bayesian statistics
Information-theoretic
measures
NGS datasets
Integrating clinical,
genetic & molecular
data
Machine-learning and
antigen modeling
BaTS software
>100 citations
‘000s downloads
HADPACK framework
in silico HIV vaccine design
Clinical trial
ABCDet API
First genomic convergent
evolution demonstration
Nature Oct 2013
Public alpha
Throughput & Access
Users
Users
Developers
Developers
Mason et al. (2014) Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill.
The ISME J (epub ahead of print; 23rd Jan 2014; retrieved 1st Mar 2014): doi:10.1038/ismej.2013.254
Metagenomics of an environmental disaster
Comparative approaches
Homology
Reframing Phylogenomics
Deepwater Horizon, revisited
Continuous analyses with immediate results
Iterative sample collection / analysis; rapid cline detection
Exploit phylogenetic
methods
Detect: population
dynamics, adaptive
evolution, migration
Facilitate NGS
Gene functions and
Ecosystem services
Explicitly model errors
Account for paralogy &
horizontal transfer
Reduce
ascertainment bias
Unbiased taxon /
gene discovery
Dr. Joe Parker
Dr. Elizabeth Clare
Environmental metagenomics
Dr. Steve Rossiter
Phylogenomics
Prof. Richard Nichols
Population genetics
Prof. Steve Lloyd
Parallel computing
Prof. Mark Trimmer
Biogeochemistry
Dr. Jon Grey
Aquatic ecology
Prof. Alfried Vogler (NHM)
Metagenomics & turbotaxonomy
Mr. Tim Booth (NEBC)
Bio-Linux & virtual machines
Prof. Jonathan Eisen (US)
Microbial phylogenomics
Prof. Alexei Drummond (NZ)
Bayesian phylogenetics, Geneious CSO
Dr. Matthew Hahn (US)
Genomics
Dr. Aris Katzourakis (Oxford)
Phylodynamics modelling
GridPP HTC
3,000+ cores
MidPlus HPC
2,000+ cores
Genome Centre
Sequencing expertise
Deepwater Horizon, revisited
Continuous analyses with immediate results
Iterative sample collection / analysis; rapid cline detection
Exploit phylogenetic
methods
Detect: population
dynamics, adaptive
evolution, migration
Facilitate NGS
Gene functions and
Ecosystem services
Explicitly model errors
Account for paralogy &
horizontal transfer
Reduce
ascertainment bias
Unbiased taxon /
gene discovery
Activity Goal
Publication and/or software
release
Y1 Port existing tools Build framework
Phylogenomics tools
Runtime visualisation
Taxonomic assignment
Sitewise diversity
Y2 I/O & GUI, review Develop framework
Visualisation tools
Review literature, develop theory
Non-standard genetic codes
Raft-aligned reads (RAR) demonstration
Y3 RARs, phylo stress
Anhomologous
phylogenetics
Asynchronous phylogenomics
RARs MPI implementation
Y4 Core method Integrate research
Core method demonstration,
including agent-based computation
Core method released
Y5 Alpha releases Deploy
Stable releases ported to Geneious,
CLC, Galaxy
Final report

More Related Content

PPT
Advancing the Metagenomics Revolution
PPTX
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
PPT
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
PPTX
[2013.09.27] extracting genomes from metagenomes
PPTX
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
PPT
Metagenomics sequencing
PPTX
Analysis of binning tool in metagenomics
PPT
The Emerging Global Community of Microbial Metagenomics Researchers
Advancing the Metagenomics Revolution
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
[2013.09.27] extracting genomes from metagenomes
Discovery and Annotation of Novel Proteins from Rumen Gut Metagenomic Sequenc...
Metagenomics sequencing
Analysis of binning tool in metagenomics
The Emerging Global Community of Microbial Metagenomics Researchers

What's hot (20)

PPTX
[13.07.07] albertsen mewe13 metagenomics
PPTX
Metagenomics and it’s applications
PPT
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
PPTX
Metagenomics
PPTX
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
PDF
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
PPT
BioMinds Poster!!!!!!!!
PPT
Folker Meyer: Metagenomic Data Annotation
PPT
The Emerging Global Collaboratory for Microbial Metagenomics Researchers
PPT
Building an Information Infrastructure to Support Microbial Metagenomic Sciences
PPTX
metagenomics
PPT
Metagenomic
PDF
Metagenomics and Industrial Application
PPTX
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
PPTX
[2017.06.02] ASM17 Mads Albertsen
PPTX
Molecular pathology in microbiology and metagenomics
PDF
Introduction to 16S Microbiome Analysis
PDF
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
PDF
Metagenomics as a tool for biodiversity and health
[13.07.07] albertsen mewe13 metagenomics
Metagenomics and it’s applications
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
Metagenomics
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
BioMinds Poster!!!!!!!!
Folker Meyer: Metagenomic Data Annotation
The Emerging Global Collaboratory for Microbial Metagenomics Researchers
Building an Information Infrastructure to Support Microbial Metagenomic Sciences
metagenomics
Metagenomic
Metagenomics and Industrial Application
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
[2017.06.02] ASM17 Mads Albertsen
Molecular pathology in microbiology and metagenomics
Introduction to 16S Microbiome Analysis
Targeted RNA Sequencing, Urban Metagenomics, and Astronaut Genomics
Metagenomics as a tool for biodiversity and health
Ad

Viewers also liked (20)

PPT
Microbial Metagenomics and Human Health
PDF
Next Generation Sequencing of Fish Microbiome- AquaCyprus 2014
PDF
Microbiome 2013
PPTX
Metagenomics newer approach in understanding Microbes
PDF
Identification of antibiotic resistance genes in Klebsiella pneumoniae isolat...
PDF
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
PDF
Gut microbiota for health: lessons of a metagenomic scan (by Joel Doré)
PDF
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
PDF
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
PPTX
Metagenomics
PPT
Microbial Metagenomics Drives a New Cyberinfrastructure
PDF
Phylogeny Driven Approaches to Genomic and Metagenomic Studies
PDF
Computational analysis of metagenomic data: delineation of compositional feat...
PPTX
Metagenomics
PPTX
Future of metagenomics
PDF
Dr. Ben Hause - Pathogen Discovery Using Metagenomic Sequencing
PPT
PROKARYOTIC TRANSCRIPTOMICS AND METAGENOMICS
PPTX
Parks kmer metagenomics
PPTX
Viral Metagenomics (CABBIO 20150629 Buenos Aires)
Microbial Metagenomics and Human Health
Next Generation Sequencing of Fish Microbiome- AquaCyprus 2014
Microbiome 2013
Metagenomics newer approach in understanding Microbes
Identification of antibiotic resistance genes in Klebsiella pneumoniae isolat...
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
Gut microbiota for health: lessons of a metagenomic scan (by Joel Doré)
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
Metagenomics
Microbial Metagenomics Drives a New Cyberinfrastructure
Phylogeny Driven Approaches to Genomic and Metagenomic Studies
Computational analysis of metagenomic data: delineation of compositional feat...
Metagenomics
Future of metagenomics
Dr. Ben Hause - Pathogen Discovery Using Metagenomic Sequencing
PROKARYOTIC TRANSCRIPTOMICS AND METAGENOMICS
Parks kmer metagenomics
Viral Metagenomics (CABBIO 20150629 Buenos Aires)
Ad

Similar to Reframing Phylogenomics (20)

PDF
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
PPTX
Supporting researchers in the molecular life sciences Jeff Christiansen
PPT
Data management, data sharing: the SysMO-SEEK Story
PPT
Data sharing - Data management - The SysMO-SEEK Story
PDF
Building bioinformatics resources for the global community
PPTX
Reproducibility (and the R*) of Science: motivations, challenges and trends
PPTX
Big data nebraska
PDF
CV_alexander_venzin_10_2016
PPTX
Big data nebraska
PPTX
Job Talk Iowa State University Ag Bio Engineering
PPTX
Big Data Field Museum
PPT
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
PDF
The Human Variome Database in Australia in 2014 - Graham Taylor
PPTX
Grand round whsiao_may2015
PPTX
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
PPT
Cross-Disciplinary Biomedical Research at Calit2
PPTX
Matt Jones software-interoperability
PDF
PDF
PPTX
Developing data services: a tale from two Oregon universities
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
Supporting researchers in the molecular life sciences Jeff Christiansen
Data management, data sharing: the SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
Building bioinformatics resources for the global community
Reproducibility (and the R*) of Science: motivations, challenges and trends
Big data nebraska
CV_alexander_venzin_10_2016
Big data nebraska
Job Talk Iowa State University Ag Bio Engineering
Big Data Field Museum
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
The Human Variome Database in Australia in 2014 - Graham Taylor
Grand round whsiao_may2015
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
Cross-Disciplinary Biomedical Research at Calit2
Matt Jones software-interoperability
Developing data services: a tale from two Oregon universities

More from Joe Parker (12)

PPTX
Challenges and potential of real-time phylogenomics: lessons from a metagenom...
PPTX
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
PPTX
Real-time Phylogenomics: Joe Parker
PPTX
Inference and informatics in a 'sequenced' world
PPTX
Using field-based DNA sequencing to accelerate phylogenomics
PPTX
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
PPTX
Joe parker-benchmarking-bioinformatics
PPT
Real-time Phylogenomics: Joe Parker
PDF
'Omics in extreme Environments (Lightweight bioinformatics)
PPT
Interpreting ‘tree space’ in the context of very large empirical datasets
PPT
Phylogenomic methods for comparative evolutionary biology - University Colleg...
PDF
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...
Challenges and potential of real-time phylogenomics: lessons from a metagenom...
Field-based, real-time metagenomics and phylogenomics for responsive pathogen...
Real-time Phylogenomics: Joe Parker
Inference and informatics in a 'sequenced' world
Using field-based DNA sequencing to accelerate phylogenomics
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Joe parker-benchmarking-bioinformatics
Real-time Phylogenomics: Joe Parker
'Omics in extreme Environments (Lightweight bioinformatics)
Interpreting ‘tree space’ in the context of very large empirical datasets
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic Convergence Detection - Evolutionary Biology Meeting in Marseill...

Recently uploaded (20)

PPTX
Biomechanics of the Hip - Basic Science.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPT
6.1 High Risk New Born. Padetric health ppt
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPTX
GREEN FIELDS SCHOOL PPT ON HOLIDAY HOMEWORK
PPTX
Introcution to Microbes Burton's Biology for the Health
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PPTX
Seminar Hypertension and Kidney diseases.pptx
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PPTX
BIOMOLECULES PPT........................
PPT
Presentation of a Romanian Institutee 2.
PPTX
ap-psych-ch-1-introduction-to-psychology-presentation.pptx
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PDF
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
PPTX
Lesson-1-Introduction-to-the-Study-of-Chemistry.pptx
PPTX
PMR- PPT.pptx for students and doctors tt
PDF
S2 SOIL BY TR. OKION.pdf based on the new lower secondary curriculum
PDF
lecture 2026 of Sjogren's syndrome l .pdf
Biomechanics of the Hip - Basic Science.pptx
Placing the Near-Earth Object Impact Probability in Context
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
6.1 High Risk New Born. Padetric health ppt
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
GREEN FIELDS SCHOOL PPT ON HOLIDAY HOMEWORK
Introcution to Microbes Burton's Biology for the Health
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Seminar Hypertension and Kidney diseases.pptx
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
BIOMOLECULES PPT........................
Presentation of a Romanian Institutee 2.
ap-psych-ch-1-introduction-to-psychology-presentation.pptx
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
Lesson-1-Introduction-to-the-Study-of-Chemistry.pptx
PMR- PPT.pptx for students and doctors tt
S2 SOIL BY TR. OKION.pdf based on the new lower secondary curriculum
lecture 2026 of Sjogren's syndrome l .pdf

Reframing Phylogenomics

  • 1. Reframing Phylogenetics Unbiased comparative methods for environmental metagenomics sampling Joe Parker
  • 2. • My background & track record • Environmental metagenomics – existing problems • Phylogenomics Add Maximum Value To Datasets • Illustrative study Outline
  • 3. Joe Parker: Novel methods for cutting-edge science High-throughput phylogenomics Parallelised analyses Bayesian statistics Information-theoretic measures NGS datasets Integrating clinical, genetic & molecular data Machine-learning and antigen modeling BaTS software >100 citations ‘000s downloads HADPACK framework in silico HIV vaccine design Clinical trial ABCDet API First genomic convergent evolution demonstration Nature Oct 2013 Public alpha
  • 5. Mason et al. (2014) Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill. The ISME J (epub ahead of print; 23rd Jan 2014; retrieved 1st Mar 2014): doi:10.1038/ismej.2013.254 Metagenomics of an environmental disaster
  • 9. Deepwater Horizon, revisited Continuous analyses with immediate results Iterative sample collection / analysis; rapid cline detection Exploit phylogenetic methods Detect: population dynamics, adaptive evolution, migration Facilitate NGS Gene functions and Ecosystem services Explicitly model errors Account for paralogy & horizontal transfer Reduce ascertainment bias Unbiased taxon / gene discovery
  • 10. Dr. Joe Parker Dr. Elizabeth Clare Environmental metagenomics Dr. Steve Rossiter Phylogenomics Prof. Richard Nichols Population genetics Prof. Steve Lloyd Parallel computing Prof. Mark Trimmer Biogeochemistry Dr. Jon Grey Aquatic ecology Prof. Alfried Vogler (NHM) Metagenomics & turbotaxonomy Mr. Tim Booth (NEBC) Bio-Linux & virtual machines Prof. Jonathan Eisen (US) Microbial phylogenomics Prof. Alexei Drummond (NZ) Bayesian phylogenetics, Geneious CSO Dr. Matthew Hahn (US) Genomics Dr. Aris Katzourakis (Oxford) Phylodynamics modelling GridPP HTC 3,000+ cores MidPlus HPC 2,000+ cores Genome Centre Sequencing expertise
  • 11. Deepwater Horizon, revisited Continuous analyses with immediate results Iterative sample collection / analysis; rapid cline detection Exploit phylogenetic methods Detect: population dynamics, adaptive evolution, migration Facilitate NGS Gene functions and Ecosystem services Explicitly model errors Account for paralogy & horizontal transfer Reduce ascertainment bias Unbiased taxon / gene discovery
  • 12. Activity Goal Publication and/or software release Y1 Port existing tools Build framework Phylogenomics tools Runtime visualisation Taxonomic assignment Sitewise diversity Y2 I/O & GUI, review Develop framework Visualisation tools Review literature, develop theory Non-standard genetic codes Raft-aligned reads (RAR) demonstration Y3 RARs, phylo stress Anhomologous phylogenetics Asynchronous phylogenomics RARs MPI implementation Y4 Core method Integrate research Core method demonstration, including agent-based computation Core method released Y5 Alpha releases Deploy Stable releases ported to Geneious, CLC, Galaxy Final report

Editor's Notes

  • #2: RB: more explnation of basic ideas RK: not here – arctic microbes slide RB: ok
  • #3: me, problem, solution: My track record and why I can take this field forward Current analyses in env. Metag. Falling short, Why phylogenetics add demonstration
  • #4: Throughout my career : track record of novel models, implemented in apps for others, doing cutting-edge science Bats, >100 cites, thousands d/ls, weekly/daily user contact Hadpack initiated entirely novel hiv analysis / vaccine design w/ machine learning, phylogenetics, GUI Current work package for HT phylogenomics, detected convergent evol (NATURE)
  • #5: **Throughput** usually in terms of sequencing , Analysis – not limited by CPU intersection of able developers who are also users v.small Access drives impact Fundamental to my goals Distributed / cloud infrastructures – no bar to entry miniION etc exacerbate
  • #6: 00s I could pick, this is one - Typical example of an environmental metagenomics question: oil spill effects on marine micro? Sediment cores, 50 sites single gene, handful of genomes MDS could distinguish some signal w/ geochemical variables, found some taxa, some new How many more new? Similarity based Slow Sequences embody Information, including important on adaptation etc - wasted *** Deepwater Horizon (DWH) oil spill – spring 2010 ~4.1 million barrels of oil to the Gulf of Mexico; >22% of this oil is unaccounted for, 64 sites by targeted sequencing of 16S rRNA genes, shotgun metagenomic sequencing of 14 samples 16S rRNA: most heavily oil-impacted sediments enriched in an uncultured Gammaproteobacterium and a Colwellia species, both of which were highly similar to sequences in the DWH deep-sea hydrocarbon plume. The primary drivers in structuring the microbial community were nitrogen and hydrocarbons. Annotation of unassembled metagenomic data revealed the most abundant hydrocarbon degradation pathway encoded genes involved in degrading aliphatic and simple aromatics via butane monooxygenase. Further, analysis of metagenomic sequence data revealed an increase in abundance of genes involved in denitrification pathways in samples that exceeded the Environmental Protection Agency (EPA)’s benchmarks for polycyclic aromatic hydrocarbons (PAHs) compared with those that did not. Importantly, these data demonstrate that the indigenous sediment microbiota contributed an important ecosystem service for remediation of oil in the Gulf. However, PAHs were more recalcitrant to degradation, and their persistence could have deleterious impacts on the sediment ecosystem.
  • #7: Given observed microbial diversity Phylogeny reveals evolutionary history; trait acquired once? Or multiple times – biologically significant…
  • #8: Why aren’t there more phylogenetics in environmental micro? Orthology assumptions from classical phylogenetics Simple case, defined as orthologous when gene and species histories identical. Genes = taxa, and vice versa Gene duplications give rise to paralogous copies, may confuse – esp similarity matching Secondary copies.. Or deletions screw up more !microbial communities! Horizontal transfer
  • #9: This is a COMPLETELY NOVEL approach Continuous analysis, agent-based – outputs instantly with increasing resolution How I envisage it working: [1] collection of short-read envir. Metagen. Sequences, low complexity [2] tiled into pseudo assemblies by similarity clustering. may be chimeric may be orthologous or paralogous I CALL THESE RAFT-ALIGNED-READS, and this step CRYSTALISATION each raft handled by an agent increased local order, still globally disordered [3] we can compute phylogenetic measures along sliding windows within a raft. These measure the coherence of the evolutionary signal along the raft [4] areas of great incoherence I CALL PHYLOGENETIC STRESS – thrse might correspond to chimeric reads, e.g. other taxa; paralogues; horiz transfer [5] agents can compare stress values and attempt to exchange reads; proportional to stress. I CALL THIS DISLOCATION [6] iteration towards maximally globally ordered state
  • #10: More taxa / genes Full evol. Information extracted Explicit modelling NGS-ready Fast / instant
  • #11: Compute / sequencing resources QM experts, collaborators & mentors International collaborators
  • #12: Leave it there for questions More taxa / genes Full evol. Information extracted Explicit modelling NGS-ready Fast / instant
  • #13: Research programme, not an engineering project