SlideShare a Scribd company logo
Genome annotation with open source
software: Apollo, JBrowse, and GO in Galaxy
Nathan A. Dunn1, Colin Diesh2, Helena Rasche3, Anthony
Bretaudeau4, Rob Buels2, Ian Holmes2
(1) Lawrence Berkeley National Laboratory, Berkeley, CA, (2) Department of Bioengineering, Berkeley, CA, (3)
University of Freiburg (4) INRA, BIPAA/GenOuest, France
https://github.org/GMOD/Apollo/
@bbop_apollo
nathandunn@lbl.gov
http://bit.ly/genomes-in-action-nathan-dunn-2020
What is Genome Annotation?
Structural Annotation
(what they look like):
● position and strand of
genomic elements
● isoforms, exons,
introns, coding region
/ UTR, repeats, etc.
Functional Annotation (what
they do):
● expression
● pathways
● gene family
● evolution
● Gene Ontology (molecular
function, biological
process, cellular
components)
Sequence to
structure
Structure to
function
http://geneontology.org/
What is Genome Annotation?
@cmungall
http://genomearchitect.org/
Example Genome Annotation Pipeline
4
Experimental design,
sampling
Comparative analyses
Curated gene
set
Manual
annotation
Sequencing
Synthesis &
dissemination
Create
assembly
FGENESH
Automated
annotation
Synthesis &
dissemination
Experimental design,
sampling
GMOD and Genome Annotation
Comparative analyses
Curated gene
set
Manual
annotation
Sequencing
Create
assembly
FGENESH
Automated
annotation
*
*
*
What is Galaxy?
• Workflow engine
• Wrap bioinformatics tools
• Reproducible
• Shared workflows
• Scalable
• Examples of doing Genome Annotation with
Galaxy:
– Galaxy Genome Annotation
– GEP / G-Onramp**
Wilson Leung / Washu
Sub Workflows
Standard interface
● Provides many resources
● Available on Galaxy EU:
https://annotation.usegalaxy.eu/
● Provides annotation trainings:
https://galaxyproject.github.io/training-
material/topics/genome-annotation/
● Provides docker images and python
libraries:
https://galaxy-genome-annotation.github.io/
● More examples (arthropods, algae,
phages):
http://bit.ly/galaxycommunityconference-
rasche-2019
@hexylena
Galaxy Genome Annotation
@abretaud
@bguerning
Undergrad Bacteriophage
Annotation Course
Genome sequence to
publication
Two Tracks:
- Isolating phage from
environment
- Novel genome, de novo
annotation
Structural Annotation
Output: Genome
loaded into Apollo
with evidence tracks,
ready to annotate
Input: Fasta
Genome
Functional Annotation
Output: Updated
genome in Apollo
with new evidence
tracks from Blast,
Interproscan, etc.
Input: Select
organism from Apollo
Genomics Education Partnership (GEP)**
• 1300+ Undergraduates, 100+ schools, 100+ faculty
• Looking for new members**:
– https://gep.wustl.edu/contact_us
• Built on G-OnRamp: http://g-onramp.org
– Built on GMOD: Galaxy, JBrowse, Apollo, etc.
• More here (Wilson Leung / Washu):
https://wustl.app.box.com/v/pag2020-g-onramp-presentation
GEP / G-Onramp
GEP / G-Onramp
Why are
all the cool
kids using
Apollo?
Automated Annotation is not Perfect
1
5
Automated
Annotation
• Assembly errors can cause fragmented annotations
• Limited coverage makes precise identification difficult
• Good to verify “numbers” visually
Manual
Annotation
Extensive Error in the Number of Genes
Inferred from Draft Genome Assemblies
Denton et al 2014 Plos Comp Biology
Human Analysis
Automated
Annotation
Manual Curation Refines Genome Annotations
1
6
Experimental Evidence
cDNAs, HMM domain searches, RNAseq,
genes from other species.
Manual
Annotation
• Human visualization finds problems numbers can’t
• Include additional analysis as needed
• Make use of the researcher’s expertise
• Integrate all underlying evidence
What are Apollo and JBrowse
• Visualize genomic
data
• Read-only
• Serverless
• Can run in desktop
• Built on JBrowse
• Write genomic data
• Used for refining
genome annotations
• Server + database
• Authenticated users
Run Apollo
Start a project
Add biological evidence
Bring in the biologists
Refine annotations
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
docker run -it -v
/jbrowse/directories:/data -v
empty_db_directory:/var/lib/postgresql
-p 8888:8080 quay.io/gmod/apollo:latest
Run Apollo
Start a project
Add biological evidence
Bring in the biologists
Refine annotations
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
Upload FASTA for new GenomesAdd hand-processed JBrowse directory
Share “Public” organisms
- OR -
creates search index if present
Add biological evidence
Bring in the biologists
Refine annotations
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
Evidence
Transcripts
(GFF3, GBK)
BAM Reads
BigWig XY &
HeatMap
Themes
(dark/light)
Color CDS Frame
Automated
Annotation
Manual Annotation
21
Dynamically Open
Configure Multiple Tracks
addStores={"url":{"type":"JBrowse/Store/SeqFeature/GFF3","urlTe
mplate":"http://host/genes.gff"}}&addTracks=[{"label":"genes","ty
pe":"JBrowse/View/Track/CanvasFeatures","store":"url"}]
Append via URL
Statically Configure
https://gmod.github.io/jbrowse-registry/Customizable Views
• BAM
• BigWig
• GFF
• GTF
• GBK
• VCF
• FASTA
• FASTAi
• SPARQL
• custom types
(e.g., REST end-
point like
mygene.info)
JBrowse Supports Diverse Data
Expansive JBrowse Plugin Registry
https://gmod.github.io/jbrowse-registry/
+50 registered plugins
@cmdcolin
Track Panel - Just JBrowse tracks
23
Search
Categories
Add Evidence from Track Panel
24
Upload evidence
directly and share
without
configuration
Upload only locally
from multiple sources
Add biological evidence
Bring in the biologists
Refine annotations
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
Annotators
Annotators
Annotators
Biology is a team sport
Manual tasks require more hands
Researcher #1
Researcher #2
Researcher #3
Real-time collaboration
Add biological evidence
Bring in the biologists
Refine annotations
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
Add / Search Users
Edit User
Permission
Use Groups to
Manage Bulk
Permissions
• Edit user permissions
• Create / edit
organisms
Added Instructor Role to Manage Organisms
Reports and Config
27
Predefine Curation Terms
Reports
Refine annotations
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
Genomic
Editing
Workspace
Editing Area Refined Genomic
Elements
29
Alignments edges
shown in red
Annotate other genomic
types with drop-down
Create New Refined Annotation
Add annotation by
dragging a genomic element
Indicate non-canonical
splice site
30
Search
View / Edit Details
List / Navigate Vertically
List of Created Annotations
Edit Annotation Structure
31
Adjust exon by dragging
Editing Annotations
32
Edit Additional
Structural Data
(right-click popup)
Edit Associations
• PubMed / dbxref
• GO
• Metadata
• key/value
• status
• comments
Change Annotation
Type
History of
Structural Edits
Annotation History
33
Revertible History of All Operations
Current position
Highlighted row
shown
Select any version
to set to
Editing Annotation Panel
Define and
select status Predefined
comments
Add and verify
pubmed
Can select predefined keys /
values per organism and type
35
Search Panel
Create Annotation
from hit or
download
Click and
highlight
region
Search directly
from annotated
sequence
Annotate Reference Assembly Corrections
36
Alteration reflected as if correction merged
37
Variant Annotation
Add Variant Annotation by
Dragging a Genomic Element
Copy / Edit Properties
Export VCF
View Variant Effect
Gene Ontology (GO) Func Annotations
• Typeahead lookups
• Utilize GO BioLink API
• Export GPAD2, GPI
http://api.geneontology.org/api/
What is an Ontology?
● A curated set of logical dictionaries
○ You can use the same word!
○ You can infer!
○ You can define exact relationships!
The Gene Ontology is
represented as a directed graph
Image courtesy of Chris Mungall
Ontology
- 45k terms
- 106k edges
Biological
Process
Cellular
Component
Molecular
Function
Ontology terms for annotation of genes
Molecular Function Cellular Component Biological Process
Where the gene product
performs its activity
The molecular activity
of the gene product
The evolved biological program
which the activity is a part of
CDC9 SGD:S000002323
DNA ligase activity GO:0003909
Nucleus GO:0005634
DNA-dependent DNA replication GO:0006261
❏ Catalytic activity
❏ Ligase activity
❏ DNA ligase activity
❏ Biosynthetic process
❏ DNA replication
❏ DNA-dependent
DNA replication
❏ Cell part
❏ Nucleus
❏ Nucleoplasm
❏ Nuclear membrane
11k terms 4k terms 30k terms
@cmungall
id: GO:0043570
name: maintenance of
DNA repeat elements
id: GO:0006915
name: apoptosis
id: GO:0016446
name: somatic hypermutation of
immunoglobulin genes
Curation of
ancestral genes
(PANTHER
families)
Suzanna
Lewis
EGSB
Anushya
Muruganujan
USC
Use homology to help annotate, but
requires human annotation
Genes in chicken
responsible for
lactation
Image courtesy of Chris Mungall
Publish refined annotations
Start on the next project
Plug Apollo into your workflow
Search
Navigation
Export Annotations
All annotator’s listed per annotation
Start on the next project
Plug Apollo into your workflow
Duplicate and
Obsolete Genomes
Plug Apollo into your workflow
45
https://pypi.org/project/apollo
/
@erasche
@luke-c-sargent @Yating-
L
http://gonramp.wustl.edu/
@abretaud
$arrow groups create_group university
$arrow users get_users | 
jq '.[] | select(.username | 
contains(“@tamu.edu")) | .username' | 
xargs -n1 arrow users add_to_group
university
pip install pyarrow
pip install apollo
from apollo import ApolloInstance
wa = ApolloInstance('https://fqdn/apollo',
'jane.doe@fqdn.edu', 'password')
orgs = wa.organisms.add_organism(
"Yeast",
"/path/to/jbrowse/data")
https://toolshed.g2.bx.psu.edu/view/gga/suite_apollo/
Track and Variant Services
46
https://github.com/GMOD/GenomeFeatureComponent
Adding inline
variants
Summary
Annot
ators
Collaborative annotationCurators refine genome annotations
Integrates within Galaxy
Visual evidence and feedback
• Ian Holmes (JBrowse and Apollo PI)
Biomedical Engineering, UC Berkeley.
• Berkeley Bioinformatics Open-source
Projects (BBOP), Berkeley Lab: Apollo and
Gene Ontology teams.
• Apollo was supported by NIH grants 5R01GM080203 from
NIGMS, and 5R01HG004483 from NHGRI.
• Thanks to you and the Apollo and GMOD
Communities
JBrowse
Robert Buels
Colin Diesh*
Galaxy Genome
Annotation
Helena Rasche
Anthony Bretaudeau
G-OnRamp
Luke Sargent
Jeremy Goecks
Wilson Leung
Sally Elgin
Apollo: http://genomearchitect.readthedocs.io
https://github.org/GMOD/Apollo/
Thank You
Apollo
Nathan Dunn
GO
Deepak Unni*
Monica Munoz-
Torres*
Chris Mungall
Nomi Harris
Seth Carbon
Suzanna Lewis*
* Apollo Alumnus
GGA
QUESTIONS?
• https://annotation.usegalaxy.eu
• GMOD/Apollo
• AWS Marketplace
• Docker
Getting Apollo
docker run -it -v /jbrowse/directories:/data -v
empty_db_directory:/var/lib/postgresql -p
8888:8080 quay.io/gmod/apollo:latest
@bbop_apollo
• http://bit.ly/genome-architect
• apollo@lbl.gov (list)
• GMOD/Apollo/issues
• nathandunn@lbl.gov
Apollo comm
At poster session
● https://usegalaxy.org/ (US)
● https://usegalaxy.eu (EU)
● https://github.com/galaxyproject
● https://galaxyproject.org/
Getting Galaxy
http://bit.ly/genomes-in-action-nathan-dunn-2020
https://gep.wustl.edu/contact_us
http://geneontology.org/
http://jbrowse.org
https://galaxy-genome-annotation.github.io/
Extra Slides
“A beginner’s guide to eukaryotic genome
annotation” Yandell 2012 Nature Review Genetics
QUESTIONS?• https://annotation.usegalaxy.eu
• GMOD/Apollo
• AWS Marketplace
• Docker
Getting Apollo
docker run -it -v
/jbrowse/directories:/data -v
empty_db_directory:/var/lib/postgresql
-p 8888:8080 quay.io/gmod/apollo:latest
@bbop_apollo
• apollo@lbl.gov
• GMOD/Apollo/issues
• https://gitter.im/GMOD/jbrowse
Contacting
Future
• “Official” track (last published genome)
• Community need?
• JBrowse2: circos, synteny support
Scriptable Web Services
• Examples: Groovy, Perl, shell, Python
• Auto-generated REST API doc in Apollo
53
curl -d "{ 'operation': 'get_features',
‘track':'Group1.10','username':'ndunn@me.com',
'password':'demo'}"
http://localhost:8080/apollo/AnnotationEditorS
ervice
Create Organisms and Tracks on the Fly
@erasche
@deepakunni3
54
Evidence
Transcripts
(GFF3, GBK)
BAM Reads
Transcripts
(GFF3, GBK)
BigWig XY
BigWig
HeatMap
Themes
(dark/light)
Color CDS Frame
Automated
Annotation
Manual Annotation
Evidence Area (Genome Browser)
Quick and Easy Genome
Annotation Editing with Apollo
Precise descriptions of annotated genomes are vital for modeling the biological function of genomic elements. The ability of a researcher to visually
identify and review diverse sets of information such as genomic and transcriptome alignments, predictive models based on sequence profiles, and
predicted regulatory elements and repeat regions are essential for the iterative improvement of the modeling of genomic elements. During analysis,
researchers also do functional enrichment analysis (such as GO), and need to update functional annotations. Furthermore, as projects increasingly
include annotations of a growing number of organisms as well as geographically dispersed researchers, the ability to quickly integrate multiple
genomes, sources of evidence, annotations and researchers is essential. The Apollo genome annotation editor fills these needs by providing a
graphical platform for researchers to collaboratively review and revise the predicted features on a genome in real-time, similar to Google Docs.
Refinement of genome annotations is made efficient through several features including drag-and-drop editing, a large suite of automated structural
edit operations, the ability to pre-define curator comments and annotation status to maintain consistency, attribution of annotation authors, and a
visual history of revertible edits.
Here, we describe recent improvements that increase the efficient refinement of genome annotations. The first is the automated processing of
genomic evidence for annotation, reducing the need for command-line processing of genome annotation evidence. Creating annotation projects
can be done by simply uploading the genome's FASTA file. Similarly, genomic annotation evidence can be provided in most cases by uploading
GFF3, VCF, BigWig, and BAM files directly in most cases. The second is the ability to associate GO annotations to genome annotations and export
in formats such as GPAD or GPI. The third is the ability to predict the effect of individual variants to aid in the annotation of variants. Finally, we
demonstrate numerous UI improvements to make annotation editing faster and easier as well as the simplicity of launching Apollo from a simple
Docker command or via preconfigured Community AMI on Amazon cloud instances.
In addition to the simplified installation process, Apollo provides extensive web-services that allow it to be integrated with other web-based
environments. Apollo and its associated libraries allow numerous customizations, both within Apollo itself and via JBrowse, the genomic browser
Apollo is built upon (http://jbrowse.org), which has a large library of plugins (https://gmod.github.io/jbrowse-registry/).
Apollo is used in hundreds of genome annotation projects around the world, ranging from the annotation of a single species to lineage-specific
efforts supporting the annotation of dozens of genomes.
Source: https://github.com/GMOD/Apollo/
Documentation: http://genomearchitect.readthedocs.io/en/latest/
License: Open Source - 3-clause BSD License

More Related Content

PDF
Apollo provides collaborative genome annotation editing with the power of jbr...
PPTX
Light Intro to the Gene Ontology
PPTX
Ontology Development Kit: Bio-Ontologies 2019
PPTX
Experiences in the biosciences with the open biological ontologies foundry an...
PPTX
The Gene Ontology & Gene Ontology Annotation resources
PPTX
US2TS presentation on Gene Ontology
PDF
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
PPTX
Gene Ontology WormBase Workshop International Worm Meeting 2015
Apollo provides collaborative genome annotation editing with the power of jbr...
Light Intro to the Gene Ontology
Ontology Development Kit: Bio-Ontologies 2019
Experiences in the biosciences with the open biological ontologies foundry an...
The Gene Ontology & Gene Ontology Annotation resources
US2TS presentation on Gene Ontology
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Gene Ontology WormBase Workshop International Worm Meeting 2015

What's hot (20)

PPT
The Language of the Gene Ontology
PDF
An introduction to Web Apollo for the Biomphalaria glabatra research community.
PPTX
Plant Pathogen Genome Data: My Life In Sequences
PPT
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
PDF
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
 
PPTX
Web Apollo: Lessons learned from community-based biocuration efforts.
PDF
Web Apollo Workshop University of Exeter
PDF
Apollo Workshop at KSU 2015
PDF
Ensembl Browser Workshop
PPTX
All together now: piecing together the knowledge graph of life
PPTX
Munoz torres web-apollo-workshop_exeter-2014_ss
PPTX
TAIR -Using biological ontologies to accelerate progress in plant biology res...
PDF
GoTermsAnalysisWithR
PDF
ICAR 2015 Workshop - Nick Provart
PPTX
Ensembl annotation
PPTX
Computing on Phenotypes AMP 2015
PPT
Analysis and visualization of microarray experiment data integrating Pipeline...
PDF
Pathogen Genome Data
PDF
BM405 Lecture Slides 21/11/2014 University of Strathclyde
The Language of the Gene Ontology
An introduction to Web Apollo for the Biomphalaria glabatra research community.
Plant Pathogen Genome Data: My Life In Sequences
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo Workshop University of Exeter
Apollo Workshop at KSU 2015
Ensembl Browser Workshop
All together now: piecing together the knowledge graph of life
Munoz torres web-apollo-workshop_exeter-2014_ss
TAIR -Using biological ontologies to accelerate progress in plant biology res...
GoTermsAnalysisWithR
ICAR 2015 Workshop - Nick Provart
Ensembl annotation
Computing on Phenotypes AMP 2015
Analysis and visualization of microarray experiment data integrating Pipeline...
Pathogen Genome Data
BM405 Lecture Slides 21/11/2014 University of Strathclyde
Ad

Similar to Genome annotation with open source software: Apollo, Jbrowse and the GO in Galaxy (20)

PPTX
Collaboratively Creating the Knowledge Graph of Life
PDF
Web Apollo Workshop UIUC
PPTX
Three's a crowd-source: Observations on Collaborative Genome Annotation
PPTX
Web Apollo Tutorial for the i5K copepod research community.
PDF
Web Apollo at Genome Informatics 2014
PDF
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
PPTX
Introduction to Web Apollo for the i5K pilot species.
PPTX
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
PPT
Introduction to Ontologies for Environmental Biology
PDF
Web Apollo Tutorial for Medfly Research Community
PDF
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
PDF
Functional annotation of invertebrate genomes
PDF
Essential Requirements for Community Annotation Tools
PDF
Apollo Collaborative genome annotation editing
PPTX
Welch Wordifier Bosc2009
PDF
Apollo Workshop AGS2017 Introduction
PDF
Advanced Bioinformatics for Genomics and BioData Driven Research
PPT
Gene Ontology Project
PDF
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
PPTX
Computing on the shoulders of giants
Collaboratively Creating the Knowledge Graph of Life
Web Apollo Workshop UIUC
Three's a crowd-source: Observations on Collaborative Genome Annotation
Web Apollo Tutorial for the i5K copepod research community.
Web Apollo at Genome Informatics 2014
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Introduction to Web Apollo for the i5K pilot species.
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
Introduction to Ontologies for Environmental Biology
Web Apollo Tutorial for Medfly Research Community
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Functional annotation of invertebrate genomes
Essential Requirements for Community Annotation Tools
Apollo Collaborative genome annotation editing
Welch Wordifier Bosc2009
Apollo Workshop AGS2017 Introduction
Advanced Bioinformatics for Genomics and BioData Driven Research
Gene Ontology Project
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
Computing on the shoulders of giants
Ad

Recently uploaded (20)

PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
famous lake in india and its disturibution and importance
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
The scientific heritage No 166 (166) (2025)
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
Sciences of Europe No 170 (2025)
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Phytochemical Investigation of Miliusa longipes.pdf
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
AlphaEarth Foundations and the Satellite Embedding dataset
neck nodes and dissection types and lymph nodes levels
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
ECG_Course_Presentation د.محمد صقران ppt
Comparative Structure of Integument in Vertebrates.pptx
bbec55_b34400a7914c42429908233dbd381773.pdf
famous lake in india and its disturibution and importance
TOTAL hIP ARTHROPLASTY Presentation.pptx
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
Classification Systems_TAXONOMY_SCIENCE8.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Placing the Near-Earth Object Impact Probability in Context
The scientific heritage No 166 (166) (2025)
Introduction to Cardiovascular system_structure and functions-1
Sciences of Europe No 170 (2025)
cpcsea ppt.pptxssssssssssssssjjdjdndndddd

Genome annotation with open source software: Apollo, Jbrowse and the GO in Galaxy

  • 1. Genome annotation with open source software: Apollo, JBrowse, and GO in Galaxy Nathan A. Dunn1, Colin Diesh2, Helena Rasche3, Anthony Bretaudeau4, Rob Buels2, Ian Holmes2 (1) Lawrence Berkeley National Laboratory, Berkeley, CA, (2) Department of Bioengineering, Berkeley, CA, (3) University of Freiburg (4) INRA, BIPAA/GenOuest, France https://github.org/GMOD/Apollo/ @bbop_apollo nathandunn@lbl.gov http://bit.ly/genomes-in-action-nathan-dunn-2020
  • 2. What is Genome Annotation? Structural Annotation (what they look like): ● position and strand of genomic elements ● isoforms, exons, introns, coding region / UTR, repeats, etc. Functional Annotation (what they do): ● expression ● pathways ● gene family ● evolution ● Gene Ontology (molecular function, biological process, cellular components)
  • 3. Sequence to structure Structure to function http://geneontology.org/ What is Genome Annotation? @cmungall http://genomearchitect.org/
  • 4. Example Genome Annotation Pipeline 4 Experimental design, sampling Comparative analyses Curated gene set Manual annotation Sequencing Synthesis & dissemination Create assembly FGENESH Automated annotation
  • 5. Synthesis & dissemination Experimental design, sampling GMOD and Genome Annotation Comparative analyses Curated gene set Manual annotation Sequencing Create assembly FGENESH Automated annotation * * *
  • 6. What is Galaxy? • Workflow engine • Wrap bioinformatics tools • Reproducible • Shared workflows • Scalable • Examples of doing Genome Annotation with Galaxy: – Galaxy Genome Annotation – GEP / G-Onramp** Wilson Leung / Washu Sub Workflows Standard interface
  • 7. ● Provides many resources ● Available on Galaxy EU: https://annotation.usegalaxy.eu/ ● Provides annotation trainings: https://galaxyproject.github.io/training- material/topics/genome-annotation/ ● Provides docker images and python libraries: https://galaxy-genome-annotation.github.io/ ● More examples (arthropods, algae, phages): http://bit.ly/galaxycommunityconference- rasche-2019 @hexylena Galaxy Genome Annotation @abretaud @bguerning
  • 8. Undergrad Bacteriophage Annotation Course Genome sequence to publication Two Tracks: - Isolating phage from environment - Novel genome, de novo annotation
  • 9. Structural Annotation Output: Genome loaded into Apollo with evidence tracks, ready to annotate Input: Fasta Genome
  • 10. Functional Annotation Output: Updated genome in Apollo with new evidence tracks from Blast, Interproscan, etc. Input: Select organism from Apollo
  • 11. Genomics Education Partnership (GEP)** • 1300+ Undergraduates, 100+ schools, 100+ faculty • Looking for new members**: – https://gep.wustl.edu/contact_us • Built on G-OnRamp: http://g-onramp.org – Built on GMOD: Galaxy, JBrowse, Apollo, etc. • More here (Wilson Leung / Washu): https://wustl.app.box.com/v/pag2020-g-onramp-presentation
  • 14. Why are all the cool kids using Apollo?
  • 15. Automated Annotation is not Perfect 1 5 Automated Annotation • Assembly errors can cause fragmented annotations • Limited coverage makes precise identification difficult • Good to verify “numbers” visually Manual Annotation Extensive Error in the Number of Genes Inferred from Draft Genome Assemblies Denton et al 2014 Plos Comp Biology
  • 16. Human Analysis Automated Annotation Manual Curation Refines Genome Annotations 1 6 Experimental Evidence cDNAs, HMM domain searches, RNAseq, genes from other species. Manual Annotation • Human visualization finds problems numbers can’t • Include additional analysis as needed • Make use of the researcher’s expertise • Integrate all underlying evidence
  • 17. What are Apollo and JBrowse • Visualize genomic data • Read-only • Serverless • Can run in desktop • Built on JBrowse • Write genomic data • Used for refining genome annotations • Server + database • Authenticated users
  • 18. Run Apollo Start a project Add biological evidence Bring in the biologists Refine annotations Publish refined annotations Start on the next project Plug Apollo into your workflow docker run -it -v /jbrowse/directories:/data -v empty_db_directory:/var/lib/postgresql -p 8888:8080 quay.io/gmod/apollo:latest
  • 19. Run Apollo Start a project Add biological evidence Bring in the biologists Refine annotations Publish refined annotations Start on the next project Plug Apollo into your workflow Upload FASTA for new GenomesAdd hand-processed JBrowse directory Share “Public” organisms - OR - creates search index if present
  • 20. Add biological evidence Bring in the biologists Refine annotations Publish refined annotations Start on the next project Plug Apollo into your workflow Evidence Transcripts (GFF3, GBK) BAM Reads BigWig XY & HeatMap Themes (dark/light) Color CDS Frame Automated Annotation Manual Annotation
  • 21. 21 Dynamically Open Configure Multiple Tracks addStores={"url":{"type":"JBrowse/Store/SeqFeature/GFF3","urlTe mplate":"http://host/genes.gff"}}&addTracks=[{"label":"genes","ty pe":"JBrowse/View/Track/CanvasFeatures","store":"url"}] Append via URL Statically Configure https://gmod.github.io/jbrowse-registry/Customizable Views • BAM • BigWig • GFF • GTF • GBK • VCF • FASTA • FASTAi • SPARQL • custom types (e.g., REST end- point like mygene.info) JBrowse Supports Diverse Data
  • 22. Expansive JBrowse Plugin Registry https://gmod.github.io/jbrowse-registry/ +50 registered plugins @cmdcolin
  • 23. Track Panel - Just JBrowse tracks 23 Search Categories
  • 24. Add Evidence from Track Panel 24 Upload evidence directly and share without configuration Upload only locally from multiple sources
  • 25. Add biological evidence Bring in the biologists Refine annotations Publish refined annotations Start on the next project Plug Apollo into your workflow Annotators Annotators Annotators Biology is a team sport Manual tasks require more hands Researcher #1 Researcher #2 Researcher #3 Real-time collaboration
  • 26. Add biological evidence Bring in the biologists Refine annotations Publish refined annotations Start on the next project Plug Apollo into your workflow Add / Search Users Edit User Permission Use Groups to Manage Bulk Permissions • Edit user permissions • Create / edit organisms Added Instructor Role to Manage Organisms
  • 27. Reports and Config 27 Predefine Curation Terms Reports
  • 28. Refine annotations Publish refined annotations Start on the next project Plug Apollo into your workflow Genomic Editing Workspace Editing Area Refined Genomic Elements
  • 29. 29 Alignments edges shown in red Annotate other genomic types with drop-down Create New Refined Annotation Add annotation by dragging a genomic element Indicate non-canonical splice site
  • 30. 30 Search View / Edit Details List / Navigate Vertically List of Created Annotations
  • 32. Editing Annotations 32 Edit Additional Structural Data (right-click popup) Edit Associations • PubMed / dbxref • GO • Metadata • key/value • status • comments Change Annotation Type History of Structural Edits
  • 33. Annotation History 33 Revertible History of All Operations Current position Highlighted row shown Select any version to set to
  • 34. Editing Annotation Panel Define and select status Predefined comments Add and verify pubmed Can select predefined keys / values per organism and type
  • 35. 35 Search Panel Create Annotation from hit or download Click and highlight region Search directly from annotated sequence
  • 36. Annotate Reference Assembly Corrections 36 Alteration reflected as if correction merged
  • 37. 37 Variant Annotation Add Variant Annotation by Dragging a Genomic Element Copy / Edit Properties Export VCF View Variant Effect
  • 38. Gene Ontology (GO) Func Annotations • Typeahead lookups • Utilize GO BioLink API • Export GPAD2, GPI http://api.geneontology.org/api/
  • 39. What is an Ontology? ● A curated set of logical dictionaries ○ You can use the same word! ○ You can infer! ○ You can define exact relationships!
  • 40. The Gene Ontology is represented as a directed graph Image courtesy of Chris Mungall Ontology - 45k terms - 106k edges Biological Process Cellular Component Molecular Function
  • 41. Ontology terms for annotation of genes Molecular Function Cellular Component Biological Process Where the gene product performs its activity The molecular activity of the gene product The evolved biological program which the activity is a part of CDC9 SGD:S000002323 DNA ligase activity GO:0003909 Nucleus GO:0005634 DNA-dependent DNA replication GO:0006261 ❏ Catalytic activity ❏ Ligase activity ❏ DNA ligase activity ❏ Biosynthetic process ❏ DNA replication ❏ DNA-dependent DNA replication ❏ Cell part ❏ Nucleus ❏ Nucleoplasm ❏ Nuclear membrane 11k terms 4k terms 30k terms @cmungall
  • 42. id: GO:0043570 name: maintenance of DNA repeat elements id: GO:0006915 name: apoptosis id: GO:0016446 name: somatic hypermutation of immunoglobulin genes Curation of ancestral genes (PANTHER families) Suzanna Lewis EGSB Anushya Muruganujan USC Use homology to help annotate, but requires human annotation Genes in chicken responsible for lactation Image courtesy of Chris Mungall
  • 43. Publish refined annotations Start on the next project Plug Apollo into your workflow Search Navigation Export Annotations All annotator’s listed per annotation
  • 44. Start on the next project Plug Apollo into your workflow Duplicate and Obsolete Genomes
  • 45. Plug Apollo into your workflow 45 https://pypi.org/project/apollo / @erasche @luke-c-sargent @Yating- L http://gonramp.wustl.edu/ @abretaud $arrow groups create_group university $arrow users get_users | jq '.[] | select(.username | contains(“@tamu.edu")) | .username' | xargs -n1 arrow users add_to_group university pip install pyarrow pip install apollo from apollo import ApolloInstance wa = ApolloInstance('https://fqdn/apollo', 'jane.doe@fqdn.edu', 'password') orgs = wa.organisms.add_organism( "Yeast", "/path/to/jbrowse/data") https://toolshed.g2.bx.psu.edu/view/gga/suite_apollo/
  • 46. Track and Variant Services 46 https://github.com/GMOD/GenomeFeatureComponent Adding inline variants
  • 47. Summary Annot ators Collaborative annotationCurators refine genome annotations Integrates within Galaxy Visual evidence and feedback
  • 48. • Ian Holmes (JBrowse and Apollo PI) Biomedical Engineering, UC Berkeley. • Berkeley Bioinformatics Open-source Projects (BBOP), Berkeley Lab: Apollo and Gene Ontology teams. • Apollo was supported by NIH grants 5R01GM080203 from NIGMS, and 5R01HG004483 from NHGRI. • Thanks to you and the Apollo and GMOD Communities JBrowse Robert Buels Colin Diesh* Galaxy Genome Annotation Helena Rasche Anthony Bretaudeau G-OnRamp Luke Sargent Jeremy Goecks Wilson Leung Sally Elgin Apollo: http://genomearchitect.readthedocs.io https://github.org/GMOD/Apollo/ Thank You Apollo Nathan Dunn GO Deepak Unni* Monica Munoz- Torres* Chris Mungall Nomi Harris Seth Carbon Suzanna Lewis* * Apollo Alumnus GGA
  • 49. QUESTIONS? • https://annotation.usegalaxy.eu • GMOD/Apollo • AWS Marketplace • Docker Getting Apollo docker run -it -v /jbrowse/directories:/data -v empty_db_directory:/var/lib/postgresql -p 8888:8080 quay.io/gmod/apollo:latest @bbop_apollo • http://bit.ly/genome-architect • apollo@lbl.gov (list) • GMOD/Apollo/issues • nathandunn@lbl.gov Apollo comm At poster session ● https://usegalaxy.org/ (US) ● https://usegalaxy.eu (EU) ● https://github.com/galaxyproject ● https://galaxyproject.org/ Getting Galaxy http://bit.ly/genomes-in-action-nathan-dunn-2020 https://gep.wustl.edu/contact_us http://geneontology.org/ http://jbrowse.org https://galaxy-genome-annotation.github.io/
  • 51. “A beginner’s guide to eukaryotic genome annotation” Yandell 2012 Nature Review Genetics
  • 52. QUESTIONS?• https://annotation.usegalaxy.eu • GMOD/Apollo • AWS Marketplace • Docker Getting Apollo docker run -it -v /jbrowse/directories:/data -v empty_db_directory:/var/lib/postgresql -p 8888:8080 quay.io/gmod/apollo:latest @bbop_apollo • apollo@lbl.gov • GMOD/Apollo/issues • https://gitter.im/GMOD/jbrowse Contacting Future • “Official” track (last published genome) • Community need? • JBrowse2: circos, synteny support
  • 53. Scriptable Web Services • Examples: Groovy, Perl, shell, Python • Auto-generated REST API doc in Apollo 53 curl -d "{ 'operation': 'get_features', ‘track':'Group1.10','username':'ndunn@me.com', 'password':'demo'}" http://localhost:8080/apollo/AnnotationEditorS ervice Create Organisms and Tracks on the Fly @erasche @deepakunni3
  • 54. 54 Evidence Transcripts (GFF3, GBK) BAM Reads Transcripts (GFF3, GBK) BigWig XY BigWig HeatMap Themes (dark/light) Color CDS Frame Automated Annotation Manual Annotation Evidence Area (Genome Browser)
  • 55. Quick and Easy Genome Annotation Editing with Apollo Precise descriptions of annotated genomes are vital for modeling the biological function of genomic elements. The ability of a researcher to visually identify and review diverse sets of information such as genomic and transcriptome alignments, predictive models based on sequence profiles, and predicted regulatory elements and repeat regions are essential for the iterative improvement of the modeling of genomic elements. During analysis, researchers also do functional enrichment analysis (such as GO), and need to update functional annotations. Furthermore, as projects increasingly include annotations of a growing number of organisms as well as geographically dispersed researchers, the ability to quickly integrate multiple genomes, sources of evidence, annotations and researchers is essential. The Apollo genome annotation editor fills these needs by providing a graphical platform for researchers to collaboratively review and revise the predicted features on a genome in real-time, similar to Google Docs. Refinement of genome annotations is made efficient through several features including drag-and-drop editing, a large suite of automated structural edit operations, the ability to pre-define curator comments and annotation status to maintain consistency, attribution of annotation authors, and a visual history of revertible edits. Here, we describe recent improvements that increase the efficient refinement of genome annotations. The first is the automated processing of genomic evidence for annotation, reducing the need for command-line processing of genome annotation evidence. Creating annotation projects can be done by simply uploading the genome's FASTA file. Similarly, genomic annotation evidence can be provided in most cases by uploading GFF3, VCF, BigWig, and BAM files directly in most cases. The second is the ability to associate GO annotations to genome annotations and export in formats such as GPAD or GPI. The third is the ability to predict the effect of individual variants to aid in the annotation of variants. Finally, we demonstrate numerous UI improvements to make annotation editing faster and easier as well as the simplicity of launching Apollo from a simple Docker command or via preconfigured Community AMI on Amazon cloud instances. In addition to the simplified installation process, Apollo provides extensive web-services that allow it to be integrated with other web-based environments. Apollo and its associated libraries allow numerous customizations, both within Apollo itself and via JBrowse, the genomic browser Apollo is built upon (http://jbrowse.org), which has a large library of plugins (https://gmod.github.io/jbrowse-registry/). Apollo is used in hundreds of genome annotation projects around the world, ranging from the annotation of a single species to lineage-specific efforts supporting the annotation of dozens of genomes. Source: https://github.com/GMOD/Apollo/ Documentation: http://genomearchitect.readthedocs.io/en/latest/ License: Open Source - 3-clause BSD License