SlideShare a Scribd company logo
An introduction to Web Apollo.
A webinar for the Biomphalaria glabrata research
community.
Monica Munoz-Torres, PhD | @monimunozto
Berkeley Bioinformatics Open-Source Projects (BBOP)
Genomics Division, Lawrence Berkeley National Laboratory
18 June, 2014
UNIVERSITY OF
CALIFORNIA
Outline
1.  What is Web Apollo?:
• Definition & working concept.
2.  Our Experience With Community
Based Curation.
3.  The Manual Annotation Process.
4.  Becoming acquainted with Web
Apollo.
An introduction to
Web Apollo.
A webinar for the
Biomphalaria glabrata
research community.
Outline 2
What is Web Apollo?
•  Web Apollo is a web-based, collaborative genomic
annotation editing platform.
We	
  need	
  annota)on	
  edi)ng	
  tools	
  to	
  modify	
  and	
  refine	
  the	
  
precise	
  loca)on	
  and	
  structure	
  of	
  the	
  genome	
  elements	
  that	
  
predic)ve	
  algorithms	
  cannot	
  yet	
  resolve	
  automa)cally.
31. What is Web Apollo?
Find more about Web Apollo at
http://GenomeArchitect.org
and
Genome Biol 14:R93. (2013).
Brief history of Apollo*:
a. Desktop:
one person at a time editing a
specific region, annotations
saved in local files; slowed down
collaboration.
b. Java Web Start:
users saved annotations directly
to a centralized database;
potential issues with stale
annotation data remained.
1. What is Web Apollo? 4
Biologists could finally visualize computational analyses and
experimental evidence from genomic features and build
manually-curated consensus gene structures. Apollo became a
very popular, open source tool (insects, fish, mammals, birds, etc.).
*
Web Apollo
•  Browser-based tool integrated with JBrowse.
•  Two new tracks: “Annotation” and “DNA Sequence”
•  Allows for intuitive annotation creation and editing,
with gestures and pull-down menus to create and
modify transcripts and exons
structures, insert comments
(CV, freeform text), etc.
•  Customizable look & feel.
•  Edits in one client are
instantly pushed to all other
clients: Collaborative!
1. What is Web Apollo? 5
Working
Concept
In the context of gene manual annotation,
curation tries to find the best examples
and/or eliminate most errors.
To conduct manual annotation efforts:
Gather and evaluate all available evidence
using quality-control metrics to
corroborate or modify automated
annotation predictions.
Perform sequence similarity searches
(phylogenetic framework) and use
literature and public databases to:
• Predict functional assignments from
experimental data.
• Distinguish orthologs from paralogs,
and classify gene membership in
families and networks.
2. In our experience. 6
Automated gene models
Evidence:
cDNAs, HMM domain searches,
alignments with assemblies or
genes from other species.
Manual annotation & curation
Dispersed, community-based gene
manual annotation efforts.
We continuously train and support
hundreds of geographically dispersed
scientists from many research
communities, to perform biologically
supported manual annotations using
Web Apollo.
– Gate keepers and monitoring.
– Written tutorials.
– Training workshops and geneborees.
– Personalized user support.
2. In our experience. 7
What we have learned.
Harvesting expertise from dispersed researchers who
assigned functions to predicted and curated peptides
we have developed more interactive and
responsive tools, as well as better visualization,
editing, and analysis capabilities.
82. In our experience.
http://people.csail.mit.edu/fredo/PUBLI/Drawing/
Collaborative Efforts Improved
Automated Annotations*
In many cases, automated annotations have been
improved (e.g: Apis mellifera. Elsik et al. BMC Genomics 2014, 15:86).
Also, learned of the challenges of newer sequencing
technologies, e.g.:
– Frameshifts and indel errors
– Split genes across scaffolds
– Highly repetitive sequences
To face these challenges, we train annotators in
recovering coding sequences in agreement with all
available biological evidence.
92. In our experience.
It is helpful to work together.
Scientific community efforts bring together domain-
specific and natural history expertise that would
otherwise remain disconnected.
Breaking down large amounts of data into
manageable portions and mobilizing groups
of researchers to extract the most accurate
representation of the biology from all
available data distills invaluable
knowledge from genome analysis.
102. In our experience.
Understanding the evolution of sociality
Comparing the genomes of 7 species of ants
contributed to a better understanding of the
evolution and organization of insect societies
at the molecular level.
Insights drawn mainly from six core aspects of
ant biology:
1.  Alternative morphological castes
2.  Division of labor
3.  Chemical Communication
4.  Alternative social organization
5.  Social immunity
6.  Mutualism
11
Libbrecht et al. 2012. Genome Biology 2013, 14:212
2. In our experience.
Atta cephalotes (above) and Harpegnathos saltator.
©alexanderwild.com
Groups of
communities
continue to guide
our efforts.
A little training goes a long way!
With the right tools, wet lab scientists make exceptional
curators who can easily learn to maximize the
generation of accurate, biologically supported gene
models.
122. In our experience.
Manual
Annotation
How do we get there?
13
Assembly
Manual
annotation
Experimental
validation
Automated
Annotation
In a genome sequencing project…
3. How do we get there?
Gene Prediction
Identification of protein-coding genes, tRNAs, rRNAs,
regulatory motifs, repetitive elements (masked), etc.
- Ab initio (DNA composition): Augustus, GENSCAN,
geneid, fgenesh
- Homology-based: E.g: SGP2, fgenesh++
14
Nucleic Acids 2003 vol. 31 no. 13 3738-3741
3. How do we get there?
Gene Annotation
Integration of data from prediction tools to generate a
consensus set of predictions or gene models.
•  Models may be organized using:
-  automatic integration of predicted sets; e.g: GLEAN
-  packaging necessary tools into pipeline; e.g: MAKER
•  All available biological evidence (e.g. transcriptomes) further
informs the annotation process.
153. How do we get there?
In some cases algorithms and metrics used to generate
consensus sets may actually reduce the accuracy of the
gene’s representation; in such cases it is usually better to
use an ab initio model to create a new annotation.
Manual Genome Annotation
•  Identifies elements that best represent the underlying
biology.
•  Eliminates elements that reflect the systemic errors of
automated genome analyses.
•  Determines functional roles through comparative
analysis of well-studied, phylogenetically similar
genome elements using literature, databases, and
the researcher’s experience.
163. How do we get there?
Curation Process is Necessary
1.  A computationally predicted consensus gene set is
generated using multiple lines of evidence.
2.  Manual annotation takes place.
3.  Ideally consensus computational predictions will be
integrated with manual annotations to produce an
updated Official Gene Set (OGS).
Otherwise, “incorrect and incomplete genome annotations
will poison every experiment that uses them”.
- M. Yandell.
173. How do we get there?
Web Apollo
Sort
Web Apollo
19
The Sequence Selection Window
4. Becoming Acquainted with Web Apollo.
19
Navigation tools:
pan and zoom Search box: go
to a scaffold or
a gene model.
Grey bar of coordinates
indicates location. You can
also select here in order to
zoom to a sub-region.
‘View’: change
color by CDS,
toggle strands,
set highlight.
‘File’:
Upload your own
evidence: GFF3,
BAM, BigWig, VCF*.
Add combination
and sequence
search tracks.
‘Tools’:
Use BLAT to query the
genome with a protein
or DNA sequence.
Available Tracks
Evidence Tracks Area
‘User-created Annotations’ Track
Login
Web Apollo
20
Graphical User Interface (GUI) for editing annotations
4. Becoming Acquainted with Web Apollo.
Flags non-
canonical splice
sites.
Selection of features and
sub-features
Edge-matching
Evidence Tracks Area
‘User-created Annotations’ Track
The editing logic in the server:
§  selects longest ORF as CDS
§  flags non-canonical splice sites
21	
Web Apollo
4. Becoming Acquainted with Web Apollo.
21
DNA Track
‘User-created Annotations’ Track
Web Apollo
22	
4. Becoming Acquainted with Web Apollo.
§  There are two new kinds of tracks for:
§  annotation editing
§  sequence alteration editing
Web Apollo
23	
Annotations, annotation edits, and History: stored in a centralized database.
4. Becoming Acquainted with Web Apollo.
23
Web Apollo
24	
4. Becoming Acquainted with Web Apollo.
24
•  DBXRefs
•  PubMed IDs
•  GO terms
•  Comments
The Information Editor
Additional Functionality
In addition to protein-coding gene annotation that you know and love.
•  Non-coding genes: ncRNAs, miRNAs, repeat regions, and TEs
•  Sequence alterations (less coverage = more fragmentation)
•  Visualization of stage and cell-type specific transcription data as
coverage plots, heat maps, and alignments
25	
4. Becoming Acquainted with Web Apollo.
25
1.  Select a chromosomal region of interest, e.g. scaffold.
2.  Select appropriate evidence tracks.
3.  Determine whether a feature in an existing evidence track will
provide a reasonable gene model to start working.
-  If yes: select and drag the feature to the ‘User-created
Annotations’ area, creating an initial gene model. If necessary
use editing functions to adjust the gene model.
-  If not: let’s talk.
4.  Check your edited gene model for integrity and accuracy by
comparing it with available homologs.
4. Becoming Acquainted with Web Apollo
General Process of Curation
26 |
Always remember: when annotating gene models using Web
Apollo, you are looking at a ‘frozen’ version of the genome
assembly and you will not be able to modify the assembly itself.
26
Example: NADH dehydrogenase subunit 5
Live Demonstration using the Apis mellifera and Biomphalaria
glabrata genomes.
Example 27
A public Honey Bee Web Apollo Demo is available at
http://genomearchitect.org/WebApolloDemo
Arthropod-centric Thanks!
AgriPest Base
FlyBase
Hymenoptera Genome Database
VectorBase
Acromyrmex echinatior
Acyrthosiphon pisum
Apis mellifera
Atta cephalotes
Bombus terrestris
Camponotus floridanus
Helicoverpa armigera
Linepithema humile
Manduca sexta
Mayetiola destructor
Nasonia vitripennis
Pogonomyrmex barbatus
Solenopsis invicta
Tribolium castaneum…and many more!
28	
28
Thank you.
Thanks!
•  Berkeley Bioinformatics Open-source Projects
(BBOP), Berkeley Lab: Web Apollo and Gene
Ontology teams. Suzanna E. Lewis (PI).
•  Elsik Lab. § University of Missouri. Christine G.
Elsik (PI).
•  Ian Holmes (PI). * University of California Berkeley.
•  Arthropod genomics community, i5K
http://www.arthropodgenomes.org/wiki/i5K Steering
Committee, Teams at USDA/NAL, HGSC-BCM,
BGI, and 1KITE http://www.1kite.org/.
•  Web Apollo is supported by NIH grants
5R01GM080203 from NIGMS, and 5R01HG004483
from NHGRI, and by the Director, Office of Science,
Office of Basic Energy Sciences, of the U.S.
Department of Energy under Contract No. DE-
AC02-05CH11231.
•  Insect images used with permission:
http://AlexanderWild.com
•  For your attention, thank you!
Thank you. 29
Web Apollo
Ed Lee
Gregg Helt
Colin Diesh §
Deepak Unni §
Rob Buels *
Gene Ontology
Chris Mungall
Seth Carbon
Heiko Dietze
BBOP
Web Apollo: http://GenomeArchitect.org
GO: http://GeneOntology.org
i5K: http://arthropodgenomes.org/wiki/i5K

More Related Content

PDF
Web Apollo Workshop UIUC
PDF
Web Apollo Tutorial for Medfly Research Community
PPTX
Web Apollo Tutorial for the i5K copepod research community.
PPTX
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
PDF
Apollo Workshop at KSU 2015
PPTX
Munoz torres web-apollo-workshop_exeter-2014_ss
PDF
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
PDF
Web Apollo Workshop University of Exeter
Web Apollo Workshop UIUC
Web Apollo Tutorial for Medfly Research Community
Web Apollo Tutorial for the i5K copepod research community.
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
Apollo Workshop at KSU 2015
Munoz torres web-apollo-workshop_exeter-2014_ss
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Web Apollo Workshop University of Exeter

What's hot (20)

PDF
Web Apollo at Genome Informatics 2014
PDF
Curation Introduction - Apollo Workshop
PDF
Apollo Collaborative genome annotation editing
PDF
Advanced Bioinformatics for Genomics and BioData Driven Research
PDF
Editing Functionality - Apollo Workshop
PPTX
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
PPTX
2014 sage-talk
PDF
Apollo Workshop AGS2017 Editing functionality
PPTX
2014 bangkok-talk
PDF
TGAC Browser bosc 2014
PPTX
The Gene Ontology & Gene Ontology Annotation resources
PPT
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
PDF
Ontologies for life sciences: examples from the gene ontology
PDF
Apollo provides collaborative genome annotation editing with the power of jbr...
PDF
Intro to metagenomic binning
PPT
Gene Ontology Project
PDF
Metabolic Network Analysis
PDF
Oxford DTP - Sansone curation tools - Dec 2014
PDF
Apollo Workshop AGS2017 Introduction
PPTX
2013 nas-ehs-data-integration-dc
Web Apollo at Genome Informatics 2014
Curation Introduction - Apollo Workshop
Apollo Collaborative genome annotation editing
Advanced Bioinformatics for Genomics and BioData Driven Research
Editing Functionality - Apollo Workshop
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
2014 sage-talk
Apollo Workshop AGS2017 Editing functionality
2014 bangkok-talk
TGAC Browser bosc 2014
The Gene Ontology & Gene Ontology Annotation resources
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
Ontologies for life sciences: examples from the gene ontology
Apollo provides collaborative genome annotation editing with the power of jbr...
Intro to metagenomic binning
Gene Ontology Project
Metabolic Network Analysis
Oxford DTP - Sansone curation tools - Dec 2014
Apollo Workshop AGS2017 Introduction
2013 nas-ehs-data-integration-dc
Ad

Similar to An introduction to Web Apollo for the Biomphalaria glabatra research community. (20)

PPTX
Introduction to Web Apollo for the i5K pilot species.
PPTX
Three's a crowd-source: Observations on Collaborative Genome Annotation
PPTX
Web Apollo: Lessons learned from community-based biocuration efforts.
PDF
Introduction to Apollo - i5k Research Community – Calanoida (copepod)
PDF
Introduction to Apollo for i5k
PPTX
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
PPTX
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
PDF
Apollo annotation guidelines for i5k projects Diaphorina citri
PDF
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
PDF
Introduction to Apollo: A webinar for the i5K Research Community
PDF
Apollo: Improving Collaborative Genome Curation
PDF
Apollo - A webinar for the Phascolarctos cinereus research community
PDF
Apollo : A workshop for the Manakin Research Coordination Network
PDF
(Digital resources) quick and easy genome annotation editing with apollo
PPTX
2016 bergen-sars
PDF
Apollo Introduction for i5K Groups 2015-10-07
PDF
Apollo Introduction for the Chestnut Research Community
PPTX
2014 ucl
PDF
Apollo — Collaborative and Scalable Manual Genome Annotation
PDF
Introduction to Apollo: i5K E affinis
Introduction to Web Apollo for the i5K pilot species.
Three's a crowd-source: Observations on Collaborative Genome Annotation
Web Apollo: Lessons learned from community-based biocuration efforts.
Introduction to Apollo - i5k Research Community – Calanoida (copepod)
Introduction to Apollo for i5k
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
Apollo annotation guidelines for i5k projects Diaphorina citri
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Introduction to Apollo: A webinar for the i5K Research Community
Apollo: Improving Collaborative Genome Curation
Apollo - A webinar for the Phascolarctos cinereus research community
Apollo : A workshop for the Manakin Research Coordination Network
(Digital resources) quick and easy genome annotation editing with apollo
2016 bergen-sars
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for the Chestnut Research Community
2014 ucl
Apollo — Collaborative and Scalable Manual Genome Annotation
Introduction to Apollo: i5K E affinis
Ad

More from Monica Munoz-Torres (12)

PDF
Apollo Exercises Kansas State University 2015
PDF
JBrowse & Apollo Overview - for AGR
PDF
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...
PDF
Gene Ontology Consortium: Website & COmmunity
PDF
Essential Requirements for Community Annotation Tools
PDF
Genome Curation using Apollo - Workshop at UTK
PDF
Genome Curation using Apollo
PDF
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcional
PDF
Apolo Taller en BIOS
PDF
PAINT Family PTHR13451-MUS81
PDF
Data Visualization And Annotation Workshop at Biocuration 2015
PDF
Apollo: developers call 2015-02-05
Apollo Exercises Kansas State University 2015
JBrowse & Apollo Overview - for AGR
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...
Gene Ontology Consortium: Website & COmmunity
Essential Requirements for Community Annotation Tools
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcional
Apolo Taller en BIOS
PAINT Family PTHR13451-MUS81
Data Visualization And Annotation Workshop at Biocuration 2015
Apollo: developers call 2015-02-05

Recently uploaded (20)

PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Basic Mud Logging Guide for educational purpose
PDF
Sports Quiz easy sports quiz sports quiz
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
01-Introduction-to-Information-Management.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
master seminar digital applications in india
PPTX
Cell Structure & Organelles in detailed.
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
RMMM.pdf make it easy to upload and study
PPTX
Cell Types and Its function , kingdom of life
PDF
Complications of Minimal Access Surgery at WLH
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Classroom Observation Tools for Teachers
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
102 student loan defaulters named and shamed – Is someone you know on the list?
Basic Mud Logging Guide for educational purpose
Sports Quiz easy sports quiz sports quiz
O5-L3 Freight Transport Ops (International) V1.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
STATICS OF THE RIGID BODIES Hibbelers.pdf
01-Introduction-to-Information-Management.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
master seminar digital applications in india
Cell Structure & Organelles in detailed.
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
RMMM.pdf make it easy to upload and study
Cell Types and Its function , kingdom of life
Complications of Minimal Access Surgery at WLH
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Renaissance Architecture: A Journey from Faith to Humanism
Classroom Observation Tools for Teachers

An introduction to Web Apollo for the Biomphalaria glabatra research community.

  • 1. An introduction to Web Apollo. A webinar for the Biomphalaria glabrata research community. Monica Munoz-Torres, PhD | @monimunozto Berkeley Bioinformatics Open-Source Projects (BBOP) Genomics Division, Lawrence Berkeley National Laboratory 18 June, 2014 UNIVERSITY OF CALIFORNIA
  • 2. Outline 1.  What is Web Apollo?: • Definition & working concept. 2.  Our Experience With Community Based Curation. 3.  The Manual Annotation Process. 4.  Becoming acquainted with Web Apollo. An introduction to Web Apollo. A webinar for the Biomphalaria glabrata research community. Outline 2
  • 3. What is Web Apollo? •  Web Apollo is a web-based, collaborative genomic annotation editing platform. We  need  annota)on  edi)ng  tools  to  modify  and  refine  the   precise  loca)on  and  structure  of  the  genome  elements  that   predic)ve  algorithms  cannot  yet  resolve  automa)cally. 31. What is Web Apollo? Find more about Web Apollo at http://GenomeArchitect.org and Genome Biol 14:R93. (2013).
  • 4. Brief history of Apollo*: a. Desktop: one person at a time editing a specific region, annotations saved in local files; slowed down collaboration. b. Java Web Start: users saved annotations directly to a centralized database; potential issues with stale annotation data remained. 1. What is Web Apollo? 4 Biologists could finally visualize computational analyses and experimental evidence from genomic features and build manually-curated consensus gene structures. Apollo became a very popular, open source tool (insects, fish, mammals, birds, etc.). *
  • 5. Web Apollo •  Browser-based tool integrated with JBrowse. •  Two new tracks: “Annotation” and “DNA Sequence” •  Allows for intuitive annotation creation and editing, with gestures and pull-down menus to create and modify transcripts and exons structures, insert comments (CV, freeform text), etc. •  Customizable look & feel. •  Edits in one client are instantly pushed to all other clients: Collaborative! 1. What is Web Apollo? 5
  • 6. Working Concept In the context of gene manual annotation, curation tries to find the best examples and/or eliminate most errors. To conduct manual annotation efforts: Gather and evaluate all available evidence using quality-control metrics to corroborate or modify automated annotation predictions. Perform sequence similarity searches (phylogenetic framework) and use literature and public databases to: • Predict functional assignments from experimental data. • Distinguish orthologs from paralogs, and classify gene membership in families and networks. 2. In our experience. 6 Automated gene models Evidence: cDNAs, HMM domain searches, alignments with assemblies or genes from other species. Manual annotation & curation
  • 7. Dispersed, community-based gene manual annotation efforts. We continuously train and support hundreds of geographically dispersed scientists from many research communities, to perform biologically supported manual annotations using Web Apollo. – Gate keepers and monitoring. – Written tutorials. – Training workshops and geneborees. – Personalized user support. 2. In our experience. 7
  • 8. What we have learned. Harvesting expertise from dispersed researchers who assigned functions to predicted and curated peptides we have developed more interactive and responsive tools, as well as better visualization, editing, and analysis capabilities. 82. In our experience. http://people.csail.mit.edu/fredo/PUBLI/Drawing/
  • 9. Collaborative Efforts Improved Automated Annotations* In many cases, automated annotations have been improved (e.g: Apis mellifera. Elsik et al. BMC Genomics 2014, 15:86). Also, learned of the challenges of newer sequencing technologies, e.g.: – Frameshifts and indel errors – Split genes across scaffolds – Highly repetitive sequences To face these challenges, we train annotators in recovering coding sequences in agreement with all available biological evidence. 92. In our experience.
  • 10. It is helpful to work together. Scientific community efforts bring together domain- specific and natural history expertise that would otherwise remain disconnected. Breaking down large amounts of data into manageable portions and mobilizing groups of researchers to extract the most accurate representation of the biology from all available data distills invaluable knowledge from genome analysis. 102. In our experience.
  • 11. Understanding the evolution of sociality Comparing the genomes of 7 species of ants contributed to a better understanding of the evolution and organization of insect societies at the molecular level. Insights drawn mainly from six core aspects of ant biology: 1.  Alternative morphological castes 2.  Division of labor 3.  Chemical Communication 4.  Alternative social organization 5.  Social immunity 6.  Mutualism 11 Libbrecht et al. 2012. Genome Biology 2013, 14:212 2. In our experience. Atta cephalotes (above) and Harpegnathos saltator. ©alexanderwild.com Groups of communities continue to guide our efforts.
  • 12. A little training goes a long way! With the right tools, wet lab scientists make exceptional curators who can easily learn to maximize the generation of accurate, biologically supported gene models. 122. In our experience.
  • 13. Manual Annotation How do we get there? 13 Assembly Manual annotation Experimental validation Automated Annotation In a genome sequencing project… 3. How do we get there?
  • 14. Gene Prediction Identification of protein-coding genes, tRNAs, rRNAs, regulatory motifs, repetitive elements (masked), etc. - Ab initio (DNA composition): Augustus, GENSCAN, geneid, fgenesh - Homology-based: E.g: SGP2, fgenesh++ 14 Nucleic Acids 2003 vol. 31 no. 13 3738-3741 3. How do we get there?
  • 15. Gene Annotation Integration of data from prediction tools to generate a consensus set of predictions or gene models. •  Models may be organized using: -  automatic integration of predicted sets; e.g: GLEAN -  packaging necessary tools into pipeline; e.g: MAKER •  All available biological evidence (e.g. transcriptomes) further informs the annotation process. 153. How do we get there? In some cases algorithms and metrics used to generate consensus sets may actually reduce the accuracy of the gene’s representation; in such cases it is usually better to use an ab initio model to create a new annotation.
  • 16. Manual Genome Annotation •  Identifies elements that best represent the underlying biology. •  Eliminates elements that reflect the systemic errors of automated genome analyses. •  Determines functional roles through comparative analysis of well-studied, phylogenetically similar genome elements using literature, databases, and the researcher’s experience. 163. How do we get there?
  • 17. Curation Process is Necessary 1.  A computationally predicted consensus gene set is generated using multiple lines of evidence. 2.  Manual annotation takes place. 3.  Ideally consensus computational predictions will be integrated with manual annotations to produce an updated Official Gene Set (OGS). Otherwise, “incorrect and incomplete genome annotations will poison every experiment that uses them”. - M. Yandell. 173. How do we get there?
  • 19. Sort Web Apollo 19 The Sequence Selection Window 4. Becoming Acquainted with Web Apollo. 19
  • 20. Navigation tools: pan and zoom Search box: go to a scaffold or a gene model. Grey bar of coordinates indicates location. You can also select here in order to zoom to a sub-region. ‘View’: change color by CDS, toggle strands, set highlight. ‘File’: Upload your own evidence: GFF3, BAM, BigWig, VCF*. Add combination and sequence search tracks. ‘Tools’: Use BLAT to query the genome with a protein or DNA sequence. Available Tracks Evidence Tracks Area ‘User-created Annotations’ Track Login Web Apollo 20 Graphical User Interface (GUI) for editing annotations 4. Becoming Acquainted with Web Apollo.
  • 21. Flags non- canonical splice sites. Selection of features and sub-features Edge-matching Evidence Tracks Area ‘User-created Annotations’ Track The editing logic in the server: §  selects longest ORF as CDS §  flags non-canonical splice sites 21 Web Apollo 4. Becoming Acquainted with Web Apollo. 21
  • 22. DNA Track ‘User-created Annotations’ Track Web Apollo 22 4. Becoming Acquainted with Web Apollo. §  There are two new kinds of tracks for: §  annotation editing §  sequence alteration editing
  • 23. Web Apollo 23 Annotations, annotation edits, and History: stored in a centralized database. 4. Becoming Acquainted with Web Apollo. 23
  • 24. Web Apollo 24 4. Becoming Acquainted with Web Apollo. 24 •  DBXRefs •  PubMed IDs •  GO terms •  Comments The Information Editor
  • 25. Additional Functionality In addition to protein-coding gene annotation that you know and love. •  Non-coding genes: ncRNAs, miRNAs, repeat regions, and TEs •  Sequence alterations (less coverage = more fragmentation) •  Visualization of stage and cell-type specific transcription data as coverage plots, heat maps, and alignments 25 4. Becoming Acquainted with Web Apollo. 25
  • 26. 1.  Select a chromosomal region of interest, e.g. scaffold. 2.  Select appropriate evidence tracks. 3.  Determine whether a feature in an existing evidence track will provide a reasonable gene model to start working. -  If yes: select and drag the feature to the ‘User-created Annotations’ area, creating an initial gene model. If necessary use editing functions to adjust the gene model. -  If not: let’s talk. 4.  Check your edited gene model for integrity and accuracy by comparing it with available homologs. 4. Becoming Acquainted with Web Apollo General Process of Curation 26 | Always remember: when annotating gene models using Web Apollo, you are looking at a ‘frozen’ version of the genome assembly and you will not be able to modify the assembly itself. 26
  • 27. Example: NADH dehydrogenase subunit 5 Live Demonstration using the Apis mellifera and Biomphalaria glabrata genomes. Example 27 A public Honey Bee Web Apollo Demo is available at http://genomearchitect.org/WebApolloDemo
  • 28. Arthropod-centric Thanks! AgriPest Base FlyBase Hymenoptera Genome Database VectorBase Acromyrmex echinatior Acyrthosiphon pisum Apis mellifera Atta cephalotes Bombus terrestris Camponotus floridanus Helicoverpa armigera Linepithema humile Manduca sexta Mayetiola destructor Nasonia vitripennis Pogonomyrmex barbatus Solenopsis invicta Tribolium castaneum…and many more! 28 28 Thank you.
  • 29. Thanks! •  Berkeley Bioinformatics Open-source Projects (BBOP), Berkeley Lab: Web Apollo and Gene Ontology teams. Suzanna E. Lewis (PI). •  Elsik Lab. § University of Missouri. Christine G. Elsik (PI). •  Ian Holmes (PI). * University of California Berkeley. •  Arthropod genomics community, i5K http://www.arthropodgenomes.org/wiki/i5K Steering Committee, Teams at USDA/NAL, HGSC-BCM, BGI, and 1KITE http://www.1kite.org/. •  Web Apollo is supported by NIH grants 5R01GM080203 from NIGMS, and 5R01HG004483 from NHGRI, and by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE- AC02-05CH11231. •  Insect images used with permission: http://AlexanderWild.com •  For your attention, thank you! Thank you. 29 Web Apollo Ed Lee Gregg Helt Colin Diesh § Deepak Unni § Rob Buels * Gene Ontology Chris Mungall Seth Carbon Heiko Dietze BBOP Web Apollo: http://GenomeArchitect.org GO: http://GeneOntology.org i5K: http://arthropodgenomes.org/wiki/i5K