SlideShare a Scribd company logo
An introduction to Web Apollo.
A webinar for the i5K Pilot Species Projects - Hemiptera
Monica Munoz-Torres, PhD
Biocurator & Bioinformatics Analyst | @monimunozto
Genomics Division, Lawrence Berkeley National Laboratory
12+1 May, 2014
UNIVERSITY OF
CALIFORNIA
Outline
1. What is Web Apollo?:
• Definition & working concept.
2. Community based curation from our
experience. Lessons Learned.
3. Manual Annotation at i5K: how do we
get there?
4. Becoming acquainted with Web
Apollo.
An introduction to
Web Apollo.
A webinar for the i5K
Pilot Species Projects -
Hemiptera.
Outline 2
What is Web Apollo?
• Web Apollo is a web-based, collaborative genomic
annotation editing platform.
We need annotation editing tools to modify and refine the
precise location and structure of the genome elements that
predictive algorithms cannot yet resolve automatically.
31. What is Web Apollo?
Find more about Web Apollo at
http://GenomeArchitect.org
and
Genome Biol 14:R93. (2013).
Brief history of Apollo*:
a. Desktop:
one person at a time editing a
specific region, annotations
saved in local files; slowed down
collaboration.
b. Java Web Start:
users saved annotations directly
to a centralized database;
potential issues with stale
annotation data remained.
1. What is Web Apollo? 4
Biologists could finally visualize computational analyses and
experimental evidence from genomic features and build
manually-curated consensus gene structures. Apollo became a
very popular, open source tool (insects, fish, mammals, birds, etc.).
*
Web Apollo
• Browser-based; plugin for JBrowse.
• Allows for intuitive annotation creation and editing,
with gestures and pull-down menus to create
transcripts, add/delete/resize exons, merge/split
exons or transcripts, insert comments
(CV, freeform text), etc.
• Customizable rules and
appearance.
• Edits in one client are
instantly pushed to all other
clients: Collaborative!
1. What is Web Apollo? 5
Working
Concept
In the context of gene manual annotation,
curation tries to find the best examples
and/or eliminate (most) errors.
To conduct manual annotation efforts:
Gather and evaluate all available evidence
using quality-control metrics to
corroborate or modify automated
annotation predictions.
Perform sequence similarity searches
(phylogenetic framework) and use
literature and public databases to:
• Predict functional assignments from
experimental data.
• Distinguish orthologs from paralogs,
and classify gene membership in
families and networks.
2. In our experience. 6
Automated gene models
Evidence:
cDNAs, HMM domain searches,
alignments with assemblies or
genes from other species.
Manual annotation & curation
Dispersed, community-based gene
manual annotation efforts.
Using Web Apollo, we* have trained
geographically dispersed scientific
communities to perform biologically
supported manual annotations, and
monitored their findings: ~80 institutions,
14 countries, hundreds of scientists, and
gate keepers.
– Training workshops and geneborees.
– Tutorials with detailed instructions.
– Personalized user support.
2. In our experience. 7
*Collaboration with Elsik Lab,
Hymenoptera Genome
Database.
What have we learned?
Harvesting expertise from dispersed researchers who
assigned functions to predicted and curated peptides,
we have developed more interactive and responsive
tools, as well as better visualization, editing, and
analysis capabilities.
82. In our experience.
It is helpful to work together.
Scientific community efforts bring together domain-
specific and natural history expertise that would have
otherwise remain disconnected.
92. In our experience.
Improved Automated Annotations*
In many cases, automated annotations have been
improved (e.g: Apis mellifera. Elsik et al. BMC Genomics 2014, 15:86).
Also, learned of the challenges of newer sequencing
technologies, e.g.:
– Frameshifts and indel errors
– Split genes across scaffolds
– Highly repetitive sequences
To face these challenges, we train annotators in
recovering coding sequences in agreement with all
available biological evidence.
102. In our experience.
Understanding the evolution of sociality.
Comparison of the genomes of 7 species of
ants contributed to a better understanding
of the evolution and organization of insect
societies at the molecular level.
Insights drawn mainly from six core aspects of
ant biology:
1. Alternative morphological castes
2. Division of labor
3. Chemical Communication
4. Alternative social organization
5. Social immunity
6. Mutualism
11
… groups of
communities
have taught us a
lot!
Libbrecht et al. 2012. Genome Biology 2013, 14:212
2. In our experience.
A little training goes a long way!
With the right tools, wet lab scientists make exceptional
curators who can easily learn to maximize the
generation of accurate, biologically supported gene
models.
122. In our experience.
Manual annotation at i5K
How do we get there?
3. How do we get there? 13
Assembly
Manual
annotation
Experimental
validation
Automated
Annotation
In a genome sequencing project…
Gene Prediction
Gene Prediction:
Identification of protein-coding genes, tRNAs, rRNAs,
regulatory motifs, repetitive elements (masked), etc.
Ab initio or homology-based. E.g: fgenesh, Augustus,
geneid, SGP2
14
Nucleic Acids 2003 vol. 31 no. 13 3738-3741
3. How do we get there?
Gene Annotation
Gene Annotation:
Integration of data from prediction tools to generate a
consensus set of predictions (gene models).
• Models may be organized by:
- automatic integration of predicted sets; e.g: GLEAN
- packaging necessary tools into pipeline; e.g: MAKER
• Transcriptomes are used to further inform the annotation
process.
153. How do we get there?
The Collaborative Curation Process at
i5K
1) A computationally predicted consensus gene set has
been generated using multiple lines of evidence; e.g.
CLEC_v0.5.3-Models.
2) i5K Projects will integrate consensus computational
predictions with manual annotations to produce an updated
Official Gene Set (OGS):
» If it’s not on either track, it won’t make the OGS!
» If it’s there and it shouldn’t, it will still make the OGS!
163. How do we get there?
Consensus set: reference and start point
• In some cases algorithms and metrics used to generate
consensus sets may actually reduce the accuracy of the gene’s
representation; e.g. use Augustus model instead to create a new
annotation.
• Isoforms: drag original and alternatively spliced form to ‘User-
created Annotations’ area.
• If an annotation needs to be removed from the consensus set,
drag it to the ‘User-created Annotations’ area and label as
‘Delete’ on Information Editor.
• Overlapping interests? Collaborate to reach agreement.
• Follow guidelines for i5K Pilot Species Projects as shown at
http://goo.gl/LRu1VY
173. How do we get there?
Navigation tools:
pan and zoom Search box: go
to a scaffold or
a gene model.
Grey bar of coordinates
indicates location. You can
also select here in order to
zoom to a sub-region.
‘View’: change
color by CDS,
toggle strands,
set highlight.
‘File’:
Upload your own
evidence: GFF3,
BAM, BigWig, VCF*.
Add combination
and sequence
search tracks.
‘Tools’:
Use BLAT to query the
genome with a protein
or DNA sequence.
Available Tracks
Evidence Tracks Area
‘User-created Annotations’ Track
Login
Web Apollo
Graphical User Interface (GUI) for editing annotations
4. Becoming Acquainted with Web Apollo.
Flags non-
canonical splice
sites.
Selection of features and
sub-features
Edge-matching
Evidence Tracks Area
‘User-created Annotations’ Track
 The editing logic (server):
 selects longest ORF as CDS
 flags non-canonical splice sites
Web Apollo
4. Becoming Acquainted with Web Apollo.
DNA Track
‘User-created Annotations’ Track
 Two new kinds of tracks:
 annotation editing
 sequence alteration editing
Web Apollo
4. Becoming Acquainted with Web Apollo.
Web Apollo
 Annotations, annotation edits, and History: stored in a centralized database.
4. Becoming Acquainted with Web Apollo.
Web Apollo
 Annotation Information Editor
4. Becoming Acquainted with Web Apollo.
Web Apollo
 Annotation Information Editor
4. Becoming Acquainted with Web Apollo.
[Some of the] Functionality:
 Protein-coding gene annotation (that you know and love)
 Sequence alterations (less coverage = more fragmentation)
 Visualization of stage and cell-type specific transcription data as
coverage plots, heat maps, and alignments
4. Becoming Acquainted with Web Apollo.
Example: ORCO
Live Demonstration using the Cimex lectularius genome
Footer 25
Arthropodcentric Thanks!
AgriPest Base
FlyBase
Hymenoptera Genome Database
VectorBase
Apis mellifera
Tribolium castaneum
Pogonomyrmex barbatus
Manduca sexta
Bombus terrestris
Helicoverpa armigera
Nasonia vitripennis
Acyrthosiphon pisum
Mayetiola destructor
Atta cephalotes
Linepithema humile
Camponotus floridanus
Solenopsis invicta
Acromyrmex echinatior
Thanks!
• Berkeley Bioinformatics Open-source Projects
(BBOP), Berkeley Lab: Web Apollo and Gene
Ontology teams. Suzanna E. Lewis (PI).
• Elsik Lab. § University of Missouri. Christine G.
Elsik (PI).
• Ian Holmes (PI). * University of California Berkeley.
• Arthropod genomics community, i5K
http://www.arthropodgenomes.org/wiki/i5K Steering
Committee, USDA/NAL, HGSC-BCM, BGI, and
1KITE http://www.1kite.org/.
• Web Apollo is supported by NIH grants 5R01GM080203
from NIGMS, and 5R01HG004483 from NHGRI, and by the
Director, Office of Science, Office of Basic Energy
Sciences, of the U.S. Department of Energy under Contract
No. DE-AC02-05CH11231.
• Insect images used with permission:
http://AlexanderWild.com
• For your attention, thank you!
Thank you. 27
Web Apollo
Ed Lee
Gregg Helt
Colin Diesh §
Deepak Unni §
Rob Buels *
Gene Ontology
Chris Mungall
Seth Carbon
Heiko Dietze
BBOP
Web Apollo: http://GenomeArchitect.org
GO: http://GeneOntology.org
i5K: http://arthropodgenomes.org/wiki/i5K

More Related Content

PDF
Web Apollo Tutorial for Medfly Research Community
PDF
Web Apollo Workshop UIUC
PDF
An introduction to Web Apollo for the Biomphalaria glabatra research community.
PPTX
Introduction to Web Apollo for the i5K pilot species.
PPTX
Web Apollo Tutorial for the i5K copepod research community.
PPTX
Munoz torres web-apollo-workshop_exeter-2014_ss
PPTX
Web Apollo: Lessons learned from community-based biocuration efforts.
PDF
Apollo provides collaborative genome annotation editing with the power of jbr...
Web Apollo Tutorial for Medfly Research Community
Web Apollo Workshop UIUC
An introduction to Web Apollo for the Biomphalaria glabatra research community.
Introduction to Web Apollo for the i5K pilot species.
Web Apollo Tutorial for the i5K copepod research community.
Munoz torres web-apollo-workshop_exeter-2014_ss
Web Apollo: Lessons learned from community-based biocuration efforts.
Apollo provides collaborative genome annotation editing with the power of jbr...

What's hot (14)

PDF
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
PDF
Web Apollo Workshop University of Exeter
PDF
Web Apollo at Genome Informatics 2014
PPTX
2013 bms-retreat-talk
PDF
Advanced Bioinformatics for Genomics and BioData Driven Research
PPT
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
PPT
JulieKlein_Bosc2012
PPTX
2014 sage-talk
PDF
Oxford DTP - Sansone curation tools - Dec 2014
PPT
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
PPTX
2014 bangkok-talk
PDF
Towards Incidental Collaboratories; Research Data Services
PDF
Ontomaton icbo2013-alternative order-t_wv3
PPTX
Towards Reproducible Science: a few building blocks from my personal experience
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Web Apollo Workshop University of Exeter
Web Apollo at Genome Informatics 2014
2013 bms-retreat-talk
Advanced Bioinformatics for Genomics and BioData Driven Research
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
JulieKlein_Bosc2012
2014 sage-talk
Oxford DTP - Sansone curation tools - Dec 2014
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
2014 bangkok-talk
Towards Incidental Collaboratories; Research Data Services
Ontomaton icbo2013-alternative order-t_wv3
Towards Reproducible Science: a few building blocks from my personal experience
Ad

Similar to An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera (20)

PPTX
Three's a crowd-source: Observations on Collaborative Genome Annotation
PDF
Introduction to Apollo - i5k Research Community – Calanoida (copepod)
PDF
Apollo Workshop at KSU 2015
PDF
Apollo Workshop AGS2017 Introduction
PPTX
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
PDF
Apollo annotation guidelines for i5k projects Diaphorina citri
PDF
Introduction to Apollo for i5k
PDF
Variant analysis and whole exome sequencing
PDF
Curation Introduction - Apollo Workshop
PDF
Apollo Collaborative genome annotation editing
PPT
Introduction to Ontologies for Environmental Biology
PPTX
Designing a community resource - Sandra Orchard
PDF
Genome Curation using Apollo
PPTX
Chibucos annot go_final
PDF
Curate locally, think globally
PDF
one complete report from all the 4 labs.pdf
PDF
one complete report from all the 4 labs.pdf
PDF
Ontology Services for the Biomedical Sciences
PPTX
Being Reproducible: SSBSS Summer School 2017
PDF
EVE161: Microbial Phylogenomics - Class 1 - Introduction
Three's a crowd-source: Observations on Collaborative Genome Annotation
Introduction to Apollo - i5k Research Community – Calanoida (copepod)
Apollo Workshop at KSU 2015
Apollo Workshop AGS2017 Introduction
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Apollo annotation guidelines for i5k projects Diaphorina citri
Introduction to Apollo for i5k
Variant analysis and whole exome sequencing
Curation Introduction - Apollo Workshop
Apollo Collaborative genome annotation editing
Introduction to Ontologies for Environmental Biology
Designing a community resource - Sandra Orchard
Genome Curation using Apollo
Chibucos annot go_final
Curate locally, think globally
one complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdf
Ontology Services for the Biomedical Sciences
Being Reproducible: SSBSS Summer School 2017
EVE161: Microbial Phylogenomics - Class 1 - Introduction
Ad

More from Monica Munoz-Torres (20)

PDF
Apollo Workshop AGS2017 Editing functionality
PDF
Editing Functionality - Apollo Workshop
PDF
Apollo Exercises Kansas State University 2015
PDF
JBrowse & Apollo Overview - for AGR
PDF
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...
PDF
Gene Ontology Consortium: Website & COmmunity
PDF
Essential Requirements for Community Annotation Tools
PDF
Genome Curation using Apollo - Workshop at UTK
PDF
Introduction to Apollo: i5K E affinis
PDF
Introduction to Apollo: A webinar for the i5K Research Community
PDF
Apollo Introduction for i5K Groups 2015-10-07
PDF
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcional
PDF
Apolo Taller en BIOS
PDF
Apollo Introduction for the Chestnut Research Community
PDF
Apollo : A workshop for the Manakin Research Coordination Network
PDF
Apollo - A webinar for the Phascolarctos cinereus research community
PDF
PAINT Family PTHR13451-MUS81
PDF
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
PDF
Data Visualization And Annotation Workshop at Biocuration 2015
PDF
Apollo: developers call 2015-02-05
Apollo Workshop AGS2017 Editing functionality
Editing Functionality - Apollo Workshop
Apollo Exercises Kansas State University 2015
JBrowse & Apollo Overview - for AGR
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...
Gene Ontology Consortium: Website & COmmunity
Essential Requirements for Community Annotation Tools
Genome Curation using Apollo - Workshop at UTK
Introduction to Apollo: i5K E affinis
Introduction to Apollo: A webinar for the i5K Research Community
Apollo Introduction for i5K Groups 2015-10-07
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcional
Apolo Taller en BIOS
Apollo Introduction for the Chestnut Research Community
Apollo : A workshop for the Manakin Research Coordination Network
Apollo - A webinar for the Phascolarctos cinereus research community
PAINT Family PTHR13451-MUS81
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Data Visualization And Annotation Workshop at Biocuration 2015
Apollo: developers call 2015-02-05

Recently uploaded (20)

PPTX
2. Earth - The Living Planet earth and life
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
2Systematics of Living Organisms t-.pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
Microbiology with diagram medical studies .pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
famous lake in india and its disturibution and importance
PDF
An interstellar mission to test astrophysical black holes
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
2. Earth - The Living Planet earth and life
Introduction to Fisheries Biotechnology_Lesson 1.pptx
HPLC-PPT.docx high performance liquid chromatography
microscope-Lecturecjchchchchcuvuvhc.pptx
2Systematics of Living Organisms t-.pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Introduction to Cardiovascular system_structure and functions-1
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Microbiology with diagram medical studies .pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
POSITIONING IN OPERATION THEATRE ROOM.ppt
bbec55_b34400a7914c42429908233dbd381773.pdf
AlphaEarth Foundations and the Satellite Embedding dataset
famous lake in india and its disturibution and importance
An interstellar mission to test astrophysical black holes
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...

An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera

  • 1. An introduction to Web Apollo. A webinar for the i5K Pilot Species Projects - Hemiptera Monica Munoz-Torres, PhD Biocurator & Bioinformatics Analyst | @monimunozto Genomics Division, Lawrence Berkeley National Laboratory 12+1 May, 2014 UNIVERSITY OF CALIFORNIA
  • 2. Outline 1. What is Web Apollo?: • Definition & working concept. 2. Community based curation from our experience. Lessons Learned. 3. Manual Annotation at i5K: how do we get there? 4. Becoming acquainted with Web Apollo. An introduction to Web Apollo. A webinar for the i5K Pilot Species Projects - Hemiptera. Outline 2
  • 3. What is Web Apollo? • Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine the precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. 31. What is Web Apollo? Find more about Web Apollo at http://GenomeArchitect.org and Genome Biol 14:R93. (2013).
  • 4. Brief history of Apollo*: a. Desktop: one person at a time editing a specific region, annotations saved in local files; slowed down collaboration. b. Java Web Start: users saved annotations directly to a centralized database; potential issues with stale annotation data remained. 1. What is Web Apollo? 4 Biologists could finally visualize computational analyses and experimental evidence from genomic features and build manually-curated consensus gene structures. Apollo became a very popular, open source tool (insects, fish, mammals, birds, etc.). *
  • 5. Web Apollo • Browser-based; plugin for JBrowse. • Allows for intuitive annotation creation and editing, with gestures and pull-down menus to create transcripts, add/delete/resize exons, merge/split exons or transcripts, insert comments (CV, freeform text), etc. • Customizable rules and appearance. • Edits in one client are instantly pushed to all other clients: Collaborative! 1. What is Web Apollo? 5
  • 6. Working Concept In the context of gene manual annotation, curation tries to find the best examples and/or eliminate (most) errors. To conduct manual annotation efforts: Gather and evaluate all available evidence using quality-control metrics to corroborate or modify automated annotation predictions. Perform sequence similarity searches (phylogenetic framework) and use literature and public databases to: • Predict functional assignments from experimental data. • Distinguish orthologs from paralogs, and classify gene membership in families and networks. 2. In our experience. 6 Automated gene models Evidence: cDNAs, HMM domain searches, alignments with assemblies or genes from other species. Manual annotation & curation
  • 7. Dispersed, community-based gene manual annotation efforts. Using Web Apollo, we* have trained geographically dispersed scientific communities to perform biologically supported manual annotations, and monitored their findings: ~80 institutions, 14 countries, hundreds of scientists, and gate keepers. – Training workshops and geneborees. – Tutorials with detailed instructions. – Personalized user support. 2. In our experience. 7 *Collaboration with Elsik Lab, Hymenoptera Genome Database.
  • 8. What have we learned? Harvesting expertise from dispersed researchers who assigned functions to predicted and curated peptides, we have developed more interactive and responsive tools, as well as better visualization, editing, and analysis capabilities. 82. In our experience.
  • 9. It is helpful to work together. Scientific community efforts bring together domain- specific and natural history expertise that would have otherwise remain disconnected. 92. In our experience.
  • 10. Improved Automated Annotations* In many cases, automated annotations have been improved (e.g: Apis mellifera. Elsik et al. BMC Genomics 2014, 15:86). Also, learned of the challenges of newer sequencing technologies, e.g.: – Frameshifts and indel errors – Split genes across scaffolds – Highly repetitive sequences To face these challenges, we train annotators in recovering coding sequences in agreement with all available biological evidence. 102. In our experience.
  • 11. Understanding the evolution of sociality. Comparison of the genomes of 7 species of ants contributed to a better understanding of the evolution and organization of insect societies at the molecular level. Insights drawn mainly from six core aspects of ant biology: 1. Alternative morphological castes 2. Division of labor 3. Chemical Communication 4. Alternative social organization 5. Social immunity 6. Mutualism 11 … groups of communities have taught us a lot! Libbrecht et al. 2012. Genome Biology 2013, 14:212 2. In our experience.
  • 12. A little training goes a long way! With the right tools, wet lab scientists make exceptional curators who can easily learn to maximize the generation of accurate, biologically supported gene models. 122. In our experience.
  • 13. Manual annotation at i5K How do we get there? 3. How do we get there? 13 Assembly Manual annotation Experimental validation Automated Annotation In a genome sequencing project…
  • 14. Gene Prediction Gene Prediction: Identification of protein-coding genes, tRNAs, rRNAs, regulatory motifs, repetitive elements (masked), etc. Ab initio or homology-based. E.g: fgenesh, Augustus, geneid, SGP2 14 Nucleic Acids 2003 vol. 31 no. 13 3738-3741 3. How do we get there?
  • 15. Gene Annotation Gene Annotation: Integration of data from prediction tools to generate a consensus set of predictions (gene models). • Models may be organized by: - automatic integration of predicted sets; e.g: GLEAN - packaging necessary tools into pipeline; e.g: MAKER • Transcriptomes are used to further inform the annotation process. 153. How do we get there?
  • 16. The Collaborative Curation Process at i5K 1) A computationally predicted consensus gene set has been generated using multiple lines of evidence; e.g. CLEC_v0.5.3-Models. 2) i5K Projects will integrate consensus computational predictions with manual annotations to produce an updated Official Gene Set (OGS): » If it’s not on either track, it won’t make the OGS! » If it’s there and it shouldn’t, it will still make the OGS! 163. How do we get there?
  • 17. Consensus set: reference and start point • In some cases algorithms and metrics used to generate consensus sets may actually reduce the accuracy of the gene’s representation; e.g. use Augustus model instead to create a new annotation. • Isoforms: drag original and alternatively spliced form to ‘User- created Annotations’ area. • If an annotation needs to be removed from the consensus set, drag it to the ‘User-created Annotations’ area and label as ‘Delete’ on Information Editor. • Overlapping interests? Collaborate to reach agreement. • Follow guidelines for i5K Pilot Species Projects as shown at http://goo.gl/LRu1VY 173. How do we get there?
  • 18. Navigation tools: pan and zoom Search box: go to a scaffold or a gene model. Grey bar of coordinates indicates location. You can also select here in order to zoom to a sub-region. ‘View’: change color by CDS, toggle strands, set highlight. ‘File’: Upload your own evidence: GFF3, BAM, BigWig, VCF*. Add combination and sequence search tracks. ‘Tools’: Use BLAT to query the genome with a protein or DNA sequence. Available Tracks Evidence Tracks Area ‘User-created Annotations’ Track Login Web Apollo Graphical User Interface (GUI) for editing annotations 4. Becoming Acquainted with Web Apollo.
  • 19. Flags non- canonical splice sites. Selection of features and sub-features Edge-matching Evidence Tracks Area ‘User-created Annotations’ Track  The editing logic (server):  selects longest ORF as CDS  flags non-canonical splice sites Web Apollo 4. Becoming Acquainted with Web Apollo.
  • 20. DNA Track ‘User-created Annotations’ Track  Two new kinds of tracks:  annotation editing  sequence alteration editing Web Apollo 4. Becoming Acquainted with Web Apollo.
  • 21. Web Apollo  Annotations, annotation edits, and History: stored in a centralized database. 4. Becoming Acquainted with Web Apollo.
  • 22. Web Apollo  Annotation Information Editor 4. Becoming Acquainted with Web Apollo.
  • 23. Web Apollo  Annotation Information Editor 4. Becoming Acquainted with Web Apollo.
  • 24. [Some of the] Functionality:  Protein-coding gene annotation (that you know and love)  Sequence alterations (less coverage = more fragmentation)  Visualization of stage and cell-type specific transcription data as coverage plots, heat maps, and alignments 4. Becoming Acquainted with Web Apollo.
  • 25. Example: ORCO Live Demonstration using the Cimex lectularius genome Footer 25
  • 26. Arthropodcentric Thanks! AgriPest Base FlyBase Hymenoptera Genome Database VectorBase Apis mellifera Tribolium castaneum Pogonomyrmex barbatus Manduca sexta Bombus terrestris Helicoverpa armigera Nasonia vitripennis Acyrthosiphon pisum Mayetiola destructor Atta cephalotes Linepithema humile Camponotus floridanus Solenopsis invicta Acromyrmex echinatior
  • 27. Thanks! • Berkeley Bioinformatics Open-source Projects (BBOP), Berkeley Lab: Web Apollo and Gene Ontology teams. Suzanna E. Lewis (PI). • Elsik Lab. § University of Missouri. Christine G. Elsik (PI). • Ian Holmes (PI). * University of California Berkeley. • Arthropod genomics community, i5K http://www.arthropodgenomes.org/wiki/i5K Steering Committee, USDA/NAL, HGSC-BCM, BGI, and 1KITE http://www.1kite.org/. • Web Apollo is supported by NIH grants 5R01GM080203 from NIGMS, and 5R01HG004483 from NHGRI, and by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. • Insect images used with permission: http://AlexanderWild.com • For your attention, thank you! Thank you. 27 Web Apollo Ed Lee Gregg Helt Colin Diesh § Deepak Unni § Rob Buels * Gene Ontology Chris Mungall Seth Carbon Heiko Dietze BBOP Web Apollo: http://GenomeArchitect.org GO: http://GeneOntology.org i5K: http://arthropodgenomes.org/wiki/i5K

Editor's Notes