Go pathway-interaction-integration

Integration of GO, Pathway data and Interaction dataChris MungallPeter D’Eustachio

The GO was originally intended to integrate databasesHow are we doing?Interoperability of genomic databases is limited by this lack of progress, and it is this major obstacle that the Gene Ontology (GO) Consortium was formed to addressGene Ontology: Tool for theUniﬁcationofBiology. Nat Genet 2000SGDFBGOA

GOThe GO was originally intended to integrate databasesHow are we doing?Not as well as we could!GOSGDFBGOAPathway CommonsIMEXReactomeCyc…BioGRIDIntact…

Integration enhances analyses and reduces workloadDivision of laborleave specialized curation to specialized systems biology databasesbut data needs to be re-combined to prevent siloingGO is an invaluable single-stop shop for term enrichment etcCan we quantify how integrating with systems biology databases helps users?Yes! We can do the experiment:GO term enrichment analysis on all MolSigDBwithReactome annotationsAlso include Reactome inputs/outputs, not currently in GOAwithoutReactomeannotations

Integration enhances analyses GOA+R: Many p-values will significantly improvedRecapitulated biologically valid results that would have been suppressed had one single resource been usedExamples:Genes down-regulated in Alzheimers

How are we currently integrating systems biology datasets?Interaction dataCurrently Intact, soon IMEX“protein binding” and “self-protein binding” only (+with)Pathway dataCurrently ReactomeonlyLoses much of what is in ReactomeE,g,inputs and outputs Manually curated GO<->Reactome linksincompletenot always to the most specific termlabor-intensivebecome stale over timeother pathway databases?This can be improved!

Automating integration using cross-product definitions – pathway databases[Term]id: GO:0015871name: choline transportintersection_of: GO:0006810 ! transportintersection_of:results_in_transport_ofCHEBI:15354 ! choline

Automating integration using cross-products – pathway databasesWe can also automatically map:catalysis terms [165*]transport [373]binding [133]phosphorylation and other modificationsmetabolism [278]signaling…All this relies on different cross-product filesAny pathway database that exports BioPax-OWL can be usedE.ghumancyc, mousecyc, pathwaycommons, …*Numbers for Reactome-human

Automating integration using cross-products – interaction databasesFIGFVEGFRbindshas_functionis_a[Term]id: GO:0043184name: vascular endothelial growth factor receptor 2 bindingintersection_of: GO:0005488 ! bindingintersection_of:results_in_binding_ofPRO:000002112! VEGFR 2

Automated Integration: ResultsReactomeEvaluation in progressMany manually assigned equivalencies recapitulatedInferred equivalencies differed in some casessometimes better than manually assignedsometimes required info not in biopax exportongoing discussionsBioGridnot evaluated (all trivial)inferred annotations improve some enrichment resultsE.g. Brentani angiogenesis gene sets, increased enrichment for VEGFR bindingObvious but useful as proof of concept

Conclusions and future workWe can be more efficient:Coordinate with systems bio databases to divide laborPrevent siloing through semi-automated integrationGO acts as a high-level ‘window’ on systems biology databasesStill to be done:Make integration tool production-readyReconcile existing mis-alignments, particularly signalinghighly inconsistent between GO and ReactomeExplore open questions – e.g. auto-generate terms?Finish cross-products, they are vitalparticular PRO, CHEBI

Go pathway-interaction-integration

More Related Content

Viewers also liked (13)

Similar to Go pathway-interaction-integration (20)

More from Chris Mungall (20)

Go pathway-interaction-integration

Editor's Notes