SlideShare a Scribd company logo
STRING Prediction of a functional association network for the yeast mitochondrial system Lars Juhl Jensen EMBL Heidelberg
Overview Prediction of functional associations between proteins What is STRING? Genomic context methods Integration of large-scale experimental data Combination and cross-species transfer of evidence (Coffee break) The yeast mitochondrial system Prediction of mitochondrial proteins A functional association network for mitochondria Mapping and correlating features of mitochondrial proteins
Part 1 Prediction of functional association between proteins Lars Juhl Jensen EMBL Heidelberg
What is STRING? Genomic neighborhood Species co-occurrence Gene fusions Database imports Exp. interaction data Microarray expression data Literature co-mentioning
Let the data speak for themselves ... Classification schemes are obviously difficult to predict if they are not supported by the data – there are no obvious features separating: Presidents vs. non-presidents Actors vs. non-actors Unsupervised methods may discover a more meaningful classification: Holding your pinky to your mouth is a clear sign of evil Wearing a bowtie is a sign of good So is consumption of alcoholic drinks
Inferring functional modules from gene presence/absence patterns T rends in Microbiology Resting protuberances Protracted protuberance Cellulose © Trends Microbiol, 1999 Cell Cell wall Anchoring  proteins Cellulosomes Cellulose The “Cellulosome”
Genomic context methods © Nature Biotechnology, 2004
Score calibration against a common reference Many diverse types of evidence The quality of each is judged by very different raw scores Quality differences exist among data sets of the same type Solved by calibrating all scores against a common reference Scores are directly comparable Probabilistic scores allow evidence to be combined Requirements for the reference Must represent a compromise of the all types of evidence Broad species coverage
Integrating physical interaction screens Make binary representation of complexes Yeast two-hybrid data sets are inherently binary Calculate score from number of (co-)occurrences Calculate score from non-shared partners Calibrate against KEGG maps Infer associations in other species Combine evidence from experiments
Mining microarray expression databases Re-normalize arrays by modern method to remove biases Build expression matrix Combine similar arrays by PCA Construct predictor by Gaussian kernel density estimation Calibrate against KEGG maps Infer associations in other species
Evidence transfer based on “fuzzy orthology” Orthology transfer is tricky Correct assignment of orthology is difficult for distant species Functional equivalence cannot be guaranteed for in-paralogs These problems are addressed by our “fuzzy orthology” scheme Confidence scores for functional equivalence are calculated from all-against-all alignment Evidence is distributed across possible pairs according to confidence scores in the case of many-to-many relationships ? Source species Target species
Multiple evidence types from several species
Predicting and defining metabolic pathways and other functional modules Image: Molecular Biology of the Cell, 3 . rd edition Metabolism overview Defined manually: cutting metabolic maps into pathways Purine biosynthesis Histidine biosynthesis Defined objectively: standard clustering of genome-scale data
Part 2 The yeast mitochondrial system Lars Juhl Jensen EMBL Heidelberg
Yeast mitochondria – why it should work well Because it is metabolism STRING was developed using KEGG pathways as a reference This may have caused STRING to function best on metabolism Because it is yeast By far the best covered organism in terms of physical interactions Many microarray gene expression studies Literature mining works well due to standardization of gene names Because it is prokaryotic Evolutionarily, mitochondria are of bacterial origin The genomic context methods in STRING are very powerful, but can only provide evidence for proteins with prokaryotic orthologs
Strategy for extracting a functional association network of the mitochondrial system Starting point: Reference set of proteins known to mitochondrial A large, diverse set of experiments relevant for predicting mitochondrial proteins The global STRING network for yeast Predict mitochondrial candidate genes Use reference set to train neural networks for predicting candidate genes based on experimental data Use very high-confidence STRING links to suggest additional candidates based interactions with reference and candidate genes Extract network that includes lower confidence interactions and identify functional modules by clustering
Predicting mitochondrial proteins Training was done with 5-fold cross validation Reference set used as positive examples All other genes used as negative examples Top 800 contains more than 90% of known mitochondrial genes Surprising performance of the linear model As good as NN with 250 hidden neurons Better than MitoP2
TOM MRPL Ribosome  Related MRPS Vacuolar  Acidification Fatty Acid Biosynth. Secondary RCC_Asy RCC_Asy RCCII RCCIV RCCV RCC_Asy HAP  Complex Arg  Biosynth. PDH/KGD/ GCV Cell Wall & pH Reg. DNA Repair Glucose sensing  and CH remodelling APC Fission/ Fusion rRNA Processing mRNA Processing TFIIIC Complex m-AAA Complex TCA Cycle Iron Homeostasis/ Chaperone Activity RCCI rRNA Processing Leu/Val/Ile Biosynth. DNA Repair GARP Complex Cytosolic  Ribosome TIM RCC_Asy Actin tRNA  Splicing RCCIII NUP Replication/ DNA Repair
TOM MRPL Ribosome  Related MRPS Vacuolar  Acidification Fatty Acid Biosynth. Secondary RCC_Asy RCC_Asy RCCII RCCIV RCCV RCC_Asy HAP  Complex Arg  Biosynth. PDH/KGD/ GCV Cell Wall & pH Reg. DNA Repair Glucose sensing  and CH remodelling APC Fission/ Fusion rRNA Processing mRNA Processing TFIIIC Complex m-AAA Complex TCA Cycle Iron Homeostasis/ Chaperone Activity RCCI rRNA Processing Leu/Val/Ile Biosynth. DNA Repair GARP Complex Cytosolic  Ribosome TIM RCC_Asy Actin tRNA  Splicing RCCIII NUP Replication/ DNA Repair Protobacterial orthologs
TOM MRPL Ribosome  Related MRPS Vacuolar  Acidification Fatty Acid Biosynth. Secondary RCC_Asy RCCII RCCIV RCCV RCC_Asy HAP  Complex Arg  Biosynth. PDH/KGD/ GCV Cell Wall & pH Reg. DNA Repair Glucose sensing  and CH remodelling APC Fission/ Fusion rRNA Processing mRNA Processing TFIIIC Complex m-AAA Complex TCA Cycle Iron Homeostasis/ Chaperone Activity RCCI rRNA Processing Leu/Val/Ile Biosynth. DNA Repair GARP Complex Cytosolic  Ribosome TIM RCC_Asy Actin tRNA  Splicing RCCIII NUP Replication/ DNA Repair Human disease orthologs RCC_Asy
 
Composition and interconnectivity of clusters A network of clusters Most probably path between clusters used as score Interacting clusters are preferentially within the same compartment Protobacterial clusters typically localize to the mitochondria
Correlations among gene features Expression data agree with data on NF specific growth defects Genes with detectable human orthologs are more conserved among yeasts Disease orthologs are often protobacterial Knockout of disease orthologs cause less severe growth defects
Can human disease genes be predicted? Mitochondrial genes are already enriched in disease genes Previous slide showed that mitochondrial genes of protobacterial origin and are further enriched in disease gene orthologs Disease gene orthologs show less growth defect than other mitochondrial genes with human orthologs
Getting more specific – generally speaking Benchmarking against one common reference allows integration of heterogeneous data The different types of data do not all tell us about the same kind of functional associations It should be possible to assign likely interaction types from supporting evidence types An accurate model of the yeast mitotic cell cycle Approach High confidence set of physical interactions Custom analysis of cell cycle expression data Observations Dynamic assembly of cell cycle complexes Temporal regulation of Cdk specificity Dynamic complex formation during the yeast cell cycle Ulrik de Lichtenberg, Lars Juhl Jensen, Søren Brunak and Peer Bork to appear in Science
Conclusions Genomic context methods are able to infer the function of many prokaryotic proteins from genome sequences alone New genomic context methods are still being developed Integration of large-scale experimental data allows similar predictions to be made for eukaryotic proteins Successful data integration requires benchmarking and cross-species transfer of information Protein networks are useful for the analysis of large, complex biological systems
Acknowledgments The STRING team Christian von Mering Berend Snel Martijn Huynen Daniel Jaeggi Steffen Schmidt Mathilde Foglierini Peer Bork New genomic context methods Jan Korbel Christian von Mering Peer Bork ArrayProspector web service Julien Lagarde Chris Workman NetView visualization tool Sean Hooper Study of yeast mitochondria Fabiana Perocchi Lars Steinmetz Analysis of yeast cell cycle Ulrik de Lichtenberg Thomas Skøt Anders Fausbøll Søren Brunak Web resources string.embl.de www.bork.embl.de/ArrayProspector www.bork.embl.de/synonyms
Thank you!

More Related Content

PPT
Prediction of protein function
PPT
Protein function prediction
DOC
Epigeneticsand methylation
PPT
Prediction of protein function from sequence derived protein features
PPTX
NetBioSIG2014-Talk by Traver Hart
PPTX
Introduction to data integration in bioinformatics
PPTX
Genomics,proteomics and comparative genomics
DOCX
my 6th paper
Prediction of protein function
Protein function prediction
Epigeneticsand methylation
Prediction of protein function from sequence derived protein features
NetBioSIG2014-Talk by Traver Hart
Introduction to data integration in bioinformatics
Genomics,proteomics and comparative genomics
my 6th paper

What's hot (19)

DOC
Applications of protein array in diagnostics and genomic and proteomic
PDF
Introduction to whole-cell modeling lecture | Whole-cell modeling summer scho...
PPT
Protein-protein interaction
PPT
Comparative genomics @ sid 2003 format
PDF
WrightPettibone_Austin_GeneDrives
PPTX
Protein protein interactions
PDF
Analisis de la expresion de genes en la depresion
PPTX
Personalized Medicine and the Omics Revolution by Professor Mike Snyder
PPTX
protein-protein interaction
PPTX
Comparative genomics
PPTX
Comparative genomics
ODT
Myers CV_2015
PPTX
Yeast two hybrid system
PPTX
DNA Sequencing in Phylogeny
PPTX
Protein protein interaction basic
PDF
Cytoscape: Gene coexppression and PPI networks
PPTX
Comparative genomics and proteomics
PPT
Proteomics a search tool for vaccines
Applications of protein array in diagnostics and genomic and proteomic
Introduction to whole-cell modeling lecture | Whole-cell modeling summer scho...
Protein-protein interaction
Comparative genomics @ sid 2003 format
WrightPettibone_Austin_GeneDrives
Protein protein interactions
Analisis de la expresion de genes en la depresion
Personalized Medicine and the Omics Revolution by Professor Mike Snyder
protein-protein interaction
Comparative genomics
Comparative genomics
Myers CV_2015
Yeast two hybrid system
DNA Sequencing in Phylogeny
Protein protein interaction basic
Cytoscape: Gene coexppression and PPI networks
Comparative genomics and proteomics
Proteomics a search tool for vaccines
Ad

Similar to STRING - Prediction of a functional association network for the yeast mitochondrial system (20)

PPT
STRING - Modeling of pathways through cross-species integration of large-scal...
PPT
STRING - Cross-species integration of known and predicted protein-protein int...
PPT
Proteomics - Analysis and integration of large-scale data sets
PPT
STRING - Prediction of functionally associated proteins from heterogeneous ge...
PPT
STRING - Prediction of functional relations, modules, and networks from heter...
PPT
Modeling the dynamic assembly of cell cycle complexes from high-throughput data
PPTX
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
PPTX
bioinformatics simple
PPT
Project report-on-bio-informatics
PPT
A systematic approach to Genotype-Phenotype correlations
PPTX
Informal presentation on bioinformatics
PPTX
High throughput approaches to understanding gene function and mapping archite...
PPT
PPTX
Metagenomics and it’s applications
PDF
metagenomicsanditsapplications-161222180924.pdf
PDF
Presentation july 31_2015
PPTX
Plant system biology
PPTX
How to analyse large data sets
PPTX
Bioinformatics, application by kk sahu sir
PPT
Protein protein interactions in systems biology
STRING - Modeling of pathways through cross-species integration of large-scal...
STRING - Cross-species integration of known and predicted protein-protein int...
Proteomics - Analysis and integration of large-scale data sets
STRING - Prediction of functionally associated proteins from heterogeneous ge...
STRING - Prediction of functional relations, modules, and networks from heter...
Modeling the dynamic assembly of cell cycle complexes from high-throughput data
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
bioinformatics simple
Project report-on-bio-informatics
A systematic approach to Genotype-Phenotype correlations
Informal presentation on bioinformatics
High throughput approaches to understanding gene function and mapping archite...
Metagenomics and it’s applications
metagenomicsanditsapplications-161222180924.pdf
Presentation july 31_2015
Plant system biology
How to analyse large data sets
Bioinformatics, application by kk sahu sir
Protein protein interactions in systems biology
Ad

More from Lars Juhl Jensen (20)

PPT
One tagger, many uses: Illustrating the power of dictionary-based named entit...
PPT
One tagger, many uses: Simple text-mining strategies for biomedicine
PPT
Extract 2.0: Text-mining-assisted interactive annotation
PPT
Network visualization: A crash course on using Cytoscape
PPT
STRING & STITCH : Network integration of heterogeneous data
PPT
Biomedical text mining: Automatic processing of unstructured text
PPT
Medical network analysis: Linking diseases and genes through data and text mi...
PPT
Network Biology: A crash course on STRING and Cytoscape
PPT
Cellular networks
PPT
Cellular Network Biology: Large-scale integration of data and text
PPT
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
PPT
STRING & related databases: Large-scale integration of heterogeneous data
PPT
Tagger: Rapid dictionary-based named entity recognition
PPT
Network Biology: Large-scale integration of data and text
PPT
Medical text mining: Linking diseases, drugs, and adverse reactions
PPT
Network biology: Large-scale integration of data and text
PPT
Medical data and text mining: Linking diseases, drugs, and adverse reactions
PPT
Cellular Network Biology
PPT
Network biology: Large-scale integration of data and text
PPT
Biomarker bioinformatics: Network-based candidate prioritization
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Simple text-mining strategies for biomedicine
Extract 2.0: Text-mining-assisted interactive annotation
Network visualization: A crash course on using Cytoscape
STRING & STITCH : Network integration of heterogeneous data
Biomedical text mining: Automatic processing of unstructured text
Medical network analysis: Linking diseases and genes through data and text mi...
Network Biology: A crash course on STRING and Cytoscape
Cellular networks
Cellular Network Biology: Large-scale integration of data and text
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
STRING & related databases: Large-scale integration of heterogeneous data
Tagger: Rapid dictionary-based named entity recognition
Network Biology: Large-scale integration of data and text
Medical text mining: Linking diseases, drugs, and adverse reactions
Network biology: Large-scale integration of data and text
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Cellular Network Biology
Network biology: Large-scale integration of data and text
Biomarker bioinformatics: Network-based candidate prioritization

Recently uploaded (20)

PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
DOCX
Euro SEO Services 1st 3 General Updates.docx
PPTX
Lecture (1)-Introduction.pptx business communication
PDF
Business model innovation report 2022.pdf
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
Training And Development of Employee .pdf
PDF
IFRS Notes in your pocket for study all the time
PDF
Types of control:Qualitative vs Quantitative
PDF
A Brief Introduction About Julia Allison
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
COST SHEET- Tender and Quotation unit 2.pdf
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
340036916-American-Literature-Literary-Period-Overview.ppt
Euro SEO Services 1st 3 General Updates.docx
Lecture (1)-Introduction.pptx business communication
Business model innovation report 2022.pdf
unit 1 COST ACCOUNTING AND COST SHEET
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
ICG2025_ICG 6th steering committee 30-8-24.pptx
Roadmap Map-digital Banking feature MB,IB,AB
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
Training And Development of Employee .pdf
IFRS Notes in your pocket for study all the time
Types of control:Qualitative vs Quantitative
A Brief Introduction About Julia Allison
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
MSPs in 10 Words - Created by US MSP Network
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
COST SHEET- Tender and Quotation unit 2.pdf

STRING - Prediction of a functional association network for the yeast mitochondrial system

  • 1. STRING Prediction of a functional association network for the yeast mitochondrial system Lars Juhl Jensen EMBL Heidelberg
  • 2. Overview Prediction of functional associations between proteins What is STRING? Genomic context methods Integration of large-scale experimental data Combination and cross-species transfer of evidence (Coffee break) The yeast mitochondrial system Prediction of mitochondrial proteins A functional association network for mitochondria Mapping and correlating features of mitochondrial proteins
  • 3. Part 1 Prediction of functional association between proteins Lars Juhl Jensen EMBL Heidelberg
  • 4. What is STRING? Genomic neighborhood Species co-occurrence Gene fusions Database imports Exp. interaction data Microarray expression data Literature co-mentioning
  • 5. Let the data speak for themselves ... Classification schemes are obviously difficult to predict if they are not supported by the data – there are no obvious features separating: Presidents vs. non-presidents Actors vs. non-actors Unsupervised methods may discover a more meaningful classification: Holding your pinky to your mouth is a clear sign of evil Wearing a bowtie is a sign of good So is consumption of alcoholic drinks
  • 6. Inferring functional modules from gene presence/absence patterns T rends in Microbiology Resting protuberances Protracted protuberance Cellulose © Trends Microbiol, 1999 Cell Cell wall Anchoring proteins Cellulosomes Cellulose The “Cellulosome”
  • 7. Genomic context methods © Nature Biotechnology, 2004
  • 8. Score calibration against a common reference Many diverse types of evidence The quality of each is judged by very different raw scores Quality differences exist among data sets of the same type Solved by calibrating all scores against a common reference Scores are directly comparable Probabilistic scores allow evidence to be combined Requirements for the reference Must represent a compromise of the all types of evidence Broad species coverage
  • 9. Integrating physical interaction screens Make binary representation of complexes Yeast two-hybrid data sets are inherently binary Calculate score from number of (co-)occurrences Calculate score from non-shared partners Calibrate against KEGG maps Infer associations in other species Combine evidence from experiments
  • 10. Mining microarray expression databases Re-normalize arrays by modern method to remove biases Build expression matrix Combine similar arrays by PCA Construct predictor by Gaussian kernel density estimation Calibrate against KEGG maps Infer associations in other species
  • 11. Evidence transfer based on “fuzzy orthology” Orthology transfer is tricky Correct assignment of orthology is difficult for distant species Functional equivalence cannot be guaranteed for in-paralogs These problems are addressed by our “fuzzy orthology” scheme Confidence scores for functional equivalence are calculated from all-against-all alignment Evidence is distributed across possible pairs according to confidence scores in the case of many-to-many relationships ? Source species Target species
  • 12. Multiple evidence types from several species
  • 13. Predicting and defining metabolic pathways and other functional modules Image: Molecular Biology of the Cell, 3 . rd edition Metabolism overview Defined manually: cutting metabolic maps into pathways Purine biosynthesis Histidine biosynthesis Defined objectively: standard clustering of genome-scale data
  • 14. Part 2 The yeast mitochondrial system Lars Juhl Jensen EMBL Heidelberg
  • 15. Yeast mitochondria – why it should work well Because it is metabolism STRING was developed using KEGG pathways as a reference This may have caused STRING to function best on metabolism Because it is yeast By far the best covered organism in terms of physical interactions Many microarray gene expression studies Literature mining works well due to standardization of gene names Because it is prokaryotic Evolutionarily, mitochondria are of bacterial origin The genomic context methods in STRING are very powerful, but can only provide evidence for proteins with prokaryotic orthologs
  • 16. Strategy for extracting a functional association network of the mitochondrial system Starting point: Reference set of proteins known to mitochondrial A large, diverse set of experiments relevant for predicting mitochondrial proteins The global STRING network for yeast Predict mitochondrial candidate genes Use reference set to train neural networks for predicting candidate genes based on experimental data Use very high-confidence STRING links to suggest additional candidates based interactions with reference and candidate genes Extract network that includes lower confidence interactions and identify functional modules by clustering
  • 17. Predicting mitochondrial proteins Training was done with 5-fold cross validation Reference set used as positive examples All other genes used as negative examples Top 800 contains more than 90% of known mitochondrial genes Surprising performance of the linear model As good as NN with 250 hidden neurons Better than MitoP2
  • 18. TOM MRPL Ribosome Related MRPS Vacuolar Acidification Fatty Acid Biosynth. Secondary RCC_Asy RCC_Asy RCCII RCCIV RCCV RCC_Asy HAP Complex Arg Biosynth. PDH/KGD/ GCV Cell Wall & pH Reg. DNA Repair Glucose sensing and CH remodelling APC Fission/ Fusion rRNA Processing mRNA Processing TFIIIC Complex m-AAA Complex TCA Cycle Iron Homeostasis/ Chaperone Activity RCCI rRNA Processing Leu/Val/Ile Biosynth. DNA Repair GARP Complex Cytosolic Ribosome TIM RCC_Asy Actin tRNA Splicing RCCIII NUP Replication/ DNA Repair
  • 19. TOM MRPL Ribosome Related MRPS Vacuolar Acidification Fatty Acid Biosynth. Secondary RCC_Asy RCC_Asy RCCII RCCIV RCCV RCC_Asy HAP Complex Arg Biosynth. PDH/KGD/ GCV Cell Wall & pH Reg. DNA Repair Glucose sensing and CH remodelling APC Fission/ Fusion rRNA Processing mRNA Processing TFIIIC Complex m-AAA Complex TCA Cycle Iron Homeostasis/ Chaperone Activity RCCI rRNA Processing Leu/Val/Ile Biosynth. DNA Repair GARP Complex Cytosolic Ribosome TIM RCC_Asy Actin tRNA Splicing RCCIII NUP Replication/ DNA Repair Protobacterial orthologs
  • 20. TOM MRPL Ribosome Related MRPS Vacuolar Acidification Fatty Acid Biosynth. Secondary RCC_Asy RCCII RCCIV RCCV RCC_Asy HAP Complex Arg Biosynth. PDH/KGD/ GCV Cell Wall & pH Reg. DNA Repair Glucose sensing and CH remodelling APC Fission/ Fusion rRNA Processing mRNA Processing TFIIIC Complex m-AAA Complex TCA Cycle Iron Homeostasis/ Chaperone Activity RCCI rRNA Processing Leu/Val/Ile Biosynth. DNA Repair GARP Complex Cytosolic Ribosome TIM RCC_Asy Actin tRNA Splicing RCCIII NUP Replication/ DNA Repair Human disease orthologs RCC_Asy
  • 21.  
  • 22. Composition and interconnectivity of clusters A network of clusters Most probably path between clusters used as score Interacting clusters are preferentially within the same compartment Protobacterial clusters typically localize to the mitochondria
  • 23. Correlations among gene features Expression data agree with data on NF specific growth defects Genes with detectable human orthologs are more conserved among yeasts Disease orthologs are often protobacterial Knockout of disease orthologs cause less severe growth defects
  • 24. Can human disease genes be predicted? Mitochondrial genes are already enriched in disease genes Previous slide showed that mitochondrial genes of protobacterial origin and are further enriched in disease gene orthologs Disease gene orthologs show less growth defect than other mitochondrial genes with human orthologs
  • 25. Getting more specific – generally speaking Benchmarking against one common reference allows integration of heterogeneous data The different types of data do not all tell us about the same kind of functional associations It should be possible to assign likely interaction types from supporting evidence types An accurate model of the yeast mitotic cell cycle Approach High confidence set of physical interactions Custom analysis of cell cycle expression data Observations Dynamic assembly of cell cycle complexes Temporal regulation of Cdk specificity Dynamic complex formation during the yeast cell cycle Ulrik de Lichtenberg, Lars Juhl Jensen, Søren Brunak and Peer Bork to appear in Science
  • 26. Conclusions Genomic context methods are able to infer the function of many prokaryotic proteins from genome sequences alone New genomic context methods are still being developed Integration of large-scale experimental data allows similar predictions to be made for eukaryotic proteins Successful data integration requires benchmarking and cross-species transfer of information Protein networks are useful for the analysis of large, complex biological systems
  • 27. Acknowledgments The STRING team Christian von Mering Berend Snel Martijn Huynen Daniel Jaeggi Steffen Schmidt Mathilde Foglierini Peer Bork New genomic context methods Jan Korbel Christian von Mering Peer Bork ArrayProspector web service Julien Lagarde Chris Workman NetView visualization tool Sean Hooper Study of yeast mitochondria Fabiana Perocchi Lars Steinmetz Analysis of yeast cell cycle Ulrik de Lichtenberg Thomas Skøt Anders Fausbøll Søren Brunak Web resources string.embl.de www.bork.embl.de/ArrayProspector www.bork.embl.de/synonyms