SlideShare a Scribd company logo
STRING Modeling of biological systems through cross-species data integration
Lars Juhl Jensen
 
 
promoter analysis
Jensen et al., Bioinformatics, 2000
genome visualization
Pedersen et al., Journal of Molecular Biology, 2000
protein function prediction
 
 
 
STRING
 
integrate diverse evidence
functional interactions
Bork et al., Current Opinion in Structural Biology, 2005
179 proteomes
genomic context methods
phylogenetic profiles
 
 
 
 
Cell Cellulosomes Cellulose
anti-correlated profiles
 
analogous enzymes
Morett et al., Nature Biotechnology, 2003
gene neighborhood
 
bidirectional promoters
 
Korbel et al., Nature Biotechnology, 2004
gene fusion
 
evolution
 
 
statistics
(the original sin)
scoring and benchmarking
raw quality scores
gene neighborhood
sum of intergenic distances
 
many types of evidence
not directly comparable
calibrate vs. gold standard
 
curated knowledge
KEGG Kyoto Encyclopedia of Genes and Genomes
STKE Signal Transduction Knowledge Environment
Reactome
MIPS Munich Information center for Protein Sequences
primary experimental data
Jensen et al., Drug Discovery Today: Targets, 2004
microarray expression data
GEO Gene Expression Omnibus
physical protein interactions
BIND Biomolecular Interaction Network Database
MINT Molecular Interactions Database
GRID General Repository for Interaction Datasets
DIP Database of Interacting Proteins
HPRD Human Protein Reference Database
von Mering et al., Nucleic Acids Research, 2005
literature mining
M EDLINE
SGD Saccharomyces Genome Database
The Interactive Fly
OMIM Online Mendelian Inheritance in Man
co-mentioning
different gene names
curated synonyms lists
NLP Natural Language Processing
Gene  and protein  names Cue words for entity recognition Verbs for relation extraction [ nxgene  The  GAL4   gene ] [ nxexpr  T he  expression  of   [ nxgene   the cytochrome  genes   [ nxpg   CYC1  and  CYC7 ]]] is  controlled  by [ nxpg   HAP1 ]
Jensen et al., Nature Reviews Genetics, 2006
combine all evidence
na ïve Bayesian scheme
spread over many species
transfer based orthology
? Source species Target species
 
 
 
 
 
 
defining functional modules
 
 
qualitative modeling
the mitochondrial system
 
RCCs
predicting “mode of action”
Jensen et al., Drug Discovery Today: Targets, 2004
Jensen et al., Drug Discovery Today: Targets, 2004
Acknowledgments The STRING team (EMBL) Christian von Mering Berend Snel Martijn Huynen Sean Hooper Mathilde Foglierini Julien Lagarde Peer Bork Literature mining project (EML Research) Jasmin Saric Rossitza Ouzounova Isabel Rojas New genomic context methods (EMBL) Jan Korbel Peer Bork Modeling of yeast mitochondria (EMBL) Fabiana Perocchi Lars Steinmetz Inspiration for presentation Dick Clarence Hardt Anders Gorm Pedersen
Thank you!

More Related Content

KEY
STRING/STITCH tutorial
PPT
The STRING database and related tools
PPT
Network biology: Large-scale data and text mining
PPT
Large-scale integration of data and text
PPT
STRING & STITCH : Network integration of heterogeneous data
PPT
The STRING database
PPT
Large-scale integration of data and text
PPT
Cross-species data integration
STRING/STITCH tutorial
The STRING database and related tools
Network biology: Large-scale data and text mining
Large-scale integration of data and text
STRING & STITCH : Network integration of heterogeneous data
The STRING database
Large-scale integration of data and text
Cross-species data integration

What's hot (20)

PPT
STRING - Protein networks from data and text mining
PPT
The STRING database
PPT
Integration of heterogeneous data
PPT
Systems biology - Understanding biology at the systems level
PPT
Introduction to STRING
PPT
Systems biology: Bioinformatics on complete biological system
PPT
Text and data mining
PPT
Data integration and functional association networks
PPT
Protein association networks with STRING
PPT
Systems biology: Bioinformatics on complete biological systems
PPT
STRING: Large-scale data and text mining
PPT
Network integration of data and text
PPT
The STRING database - Quality scores for heterogeneous interaction data
PPT
Using networks to derive function
PPT
Data and Text Mining
PPT
Gene association networks - Large-scale integration of data and text
PPTX
Systems Biology Approaches to Cancer
PPT
Large-scale data and text mining
PPT
Gene association networks - Large-scale integration of data and text
PPT
Network biology - Large-scale integration of data and text
STRING - Protein networks from data and text mining
The STRING database
Integration of heterogeneous data
Systems biology - Understanding biology at the systems level
Introduction to STRING
Systems biology: Bioinformatics on complete biological system
Text and data mining
Data integration and functional association networks
Protein association networks with STRING
Systems biology: Bioinformatics on complete biological systems
STRING: Large-scale data and text mining
Network integration of data and text
The STRING database - Quality scores for heterogeneous interaction data
Using networks to derive function
Data and Text Mining
Gene association networks - Large-scale integration of data and text
Systems Biology Approaches to Cancer
Large-scale data and text mining
Gene association networks - Large-scale integration of data and text
Network biology - Large-scale integration of data and text
Ad

Viewers also liked (20)

PPTX
Cancer genome databases & Ecological databases
PDF
Data mining
PPT
Introduction to Cancer Genomics Databases
PDF
HEALTHCARE RESEARCH METHODS: Primary Studies: Selecting a Sample Population a...
PDF
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
PDF
Introduction to Data Mining / Bioinformatics
PPTX
Sample and population
PPT
Data Mining: Concepts and techniques: Chapter 13 trend
PDF
variational bayes in biophysics
PPT
Knowledge discovery thru data mining
PPTX
Biological databases
PPTX
BIOLOGICAL SEQUENCE DATABASES
PPT
Biological databases
PDF
Population and sample mean
PPT
Biological databases
PPTX
Computer aided drug designing (CADD)
PPT
Biological databases
PPTX
databases in bioinformatics
PDF
Predictive Analytics using R
Cancer genome databases & Ecological databases
Data mining
Introduction to Cancer Genomics Databases
HEALTHCARE RESEARCH METHODS: Primary Studies: Selecting a Sample Population a...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Introduction to Data Mining / Bioinformatics
Sample and population
Data Mining: Concepts and techniques: Chapter 13 trend
variational bayes in biophysics
Knowledge discovery thru data mining
Biological databases
BIOLOGICAL SEQUENCE DATABASES
Biological databases
Population and sample mean
Biological databases
Computer aided drug designing (CADD)
Biological databases
databases in bioinformatics
Predictive Analytics using R
Ad

Similar to STRING - Modeling of biological systems through cross-species data integration (20)

PPT
Functional association networks - The STRING and STITCH web resources
PPT
Integration of diverse large-scale datasets
PPT
Data integration - Integration of functional associations using STRING
PPT
STRING - Prediction of a functional association network for the yeast mitocho...
PPT
Prediction of protein networks through data integration
PPT
Integration of heterogeneous data
PPT
STRING & related databases: Large-scale integration of heterogeneous data
PPT
Information integration
PPT
Proteomics - Analysis and integration of large-scale data sets
PPT
Large-scale integration of data and text
PPT
Prediction of protein function
PPT
Network integration of heterogeneous data
PPT
STRING: Prediction of protein networks through integration of diverse large-s...
PPT
Large-scale data and text mining
PPT
STRING - Cross-species integration of known and predicted protein-protein int...
PPT
STRING - Modeling of pathways through cross-species integration of large-scal...
PPTX
String.pptx
PPT
STRING - Prediction of protein networks through integration of diverse large-...
ZIP
Exploring proteins, chemicals and their interactions with STRING and STITCH
PPT
STRING - Cross-species integration of known and predicted protein-protein int...
Functional association networks - The STRING and STITCH web resources
Integration of diverse large-scale datasets
Data integration - Integration of functional associations using STRING
STRING - Prediction of a functional association network for the yeast mitocho...
Prediction of protein networks through data integration
Integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous data
Information integration
Proteomics - Analysis and integration of large-scale data sets
Large-scale integration of data and text
Prediction of protein function
Network integration of heterogeneous data
STRING: Prediction of protein networks through integration of diverse large-s...
Large-scale data and text mining
STRING - Cross-species integration of known and predicted protein-protein int...
STRING - Modeling of pathways through cross-species integration of large-scal...
String.pptx
STRING - Prediction of protein networks through integration of diverse large-...
Exploring proteins, chemicals and their interactions with STRING and STITCH
STRING - Cross-species integration of known and predicted protein-protein int...

More from Lars Juhl Jensen (20)

PPT
One tagger, many uses: Illustrating the power of dictionary-based named entit...
PPT
One tagger, many uses: Simple text-mining strategies for biomedicine
PPT
Extract 2.0: Text-mining-assisted interactive annotation
PPT
Network visualization: A crash course on using Cytoscape
PPT
Biomedical text mining: Automatic processing of unstructured text
PPT
Medical network analysis: Linking diseases and genes through data and text mi...
PPT
Network Biology: A crash course on STRING and Cytoscape
PPT
Cellular networks
PPT
Cellular Network Biology: Large-scale integration of data and text
PPT
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
PPT
Tagger: Rapid dictionary-based named entity recognition
PPT
Network Biology: Large-scale integration of data and text
PPT
Medical text mining: Linking diseases, drugs, and adverse reactions
PPT
Network biology: Large-scale integration of data and text
PPT
Medical data and text mining: Linking diseases, drugs, and adverse reactions
PPT
Cellular Network Biology
PPT
Network biology: Large-scale integration of data and text
PPT
Biomarker bioinformatics: Network-based candidate prioritization
PPT
The Art of Counting: Scoring and ranking co-occurrences in literature
PPT
Text-mining-based retrieval of protein networks
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Simple text-mining strategies for biomedicine
Extract 2.0: Text-mining-assisted interactive annotation
Network visualization: A crash course on using Cytoscape
Biomedical text mining: Automatic processing of unstructured text
Medical network analysis: Linking diseases and genes through data and text mi...
Network Biology: A crash course on STRING and Cytoscape
Cellular networks
Cellular Network Biology: Large-scale integration of data and text
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Tagger: Rapid dictionary-based named entity recognition
Network Biology: Large-scale integration of data and text
Medical text mining: Linking diseases, drugs, and adverse reactions
Network biology: Large-scale integration of data and text
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Cellular Network Biology
Network biology: Large-scale integration of data and text
Biomarker bioinformatics: Network-based candidate prioritization
The Art of Counting: Scoring and ranking co-occurrences in literature
Text-mining-based retrieval of protein networks

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
KodekX | Application Modernization Development
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
KodekX | Application Modernization Development
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Diabetes mellitus diagnosis method based random forest with bat algorithm
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

STRING - Modeling of biological systems through cross-species data integration