SlideShare a Scribd company logo
Prediction of protein networks through data integration Lars Juhl Jensen EMBL Heidelberg
prediction of interactions
STRING
 
functional interactions
373 genomes
model organism databases
Ensembl
Genome Reviews
RefSeq
genomic context methods
gene neighborhood
 
gene fusion
 
phylogenetic profiles
 
 
 
 
Cell Cellulosomes Cellulose
correct interactions
wrong associations
phylogenetic profiles
 
SVD Singular Value Decomposition
Euclidian distance
gene neighborhood
 
sum of intergenic distances
raw quality scores
rank by reliability
not comparable
Euclidian distance
sum of intergenic distances
benchmarking
calibrate vs. gold standard
 
raw quality scores
probabilistic scores
curated knowledge
many sources
KEGG Kyoto Encyclopedia of Genes and Genomes
Reactome
PID NCI-Nature Pathway Interaction Database
STKE Signal Transduction Knowledge Environment
MIPS Munich Information center for Protein Sequences
Gene Ontology
different gene identifiers
synonyms list
literature mining
M EDLINE
SGD Saccharomyces Genome Database
The Interactive Fly
OMIM Online Mendelian Inheritance in Man
co-mentioning
NLP Natural Language Processing
Gene  and protein  names Cue words for entity recognition Verbs for relation extraction [ nxgene  The  GAL4   gene ] [ nxexpr  T he  expression  of   [ nxgene   the cytochrome  genes   [ nxpg   CYC1  and  CYC7 ]]] is  controlled  by [ nxpg   HAP1 ]
calibrate vs. gold standard
 
primary experimental data
gene expression
GEO Gene Expression Omnibus
expression compendia
protein interactions
BIND Biomolecular Interaction Network Database
BioGRID General Repository for Interaction Datasets
DIP Database of Interacting Proteins
IntAct
MINT Molecular Interactions Database
HPRD Human Protein Reference Database
many sources
different gene identifiers
redundancy
not comparable
merge data by publication
raw quality scores
calibrate vs. gold standard
 
combine all evidence
spread over many species
transfer by orthology
na ïve Bayesian scoring
 
prediction of interactions
NetworKIN
 
the idea
phosphoproteomics
mass spectrometry
 
phosphorylation sites
Phospho.ELM
in vivo
kinases are unknown
computational methods
NetPhosK
Scansite
sequence motifs
 
kinase families
overprediction
no context
what a kinase could do
not what it actually does
context
co-activators
scaffolders
protein networks
 
the algorithm
NetworKIN
 
benchmarking
Phospho.ELM
 
2.5-fold better accuracy
context is crucial
global statistics
 
visualization
 
ATM signaling
 
experimental validation
summary
reanalysis
benchmarking
integration
complementary data types
computational methods
reproduce what is know
biological discoveries
testable hypotheses
Acknowledgments The STRING database Christian von Mering Michael Kuhn Berend Snel Martijn Huynen Sean Hooper Samuel Chaffron Julien Lagarde Mathilde Foglierini Peer Bork Literature mining Jasmin Saric Rossitza Ouzounova Isabel Rojas The NetworKIN method Rune Linding Gerard Ostheimer Francesca Diella Karen Colwill Jing Jin Pavel Metalnikov Vivian Nguyen Adrian Pasculescu Jin Gyoon Park Leona D. Samson Rob Russell Peer Bork Michael Yaffe Tony Pawson

More Related Content

PPT
Functional association networks - The STRING and STITCH web resources
PPT
The STRING database
PPT
The STRING database - Quality scores for heterogeneous interaction data
PPT
The STRING database
PPT
Protein association networks with STRING
PPT
Systematic discovery of phosphorylation networks - Combining linear motifs an...
PPT
The STRING database and related tools
PPT
Large-scale integration of data and text
Functional association networks - The STRING and STITCH web resources
The STRING database
The STRING database - Quality scores for heterogeneous interaction data
The STRING database
Protein association networks with STRING
Systematic discovery of phosphorylation networks - Combining linear motifs an...
The STRING database and related tools
Large-scale integration of data and text

What's hot (20)

PPT
Systematic discovery of phosphorylation networks - Combining linear motifs an...
PPTX
EiB Seminar from Antoni Miñarro, Ph.D
PPT
dna fingerprinting powerpoint
PPT
Network biology: Large-scale data and text mining
DOCX
1. cristina finger bioinfor rvised
PPT
Introduction to STRING
PPT
PPT
Forensics
PPTX
Antisense and RNAi
PPT
STRING - Protein networks from data and text mining
PPTX
Gene mapping
PPSX
Microhaplotype, A Powerful New Type of Genetic Marker
PPT
Pragmatic text mining: From literature to electronic health records
PPTX
Forensic DNA Profiling
PPTX
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PPTX
Role of biotechnology in forensic science
PPTX
FORENSIC DNA PROFILING: Strengths and Limitations
DOCX
Dna Profiling Using Pcr Notes
PPT
Fs Ch 14
Systematic discovery of phosphorylation networks - Combining linear motifs an...
EiB Seminar from Antoni Miñarro, Ph.D
dna fingerprinting powerpoint
Network biology: Large-scale data and text mining
1. cristina finger bioinfor rvised
Introduction to STRING
Forensics
Antisense and RNAi
STRING - Protein networks from data and text mining
Gene mapping
Microhaplotype, A Powerful New Type of Genetic Marker
Pragmatic text mining: From literature to electronic health records
Forensic DNA Profiling
PHYSICAL MAPPING STRATEGIES IN GENOMICS
Role of biotechnology in forensic science
FORENSIC DNA PROFILING: Strengths and Limitations
Dna Profiling Using Pcr Notes
Fs Ch 14
Ad

Viewers also liked (6)

PPT
Analysis of metagenomes
PPT
Prediction of protein function
PPT
Mining large-scale data sets on the eukaryotic cell cycle
PPT
Protein networks as a scaffold for structuring other data
PPT
Literature Mining and Systems Biology
PPT
Open access - making the most of biomedical literature mining
Analysis of metagenomes
Prediction of protein function
Mining large-scale data sets on the eukaryotic cell cycle
Protein networks as a scaffold for structuring other data
Literature Mining and Systems Biology
Open access - making the most of biomedical literature mining
Ad

Similar to Prediction of protein networks through data integration (20)

PPT
Network integration of heterogeneous data
PPT
Cross-species data integration
PPT
Computational approaches to cell cycle analysis: Data and databases
PPT
Integration of heterogeneous data
PPT
Advanced bioinformatics of proteomics datasets
PPT
STRING - Prediction of a functional association network for the yeast mitocho...
PPT
The STITCH and Reflect web resources
PPT
Data integration and functional association networks
PPT
Proteomics - Analysis and integration of large-scale data sets
PPT
STRING - Modeling of biological systems through cross-species data integ...
PPT
The STITCH and Reflect web resources
PPT
Large-scale integration of data and text
PPT
Integration of heterogeneous data
PPT
STRING - Prediction of functionally associated proteins from heterogeneous ge...
PPT
Data integration - Integration of functional associations using STRING
PPT
STRING - Prediction of protein networks through integration of diverse large-...
PPT
STRING - Cross-species integration of known and predicted protein-protein int...
PPT
STRING - Modeling of pathways through cross-species integration of large-scal...
PPT
Bioinformatics of cellular processes
PPT
STRING: Prediction of protein networks through integration of diverse large-s...
Network integration of heterogeneous data
Cross-species data integration
Computational approaches to cell cycle analysis: Data and databases
Integration of heterogeneous data
Advanced bioinformatics of proteomics datasets
STRING - Prediction of a functional association network for the yeast mitocho...
The STITCH and Reflect web resources
Data integration and functional association networks
Proteomics - Analysis and integration of large-scale data sets
STRING - Modeling of biological systems through cross-species data integ...
The STITCH and Reflect web resources
Large-scale integration of data and text
Integration of heterogeneous data
STRING - Prediction of functionally associated proteins from heterogeneous ge...
Data integration - Integration of functional associations using STRING
STRING - Prediction of protein networks through integration of diverse large-...
STRING - Cross-species integration of known and predicted protein-protein int...
STRING - Modeling of pathways through cross-species integration of large-scal...
Bioinformatics of cellular processes
STRING: Prediction of protein networks through integration of diverse large-s...

More from Lars Juhl Jensen (20)

PPT
One tagger, many uses: Illustrating the power of dictionary-based named entit...
PPT
One tagger, many uses: Simple text-mining strategies for biomedicine
PPT
Extract 2.0: Text-mining-assisted interactive annotation
PPT
Network visualization: A crash course on using Cytoscape
PPT
STRING & STITCH : Network integration of heterogeneous data
PPT
Biomedical text mining: Automatic processing of unstructured text
PPT
Medical network analysis: Linking diseases and genes through data and text mi...
PPT
Network Biology: A crash course on STRING and Cytoscape
PPT
Cellular networks
PPT
Cellular Network Biology: Large-scale integration of data and text
PPT
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
PPT
STRING & related databases: Large-scale integration of heterogeneous data
PPT
Tagger: Rapid dictionary-based named entity recognition
PPT
Network Biology: Large-scale integration of data and text
PPT
Medical text mining: Linking diseases, drugs, and adverse reactions
PPT
Network biology: Large-scale integration of data and text
PPT
Medical data and text mining: Linking diseases, drugs, and adverse reactions
PPT
Cellular Network Biology
PPT
Network biology: Large-scale integration of data and text
PPT
Biomarker bioinformatics: Network-based candidate prioritization
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Simple text-mining strategies for biomedicine
Extract 2.0: Text-mining-assisted interactive annotation
Network visualization: A crash course on using Cytoscape
STRING & STITCH : Network integration of heterogeneous data
Biomedical text mining: Automatic processing of unstructured text
Medical network analysis: Linking diseases and genes through data and text mi...
Network Biology: A crash course on STRING and Cytoscape
Cellular networks
Cellular Network Biology: Large-scale integration of data and text
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
STRING & related databases: Large-scale integration of heterogeneous data
Tagger: Rapid dictionary-based named entity recognition
Network Biology: Large-scale integration of data and text
Medical text mining: Linking diseases, drugs, and adverse reactions
Network biology: Large-scale integration of data and text
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Cellular Network Biology
Network biology: Large-scale integration of data and text
Biomarker bioinformatics: Network-based candidate prioritization

Recently uploaded (20)

PPTX
Spectroscopy.pptx food analysis technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
MYSQL Presentation for SQL database connectivity
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PPT
Teaching material agriculture food technology
Spectroscopy.pptx food analysis technology
Review of recent advances in non-invasive hemoglobin estimation
MYSQL Presentation for SQL database connectivity
MIND Revenue Release Quarter 2 2025 Press Release
Digital-Transformation-Roadmap-for-Companies.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Dropbox Q2 2025 Financial Results & Investor Presentation
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
The AUB Centre for AI in Media Proposal.docx
Mobile App Security Testing_ A Comprehensive Guide.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
Teaching material agriculture food technology

Prediction of protein networks through data integration