SlideShare a Scribd company logo
Network biology
Large-scale data and text mining
Lars Juhl Jensen
guilt by association
Network biology: Large-scale data and text mining
protein networks
STRING
computational predictions
gene fusion
Korbel et al., Nature Biotechnology, 2004
gene neighborhood
Korbel et al., Nature Biotechnology, 2004
phylogenetic profiles
Korbel et al., Nature Biotechnology, 2004
experimental data
gene coexpression
Network biology: Large-scale data and text mining
protein interactions
Jensen & Bork, Science, 2008
curated knowledge
complexes
pathways
Letunic & Bork, Trends in Biochemical Sciences, 2008
many databases
different formats
different identifiers
variable quality
not comparable
not same species
hard work
quality scores
von Mering et al., Nucleic Acids Research, 2005
calibrate vs. gold standard
von Mering et al., Nucleic Acids Research, 2005
homology-based transfer
Franceschini et al., Nucleic Acids Research, 2013
missing most of the data
text mining
>10 km
too much to read
computer
as smart as a dog
teach it specific tricks
Network biology: Large-scale data and text mining
Network biology: Large-scale data and text mining
named entity recognition
comprehensive lexicon
CDC2
cyclin dependent kinase 1
expansion rules
hCdc2
CDC2
flexible matching
cyclin-dependent kinase 1
cyclin dependent kinase 1
“black list”
SDS
augmented browsing
Reflect
browser add-on
real-time text mining
Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009
O’Donoghue et al., Journal of Web Semantics, 2010
information extraction
co-mentioning
within documents
within paragraphs
within sentences
text corpus
~22 million abstracts
no access
millions of full-text articles
Network biology: Large-scale data and text mining
localization and disease
general approach
COMPARTMENTS
TISSUES
DISEASES
curated knowledge
experimental data
text mining
computational predictions
common identifiers
quality scores
visualization
compartments.jensenlab.org
tissues.jensenlab.org
dissemination
web interfaces
Network biology: Large-scale data and text mining
web services
diseases.jensenlab.org
bulk download
Acknowledgments
STRING
Christian von
Mering
Damian
Szklarczyk
Michael Kuhn
Manuel Stark
Samuel Chaffron
Chris Creevey
Jean Muller
Tobias Doerks
Philippe Julien
Alexander Roth
Milan Simonovic
Jan Korbel
Berend Snel
Martijn Huynen
Peer Bork
Text
mining
Sune Frankild
Evangelos Pafilis
Kalliopi Tsafou
Alberto Santos
Janos Binder
Heiko Horn
Michael Kuhn
Nigel Brown
Reinhardt Schneider
Sean O’ Donoghue

More Related Content

PPT
Gene association networks - Large-scale integration of data and text
PPT
Network biology: Large-scale data and text mining
PPT
STRING - Protein networks from data and text mining
KEY
STRING/STITCH tutorial
PPT
STRING - Large-scale integration of data and text
PPT
Large-scale integration of data and text
PPT
In silico and Text-Based Analysis of Cellular Networks
PPT
Protein association networks: Large-scale integration of data and text
Gene association networks - Large-scale integration of data and text
Network biology: Large-scale data and text mining
STRING - Protein networks from data and text mining
STRING/STITCH tutorial
STRING - Large-scale integration of data and text
Large-scale integration of data and text
In silico and Text-Based Analysis of Cellular Networks
Protein association networks: Large-scale integration of data and text

What's hot (20)

PPT
Introduction to STRING
PPT
Network Biology: A crash course on STRING and Cytoscape
PPT
Gene association networks - Large-scale integration of data and text
PPT
Gene association networks: Large-scale integration of data and text
PPT
Gene association networks - Large-scale integration of data and text
PPT
Network biology: Large-scale data integration and text mining
PPT
Network Biology: Large-scale integration of data and text
PPT
Network biology - Large-scale integration of data and text
PPT
The STRING database and related tools
PPT
Gene association networks: Large-scale integration of data and text
PPT
Gene association networks: Large-scale integration of data and text
PPT
Gene association networks - Large-scale integration of data and text
PPT
Making gene networks through data integration
PPT
STRING: Protein networks from data and text mining
PPT
Data integration with STRING
PPT
The STRING database
PPT
Integration of heterogeneous data
PPT
Networks of proteins and diseases
PPT
STRING: Large-scale data and text mining
PPT
One tagger, many uses - Illustrating the power of ontologies in named entity ...
Introduction to STRING
Network Biology: A crash course on STRING and Cytoscape
Gene association networks - Large-scale integration of data and text
Gene association networks: Large-scale integration of data and text
Gene association networks - Large-scale integration of data and text
Network biology: Large-scale data integration and text mining
Network Biology: Large-scale integration of data and text
Network biology - Large-scale integration of data and text
The STRING database and related tools
Gene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and text
Gene association networks - Large-scale integration of data and text
Making gene networks through data integration
STRING: Protein networks from data and text mining
Data integration with STRING
The STRING database
Integration of heterogeneous data
Networks of proteins and diseases
STRING: Large-scale data and text mining
One tagger, many uses - Illustrating the power of ontologies in named entity ...
Ad

Viewers also liked (20)

PPTX
Computational Systems Biology (JCSB)
PPTX
Colombia desarrollo tecnológico y científico
PPT
Satellite tv software trial
PPT
Satellite tv software pc
PPT
παρουσίαση1
PPT
Satellite tv software on pc
PPT
Satellite tv software laptop
PPT
παρουσίαση1
PPT
παρουσίαση1
PPTX
Colombia desarrollo tecnológico y científico
PPTX
The Value of Bioinformatics Software
PPT
Explorations in bioinformatics
PDF
Making the Most of Your Gradle Build
PPTX
Metabolomics Society meeting 2011 - presentatie Kees
PPTX
タイ人オタクが艦これ聖地山を巡った話 第3話1章 和歌山 新宮市 熊野川
PPTX
Historica Fantasia, Development Blog 05, Prototype Cost Estimation
PDF
World-wide data exchange in metabolomics, Wageningen, October 2016
PPTX
Kalifornia
PDF
The Future of Progressive Web Apps - View Source conference, Berlin 2016
PPTX
NOMNENCLATURA DE QUIMICA INORGANICA
Computational Systems Biology (JCSB)
Colombia desarrollo tecnológico y científico
Satellite tv software trial
Satellite tv software pc
παρουσίαση1
Satellite tv software on pc
Satellite tv software laptop
παρουσίαση1
παρουσίαση1
Colombia desarrollo tecnológico y científico
The Value of Bioinformatics Software
Explorations in bioinformatics
Making the Most of Your Gradle Build
Metabolomics Society meeting 2011 - presentatie Kees
タイ人オタクが艦これ聖地山を巡った話 第3話1章 和歌山 新宮市 熊野川
Historica Fantasia, Development Blog 05, Prototype Cost Estimation
World-wide data exchange in metabolomics, Wageningen, October 2016
Kalifornia
The Future of Progressive Web Apps - View Source conference, Berlin 2016
NOMNENCLATURA DE QUIMICA INORGANICA
Ad

Similar to Network biology: Large-scale data and text mining (20)

PPT
Network biology: Large-scale data integration and text mining
PPT
Data and Text Mining
PPT
Large-scale data and text mining
PPT
Turning big data and text collections into web resrouces
PPT
Protein networks: A basis for large-scale data mining
PPT
Protein networks: A basis for large-scale data mining
PPT
Systems biology: Large-scale biomedical data mining
PPT
Systems biology: Large-scale biomedical data mining
PPT
Large-scale integration of data and text
PPT
STRING & related databases: Large-scale integration of heterogeneous data
PPT
Protein networks: A basis for large-scale data mining
PPT
Cellular Network Biology: Large-scale integration of data and text
PPT
Protein networks: A basis for large-scale data mining
PPT
Networks of proteins and diseases
PPT
Large-scale integration of data and text
PPT
Network biology
PPT
Network biology: A basis for large-scale biomedical data mining
PPT
Network biology: A basis for large-scale biomedical data mining
PPT
Network biology: A basis for large-scale biomedical data mining
PPT
Systems biology - Bioinformatics on complete biological systems
Network biology: Large-scale data integration and text mining
Data and Text Mining
Large-scale data and text mining
Turning big data and text collections into web resrouces
Protein networks: A basis for large-scale data mining
Protein networks: A basis for large-scale data mining
Systems biology: Large-scale biomedical data mining
Systems biology: Large-scale biomedical data mining
Large-scale integration of data and text
STRING & related databases: Large-scale integration of heterogeneous data
Protein networks: A basis for large-scale data mining
Cellular Network Biology: Large-scale integration of data and text
Protein networks: A basis for large-scale data mining
Networks of proteins and diseases
Large-scale integration of data and text
Network biology
Network biology: A basis for large-scale biomedical data mining
Network biology: A basis for large-scale biomedical data mining
Network biology: A basis for large-scale biomedical data mining
Systems biology - Bioinformatics on complete biological systems

More from Lars Juhl Jensen (20)

PPT
One tagger, many uses: Illustrating the power of dictionary-based named entit...
PPT
One tagger, many uses: Simple text-mining strategies for biomedicine
PPT
Extract 2.0: Text-mining-assisted interactive annotation
PPT
Network visualization: A crash course on using Cytoscape
PPT
STRING & STITCH : Network integration of heterogeneous data
PPT
Biomedical text mining: Automatic processing of unstructured text
PPT
Medical network analysis: Linking diseases and genes through data and text mi...
PPT
Cellular networks
PPT
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
PPT
Tagger: Rapid dictionary-based named entity recognition
PPT
Medical text mining: Linking diseases, drugs, and adverse reactions
PPT
Network biology: Large-scale integration of data and text
PPT
Medical data and text mining: Linking diseases, drugs, and adverse reactions
PPT
Cellular Network Biology
PPT
Network biology: Large-scale integration of data and text
PPT
Biomarker bioinformatics: Network-based candidate prioritization
PPT
The Art of Counting: Scoring and ranking co-occurrences in literature
PPT
Text-mining-based retrieval of protein networks
PPT
Medical data and text mining: Linking diseases, drugs, and adverse reactions
PPT
Medical data and text mining: Linking diseases, drugs, and adverse reactions
One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Simple text-mining strategies for biomedicine
Extract 2.0: Text-mining-assisted interactive annotation
Network visualization: A crash course on using Cytoscape
STRING & STITCH : Network integration of heterogeneous data
Biomedical text mining: Automatic processing of unstructured text
Medical network analysis: Linking diseases and genes through data and text mi...
Cellular networks
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Tagger: Rapid dictionary-based named entity recognition
Medical text mining: Linking diseases, drugs, and adverse reactions
Network biology: Large-scale integration of data and text
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Cellular Network Biology
Network biology: Large-scale integration of data and text
Biomarker bioinformatics: Network-based candidate prioritization
The Art of Counting: Scoring and ranking co-occurrences in literature
Text-mining-based retrieval of protein networks
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactions

Recently uploaded (20)

PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Microbes in human welfare class 12 .pptx
PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PPT
veterinary parasitology ````````````.ppt
PPTX
BODY FLUIDS AND CIRCULATION class 11 .pptx
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PPTX
Application of enzymes in medicine (2).pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPT
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPTX
C1 cut-Methane and it's Derivatives.pptx
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PDF
Biophysics 2.pdffffffffffffffffffffffffff
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Microbes in human welfare class 12 .pptx
Science Quipper for lesson in grade 8 Matatag Curriculum
lecture 2026 of Sjogren's syndrome l .pdf
veterinary parasitology ````````````.ppt
BODY FLUIDS AND CIRCULATION class 11 .pptx
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
Application of enzymes in medicine (2).pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Introduction to Cardiovascular system_structure and functions-1
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
C1 cut-Methane and it's Derivatives.pptx
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
Biophysics 2.pdffffffffffffffffffffffffff

Network biology: Large-scale data and text mining