Mining biomedical texts

Download as PPT, PDF

1 like437 views

Lars Juhl Jensen

This document discusses mining biomedical texts to extract relevant information from the large number of papers published. It mentions that named entity recognition is used to identify concepts like small molecules, proteins, and diseases from text. Tools mentioned include Reflect.ws, a browser add-on, and STITCH, which extracts curated knowledge on drug targets and pathways. Text mining techniques like co-mentioning are also discussed to find relationships between concepts from abstracts and full texts.

Mining biomedical texts Lars Juhl Jensen >10 km

exponential growth

some things are constant

~45 seconds per paper

information retrieval

find the relevant texts

still too much to read

computer

as smart as a dog

teach it specific tricks

named entity recognition

identify the concepts

comprehensive lexicon

small molecules

proteins

cellular components

organisms

diseases

orthographic variation

“ black list”

Reflect.ws

augmented browsing

browser add-on

Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology , 2009 O’Donoghue et al., Journal of Web Semantics , 2010

Firefox

Internet Explorer

Google Chrome

Safari

Utopia Documents

web services

~150 years of publishing

dead wood

dead e-wood

added value

collaboration

SciVerse application

STITCH

Kuhn et al., Nucleic Acids Research , 2010

curated knowledge

drug targets

pathways

Letunic & Bork, Trends in Biochemical Sciences , 2008

experimental data

physical interactions

Jensen & Bork, Science , 2008

text mining

co-mentioning

NLP Natural Language Processing

abstracts

full text

restricted access

collaboration

electronic patient journals

a hard problem

in Danish

no lexicon

by busy doctors

acronyms

typos

about psychiatric patients

delusions

domain specific system

F20 F200 Negation Family

diagnoses

patient stratification

Roque et al., PLoS Computational Biology , 2011

disease comorbidity

Roque et al., PLoS Computational Biology , 2011

medication

adverse drug events

pharmacovigilance

phenotype

genotype

Thank you! Reflect.ws Sune Frankild Heiko Horn Evangelos Pafilis Michael Kuhn Reinhardt Schneider Sean O’Donoghue SciVerse app Juan-Carlos Silla-Castro Sean O’Donoghue EPJ-mining Francisco S Roque Peter B Jensen Robert Eriksson Henriette Schmock Marlene Dalgaard Massimo Andreatta Thomas Hansen Karen Søeby Søren Bredkjær Anders Juul Thomas Werge Søren Brunak

larsjuhljensen

Ad

Recommended

PPT

Mining literature and medical records

Lars Juhl Jensen

PPT

Networks of proteins and diseases

Lars Juhl Jensen

PPT

Biomedical text mining

Lars Juhl Jensen

PPT

Data and Text Mining

Lars Juhl Jensen

PPT

Biomedical text mining and network analysis

Lars Juhl Jensen

PPT

Text and data mining

Lars Juhl Jensen

PPT

Network integration of data and text

Lars Juhl Jensen

PPT

Large-scale data and text mining

Lars Juhl Jensen

PPT

The researcher perspective, Jean-Fred Fontaine, MDC Berlin

PPT

Mining text and data on chemicals

Lars Juhl Jensen

PPTX

ContentMine: Mining the Scientific Literature

petermurrayrust

PPT

Medical data and text mining - Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

PPT

Mining and communicating biomedical knowledge

Lars Juhl Jensen

PPT

Network biology - A basis for large-scale biomedica data mining

Lars Juhl Jensen

PPTX

Biovision2017 Accessing the scientific literature

petermurrayrust

PPT

Computational Biology - Signaling networks and drug repositioning

Lars Juhl Jensen

PPT

Network biology: Large-scale biomedical data and text mining

Lars Juhl Jensen

PPT

Large-scale integration of data and text

Lars Juhl Jensen

PPT

Network biology

Lars Juhl Jensen

PDF

Deep learning for biomedical discovery and data mining I

Deakin University

PPTX

Big Data and ContentMining for Libraries

petermurrayrust

PPT

Turning big data and text collections into web resrouces

Lars Juhl Jensen

PPT

The pragmatic text miner: It’s just another type of poorly standardized data

Lars Juhl Jensen

PPTX

2016 davis-biotech

PDF

Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...

PPTX

ContentMining for France and Europe; Lessons from 2 years in UK

petermurrayrust

PDF

Biomedical Literature Mining 1st Edition Vinod D Kumar Hannah Jane Tipney Eds

PDF

MOLIERE: Automatic Biomedical Hypothesis Generation System

Justin Sybrandt, Ph.D.

PPT

One tagger, many uses: Illustrating the power of dictionary-based named entit...

Lars Juhl Jensen

PPT

One tagger, many uses: Simple text-mining strategies for biomedicine

Lars Juhl Jensen

More Related Content

PPT

Mining literature and medical records

Lars Juhl Jensen

PPT

Networks of proteins and diseases

Lars Juhl Jensen

PPT

Biomedical text mining

Lars Juhl Jensen

PPT

Data and Text Mining

Lars Juhl Jensen

PPT

Biomedical text mining and network analysis

Lars Juhl Jensen

PPT

Text and data mining

Lars Juhl Jensen

PPT

Network integration of data and text

Lars Juhl Jensen

PPT

Large-scale data and text mining

Lars Juhl Jensen

Mining literature and medical records

Lars Juhl Jensen

Networks of proteins and diseases

Lars Juhl Jensen

Biomedical text mining

Lars Juhl Jensen

Data and Text Mining

Lars Juhl Jensen

Biomedical text mining and network analysis

Lars Juhl Jensen

Text and data mining

Lars Juhl Jensen

Network integration of data and text

Lars Juhl Jensen

Large-scale data and text mining

Lars Juhl Jensen

Similar to Mining biomedical texts (20)

PPT

The researcher perspective, Jean-Fred Fontaine, MDC Berlin

PPT

Mining text and data on chemicals

Lars Juhl Jensen

PPTX

ContentMine: Mining the Scientific Literature

petermurrayrust

PPT

Medical data and text mining - Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

PPT

Mining and communicating biomedical knowledge

Lars Juhl Jensen

PPT

Network biology - A basis for large-scale biomedica data mining

Lars Juhl Jensen

PPTX

Biovision2017 Accessing the scientific literature

petermurrayrust

PPT

Computational Biology - Signaling networks and drug repositioning

Lars Juhl Jensen

PPT

Network biology: Large-scale biomedical data and text mining

Lars Juhl Jensen

PPT

Large-scale integration of data and text

Lars Juhl Jensen

PPT

Network biology

Lars Juhl Jensen

PDF

Deep learning for biomedical discovery and data mining I

Deakin University

PPTX

Big Data and ContentMining for Libraries

petermurrayrust

PPT

Turning big data and text collections into web resrouces

Lars Juhl Jensen

PPT

The pragmatic text miner: It’s just another type of poorly standardized data

Lars Juhl Jensen

PPTX

2016 davis-biotech

PDF

Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...

PPTX

ContentMining for France and Europe; Lessons from 2 years in UK

petermurrayrust

PDF

Biomedical Literature Mining 1st Edition Vinod D Kumar Hannah Jane Tipney Eds

PDF

MOLIERE: Automatic Biomedical Hypothesis Generation System

Justin Sybrandt, Ph.D.

The researcher perspective, Jean-Fred Fontaine, MDC Berlin

Mining text and data on chemicals

Lars Juhl Jensen

ContentMine: Mining the Scientific Literature

petermurrayrust

Medical data and text mining - Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

Mining and communicating biomedical knowledge

Lars Juhl Jensen

Network biology - A basis for large-scale biomedica data mining

Lars Juhl Jensen

Biovision2017 Accessing the scientific literature

petermurrayrust

Computational Biology - Signaling networks and drug repositioning

Lars Juhl Jensen

Network biology: Large-scale biomedical data and text mining

Lars Juhl Jensen

Large-scale integration of data and text

Lars Juhl Jensen

Network biology

Lars Juhl Jensen

Deep learning for biomedical discovery and data mining I

Deakin University

Big Data and ContentMining for Libraries

petermurrayrust

Turning big data and text collections into web resrouces

Lars Juhl Jensen

The pragmatic text miner: It’s just another type of poorly standardized data

Lars Juhl Jensen

2016 davis-biotech

Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...

ContentMining for France and Europe; Lessons from 2 years in UK

petermurrayrust

Biomedical Literature Mining 1st Edition Vinod D Kumar Hannah Jane Tipney Eds

MOLIERE: Automatic Biomedical Hypothesis Generation System

Justin Sybrandt, Ph.D.

Ad

More from Lars Juhl Jensen (20)

PPT

One tagger, many uses: Illustrating the power of dictionary-based named entit...

Lars Juhl Jensen

PPT

One tagger, many uses: Simple text-mining strategies for biomedicine

Lars Juhl Jensen

PPT

Extract 2.0: Text-mining-assisted interactive annotation

Lars Juhl Jensen

PPT

Network visualization: A crash course on using Cytoscape

Lars Juhl Jensen

PPT

STRING & STITCH: Network integration of heterogeneous data

Lars Juhl Jensen

PPT

Biomedical text mining: Automatic processing of unstructured text

Lars Juhl Jensen

PPT

Medical network analysis: Linking diseases and genes through data and text mi...

Lars Juhl Jensen

PPT

Network Biology: A crash course on STRING and Cytoscape

Lars Juhl Jensen

PPT

Cellular networks

Lars Juhl Jensen

PPT

Cellular Network Biology: Large-scale integration of data and text

Lars Juhl Jensen

PPT

Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...

Lars Juhl Jensen

PPT

STRING & related databases: Large-scale integration of heterogeneous data

Lars Juhl Jensen

PPT

Tagger: Rapid dictionary-based named entity recognition

Lars Juhl Jensen

PPT

Network Biology: Large-scale integration of data and text

Lars Juhl Jensen

PPT

Medical text mining: Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

PPT

Network biology: Large-scale integration of data and text

Lars Juhl Jensen

PPT

Medical data and text mining: Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

PPT

Cellular Network Biology

Lars Juhl Jensen

PPT

Network biology: Large-scale integration of data and text

Lars Juhl Jensen

PPT

Biomarker bioinformatics: Network-based candidate prioritization

Lars Juhl Jensen

One tagger, many uses: Illustrating the power of dictionary-based named entit...

Lars Juhl Jensen

One tagger, many uses: Simple text-mining strategies for biomedicine

Lars Juhl Jensen

Extract 2.0: Text-mining-assisted interactive annotation

Lars Juhl Jensen

Network visualization: A crash course on using Cytoscape

Lars Juhl Jensen

STRING & STITCH: Network integration of heterogeneous data

Lars Juhl Jensen

Biomedical text mining: Automatic processing of unstructured text

Lars Juhl Jensen

Medical network analysis: Linking diseases and genes through data and text mi...

Lars Juhl Jensen

Network Biology: A crash course on STRING and Cytoscape

Lars Juhl Jensen

Cellular networks

Lars Juhl Jensen

Cellular Network Biology: Large-scale integration of data and text

Lars Juhl Jensen

Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...

Lars Juhl Jensen

STRING & related databases: Large-scale integration of heterogeneous data

Lars Juhl Jensen

Tagger: Rapid dictionary-based named entity recognition

Lars Juhl Jensen

Network Biology: Large-scale integration of data and text

Lars Juhl Jensen

Medical text mining: Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

Network biology: Large-scale integration of data and text

Lars Juhl Jensen

Medical data and text mining: Linking diseases, drugs, and adverse reactions

Lars Juhl Jensen

Cellular Network Biology

Lars Juhl Jensen

Network biology: Large-scale integration of data and text

Lars Juhl Jensen

Biomarker bioinformatics: Network-based candidate prioritization

Lars Juhl Jensen

Ad

Recently uploaded (20)

PDF

ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf

PDF

August Patch Tuesday

PDF

NewMind AI Weekly Chronicles – August ’25 Week III

PDF

A review of recent deep learning applications in wood surface defect identifi...

PDF

A comparative study of natural language inference in Swahili using monolingua...

PPTX

Web Crawler for Trend Tracking Gen Z Insights.pptx

Actowiz Solustions

PDF

Zenith AI: Advanced Artificial Intelligence

PPTX

Final SEM Unit 1 for mit wpu at pune .pptx

PDF

Hindi spoken digit analysis for native and non-native speakers

PDF

WOOl fibre morphology and structure.pdf for textiles

Rajendrakumar868651

PPTX

Tartificialntelligence_presentation.pptx

PPTX

Benefits of Physical activity for teenagers.pptx

Nokwanda Thabethe

PDF

Assigned Numbers - 2025 - Bluetooth® Document

PDF

Architecture types and enterprise applications.pdf

ChristopherTHyatt

PDF

Getting Started with Data Integration: FME Form 101

PDF

DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf

PPT

What is a Computer? Input Devices /output devices

PPTX

Chapter 5: Probability Theory and Statistics

PDF

Enhancing emotion recognition model for a student engagement use case through...

PDF

Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor

ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf

August Patch Tuesday

NewMind AI Weekly Chronicles – August ’25 Week III

A review of recent deep learning applications in wood surface defect identifi...

A comparative study of natural language inference in Swahili using monolingua...

Web Crawler for Trend Tracking Gen Z Insights.pptx

Actowiz Solustions

Zenith AI: Advanced Artificial Intelligence

Final SEM Unit 1 for mit wpu at pune .pptx

Hindi spoken digit analysis for native and non-native speakers

WOOl fibre morphology and structure.pdf for textiles

Rajendrakumar868651

Tartificialntelligence_presentation.pptx

Benefits of Physical activity for teenagers.pptx

Nokwanda Thabethe

Assigned Numbers - 2025 - Bluetooth® Document

Architecture types and enterprise applications.pdf

ChristopherTHyatt

Getting Started with Data Integration: FME Form 101

DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf

What is a Computer? Input Devices /output devices

Chapter 5: Probability Theory and Statistics

Enhancing emotion recognition model for a student engagement use case through...

Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor

Mining biomedical texts

1. Mining biomedical texts Lars Juhl Jensen >10 km

2. exponential growth

3.

4.

5. some things are constant

6.

7. ~45 seconds per paper

8. information retrieval

9. find the relevant texts

10. still too much to read

12. as smart as a dog

13. teach it specific tricks

14.

15.

16. named entity recognition

17. identify the concepts

18. comprehensive lexicon

19. small molecules

21. cellular components

24. orthographic variation

25. “ black list”

27. augmented browsing

28. browser add-on

29. Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology , 2009 O’Donoghue et al., Journal of Web Semantics , 2010

31. Internet Explorer

32. Google Chrome

34. Utopia Documents

35. web services

36. ~150 years of publishing

37.

39.

40. dead e-wood

41. added value

42. collaboration

43.

44.

45. SciVerse application

46.

47.

48.

49.

50.

52. Kuhn et al., Nucleic Acids Research , 2010

53. curated knowledge

54. drug targets

56. Letunic & Bork, Trends in Biochemical Sciences , 2008

57. experimental data

58. physical interactions

59. Jensen & Bork, Science , 2008

60. text mining

61. co-mentioning

62.

63. NLP Natural Language Processing

64.

67. restricted access

68.

69. collaboration

70. electronic patient journals

71. a hard problem

74. by busy doctors

77. about psychiatric patients

79. domain specific system

80. F20 F200 Negation Family

82. patient stratification

83. Roque et al., PLoS Computational Biology , 2011

84. disease comorbidity

85. Roque et al., PLoS Computational Biology , 2011

87. adverse drug events

88. pharmacovigilance

91. Thank you! Reflect.ws Sune Frankild Heiko Horn Evangelos Pafilis Michael Kuhn Reinhardt Schneider Sean O’Donoghue SciVerse app Juan-Carlos Silla-Castro Sean O’Donoghue EPJ-mining Francisco S Roque Peter B Jensen Robert Eriksson Henriette Schmock Marlene Dalgaard Massimo Andreatta Thomas Hansen Karen Søeby Søren Bredkjær Anders Juul Thomas Werge Søren Brunak

92. larsjuhljensen