SlideShare a Scribd company logo
Named Entity Recognition and
Semantic Relation Extraction
from Cochrane Review Documents.
Motivation & Approach
 To make Cochrane contents – evidence - more accessible
 Cross referencing information
 finding relevant passages in documents of 200+ pages
 Supporting discovery & search in Cochrane review documents
 Build a foundation for apps to be used in „point of care“ situations
 Extract an ontology based on information contained in the Cochrane library
 „discover“ Cochrane content relevant to a given patient
 Using semantic models (IBM‘s System T) to extract entities &
relations
 diseases, diagnoses, treatments, interventions, medication, drugs,
symptoms, complications
 „… prolonged treatment with vitamin K antagonists reduces the risk of
recurrent venous thromboembolism …. ”
Page  2
L. Chiticariu, R. Krishnamurthy, Y. Li, F. Reiss, and S. Vaithyanathan, “Domain adaptation of rule-based annotators for named-entity recognition tasks,” in
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010, pp. 1002–1012.
A. Nagesh, G. Ramakrishnan, L. Chiticariu, R. Krishnamurthy, A. Dharkar, and P. Bhattacharyya, “Towards efficient named-entity rule induction for customizability,”
in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp.
128–138.
Page  3
Semantic models using System T / AQL
•Extract
candidate
features
•Apply filters
•Post processing
Dictionary
Learning
•Extract basic
features
•Combine
features
•Annotate and
canonical form
Named Entity
Recognition
•Combine
modifiers with
entities
•Combine
Relation Hints
with extended
spans
•Normalize
Relation
Identification
Named Entity Recognition / Build Dictionaries
Page  4
Extract Candidate Features – Context Hints
:treatment <is> <more> effective than :treatment
/(administer|apply)ing/ :treatment
/tak(e|ing)/ :drug
:drug <consumption>
/doses used in/ :disease
/(health )?consequences? of/ :disease
/(medication|therapy|treatment) for/ :disease
Page  5
Dictionary
Learning
Dictionary Learning - Postprocessing
 Shorten by intrinsic hierarchy
 „long-term compression therapy“  „compression therapy“
 “probable ischaemic stroke“  „ischaemic stroke“
(use match statistics as heuristic measures)
 Filter using statistics
 remove „exploded terms“
(few original matches, many dictionary matches)
 remove „weak terms“
(few matches, but many matches for parent)
 Type Deduction
 Type candidates from source extractor modules
 „same list“ (entities mentioned within a list)
 comparation patterns: „compare with“, „in combination with“
Page  6
Dictionary
Learning
Semantic Relation Extraction
Page  7
Building Blocks for Relation Extraction
Named Entities
(„calcium channel blockers“, „parkinson‘s
disease“)
Relation Hints
(A was caused by B; A has positive effects on B …)
 tag with relation type: CAUSE, PREVENT,
INCREASE, …
Modifier Hints
(use of A; risk of A; developing A …)
 tag with modifier type: RISK, USE, INFECTION,
GROWTH, …
Page  8
Relation
Identification
Illustrated cases
 List and Bracket Processing
„Other drugs include carbamazepine and newer antiepileptics
(lamotrigine, topiramate and zonisamide) and the atypical
antipsychotics (clozapine, aripiprazole and ziprasidone)“
 Special Cases (for “is a”)
„atypical antipsychotics (clozapine, aripiprazole and ziprasidone)“
„infections, such as malaria and hookworm”
„selenium, vitamin C and other antioxidants“
 Simple direct relations
@PREVENT :entity /(protect|help)s against/ :entity
@PREVENT :entity <can> <adverb> ? prevent :entity
@CAUSE :entity <is> <adverb>? followed by :entity
@CAUSE :entity <can> <adverb>? result in :entity
Page  9
Relation
Identification
Relation Postprocessing
 Combine consecutive modifiers and Named Entities:
„a reduction in the risk of developing A“
 REDUCE.RISK.GROWTH
 Combine Relation Hints and Extended Entity Spans:
„use of calcium channel blockers was associated with a reduction in the risk of developing
parkinson‘s disease“
 USE A CAUSE REDUCE.RISK.GROWTH B
 Simplify („translate“) to create the final Semantic Relation
 A REDUCE RISK B
 calcium channel blockers REDUCE RISK parkinson‘s
disease
Page  10
Relation
Identification
Page  11
Cochrane Ontology Viewer
Page  12
Text passage supporting a relation
Page  13
Document context of the text passage
Page  14
Cochrane Ontology Viewer – other relations in Chen …
Page  15
Effectiveness of drugs …
Page  16
Evidence for drugs reducing risk of thrombosis
Page  17
What else do we know about „ethynilestradiol? “
Page  18
Relations found in several documents
Observations and insights gained …
 Rule based system adequate for medical reports
 Statistical approaches require larger corpora
 Grammatical parsers alone not sufficiently specific
 Domain specific language aids semantic modelling
 Problems encountered, responses (POS)
 Adjective contamination
 Some antiepileptic drugs are marketed specifically for migraine prophylaxis.
 Delimiting entities and relations
 Drug therapy for migraine falls into two categories.
 Patients were likely to reduce the number of their migraine headaches by 50%.
 Efforts commensurate with the text corpus
 Continuous improvement process inherent in our approach
 Building on top of the existing dictionaries and patterns
Page  19
What‘s next?
 Improve AQL extraction results
 Improve entity normalization and types
(eg. make better use of entity components: „endothelin receptor
antagonist“)
 Identify most relevant relations
 Extraction of Structured Context
 Use ontology for point of care situations
 Introduce deep learning technology
 User interface (mobile systems of engagement) for „point of care“
situations
 Combine with patient data to guide the discovery process in
Cochrane reviews
Page  20

More Related Content

PPTX
Bag of tricks for documents tagging, information extraction &amp; analysis
PPTX
Developing Youth Leaders by Melissa Erickson, Daniel Erickson and Michael Ort...
PDF
Emilioblog draw
PDF
Listado de precios 15 de mayo 2014 specialtech
PDF
Dame nature
PDF
Interview Digital Studio
PDF
Excellent_Award0001
PDF
TIG Certification
Bag of tricks for documents tagging, information extraction &amp; analysis
Developing Youth Leaders by Melissa Erickson, Daniel Erickson and Michael Ort...
Emilioblog draw
Listado de precios 15 de mayo 2014 specialtech
Dame nature
Interview Digital Studio
Excellent_Award0001
TIG Certification

Viewers also liked (13)

PPTX
5 Malecones que debes visitar en México
PDF
Excel Ninja Shortcuts
PDF
Folha Dominical - 14.03.10 Nº313
PDF
PDF
Folha Dominical - 18.11.12 Nº 449
PPTX
Jojojoojocnnn
PDF
autonomy
PDF
Totalresultat karate sm_2012
PPTX
PPT
Indicadores de-gestin-1234750032042066-2
PDF
SAAPRI (2)
PDF
NLP Structured Data Investigation on Non-Text by Casey Stella
PDF
Kapanowski FINAL_Lean Assessment
5 Malecones que debes visitar en México
Excel Ninja Shortcuts
Folha Dominical - 14.03.10 Nº313
Folha Dominical - 18.11.12 Nº 449
Jojojoojocnnn
autonomy
Totalresultat karate sm_2012
Indicadores de-gestin-1234750032042066-2
SAAPRI (2)
NLP Structured Data Investigation on Non-Text by Casey Stella
Kapanowski FINAL_Lean Assessment
Ad

Similar to Using IBM Watson to construct semantic models to extract relevant point-of-care information from Cochrane Reviews (20)

PDF
Controlled vocabularies for medical and health research
PPT
2009 09 Lod London
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PPTX
EBP & Health Sciences Librarianship
PPTX
"May we show you something in a gene?"
PPTX
Understanding medical concepts and codes through NLP methods
PPT
2011-10-11 Open PHACTS at BioIT World Europe
PDF
AETIONOMY Overview AD/PD Conference 2015 Nice
PDF
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
PPT
Literature searching
PPT
Crowdsourcing Chemistry for the Community – 5 Years of Experiences
PPT
Searching Medical Sources
PPT
Embi cri review-2012-final
PPT
Systematic reviews
PPT
Keeping up with the Medical Literature
DOCX
Second-Generation HIT InformaticistsGreat discoveries can transfor.docx
PPT
ChemSpider hosting linking and curating chemistry data for the community
PPT
Chemspider hosting linking and curating chemistry data for the community
PPT
2011-11-28 Open PHACTS at RSC CICAG
Controlled vocabularies for medical and health research
2009 09 Lod London
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
EBP & Health Sciences Librarianship
"May we show you something in a gene?"
Understanding medical concepts and codes through NLP methods
2011-10-11 Open PHACTS at BioIT World Europe
AETIONOMY Overview AD/PD Conference 2015 Nice
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
Literature searching
Crowdsourcing Chemistry for the Community – 5 Years of Experiences
Searching Medical Sources
Embi cri review-2012-final
Systematic reviews
Keeping up with the Medical Literature
Second-Generation HIT InformaticistsGreat discoveries can transfor.docx
ChemSpider hosting linking and curating chemistry data for the community
Chemspider hosting linking and curating chemistry data for the community
2011-11-28 Open PHACTS at RSC CICAG
Ad

More from Paradigma Consulting (20)

PPTX
20150510 gse region conferencehamburg
PPTX
Catchment Area Identification
PPTX
Kollaborative Projekte mit Watson Explorer
PDF
Trade Corridors and Multimodal Transport Corridors in Europe
PDF
Empty Container Relocation: Businesscase
PPTX
Innovative Approaches for the collection of road transport statistics
PPT
Sonora both phases
PPT
Harmonizing Alpine traffic data
PPTX
Innovative Methoden zur Erhebung der Strassengüterverkehrsstatistik
PPTX
Strategic Management in Maritime Ports
PPTX
The ports as logistics platforms - the current metamorphosis of maritime ports
PPTX
Digital Agenda Europe
PPTX
EUROSTAT Presentation
PPT
100125 silk road
PPT
Transport Services Market Potential Syria Egypt
PPTX
Innovative methods to collect road statistics
PPTX
Road traffic modelling
PPTX
APIS Briefing
PDF
Statistikwoche 20120920 poster
PPTX
Public Sector National Accounts
20150510 gse region conferencehamburg
Catchment Area Identification
Kollaborative Projekte mit Watson Explorer
Trade Corridors and Multimodal Transport Corridors in Europe
Empty Container Relocation: Businesscase
Innovative Approaches for the collection of road transport statistics
Sonora both phases
Harmonizing Alpine traffic data
Innovative Methoden zur Erhebung der Strassengüterverkehrsstatistik
Strategic Management in Maritime Ports
The ports as logistics platforms - the current metamorphosis of maritime ports
Digital Agenda Europe
EUROSTAT Presentation
100125 silk road
Transport Services Market Potential Syria Egypt
Innovative methods to collect road statistics
Road traffic modelling
APIS Briefing
Statistikwoche 20120920 poster
Public Sector National Accounts

Recently uploaded (20)

PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Business Analytics and business intelligence.pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Computer network topology notes for revision
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Supervised vs unsupervised machine learning algorithms
Qualitative Qantitative and Mixed Methods.pptx
1_Introduction to advance data techniques.pptx
Introduction to Knowledge Engineering Part 1
Business Analytics and business intelligence.pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
climate analysis of Dhaka ,Banglades.pptx
IB Computer Science - Internal Assessment.pptx
Computer network topology notes for revision
Database Infoormation System (DBIS).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
[EN] Industrial Machine Downtime Prediction
oil_refinery_comprehensive_20250804084928 (1).pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
168300704-gasification-ppt.pdfhghhhsjsjhsuxush

Using IBM Watson to construct semantic models to extract relevant point-of-care information from Cochrane Reviews

  • 1. Named Entity Recognition and Semantic Relation Extraction from Cochrane Review Documents.
  • 2. Motivation & Approach  To make Cochrane contents – evidence - more accessible  Cross referencing information  finding relevant passages in documents of 200+ pages  Supporting discovery & search in Cochrane review documents  Build a foundation for apps to be used in „point of care“ situations  Extract an ontology based on information contained in the Cochrane library  „discover“ Cochrane content relevant to a given patient  Using semantic models (IBM‘s System T) to extract entities & relations  diseases, diagnoses, treatments, interventions, medication, drugs, symptoms, complications  „… prolonged treatment with vitamin K antagonists reduces the risk of recurrent venous thromboembolism …. ” Page  2 L. Chiticariu, R. Krishnamurthy, Y. Li, F. Reiss, and S. Vaithyanathan, “Domain adaptation of rule-based annotators for named-entity recognition tasks,” in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010, pp. 1002–1012. A. Nagesh, G. Ramakrishnan, L. Chiticariu, R. Krishnamurthy, A. Dharkar, and P. Bhattacharyya, “Towards efficient named-entity rule induction for customizability,” in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp. 128–138.
  • 3. Page  3 Semantic models using System T / AQL •Extract candidate features •Apply filters •Post processing Dictionary Learning •Extract basic features •Combine features •Annotate and canonical form Named Entity Recognition •Combine modifiers with entities •Combine Relation Hints with extended spans •Normalize Relation Identification
  • 4. Named Entity Recognition / Build Dictionaries Page  4
  • 5. Extract Candidate Features – Context Hints :treatment <is> <more> effective than :treatment /(administer|apply)ing/ :treatment /tak(e|ing)/ :drug :drug <consumption> /doses used in/ :disease /(health )?consequences? of/ :disease /(medication|therapy|treatment) for/ :disease Page  5 Dictionary Learning
  • 6. Dictionary Learning - Postprocessing  Shorten by intrinsic hierarchy  „long-term compression therapy“  „compression therapy“  “probable ischaemic stroke“  „ischaemic stroke“ (use match statistics as heuristic measures)  Filter using statistics  remove „exploded terms“ (few original matches, many dictionary matches)  remove „weak terms“ (few matches, but many matches for parent)  Type Deduction  Type candidates from source extractor modules  „same list“ (entities mentioned within a list)  comparation patterns: „compare with“, „in combination with“ Page  6 Dictionary Learning
  • 8. Building Blocks for Relation Extraction Named Entities („calcium channel blockers“, „parkinson‘s disease“) Relation Hints (A was caused by B; A has positive effects on B …)  tag with relation type: CAUSE, PREVENT, INCREASE, … Modifier Hints (use of A; risk of A; developing A …)  tag with modifier type: RISK, USE, INFECTION, GROWTH, … Page  8 Relation Identification
  • 9. Illustrated cases  List and Bracket Processing „Other drugs include carbamazepine and newer antiepileptics (lamotrigine, topiramate and zonisamide) and the atypical antipsychotics (clozapine, aripiprazole and ziprasidone)“  Special Cases (for “is a”) „atypical antipsychotics (clozapine, aripiprazole and ziprasidone)“ „infections, such as malaria and hookworm” „selenium, vitamin C and other antioxidants“  Simple direct relations @PREVENT :entity /(protect|help)s against/ :entity @PREVENT :entity <can> <adverb> ? prevent :entity @CAUSE :entity <is> <adverb>? followed by :entity @CAUSE :entity <can> <adverb>? result in :entity Page  9 Relation Identification
  • 10. Relation Postprocessing  Combine consecutive modifiers and Named Entities: „a reduction in the risk of developing A“  REDUCE.RISK.GROWTH  Combine Relation Hints and Extended Entity Spans: „use of calcium channel blockers was associated with a reduction in the risk of developing parkinson‘s disease“  USE A CAUSE REDUCE.RISK.GROWTH B  Simplify („translate“) to create the final Semantic Relation  A REDUCE RISK B  calcium channel blockers REDUCE RISK parkinson‘s disease Page  10 Relation Identification
  • 11. Page  11 Cochrane Ontology Viewer
  • 12. Page  12 Text passage supporting a relation
  • 13. Page  13 Document context of the text passage
  • 14. Page  14 Cochrane Ontology Viewer – other relations in Chen …
  • 16. Page  16 Evidence for drugs reducing risk of thrombosis
  • 17. Page  17 What else do we know about „ethynilestradiol? “
  • 18. Page  18 Relations found in several documents
  • 19. Observations and insights gained …  Rule based system adequate for medical reports  Statistical approaches require larger corpora  Grammatical parsers alone not sufficiently specific  Domain specific language aids semantic modelling  Problems encountered, responses (POS)  Adjective contamination  Some antiepileptic drugs are marketed specifically for migraine prophylaxis.  Delimiting entities and relations  Drug therapy for migraine falls into two categories.  Patients were likely to reduce the number of their migraine headaches by 50%.  Efforts commensurate with the text corpus  Continuous improvement process inherent in our approach  Building on top of the existing dictionaries and patterns Page  19
  • 20. What‘s next?  Improve AQL extraction results  Improve entity normalization and types (eg. make better use of entity components: „endothelin receptor antagonist“)  Identify most relevant relations  Extraction of Structured Context  Use ontology for point of care situations  Introduce deep learning technology  User interface (mobile systems of engagement) for „point of care“ situations  Combine with patient data to guide the discovery process in Cochrane reviews Page  20