Extraction of Drug-Drug Interactions from Biomedical Texts

Isabel Segura-Bedmar, Paloma Martínez, María Herrero-Zazo
Universidad Carlos III de Madrid, SPAIN
SemEval-2013 Task 9:
Extraction of Drug-Drug Interactions
from Biomedical Texts

Outline
2
 Motivation
 Previous Work: DDIExtraction 2011
 New in DDIExtraction 2013
 The DDI corpus
 Tasks
 Task 9.1: Drug Name Recognition and Classification
 Taks 9.2: Drug-Drug Interaction Extraction
 Conclusions

What is a Drug-Drug Interaction (DDI)?
3
Motivation
 A DDI occurs when a drug influences the
level or the activity of another drug.
 A DDI can be beneficial, but most times
DDIs are dangerous for patients and can
increase healthcare costs.
 Medical literature is the most effective
source for the detection of DDIs.

Information Extraction
4
Motivation
We thank the team at the Humboldt-Universitaet zu Berlin for making available a visualization of the DDI corpus using Stav:
http://http://corpora.informatik.hu-berlin.de/, https://github.com/TsujiiLaboratory/stav

Previous Work: DDIExtraction 2011
5
 Automatic extraction of drug-drug
interactions from texts.
 Dataset: a collection of 579 documents
from DrugBank.
 DDIs annotated by a pharmacist,
 Drugs automatically annotated.
 F1 ranged between 0.16 and 0.66.
Previous Work

New in SemEval Task 9
6
 Task 9.1: Drug Name Recognition and
Classification.
 Task 9.2: DDI Detection and Classification.
 The DDI corpus:
 double size: 1,025 annotated documents, 18,502
pharmacological substances and 5,028 DDIs.
 Drugs and DDIs were manually annotated by two
pharmacists.
 Available annotation guidelines and Inter-Annotator
agreement.
 Two different text sources:
 MedLine
 DrugBank.
Motivation

Tasks
7
Classification.
 Task 9.2: Drug-Drug Interaction Extraction
Tasks

Task 9.1 - Drug Classification
8
Tasks
 drug type for generic drugs.
(Eg. Heparin, ibuprofen, methotrexate).
 brand type for trade drugs.
(Eg. Espidifen, aspirin).
 group type for groups of drugs.
(Eg. Analgesics, anticoagulants).
 drug_n type for active substances not approved for
human use.
(Eg. Picrotoxin, heroin)

Task 9.1 - Teams
9
Team Affiliation Approach
LASIGE Lisbon University Conditional Random
Fields
UEM_UC3M European
University, Carlos III
University of Madrid
Ontology-based
approach
UMCC_DLSI Matanzas
University, Alicant
University
J48 classifier
Uturku Turku University SVM classifier
(TEES system)
WBI Humboldt University
of Berlin
Conditional Random
Fields
Tasks

Task 9.1 Evaluation
10
 Recognition (regardless to the type):
 Exact-boundary matching (EXACT).
 Partial-boundary matching (PARTIAL).
 Recognition and classification:
 Exact-boundary + type matching
(STRICT).
 Partial-boundary + type matching
(TYPE).
Tasks

Task 9.1- Overview of the results
11
 Groups and substances not approved are
more difficult than drugs and brands:
 brand names: short and unique.
 generic names: no ambiguity because they
are simplified chemical names.
 group names can be ambiguous (eg.
anticoagulant, anti-retroviral, etc)
 group names: many variants and
abbreviations.
Tasks

12
 Drug-n type was the most difficult
type:
 very scarce in DrugBank (less1%).
 less clearly defined in guidelines.
 Systems are able to identify, but fail to
classify them.
Tasks

Tasks
13
Classification.
 Task 9.2: Drug-Drug Interaction
Extraction
Tasks

14
 Gold annotations for drugs are provided to
teams both for training and test datasets.
Task 9.2: Drug-Drug Interaction (DDI) Extraction
Tasks

15
 Detect DDI and classify them
Tasks

16
 Detect DDI and classify them
Tasks
EFFECT
EFFECT
MECHANISM

DDI Classification
17
Tasks
 mechanism type for interactions describing the way the
interaction occurs.
Lansoprazole may decrease the absorption of enoxacin.
 effect type for interactions describing the consequence of
the interaction.
Additive CNS depression may occur when antihistamines are
administered with barbiturates.
 advice type for interactions describing a recommendation or
advice.
Patients taking isoniazid and disulfiram concomitantly should
closely monitored.
 int type for mentions of interactions without any additional
information. Clopidogrel interacts with omeprazol.

Task 9.2 Teams
18
Team Affiliation Approach
FBK-irst FBK-irst, Italy Hybrid kernel + scope of
negations and semantic
roles
NIL_UCM Complutense University of
Madrid, Spain
SVM classifier
SCAI Fraunhofer SCAI,
Germany
SVM classifier
UC3M Carlos III University of
Madrid, Spain
Shallow Linguistic Kernel
UCOLORADO_SO
M
University of Colorado,
School of Medicine, USA
SMV classifier
Uturku Turku University, Finland SVM classifier (TEES
system)
UWM_TRIADS University of Wisconsin,
USA
Two-stage SVM
WBI_DDI Humboldt University of Ensemble of kernels
Tasks

Task 9.2- Results
19
0.827
0.676
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
DrugBank
Tasks
0.53
0.42
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MedLine

20
 Detection: significant improvement over 2011:
66% F1 (2011) vs . 82% F1 (2013)
 In DrugBank:
 Int DDI type is the most difficult (54% F1).
 Mechanism, effect and advice types show
similar F1 (70%).
 In MedLine, results for effect and mechanism
types are considerably lower due to the
complexity of sentences describing these DDIs.
 Non-linear kernel-based methods overcome
linear SVMs.
Tasks

Conclusion
21
 13 teams from 7 different countries.
 In both tasks, the results on DrugBank are
considerably better than the ones on MedLine.
 Best F1:
Task 9.1 Drug NERC Task 9.2 Extraction of DDIs
Recognitio
n
Recognition +
Classification
Detection Detection +
Classification
DrugBan
k
90% 87% 82% 53%
MedLine 80% 58% 67% 42%

Conclusion
22
 13 teams from 7 different countries.
 In both tasks, results on DrugBank
considerably better than the ones on
MedLine.
 Task 9.1:
 Best system (WBI): conditional random field
+ the training dataset extended with the
test dataset for task 9.2.
 Most difficult: groups and drug-n.
 Task 9.2:
 There is much room to improve.

Future of the task
23
 Include new types of texts:
 prescription drug documents,
 health records,
 texts from social media about DDIs and
adverse event drugs.
 No plans for annotating new documents.
 Goal of the next DDIExtraction:
 Create a silver standard DDI corpus.
 To annotate effect, mechanism, drug dosages,
etc.
 Similar to CALBC challenge.

Acknowledgments
24
 This work was supported by the Regional
Government of Madrid under the Research Network
MA2VICMR [S2009/TIC-1542] and by the Spanish
Ministry of Education under the project
MULTIMEDICA [TIN2010-20644-C03-01].
 To all participants for their efforts and to congratulate
them to their interesting work.
 To the Uturku team who provided TEES analyses for
training and test datasets.
 To the WBI team who made available a visualization
of the DDI corpus using Stav.

Extraction of Drug-Drug Interactions from Biomedical Texts

More Related Content

Similar to Extraction of Drug-Drug Interactions from Biomedical Texts (20)

More from Grupo HULAT (20)

Recently uploaded (20)

Extraction of Drug-Drug Interactions from Biomedical Texts

Editor's Notes