SlideShare a Scribd company logo
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
Data, Decisions, and MongoDB
Tuesday, November 29, 2016
© 2015 8 Path Solutions LLC. All Rights Reserved.
	
  Validating	
  an	
  Open	
  Society	
  	
  
Jennifer Shin
Founder, 8 Path Solutions
Senior Principal Data Scientist, Nielsen
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
●  Background
š Native New Yorker
š Undergraduate degree in Economics, Mathematics & Creative
Writing from Columbia University
š Graduate degree in Statistics from Columbia University
Introduction
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
●  Professional Experience
š Founder & Chief Data Scientist at 8 Path Solutions
š Senior Principal Data Scientist at Nielsen
š Management consultant at Fortune 100 companies
š Top Contributor for IBM Data Magazine
š Faculty in the MIDS Graduate Program at UC Berkeley
Introduction
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
●  Recent Talks & Presentations
š Institute of Computational & Experimental Research in Mathematics
(ICERM) at Brown University
š TDWI Accelerate 2016
š Data Dialogs Conference – UC Berkeley
š IBM World of Watson 2016
Introduction
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
š  Adverse Drug Reactions (ADR)
š  FDA Adverse Events Reporting System (FAERS)
š  openFDA API
š  openFDA + MongoDB
Today’s Talk
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
š Monitoring the safety of medicinal products
š Adverse Drug Reactions (ADR):
unwanted, uncomfortable, or dangerous effects that a drug may have
š In the US, 3 to 7% of all hospitalizations are due to ADR1
Pharmacovigilance
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA SOURCE
š  FDA Adverse Event Reporting System (FAERS)
●  A computerized information database designed to support the FDA's
post-marketing safety surveillance program for all approved drug &
therapeutic biologic products
●  Used to monitor for new adverse events and medication errors that
might occur with these marketed products
© 2015 8 Path Solutions LLC. All Rights Reserved.
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA SOURCE
© 2015 8 Path Solutions LLC. All Rights Reserved.
Option 1: Quarterly FAERS Data Files
○  Available each quarter from the FDA
○  Data from 2004 to 2012 available in ASCII/SGML
Data after 2012 available in ASCII/XML
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA CHALLENGES
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Requires downloading and consolidating quarterly reports in a
databases.
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA ISSUES
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Duplicate Reports
○  Spelling errors
○  Inaccurate information
○  One field for all drug names (e.g. Brand Name & Generic) and
active ingredients
○  Multiple drugs included in a single report
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
openFDA API
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
š  Beta launch - June 2014
š  New website
Food & Drug Administration openFDA API
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
š  Facilitate access and use of big important FDA public datasets by
developer, researchers, and the public through harmonization of
data across disparate FDA datasets provided via application
programming interfaces (APIs)
API OBJECTIVES
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
š  Drug adverse events: Reports of drug side effects, product use
errors, product quality problems, and therapeutic failures.
š  Drug product labeling: Structured product information, including
prescribing information, for approved drug products.
š  Drug recall enforcement reports: Drug product recall enforcement
reports.
API DATA
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved.
š  Drug product labeling: Structured product information, including
prescribing information, for approved drug products.
š  Drug recall enforcement reports: Drug product recall enforcement
reports.
API DATA
š  Drug adverse events: Reports of drug side effects, product use
errors, product quality problems, and therapeutic failures
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
API DATA
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Access to FAERS database using API calls
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
API DATA
3 ways to download data from openFDA.
š Download manually. 
There’s a downloads section on each endpoint’s openFDA.
š Write code to download the data automatically. 
Use a special API query (see below) to get a list of all the current
data files for each endpoint.
š Synchronize with the openFDA S3 bucket. 
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA ISSUE SOLUTION
Drug Name Harmonization Process
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Benefits
•  Harmonizes the FAERS data on drug identifiers using other
data sources, such as NDC & RxNorm
•  Separate data fields for brand names, generic names, and
active ingredients
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA ISSUE SOLUTION
Drug Name Harmonization Process
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Limitations
•  Cannot harmonize misspelled drug names
•  Validation process requires using FAERS data files
- Not necessarily easier
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
CASE STUDY: DROSPIRENONE
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
FAERS vs. OPENFDA
© 2015 8 Path Solutions LLC. All Rights Reserved.
FAERS Data Files
Data From 2004 Q1 to 2012 Q3
2 out of the 7 Reports:
DEMO, DRUG, REAC, RPSR, THER, OUTC, INDI
Consolidated using SQL Server 2012
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
FAERS vs. OPENFDA
© 2015 8 Path Solutions LLC. All Rights Reserved.
Drug Name Mapping
DRUGNAME	
  
YAZ	
  
DROSPIRENONE	
  W/ETHINYLESTRADIOL	
  (YAZ)	
  
YAZ	
  /06358701/	
  
YAZ	
  BAYER	
  HEALTHCARE	
  
YAZ	
  N/A	
  BAYER	
  HEALTHCARE	
  
YAZ	
  (24)	
  
YAZ	
  (DROSPIRENONE	
  +	
  ETHINYLESTRADIOL	
  20!G	
  (24+4)	
  [YAZ]	
  	
  
YAZ	
  (DROSPIRENONE/ESTRADIOL)	
  
YAZ	
  (ORAL	
  CONTRACEPTATIVE	
  NOS)	
  
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
FAERS vs. OPENFDA
© 2015 8 Path Solutions LLC. All Rights Reserved.
Data Fields
FAERS Data Files openFDA API
Brand Name
Yaz	
  
	
  
DRUGNAME	
  
	
  
pa.ent.drug.openfda.brand_name	
  	
  
	
  
Generic Name
Drospirenone	
  Ethinyl	
  
Estradiol	
  
	
  
DRUGNAME	
  
	
  
	
  
pa.ent.drug.openfda.generic_name	
  
Case Report
Identifier	
  
	
  
ISR	
  
	
  
safetyrepor.d	
  
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
FAERS vs. OPENFDA
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Inconsistent Query Data
•  Running the same query on 8/03/14 &
8/10/14 produced different results
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
FAERS vs. OPENFDA
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  Inconsistent Query Data
•  No information as to the cause of these
changes could be found on the FDA website
•  According to the Github records, there were
no updates made between these two dates
•  For our brand name data analysis, the most
recent results were selected
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
DATA ISSUES SOLUTION?
Drug Name Harmonization Process
© 2015 8 Path Solutions LLC. All Rights Reserved.
○  FAERS Data Files vs. openFDA API Query
•  Cannot harmonize misspelled drug names
•  Validation process requires using FAERS data files
- Not necessarily easier
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS 1:
BRAND NAME
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS 1: BRAND NAME
1
0
20
105
2,502
2,321
1,472
8,857
6,750
0
0
19
215
2,498
2,261
1,365
7,881
5,551
2004 2005 2006 2007 2008 2009 2010 2011 2012
openFDA FAERS
Comparing Reports for Yaz from Q1 2004 to Q3 2012
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS 1:
BRAND NAME
openFDA API
query results for
safetyreportid
“4990905-5”
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS 1:
BRAND NAME
DRUGNAME for ISR number “4990905” only includes
DROSPIRENONE AND ETHINYL ESTRADIOL
openFDA API
query results for
safetyreportid
“4990905-5”
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS II:
GENERIC NAME
QUERY	
   openFDA	
  
Results	
  
Initial Query
hOps://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND+	
  
pa^ent.drug.openfda.generic_name:”DROSPIRENONE+ETHINYL
+ESTRADIOL”&count=pa^ent.drug.openfda.brand_name	
  
	
  
Total:	
  	
  	
  	
  	
  714	
  
	
  
Yaz:	
  	
  	
  	
  	
  	
  	
  	
  107	
  
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS II:
GENERIC NAME
QUERY	
   openFDA	
  
Results	
  
Initial Query
hOps://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND+	
  
pa^ent.drug.openfda.generic_name:”DROSPIRENONE+ETHINYL
+ESTRADIOL”&count=pa^ent.drug.openfda.brand_name	
  
	
  
Total:	
  	
  	
  	
  	
  714	
  
	
  
Yaz:	
  	
  	
  	
  	
  	
  	
  	
  107	
  
Revised Query
hOps://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND
+pa^ent.drug.openfda.generic_name:"DROSPIRENONE”+AND
+pa^ent.drug.openfda.generic_name:"ETHINYL”+AND
+pa^ent.drug.openfda.generic_name:"ESTRADIOL”&count=pa^ent.drug.openfda.brand_na
me	
  
	
  
Total:	
  	
  31,051	
  
	
  
Yaz:	
  	
  	
  	
  	
  22,028	
  
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
Results
The drug harmonization process incorrectly associated reports for
Drospirenone Ethinyl Estradiol with the drug Yaz.
š Raises concerns about the drug harmonization process for Yaz as well
as other drugs
š Further study is needed to validate the accuracy of the openFDA data
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
RESULTS II:
GENERIC NAME
The reported cases for Yaz from
the openFDA API and FAERS
Data Files varied widely when
compared based on the year of
the report.
š For 2006, the API only included
105 cases, which is 51% less
than the 215 cases in FAERS.
š For 2011, the API included
8,857 cases, which is 12.4%
more than the 7,881 cases in
FAERS.
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
Data, Trust, and Reproducibility
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
Which is better?
© 2015 8 Path Solutions LLC. All Rights Reserved.
@8PATHSOLUTIONS
š  Traditional methods vs. newer approaches
š  Data processing & data validation
š  Access via API vs. database 
š  Implications for pharmaceutical research, data science,
data technology & development
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
Data Dependence
š  Risk of relying on API data
EX: http://download.open.fda.gov/
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
openFDA Data
3 ways to download data from openFDA.
š Download manually. 
There’s a downloads section on each endpoint’s openFDA.
š Write code to download the data automatically. 
Use a special API query (see below) to get a list of all the current
data files for each endpoint.
š Synchronize with the openFDA S3 bucket. 
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
MongoDB
š Collecting query records
š Storing query results
š Setting up data environment
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
LIVE DEMO
© 2015 8 Path Solutions LLC. All Rights Reserved.
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
LIVE DEMO
© 2015 8 Path Solutions LLC. All Rights Reserved.
openFDA API website
https://open.fda.gov/index.html
FDA’S Example Query
https://open.fda.gov/api/reference/#example-query
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
LIVE DEMO
© 2015 8 Path Solutions LLC. All Rights Reserved.
FDA’S Example Query: https://open.fda.gov/api/reference/#example-query
Original Query
https://api.fda.gov/drug/event.json?
search=patient.drug.openfda.pharm_class_epc:"nonsteroidal+anti-inflammatory
+drug"&count=patient.reaction.reactionmeddrapt.exact
Our Query
https://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND
+patient.drug.openfda.brand_name:"Yaz"
@8PathSolutions© 2015 8 Path Solutions LLC. All Rights Reserved.
LIVE DEMO
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.š https://api.fda.gov/drug/event.json?
api_key=83k0zAKbRQbk5rCbedjpDs8DdKwoagWvojeW2ATf&search
=receivedate:[20040101+TO+20120930]+AND+receiptdate:
[20040101+TO+20120930]+AND+patient.drug.medicinalproduct:
%22YAZ%22&count=patient.drug.openfda.brand_name.exact
© 2016. 8 Path Solutions LLC.
THANK YOU
JENNIFER SHIN
jshin@8pathsolutions.com
@8pathsolutions
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
© 2015 8 Path Solutions LLC. All Rights Reserved. @8PATHSOLUTIONS
○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-1
○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-2
○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-3
○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-4
Additional Resources
© 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.
Footnotes
š  1. http://www.merckmanuals.com/professional/clinical-pharmacology/adverse-drug-
reactions/adverse-drug-reactions

More Related Content

PPTX
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
PDF
Using Healthcare Data for Research @ The Hyve - Campus Party 2016
PDF
operationalizing asthma analytic plan using omop cdm brandt
PPTX
Road to NODES - Healthcare Analytics
PPTX
Evaluating Drug Safety Using Graph Databases
PPTX
Data Integration vs Transparency: Tackling the tension
PDF
Measuring the Relationship between Innovative Drugs and AE_2015
PPTX
No Free Lunch: Metadata in the life sciences
Incorporating Commercial and Private Data into an Open Linked Data Platform f...
Using Healthcare Data for Research @ The Hyve - Campus Party 2016
operationalizing asthma analytic plan using omop cdm brandt
Road to NODES - Healthcare Analytics
Evaluating Drug Safety Using Graph Databases
Data Integration vs Transparency: Tackling the tension
Measuring the Relationship between Innovative Drugs and AE_2015
No Free Lunch: Metadata in the life sciences

Similar to MongoDB_Talk_ValidatingAnOpenSociety_112916_Final (20)

PPT
ManagingOrganizingData_ReusableSlides.ppt
PDF
Opening up pharmacological space, the OPEN PHACTs api
PPTX
How to access and process FDA drug approval packages for use in research
PPTX
Pharmacovigilance regulations as per European Union
PPTX
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
PDF
Polypharmacy and medication errors
PPTX
BigDataEurope - Big Data & Health
PPTX
Med errors apa
DOCX
Chapter 9 Patient Safety, Quality and ValueHarry Burke MD P.docx
DOCX
Chapter 9 Patient Safety, Quality and ValueHarry Burke MD P.docx
PPTX
6 years of my private G+ Spotfire community
PPTX
Strategies for Dealing with the CRF
PDF
From algorithms to advancing care: genomics data drives progress
PPTX
Is Big Data Always Good Data?
PPT
393258977-Lecture12-RootCauseAnalysis-ppt.ppt
PPT
root cause analysis lecture bachelor level
PPTX
medication error.pptx
PPTX
Open PHACTS (Sept 2013) EBI Industry Programme
PDF
World Drug Safety berlin sept 2017
PDF
Using Digital Innovation to Establish Authentic Reporter Dialogue
ManagingOrganizingData_ReusableSlides.ppt
Opening up pharmacological space, the OPEN PHACTs api
How to access and process FDA drug approval packages for use in research
Pharmacovigilance regulations as per European Union
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
Polypharmacy and medication errors
BigDataEurope - Big Data & Health
Med errors apa
Chapter 9 Patient Safety, Quality and ValueHarry Burke MD P.docx
Chapter 9 Patient Safety, Quality and ValueHarry Burke MD P.docx
6 years of my private G+ Spotfire community
Strategies for Dealing with the CRF
From algorithms to advancing care: genomics data drives progress
Is Big Data Always Good Data?
393258977-Lecture12-RootCauseAnalysis-ppt.ppt
root cause analysis lecture bachelor level
medication error.pptx
Open PHACTS (Sept 2013) EBI Industry Programme
World Drug Safety berlin sept 2017
Using Digital Innovation to Establish Authentic Reporter Dialogue
Ad

MongoDB_Talk_ValidatingAnOpenSociety_112916_Final

  • 1. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. Data, Decisions, and MongoDB Tuesday, November 29, 2016 © 2015 8 Path Solutions LLC. All Rights Reserved.  Validating  an  Open  Society     Jennifer Shin Founder, 8 Path Solutions Senior Principal Data Scientist, Nielsen
  • 2. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. ●  Background š Native New Yorker š Undergraduate degree in Economics, Mathematics & Creative Writing from Columbia University š Graduate degree in Statistics from Columbia University Introduction
  • 3. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. ●  Professional Experience š Founder & Chief Data Scientist at 8 Path Solutions š Senior Principal Data Scientist at Nielsen š Management consultant at Fortune 100 companies š Top Contributor for IBM Data Magazine š Faculty in the MIDS Graduate Program at UC Berkeley Introduction
  • 4. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. ●  Recent Talks & Presentations š Institute of Computational & Experimental Research in Mathematics (ICERM) at Brown University š TDWI Accelerate 2016 š Data Dialogs Conference – UC Berkeley š IBM World of Watson 2016 Introduction
  • 5. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. š  Adverse Drug Reactions (ADR) š  FDA Adverse Events Reporting System (FAERS) š  openFDA API š  openFDA + MongoDB Today’s Talk
  • 6. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. š Monitoring the safety of medicinal products š Adverse Drug Reactions (ADR): unwanted, uncomfortable, or dangerous effects that a drug may have š In the US, 3 to 7% of all hospitalizations are due to ADR1 Pharmacovigilance
  • 7. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA SOURCE š  FDA Adverse Event Reporting System (FAERS) ●  A computerized information database designed to support the FDA's post-marketing safety surveillance program for all approved drug & therapeutic biologic products ●  Used to monitor for new adverse events and medication errors that might occur with these marketed products © 2015 8 Path Solutions LLC. All Rights Reserved.
  • 8. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA SOURCE © 2015 8 Path Solutions LLC. All Rights Reserved. Option 1: Quarterly FAERS Data Files ○  Available each quarter from the FDA ○  Data from 2004 to 2012 available in ASCII/SGML Data after 2012 available in ASCII/XML
  • 9. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA CHALLENGES © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Requires downloading and consolidating quarterly reports in a databases.
  • 10. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA ISSUES © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Duplicate Reports ○  Spelling errors ○  Inaccurate information ○  One field for all drug names (e.g. Brand Name & Generic) and active ingredients ○  Multiple drugs included in a single report
  • 11. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. openFDA API
  • 12. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. š  Beta launch - June 2014 š  New website Food & Drug Administration openFDA API
  • 13. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. š  Facilitate access and use of big important FDA public datasets by developer, researchers, and the public through harmonization of data across disparate FDA datasets provided via application programming interfaces (APIs) API OBJECTIVES
  • 14. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. š  Drug adverse events: Reports of drug side effects, product use errors, product quality problems, and therapeutic failures. š  Drug product labeling: Structured product information, including prescribing information, for approved drug products. š  Drug recall enforcement reports: Drug product recall enforcement reports. API DATA
  • 15. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. š  Drug product labeling: Structured product information, including prescribing information, for approved drug products. š  Drug recall enforcement reports: Drug product recall enforcement reports. API DATA š  Drug adverse events: Reports of drug side effects, product use errors, product quality problems, and therapeutic failures
  • 16. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. API DATA © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Access to FAERS database using API calls
  • 17. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. API DATA 3 ways to download data from openFDA. š Download manually.  There’s a downloads section on each endpoint’s openFDA. š Write code to download the data automatically.  Use a special API query (see below) to get a list of all the current data files for each endpoint. š Synchronize with the openFDA S3 bucket. 
  • 18. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA ISSUE SOLUTION Drug Name Harmonization Process © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Benefits •  Harmonizes the FAERS data on drug identifiers using other data sources, such as NDC & RxNorm •  Separate data fields for brand names, generic names, and active ingredients
  • 19. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA ISSUE SOLUTION Drug Name Harmonization Process © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Limitations •  Cannot harmonize misspelled drug names •  Validation process requires using FAERS data files - Not necessarily easier
  • 20. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. CASE STUDY: DROSPIRENONE
  • 21. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. FAERS vs. OPENFDA © 2015 8 Path Solutions LLC. All Rights Reserved. FAERS Data Files Data From 2004 Q1 to 2012 Q3 2 out of the 7 Reports: DEMO, DRUG, REAC, RPSR, THER, OUTC, INDI Consolidated using SQL Server 2012
  • 22. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. FAERS vs. OPENFDA © 2015 8 Path Solutions LLC. All Rights Reserved. Drug Name Mapping DRUGNAME   YAZ   DROSPIRENONE  W/ETHINYLESTRADIOL  (YAZ)   YAZ  /06358701/   YAZ  BAYER  HEALTHCARE   YAZ  N/A  BAYER  HEALTHCARE   YAZ  (24)   YAZ  (DROSPIRENONE  +  ETHINYLESTRADIOL  20!G  (24+4)  [YAZ]     YAZ  (DROSPIRENONE/ESTRADIOL)   YAZ  (ORAL  CONTRACEPTATIVE  NOS)  
  • 23. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. FAERS vs. OPENFDA © 2015 8 Path Solutions LLC. All Rights Reserved. Data Fields FAERS Data Files openFDA API Brand Name Yaz     DRUGNAME     pa.ent.drug.openfda.brand_name       Generic Name Drospirenone  Ethinyl   Estradiol     DRUGNAME       pa.ent.drug.openfda.generic_name   Case Report Identifier     ISR     safetyrepor.d  
  • 24. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. FAERS vs. OPENFDA © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Inconsistent Query Data •  Running the same query on 8/03/14 & 8/10/14 produced different results
  • 25. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. FAERS vs. OPENFDA © 2015 8 Path Solutions LLC. All Rights Reserved. ○  Inconsistent Query Data •  No information as to the cause of these changes could be found on the FDA website •  According to the Github records, there were no updates made between these two dates •  For our brand name data analysis, the most recent results were selected
  • 26. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. DATA ISSUES SOLUTION? Drug Name Harmonization Process © 2015 8 Path Solutions LLC. All Rights Reserved. ○  FAERS Data Files vs. openFDA API Query •  Cannot harmonize misspelled drug names •  Validation process requires using FAERS data files - Not necessarily easier
  • 27. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS 1: BRAND NAME
  • 28. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS 1: BRAND NAME 1 0 20 105 2,502 2,321 1,472 8,857 6,750 0 0 19 215 2,498 2,261 1,365 7,881 5,551 2004 2005 2006 2007 2008 2009 2010 2011 2012 openFDA FAERS Comparing Reports for Yaz from Q1 2004 to Q3 2012
  • 29. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS 1: BRAND NAME openFDA API query results for safetyreportid “4990905-5”
  • 30. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS 1: BRAND NAME DRUGNAME for ISR number “4990905” only includes DROSPIRENONE AND ETHINYL ESTRADIOL openFDA API query results for safetyreportid “4990905-5”
  • 31. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS II: GENERIC NAME QUERY   openFDA   Results   Initial Query hOps://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND+   pa^ent.drug.openfda.generic_name:”DROSPIRENONE+ETHINYL +ESTRADIOL”&count=pa^ent.drug.openfda.brand_name     Total:          714     Yaz:                107  
  • 32. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS II: GENERIC NAME QUERY   openFDA   Results   Initial Query hOps://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND+   pa^ent.drug.openfda.generic_name:”DROSPIRENONE+ETHINYL +ESTRADIOL”&count=pa^ent.drug.openfda.brand_name     Total:          714     Yaz:                107   Revised Query hOps://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND +pa^ent.drug.openfda.generic_name:"DROSPIRENONE”+AND +pa^ent.drug.openfda.generic_name:"ETHINYL”+AND +pa^ent.drug.openfda.generic_name:"ESTRADIOL”&count=pa^ent.drug.openfda.brand_na me     Total:    31,051     Yaz:          22,028  
  • 33. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. Results The drug harmonization process incorrectly associated reports for Drospirenone Ethinyl Estradiol with the drug Yaz. š Raises concerns about the drug harmonization process for Yaz as well as other drugs š Further study is needed to validate the accuracy of the openFDA data
  • 34. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. RESULTS II: GENERIC NAME The reported cases for Yaz from the openFDA API and FAERS Data Files varied widely when compared based on the year of the report. š For 2006, the API only included 105 cases, which is 51% less than the 215 cases in FAERS. š For 2011, the API included 8,857 cases, which is 12.4% more than the 7,881 cases in FAERS.
  • 35. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. Data, Trust, and Reproducibility
  • 36. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. Which is better? © 2015 8 Path Solutions LLC. All Rights Reserved. @8PATHSOLUTIONS š  Traditional methods vs. newer approaches š  Data processing & data validation š  Access via API vs. database  š  Implications for pharmaceutical research, data science, data technology & development
  • 37. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. Data Dependence š  Risk of relying on API data EX: http://download.open.fda.gov/
  • 38. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. openFDA Data 3 ways to download data from openFDA. š Download manually.  There’s a downloads section on each endpoint’s openFDA. š Write code to download the data automatically.  Use a special API query (see below) to get a list of all the current data files for each endpoint. š Synchronize with the openFDA S3 bucket. 
  • 39. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. MongoDB š Collecting query records š Storing query results š Setting up data environment
  • 40. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. LIVE DEMO © 2015 8 Path Solutions LLC. All Rights Reserved.
  • 41. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. LIVE DEMO © 2015 8 Path Solutions LLC. All Rights Reserved. openFDA API website https://open.fda.gov/index.html FDA’S Example Query https://open.fda.gov/api/reference/#example-query
  • 42. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. LIVE DEMO © 2015 8 Path Solutions LLC. All Rights Reserved. FDA’S Example Query: https://open.fda.gov/api/reference/#example-query Original Query https://api.fda.gov/drug/event.json? search=patient.drug.openfda.pharm_class_epc:"nonsteroidal+anti-inflammatory +drug"&count=patient.reaction.reactionmeddrapt.exact Our Query https://api.fda.gov/drug/event.json?search=receivedate:[20040101+TO+20120930]+AND +patient.drug.openfda.brand_name:"Yaz"
  • 43. @8PathSolutions© 2015 8 Path Solutions LLC. All Rights Reserved. LIVE DEMO
  • 44. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC.š https://api.fda.gov/drug/event.json? api_key=83k0zAKbRQbk5rCbedjpDs8DdKwoagWvojeW2ATf&search =receivedate:[20040101+TO+20120930]+AND+receiptdate: [20040101+TO+20120930]+AND+patient.drug.medicinalproduct: %22YAZ%22&count=patient.drug.openfda.brand_name.exact
  • 45. © 2016. 8 Path Solutions LLC. THANK YOU JENNIFER SHIN jshin@8pathsolutions.com @8pathsolutions
  • 46. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. © 2015 8 Path Solutions LLC. All Rights Reserved. @8PATHSOLUTIONS ○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-1 ○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-2 ○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-3 ○  http://www.ibmbigdatahub.com/blog/exploring-public-open-data-project-part-4 Additional Resources
  • 47. © 2016. 8 Path Solutions LLC.© 2016. 8 Path Solutions LLC. Footnotes š  1. http://www.merckmanuals.com/professional/clinical-pharmacology/adverse-drug- reactions/adverse-drug-reactions