SlideShare a Scribd company logo
Baker Tilly refers to Baker Tilly Virchow Krause, LLP,
an independently owned and managed member of Baker Tilly International. © 2012 Baker Tilly Virchow Krause, LLP
Baker Tilly
Management Consulting
Realizing Business Value from Unstructured Data
THE VALUE OF UNSTRUCTURED DATA
ANALYTICS
2
There is a tremendous opportunity to gain a competitive advantage by
analyzing unstructured data
Industries continue to struggle with integrating unstructured analytics into their business models. It is time
consuming to identify all of the relevant data source and technically challenging to consume the data into an
analytics environment, where additional processing needs to occur before the data can be analyzed.
Leveraging a Broad Variety of Data:
Companies must be able to transform and parse data from multiple sources and in multiple formats:
databases, text files, scientific devices, transactions, and even social media postings. End users also need easy,
consistent access to all of this data to create a 360-degree view- of their customers, their products, or their
brand.
The Value of Unstructured Data Analytics
Unlocking the True Potential of Big Data
3
The Value of Unstructured Data Analytics
Mapping data sources to use-cases
Business use-cases that can benefit from an analysis of unstructured data include:
• Clinical Trial Development-Analysis: Expedite the analysis of patient diary and Patient Reported Outcome data to reduce
time-to-market (and potentially uncover unanticipated benefits in early stage trials)
• Clinical Trial-PRO Development- Analyzing publicly available discussion forum data can accelerate the development of
Patient Reported Outcome measures and streamline the FDA’s protocol review process
• Active Market Surveillance (Pharmacovigilance): Are patients using and experiencing your product in a manner that is
consistent with your Clinical Trial data?
• Market Intelligence: Understanding how your customers are describing their experiences with specific medications can
inform market positioning and facilitate targeted messaging
• Labelling Claim Expansion: Are there unanticipated applications and benefits that are being articulated by customers
that can be used to inform programmatic expansion of an existing compound?
Data Sources Include: Clinical Research
 PubMed
 www.clinicaltrials.gov
 FDA.gov
Patient Support Sites
 patientslikeme.com
 dailystrength.org
 askapatient.com
Social Media Platforms
 Reddit
 Twitter
 Facebook
 Clinical Trial Data (e.g., Patient Diaries)
 Call Center Notes
 Documents
Internal
The abundance of data provides tremendous opportunity…
And an overwhelming amount of data points:
7
We can help separate meaningful from meaningless
MAPPING DATA SOURCES TO USE CASES
8
The Value of Unstructured Data Analytics
Availability of Data
Relevant data is readily available:
9
CASE STUDY: PHARMACOVIGILANCE
10
Case-Study: Pharmacovigilance
Data Source:Reddit
11
234M Unique Users 853,824 Subreddits 11,464 Active Communities
217 Countries 8 Billion Page Views Monthly 13+ minutes spent on
Average
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
Analyzing all of the
post titles can yield
value…
But analyzing the
conversations people are
having, and associated
metadata like post date
and # of comments can
be infinitely more
powerful
The Challenges with Analyzing Externally Sourced Unstructured Data:
• There are thousands of posts, and tens of thousands (and more) comments
• Without technology and a methodical text mining processes, gaining insight would require manual review
and data collection.
The amount of time to mine insights from the data would take on the order of months making
it difficult to impact business decisions
13
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
• To source the data we wrote a Python script to crawl the site and scrape
the data
• We ran a query on Reddit, using ‘Lipitor’ as the search term and analyzed
the results using Python and Oracle Big Data Discovery
• The following are some visualizations and insights we were able to glean
from the data.
14
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
15
We are able to view a quick top-line summary of the data set and KPIs:
And a distribution of where posts have been submitted
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
16
Symptoms that people discuss, buried in the comments section of the posts have been tagged,
aggregated and visualized in a Tag Cloud:
And we can see how the volume of comments about the symptoms has changed over time:
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
17
We are able to see distribution of comments by location…
And limit our analysis to a geographic location of interest. Our summary data updates automatically based on
this refinement:
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
18
We can set up alerts that tell us when Pfizer products are mentioned:
And configure the alerts to show us the terms that were used to flag them:
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
19
Users have complete visibility into the source data and finding key words and phrases is facilitated by
powerful search technology:
Sample Use-Case: Pharmacovigilance
Data Source: www.reddit.com
Summary
• Robust publicly available unstructured data provides opportunities to inform multiple use-cases, including:
- Pharmacovigilance
- Competitor Analysis
- Market Research
- Expedited Clinical Trial End-Point Development
• For most companies these data points represent a difficult ‘aspirational’ data source for inclusion in Business Processes
• Barriers include:
- Identifying the relevant publicly available data sources
- Technical challenges associated with sourcing the data
- Methodology/Technical approach to generating insights (Text Analytics)
- Integrating insights into Business Processes
• Baker Tilly can help!
Proposed next steps
• Custom Demo Development
 Conduct 1/2 day onsite Discovery working-session
 Define high-value use-case for demo
 Identify 2-3 high value unstructured sources for inclusions in demo
 Develop 4-5 visualizations to demonstrate value and surface insights
20
The Value of Unstructured Data Analytics
Summary & Proposed Next Steps
Interested in learning more? Contact Andrew Malinow, PhD

More Related Content

PPTX
Ethics in clinical trials
PPTX
Understanding Risk Stratification, Comorbidities, and the Future of Healthcare
PDF
Viajar al pasado para cambiar mi pasado y salvar el planeta
PPT
RESEARCH STRATEGIES
PPTX
softwares in public health
PPTX
Analytics in healthcare
PPTX
Crossover design ppt
Ethics in clinical trials
Understanding Risk Stratification, Comorbidities, and the Future of Healthcare
Viajar al pasado para cambiar mi pasado y salvar el planeta
RESEARCH STRATEGIES
softwares in public health
Analytics in healthcare
Crossover design ppt

What's hot (20)

PPTX
measures of central tendency
PPTX
Analysis and interpretation of data
PPTX
RESEARCH METHODOLOGY
PPTX
Error, bias and confounding
PPTX
Randomized trial seminar
PPTX
PPTX
Meta analysis
PPTX
Exploratory data analysis using r
PPTX
Various statistical software's in data analysis.
PPT
Study designs in epidemiology
PPTX
General research methodology mpharm
PDF
eBook - Data Analytics in Healthcare
PPTX
Types of Research Report Writing
PPTX
structure of research report
PPT
Interpretation and Report writing
PPTX
BIBLIOGRAPHY AND REFRENCE
PDF
Best strategies for successful recruitment and retention
PPTX
Criticisms of orthodox medical ethics, importance of
measures of central tendency
Analysis and interpretation of data
RESEARCH METHODOLOGY
Error, bias and confounding
Randomized trial seminar
Meta analysis
Exploratory data analysis using r
Various statistical software's in data analysis.
Study designs in epidemiology
General research methodology mpharm
eBook - Data Analytics in Healthcare
Types of Research Report Writing
structure of research report
Interpretation and Report writing
BIBLIOGRAPHY AND REFRENCE
Best strategies for successful recruitment and retention
Criticisms of orthodox medical ethics, importance of
Ad

Viewers also liked (6)

PPTX
Future of RWE - Big Data and Analytics for Pharma 2017 presentation
PPTX
Analytics in Pharmaceutical Industry
PPTX
Nursing informatics
PDF
Integrating Structure and Analytics with Unstructured Data
PPTX
Enabling Better Clinical Operations through a Clinical Operations Store
PPTX
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Future of RWE - Big Data and Analytics for Pharma 2017 presentation
Analytics in Pharmaceutical Industry
Nursing informatics
Integrating Structure and Analytics with Unstructured Data
Enabling Better Clinical Operations through a Clinical Operations Store
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Ad

Similar to Life Science Analytics (20)

PDF
Odgers Berndtson and Unico Big Data White Paper
PPTX
Introduction to Big Data Analytics
PPTX
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
PDF
Course 8 : How to start your big data project by Eric Rodriguez
PPT
Big Data Analytics (1).ppt
PPTX
Big Data: How does it fit in your data strategy?
PPTX
000 introduction to big data analytics 2021
PPTX
Big data, data science & fast data
PPTX
Big data Analytics Fundamentals Chapter 1
PDF
Big Data becomes Big Analysis
PPSX
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
PPTX
What is big data
PDF
life cycle of Big data Analytics Presentation.pdf
PPTX
bigdata introduction for students pg msc
PDF
20CS601 - Big data Analytics - types of data , definition of big data
PPTX
Unlocking-Insights-The-Power-of-Data-Analysis.pptx
PDF
BIG DATA RESEARCH
PDF
Introduction to Data Analytics, AKTU - UNIT-1
PPTX
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
PPTX
What Is Unstructured Data And Why Is It So Important To Businesses?
Odgers Berndtson and Unico Big Data White Paper
Introduction to Big Data Analytics
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
Course 8 : How to start your big data project by Eric Rodriguez
Big Data Analytics (1).ppt
Big Data: How does it fit in your data strategy?
000 introduction to big data analytics 2021
Big data, data science & fast data
Big data Analytics Fundamentals Chapter 1
Big Data becomes Big Analysis
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
What is big data
life cycle of Big data Analytics Presentation.pdf
bigdata introduction for students pg msc
20CS601 - Big data Analytics - types of data , definition of big data
Unlocking-Insights-The-Power-of-Data-Analysis.pptx
BIG DATA RESEARCH
Introduction to Data Analytics, AKTU - UNIT-1
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
What Is Unstructured Data And Why Is It So Important To Businesses?

Recently uploaded (20)

PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
Computer network topology notes for revision
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Introduction to machine learning and Linear Models
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Global journeys: estimating international migration
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPT
Quality review (1)_presentation of this 21
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
STUDY DESIGN details- Lt Col Maksud (21).pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Mega Projects Data Mega Projects Data
Computer network topology notes for revision
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Moving the Public Sector (Government) to a Digital Adoption
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Introduction to machine learning and Linear Models
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
Database Infoormation System (DBIS).pptx
Global journeys: estimating international migration
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Business Ppt On Nestle.pptx huunnnhhgfvu
Quality review (1)_presentation of this 21
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn

Life Science Analytics

  • 1. Baker Tilly refers to Baker Tilly Virchow Krause, LLP, an independently owned and managed member of Baker Tilly International. © 2012 Baker Tilly Virchow Krause, LLP Baker Tilly Management Consulting Realizing Business Value from Unstructured Data
  • 2. THE VALUE OF UNSTRUCTURED DATA ANALYTICS 2
  • 3. There is a tremendous opportunity to gain a competitive advantage by analyzing unstructured data Industries continue to struggle with integrating unstructured analytics into their business models. It is time consuming to identify all of the relevant data source and technically challenging to consume the data into an analytics environment, where additional processing needs to occur before the data can be analyzed. Leveraging a Broad Variety of Data: Companies must be able to transform and parse data from multiple sources and in multiple formats: databases, text files, scientific devices, transactions, and even social media postings. End users also need easy, consistent access to all of this data to create a 360-degree view- of their customers, their products, or their brand. The Value of Unstructured Data Analytics Unlocking the True Potential of Big Data 3
  • 4. The Value of Unstructured Data Analytics Mapping data sources to use-cases Business use-cases that can benefit from an analysis of unstructured data include: • Clinical Trial Development-Analysis: Expedite the analysis of patient diary and Patient Reported Outcome data to reduce time-to-market (and potentially uncover unanticipated benefits in early stage trials) • Clinical Trial-PRO Development- Analyzing publicly available discussion forum data can accelerate the development of Patient Reported Outcome measures and streamline the FDA’s protocol review process • Active Market Surveillance (Pharmacovigilance): Are patients using and experiencing your product in a manner that is consistent with your Clinical Trial data? • Market Intelligence: Understanding how your customers are describing their experiences with specific medications can inform market positioning and facilitate targeted messaging • Labelling Claim Expansion: Are there unanticipated applications and benefits that are being articulated by customers that can be used to inform programmatic expansion of an existing compound? Data Sources Include: Clinical Research  PubMed  www.clinicaltrials.gov  FDA.gov Patient Support Sites  patientslikeme.com  dailystrength.org  askapatient.com Social Media Platforms  Reddit  Twitter  Facebook  Clinical Trial Data (e.g., Patient Diaries)  Call Center Notes  Documents Internal
  • 5. The abundance of data provides tremendous opportunity…
  • 6. And an overwhelming amount of data points:
  • 7. 7 We can help separate meaningful from meaningless
  • 8. MAPPING DATA SOURCES TO USE CASES 8
  • 9. The Value of Unstructured Data Analytics Availability of Data Relevant data is readily available: 9
  • 11. Case-Study: Pharmacovigilance Data Source:Reddit 11 234M Unique Users 853,824 Subreddits 11,464 Active Communities 217 Countries 8 Billion Page Views Monthly 13+ minutes spent on Average
  • 12. Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com Analyzing all of the post titles can yield value… But analyzing the conversations people are having, and associated metadata like post date and # of comments can be infinitely more powerful
  • 13. The Challenges with Analyzing Externally Sourced Unstructured Data: • There are thousands of posts, and tens of thousands (and more) comments • Without technology and a methodical text mining processes, gaining insight would require manual review and data collection. The amount of time to mine insights from the data would take on the order of months making it difficult to impact business decisions 13 Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 14. • To source the data we wrote a Python script to crawl the site and scrape the data • We ran a query on Reddit, using ‘Lipitor’ as the search term and analyzed the results using Python and Oracle Big Data Discovery • The following are some visualizations and insights we were able to glean from the data. 14 Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 15. 15 We are able to view a quick top-line summary of the data set and KPIs: And a distribution of where posts have been submitted Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 16. 16 Symptoms that people discuss, buried in the comments section of the posts have been tagged, aggregated and visualized in a Tag Cloud: And we can see how the volume of comments about the symptoms has changed over time: Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 17. 17 We are able to see distribution of comments by location… And limit our analysis to a geographic location of interest. Our summary data updates automatically based on this refinement: Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 18. 18 We can set up alerts that tell us when Pfizer products are mentioned: And configure the alerts to show us the terms that were used to flag them: Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 19. 19 Users have complete visibility into the source data and finding key words and phrases is facilitated by powerful search technology: Sample Use-Case: Pharmacovigilance Data Source: www.reddit.com
  • 20. Summary • Robust publicly available unstructured data provides opportunities to inform multiple use-cases, including: - Pharmacovigilance - Competitor Analysis - Market Research - Expedited Clinical Trial End-Point Development • For most companies these data points represent a difficult ‘aspirational’ data source for inclusion in Business Processes • Barriers include: - Identifying the relevant publicly available data sources - Technical challenges associated with sourcing the data - Methodology/Technical approach to generating insights (Text Analytics) - Integrating insights into Business Processes • Baker Tilly can help! Proposed next steps • Custom Demo Development  Conduct 1/2 day onsite Discovery working-session  Define high-value use-case for demo  Identify 2-3 high value unstructured sources for inclusions in demo  Develop 4-5 visualizations to demonstrate value and surface insights 20 The Value of Unstructured Data Analytics Summary & Proposed Next Steps Interested in learning more? Contact Andrew Malinow, PhD

Editor's Notes

  • #4: https://www.informatica.com/content/dam/informatica-com/global/amer/us/collateral/executive-brief/big-data-pharmaceutical-industry_ebook_2341.pdf