SlideShare a Scribd company logo
Diabetes Data Science
Philip E. Bourne PhD, FACMI
Stephenson Chair of Data Science
Director, Data Science Institute
Professor of Biomedical Engineering
peb6a@virginia.edu
https://www.slideshare.net/pebourne
1
@pebourne
American Diabetes Association, June 23, 2018, Orlando
I declare no conflicts of interest …
I am an open science advocate and you can take
all the photos you want (being courtious to others)
...
The slides are all on slideshare in any case
2
I am not a diabetes researcher...
I am a computational biologist come data
scientist interested in helping address diabetes
where I see lots of opportunities
3
So What is Data Science?
4
http://vadlo.com/cartoons.php?id=357
Data science is like the Internet…
If I asked you to define it, you would all say
something different, yet you use it every day…
So What do I Mean by Data Science?
• Use of the ever increasing amount of open, complex, diverse digital
data
• Finding ways to ask and then answer relevant questions by
combining such diverse data sets
• Arriving at statistically significant conclusions not otherwise
obtainable
• Sharing such findings in a useful way
• Translating such findings into actions that improve the human
condition
5
If you don’t listen to me listen to:
The NIH Strategic Plan for Data
• Support a Highly Efficient and Effective Biomedical Research Data
Infrastructure
• Promote Modernization of the Data-Resources Ecosystem
• Support the Development and Dissemination of Advanced Data
Management, Analytics, and Visualization Tools
• Enhance Workforce Development for Biomedical Data Science
• Enact Appropriate Policies to Promote Stewardship and
Sustainability
6https://grants.nih.gov/grants/rfi/NIH-Strategic-Plan-for-Data-Science.pdf
Why Now? Drivers of Change
• Generic
• There are ~2.7 Zetabytes (2.7 x 106 PB) of digital data
• Training data is doubling every two years
• Robust and reusable tools in Python and R
• More advanced tools e.g., Deep Artificial Neural Networks (DNNs)
• New computing power e.g., GPUs, the cloud
• Advances coming from the private sector NOT academia
• Successful integration into workflows & lifestyles – analytics companies
• Diabetes specific
• $1000 genome
• Wearable sensors
• Mandatory EHRs
• “Success” in predictive modelling
7
Pastur-Romay et al. 2016 doi:10.3390/ijms17081313
Mapping Diabetes to the 5 Pillars of Data Science
8
Data Integration
& Engineering
Machine Learning
& Analytics
Visualization
& Dissemination
Data Acquisition Ethics, Law,
Policy,
Social Implications
Mapping Diabetes to the 5 Pillars of Data Science
9
Data Integration
& Engineering
Machine Learning
& Analytics
Visualization
& Dissemination
Data Acquisition Ethics, Law,
Policy,
Social Implications
Global
Treatment
Ecosystem
Virtual Image
of the Patient
(VIP)
Patient Profile;
Analytics
Treatment
& Control
Predictive Analysis
Database
Add Genotype,
Medical Record
Local Treatment
Ecosystem:
Real-time data;
Predictive analytics;
Artificial Pancreas
[Adapted from Boris Kovatchev]
Screening
Hypoglycemia
Insulin associated weight gain
Retinopathy
Neuropathy
Nephropathy
Heart disease
Cichosz et al 2016 J Diabetes Sci & Tech 10(1) 27-34
10
Prediction – Image Recognition
• Google Diabetic Retinopathy– Prediction based of training from 120,000
images classified by 54 ophthalmologists
• Prediction maps inputs (image of the retina) to outputs (a diagnosis of
retinopathy) in a closed system – does not consider confounders eg if the
retina had been operated on
• All the required information is in the data
• Researchers concluded that the algorithm’s performance was in line with
board-certified ophthalmologists and retinal specialists
11Krause et al. https://doi.org/10.1016/j.ophtha.2018.01.034
Image Recognition - Convolutional Neural Networks
Convolutional
Layers
Max Pooling
Layers
• Down sampling while maintaining key features
• “Convolute” discovers the feature where ever it may reside in the image
12
Prediction: Comorbidity Network for 6.2M Danes Over 14.9 Years
Jensen et al 2014 Nat Comm 5:4022
13
A Note of Caution
14
Predictive ability overemphasizes what is possible in
healthcare …
There are many confounders …
Does enough expert knowledge (itself biased) in a complex
system built into the algorithm provide accurate outcomes?
The Birthweight Paradox
• What is the causal effect of smoking during pregnancy?
• Confounders – alcohol consumption, diet, prenatal care
• Need to adjust for cofounders e.g. birth weight
• BUT birth weight is associated with infant mortality and
maternal smoking – introduces bias
• Lower birth weight babies from mothers who smoked
during pregnancy leads to lower mortality
15
16
http://cartertoons.com/
Diabetes Platform
Research
Students
Healthcare
Patients
Insightful Care
Rapid Innovation
17[Adapted from Omar Khurshid]
Should biomedical research be Like Airbnb?
doi: 10.1371/journal.pbio.2001818
In Summary
• Data science will have an increasing impact on diabetes
research
• Data scientists & experts need to work together
• Acceptance begins with getting clinicians on-board at the
start of the study
• Education in these new approaches is desperately needed
• Bioethical data science training is part of that education even
though policy and law are not keeping pace
18

More Related Content

PDF
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
PPTX
Predicting Diabetes Using Machine Learning
PPTX
Prediction of cardiovascular disease with machine learning
PPTX
Big Data in Medicine
PDF
Heart disease prediction
PDF
Data science presentation
PPTX
Data science | What is Data science
PPT
Survey on data mining techniques in heart disease prediction
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Predicting Diabetes Using Machine Learning
Prediction of cardiovascular disease with machine learning
Big Data in Medicine
Heart disease prediction
Data science presentation
Data science | What is Data science
Survey on data mining techniques in heart disease prediction

What's hot (20)

PDF
Big Data Analytics for Healthcare
PPTX
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
PPTX
Big data by Mithlesh sadh
PDF
HPPS: Heart Problem Prediction System using Machine Learning
PDF
Data Science Deep Roots in Healthcare Industry
PDF
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
PPTX
Final ppt
PPT
Ecg analysis in the cloud
PPTX
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
PPTX
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptx
PPTX
Application of data science in healthcare
PPTX
Big Data in the Cloud
PDF
Big Data: Its Characteristics And Architecture Capabilities
PDF
Data science - An Introduction
PPTX
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
PDF
Survey on data mining techniques in heart disease prediction
PPTX
Prediction of heart disease using machine learning.pptx
PDF
Heart Attack Prediction using Machine Learning
PPTX
Big-Data in HealthCare _ Overview
PPTX
DIABETES PREDICTION SYSTEM .pptx
Big Data Analytics for Healthcare
Data Mining in Healthcare: How Health Systems Can Improve Quality and Reduce...
Big data by Mithlesh sadh
HPPS: Heart Problem Prediction System using Machine Learning
Data Science Deep Roots in Healthcare Industry
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Final ppt
Ecg analysis in the cloud
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Cardiovascular Disease Prediction Using Machine Learning Approaches.pptx
Application of data science in healthcare
Big Data in the Cloud
Big Data: Its Characteristics And Architecture Capabilities
Data science - An Introduction
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
Survey on data mining techniques in heart disease prediction
Prediction of heart disease using machine learning.pptx
Heart Attack Prediction using Machine Learning
Big-Data in HealthCare _ Overview
DIABETES PREDICTION SYSTEM .pptx
Ad

Similar to Diabetes Data Science (20)

PPT
Open Data in a Global Ecosystem
PDF
From Research to Practice - New Models for Data-sharing and Collaboration to ...
PDF
From Research to Practice: New Models for Data-sharing and Collaboration to I...
PDF
National Workshop to Advance Use of Electronic Data
PPTX
Will Biomedical Research Fundamentally Change in the Era of Big Data?
PPT
PhRMA Some Early Thoughts
PPT
A Successful Academic Medical Center Must be a Truly Digital Enterprise
PPT
The Vision for Data @ the NIH
PPT
Mind the Gap: Reflections on Data Policies and Practice
PPT
Secure Data Sharing and Related Matters – An NIH View
PPTX
What Can Happen when Genome Sciences Meets Data Sciences?
PDF
Improving health care outcomes with responsible data science
PPT
Big Data in Biomedicine: Where is the NIH Headed
PPT
Evolution or revolution? The changing data landscape
PDF
Sun==big data analytics for health care
PPT
Data at the NIH
PPT
Open Data in a Big Data World: easy to say, but hard to do?
PPT
Biomedical Research as Part of the Digital Enterprise
PPT
Data at the NIH: Some Early Thoughts
PPT
Data Analytics
Open Data in a Global Ecosystem
From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice: New Models for Data-sharing and Collaboration to I...
National Workshop to Advance Use of Electronic Data
Will Biomedical Research Fundamentally Change in the Era of Big Data?
PhRMA Some Early Thoughts
A Successful Academic Medical Center Must be a Truly Digital Enterprise
The Vision for Data @ the NIH
Mind the Gap: Reflections on Data Policies and Practice
Secure Data Sharing and Related Matters – An NIH View
What Can Happen when Genome Sciences Meets Data Sciences?
Improving health care outcomes with responsible data science
Big Data in Biomedicine: Where is the NIH Headed
Evolution or revolution? The changing data landscape
Sun==big data analytics for health care
Data at the NIH
Open Data in a Big Data World: easy to say, but hard to do?
Biomedical Research as Part of the Digital Enterprise
Data at the NIH: Some Early Thoughts
Data Analytics
Ad

More from Philip Bourne (20)

PPTX
Your Science Needs You - More Than Ever Before
PPTX
The Biological Data Sustainability Paradox: A Time to Think Differently
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
AI in Medical Education A Meta View to Start a Conversation
PPTX
AI+ Now and Then How Did We Get Here And Where Are We Going
PPTX
Thoughts on Biological Data Sustainability
PPTX
What is FAIR Data and Who Needs It?
PPTX
Data Science Meets Biomedicine, Does Anything Change
PPTX
Data Science Meets Drug Discovery
PPTX
Biomedical Data Science: We Are Not Alone
PPTX
BIMS7100-2023. Social Responsibility in Research
PPTX
AI from the Perspective of a School of Data Science
PPTX
What Data Science Will Mean to You - One Person's View
PPTX
Novo Nordisk 080522.pptx
PPTX
Towards a US Open research Commons (ORC)
PPTX
COVID and Precision Education
PPTX
One View of Data Science
PPTX
Cancer Research Meets Data Science — What Can We Do Together?
PPTX
Data Science Meets Open Scholarship – What Comes Next?
Your Science Needs You - More Than Ever Before
The Biological Data Sustainability Paradox: A Time to Think Differently
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
AI in Medical Education A Meta View to Start a Conversation
AI+ Now and Then How Did We Get Here And Where Are We Going
Thoughts on Biological Data Sustainability
What is FAIR Data and Who Needs It?
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Drug Discovery
Biomedical Data Science: We Are Not Alone
BIMS7100-2023. Social Responsibility in Research
AI from the Perspective of a School of Data Science
What Data Science Will Mean to You - One Person's View
Novo Nordisk 080522.pptx
Towards a US Open research Commons (ORC)
COVID and Precision Education
One View of Data Science
Cancer Research Meets Data Science — What Can We Do Together?
Data Science Meets Open Scholarship – What Comes Next?

Recently uploaded (20)

PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Pre independence Education in Inndia.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Complications of Minimal Access Surgery at WLH
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
master seminar digital applications in india
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Business Ethics Teaching Materials for college
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Basic Mud Logging Guide for educational purpose
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Pre independence Education in Inndia.pdf
PPH.pptx obstetrics and gynecology in nursing
Complications of Minimal Access Surgery at WLH
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Microbial diseases, their pathogenesis and prophylaxis
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Anesthesia in Laparoscopic Surgery in India
master seminar digital applications in india
human mycosis Human fungal infections are called human mycosis..pptx
Business Ethics Teaching Materials for college
Abdominal Access Techniques with Prof. Dr. R K Mishra
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Week 4 Term 3 Study Techniques revisited.pptx
O7-L3 Supply Chain Operations - ICLT Program
Cell Types and Its function , kingdom of life
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Basic Mud Logging Guide for educational purpose

Diabetes Data Science

  • 1. Diabetes Data Science Philip E. Bourne PhD, FACMI Stephenson Chair of Data Science Director, Data Science Institute Professor of Biomedical Engineering peb6a@virginia.edu https://www.slideshare.net/pebourne 1 @pebourne American Diabetes Association, June 23, 2018, Orlando
  • 2. I declare no conflicts of interest … I am an open science advocate and you can take all the photos you want (being courtious to others) ... The slides are all on slideshare in any case 2
  • 3. I am not a diabetes researcher... I am a computational biologist come data scientist interested in helping address diabetes where I see lots of opportunities 3
  • 4. So What is Data Science? 4 http://vadlo.com/cartoons.php?id=357 Data science is like the Internet… If I asked you to define it, you would all say something different, yet you use it every day…
  • 5. So What do I Mean by Data Science? • Use of the ever increasing amount of open, complex, diverse digital data • Finding ways to ask and then answer relevant questions by combining such diverse data sets • Arriving at statistically significant conclusions not otherwise obtainable • Sharing such findings in a useful way • Translating such findings into actions that improve the human condition 5
  • 6. If you don’t listen to me listen to: The NIH Strategic Plan for Data • Support a Highly Efficient and Effective Biomedical Research Data Infrastructure • Promote Modernization of the Data-Resources Ecosystem • Support the Development and Dissemination of Advanced Data Management, Analytics, and Visualization Tools • Enhance Workforce Development for Biomedical Data Science • Enact Appropriate Policies to Promote Stewardship and Sustainability 6https://grants.nih.gov/grants/rfi/NIH-Strategic-Plan-for-Data-Science.pdf
  • 7. Why Now? Drivers of Change • Generic • There are ~2.7 Zetabytes (2.7 x 106 PB) of digital data • Training data is doubling every two years • Robust and reusable tools in Python and R • More advanced tools e.g., Deep Artificial Neural Networks (DNNs) • New computing power e.g., GPUs, the cloud • Advances coming from the private sector NOT academia • Successful integration into workflows & lifestyles – analytics companies • Diabetes specific • $1000 genome • Wearable sensors • Mandatory EHRs • “Success” in predictive modelling 7 Pastur-Romay et al. 2016 doi:10.3390/ijms17081313
  • 8. Mapping Diabetes to the 5 Pillars of Data Science 8 Data Integration & Engineering Machine Learning & Analytics Visualization & Dissemination Data Acquisition Ethics, Law, Policy, Social Implications
  • 9. Mapping Diabetes to the 5 Pillars of Data Science 9 Data Integration & Engineering Machine Learning & Analytics Visualization & Dissemination Data Acquisition Ethics, Law, Policy, Social Implications
  • 10. Global Treatment Ecosystem Virtual Image of the Patient (VIP) Patient Profile; Analytics Treatment & Control Predictive Analysis Database Add Genotype, Medical Record Local Treatment Ecosystem: Real-time data; Predictive analytics; Artificial Pancreas [Adapted from Boris Kovatchev] Screening Hypoglycemia Insulin associated weight gain Retinopathy Neuropathy Nephropathy Heart disease Cichosz et al 2016 J Diabetes Sci & Tech 10(1) 27-34 10
  • 11. Prediction – Image Recognition • Google Diabetic Retinopathy– Prediction based of training from 120,000 images classified by 54 ophthalmologists • Prediction maps inputs (image of the retina) to outputs (a diagnosis of retinopathy) in a closed system – does not consider confounders eg if the retina had been operated on • All the required information is in the data • Researchers concluded that the algorithm’s performance was in line with board-certified ophthalmologists and retinal specialists 11Krause et al. https://doi.org/10.1016/j.ophtha.2018.01.034
  • 12. Image Recognition - Convolutional Neural Networks Convolutional Layers Max Pooling Layers • Down sampling while maintaining key features • “Convolute” discovers the feature where ever it may reside in the image 12
  • 13. Prediction: Comorbidity Network for 6.2M Danes Over 14.9 Years Jensen et al 2014 Nat Comm 5:4022 13
  • 14. A Note of Caution 14 Predictive ability overemphasizes what is possible in healthcare … There are many confounders … Does enough expert knowledge (itself biased) in a complex system built into the algorithm provide accurate outcomes?
  • 15. The Birthweight Paradox • What is the causal effect of smoking during pregnancy? • Confounders – alcohol consumption, diet, prenatal care • Need to adjust for cofounders e.g. birth weight • BUT birth weight is associated with infant mortality and maternal smoking – introduces bias • Lower birth weight babies from mothers who smoked during pregnancy leads to lower mortality 15
  • 17. Diabetes Platform Research Students Healthcare Patients Insightful Care Rapid Innovation 17[Adapted from Omar Khurshid] Should biomedical research be Like Airbnb? doi: 10.1371/journal.pbio.2001818
  • 18. In Summary • Data science will have an increasing impact on diabetes research • Data scientists & experts need to work together • Acceptance begins with getting clinicians on-board at the start of the study • Education in these new approaches is desperately needed • Bioethical data science training is part of that education even though policy and law are not keeping pace 18

Editor's Notes

  • #13: CNN - takes small regions and condenses into one value
  • #14: 16 million hospital inpatient events (24.5% of total), 35 million outpatient clinic events (53.6% of total) and 14 million emergency department events (21.9% of total