SlideShare a Scribd company logo
Big Data in Life Sciences
Dr. Matthieu-P. Schapranow
CMS Global Life Sciences Forum, Frankfurt, Germany
Nov 9, 2015
What is the Hasso Plattner Institute, Potsdam, Germany?
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
2
What are the Trends?
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
3
https://www.google.com/trends/explore#q=Big data%2C Life sciences%2C Precision medicine&cmpt=q @ Nov 9, 2015
Life Sciences
Big Data
Precision Medicine
IT Challenges in Life Sciences
Distributed Heterogeneous Data Sources
Human genome/biological data
>750GB per complete human genome
>15PB in databases of leading institutes
Prescription data
1.5B records from 10,000 doctors and
10M Patients (100 GB)
Clinical trials
>30k recruiting trials on
ClinicalTrials.gov
Human proteome
160M data points (2.4GB) per sample
>3TB raw proteome data in ProteomicsDB
PubMed database
>24M unstructured
data in publications
Hospital information systems
>50GB structured relational data
Medical sensor data
Scan of a single organ creates
10GB of raw data within 1s
Cancer patient records
>160k records only at NCT
Big Data in Life
Sciences
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Chart 4
Healthcare Interactions in the 21st Century
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
5
Indirect Interaction
Direct Interaction
C linician PatientResearcher
Pharm aceutical
Com pany
H ealthcare
Providers
H ospital
Research
Center
Laboratory
Patient
Advocacy
G roup
Use Case: Precision Medicine in Oncology
Identification of Best Treatment Option for Cancer Patient
■  Patient: 48 years, female, non-smoker, smoke-free environment
■  Diagnosis: Non-Small Cell Lung Cancer (NSCLC), stage IV
1.  Surgery to remove tumor
2.  Tumor sample is sent to laboratory to extract DNA
3.  DNA is sequenced resulting in 750 GB of raw data per sample
4.  Processing of raw data to perform analysis
5.  Identification of relevant driver mutations using international medical knowledge
6.  Informed decision making
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
6
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
7
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
8
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
we.analyzegenomes.com
Real-time Analysis of Big Medical Data
9
In-Memory Database
Extensions for Life Sciences
Data Exchange,
App Store
Access Control,
Data Protection
Fair Use
Statistical
Tools
Real-time
Analysis
App-spanning
User Profiles
Combined and Linked Data
Genome
Data
Cellular
Pathways
Genome
Metadata
Research
Publications
Pipeline and
Analysis Models
Drugs and
Interactions
Big Data in Life
Sciences
Drug Response
Analysis
Pathway Topology
Analysis
Medical
Knowledge CockpitOncolyzer
Clinical Trial
Recruitment
Cohort
Analysis
...
Indexed
Sources
Real-time Data Analysis and
Interactive Exploration
Drug Response Analysis
Data Sources
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
Smoking status,
tumor classification
and age
(1MB - 100MB)
Raw DNA data
and genetic variants
(100MB - 1TB)
Medication efficiency
and wet lab results
(10MB - 1GB)
10
Patient-specific
Data
Tumor-specific
Data
Compound
Interaction Data
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
11
Showcase
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
12
Calculating Drug Response…Predict Drug Response
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
13
cetuximab might be more
beneficial for the current case
■  Online: Visit we.analyzegenomes.com for latest research
results, slides, videos, tools, and publications
■  Offline: Read more about it, e.g.
High-Performance In-Memory Genome Data Analysis:
How In-Memory Database Technology Accelerates Personalized Medicine,
In-Memory Data Management Research, Springer,
ISBN: 978-3-319-03034-0, 2014
■  In Person: Join us for “Festival of Genomics” Jan 19-21, 2016 in London, UK
Where do you find additional information?
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
14
Keep in contact with us!
Hasso Plattner Institute
Enterprise Platform & Integration Concepts (EPIC)
Program Manager E-Health
August-Bebel-Str. 88
14482 Potsdam, Germany
Dr. Matthieu-P. Schapranow
schapranow@hpi.de
http://we.analyzegenomes.com/
Schapranow, CMS Global
Life Sciences, Fankfurt,
Nov 9, 2015
Big Data in Life
Sciences
15

More Related Content

PDF
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
PDF
Analyze Genomes: Drug Response Analysis
PDF
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
PDF
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
PDF
A Federated In-Memory Database System for Life Sciences
PDF
Big Data in Genomics: Opportunities and Challenges
PDF
Analyze Genomes: In-memory Apps supporting Precision Medicine
PDF
Festival of Genomics 2016 London: What to take home?
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: Drug Response Analysis
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
A Federated In-Memory Database System for Life Sciences
Big Data in Genomics: Opportunities and Challenges
Analyze Genomes: In-memory Apps supporting Precision Medicine
Festival of Genomics 2016 London: What to take home?

What's hot (20)

PDF
Analyze Genomes Services for Precision Medicine
PDF
Analyze Genomes Services for Precision Medicine
PDF
In-Memory Data Management for Systems Medicine
PDF
Festival of Genomics 2016 London: Analyze Genomes: Real-world Examples
PDF
A Platform for Integrated Genome Data Analysis
PDF
Festival of Genomics 2016 London: Agenda
PDF
BioNRW: Big Medical Data: Challenge or Potential
PDF
"When time matters..."
PDF
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
PDF
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
PDF
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
PDF
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
PDF
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
PPT
Human Genome and Big Data Challenges
PDF
Processing of Big Medical Data in Personalized Medicine: Challenge or Potential
PDF
ICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
PDF
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
PDF
In-Memory Apps for Precision Medicine
PDF
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
PDF
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision Medicine
In-Memory Data Management for Systems Medicine
Festival of Genomics 2016 London: Analyze Genomes: Real-world Examples
A Platform for Integrated Genome Data Analysis
Festival of Genomics 2016 London: Agenda
BioNRW: Big Medical Data: Challenge or Potential
"When time matters..."
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Festival of Genomics 2016 London: Mining and Processing of Unstructured Medic...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Analyze Genomes: A Federated In-memory Database Computing Platform enabling r...
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Human Genome and Big Data Challenges
Processing of Big Medical Data in Personalized Medicine: Challenge or Potential
ICT Platform to Enable Consortium Work for Systems Medicine of Heart Failure
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
In-Memory Apps for Precision Medicine
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Ad

Viewers also liked (20)

DOCX
Mosaic plot in R.
PPS
Zoom in your life
PDF
SXSW DT Your Life - Presentation
DOCX
CLUSTERGRAM
DOCX
Advanced Data Visualization in R- Somes Examples.
PDF
Menú setembre
PPTX
Webinar: Leveraging big data in life sciences & healthcare
KEY
Technology Across the Curriculum
ODP
Top 30 Marriage And Infidelity Quotes
PPTX
december
PPT
Android Rooting
PDF
Pistoia Alliance European Conference 2015 - Jakob de Vlieg / Bayer Crop Science
PDF
A Statistician's View on Big Data and Data Science (Version 2)
PPT
MAGLEV TRAIN BY GIRISH HARMUKH
PDF
Spatial Analysis with R - the Good, the Bad, and the Pretty
PPTX
Codes and conventions of Short Film
PDF
Advancing life sciences with IBM reference architecture for genomics
PDF
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
PPTX
What's New for SAP HANA Smart Data Integration & Smart Data Quality
PPTX
Leveraging SAP, Hadoop, and Big Data to Redefine Business
Mosaic plot in R.
Zoom in your life
SXSW DT Your Life - Presentation
CLUSTERGRAM
Advanced Data Visualization in R- Somes Examples.
Menú setembre
Webinar: Leveraging big data in life sciences & healthcare
Technology Across the Curriculum
Top 30 Marriage And Infidelity Quotes
december
Android Rooting
Pistoia Alliance European Conference 2015 - Jakob de Vlieg / Bayer Crop Science
A Statistician's View on Big Data and Data Science (Version 2)
MAGLEV TRAIN BY GIRISH HARMUKH
Spatial Analysis with R - the Good, the Bad, and the Pretty
Codes and conventions of Short Film
Advancing life sciences with IBM reference architecture for genomics
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
What's New for SAP HANA Smart Data Integration & Smart Data Quality
Leveraging SAP, Hadoop, and Big Data to Redefine Business
Ad

Similar to Big Data in Life Sciences (20)

PDF
Turning Big Data into Precision Medicine
PPTX
Gaining Time – Real-time Analysis of Big Medical Data
PDF
Standards for public health genomic epidemiology - Biocuration 2015
PDF
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
PDF
Festival of Genomics 2016 London: Challenges of Big Medical Data?
PPTX
Reuse of public data in proteomics
PDF
Methods to enhance the validity of precision guidelines emerging from big data
PPT
2011-10-11 Open PHACTS at BioIT World Europe
PDF
Imagine biomedical warehouse
PPTX
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
PPT
La Médecine du futur !
PDF
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
PPTX
Reuse of public proteomics data
PDF
20160811 Big Data for Health and Medicine
PPTX
BioData World Basel 2018
PDF
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
Turning Big Data into Precision Medicine
Gaining Time – Real-time Analysis of Big Medical Data
Standards for public health genomic epidemiology - Biocuration 2015
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Reuse of public data in proteomics
Methods to enhance the validity of precision guidelines emerging from big data
2011-10-11 Open PHACTS at BioIT World Europe
Imagine biomedical warehouse
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
La Médecine du futur !
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
Reuse of public proteomics data
20160811 Big Data for Health and Medicine
BioData World Basel 2018
Jax bio dataworldcongress.ngs.20181128finalwithoutbu

Recently uploaded (20)

PPTX
DENTAL CARIES FOR DENTISTRY STUDENT.pptx
PPTX
neonatal infection(7392992y282939y5.pptx
PPTX
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
PPT
MENTAL HEALTH - NOTES.ppt for nursing students
PPTX
Neuropathic pain.ppt treatment managment
PPTX
Note on Abortion.pptx for the student note
PPT
Breast Cancer management for medicsl student.ppt
PDF
CT Anatomy for Radiotherapy.pdf eryuioooop
PPTX
Important Obstetric Emergency that must be recognised
PPT
Copy-Histopathology Practical by CMDA ESUTH CHAPTER(0) - Copy.ppt
PPT
Management of Acute Kidney Injury at LAUTECH
PDF
Deadly Stampede at Yaounde’s Olembe Stadium Forensic.pdf
PPTX
ACID BASE management, base deficit correction
PDF
Khadir.pdf Acacia catechu drug Ayurvedic medicine
PDF
Therapeutic Potential of Citrus Flavonoids in Metabolic Inflammation and Ins...
PDF
Handout_ NURS 220 Topic 10-Abnormal Pregnancy.pdf
PPT
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
PPTX
Gastroschisis- Clinical Overview 18112311
PPT
ASRH Presentation for students and teachers 2770633.ppt
PPT
1b - INTRODUCTION TO EPIDEMIOLOGY (comm med).ppt
DENTAL CARIES FOR DENTISTRY STUDENT.pptx
neonatal infection(7392992y282939y5.pptx
Chapter-1-The-Human-Body-Orientation-Edited-55-slides.pptx
MENTAL HEALTH - NOTES.ppt for nursing students
Neuropathic pain.ppt treatment managment
Note on Abortion.pptx for the student note
Breast Cancer management for medicsl student.ppt
CT Anatomy for Radiotherapy.pdf eryuioooop
Important Obstetric Emergency that must be recognised
Copy-Histopathology Practical by CMDA ESUTH CHAPTER(0) - Copy.ppt
Management of Acute Kidney Injury at LAUTECH
Deadly Stampede at Yaounde’s Olembe Stadium Forensic.pdf
ACID BASE management, base deficit correction
Khadir.pdf Acacia catechu drug Ayurvedic medicine
Therapeutic Potential of Citrus Flavonoids in Metabolic Inflammation and Ins...
Handout_ NURS 220 Topic 10-Abnormal Pregnancy.pdf
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
Gastroschisis- Clinical Overview 18112311
ASRH Presentation for students and teachers 2770633.ppt
1b - INTRODUCTION TO EPIDEMIOLOGY (comm med).ppt

Big Data in Life Sciences

  • 1. Big Data in Life Sciences Dr. Matthieu-P. Schapranow CMS Global Life Sciences Forum, Frankfurt, Germany Nov 9, 2015
  • 2. What is the Hasso Plattner Institute, Potsdam, Germany? Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 2
  • 3. What are the Trends? Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 3 https://www.google.com/trends/explore#q=Big data%2C Life sciences%2C Precision medicine&cmpt=q @ Nov 9, 2015 Life Sciences Big Data Precision Medicine
  • 4. IT Challenges in Life Sciences Distributed Heterogeneous Data Sources Human genome/biological data >750GB per complete human genome >15PB in databases of leading institutes Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB) Clinical trials >30k recruiting trials on ClinicalTrials.gov Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB PubMed database >24M unstructured data in publications Hospital information systems >50GB structured relational data Medical sensor data Scan of a single organ creates 10GB of raw data within 1s Cancer patient records >160k records only at NCT Big Data in Life Sciences Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Chart 4
  • 5. Healthcare Interactions in the 21st Century Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 5 Indirect Interaction Direct Interaction C linician PatientResearcher Pharm aceutical Com pany H ealthcare Providers H ospital Research Center Laboratory Patient Advocacy G roup
  • 6. Use Case: Precision Medicine in Oncology Identification of Best Treatment Option for Cancer Patient ■  Patient: 48 years, female, non-smoker, smoke-free environment ■  Diagnosis: Non-Small Cell Lung Cancer (NSCLC), stage IV 1.  Surgery to remove tumor 2.  Tumor sample is sent to laboratory to extract DNA 3.  DNA is sequenced resulting in 750 GB of raw data per sample 4.  Processing of raw data to perform analysis 5.  Identification of relevant driver mutations using international medical knowledge 6.  Informed decision making Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 6
  • 7. Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 7
  • 8. Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 8
  • 9. Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 we.analyzegenomes.com Real-time Analysis of Big Medical Data 9 In-Memory Database Extensions for Life Sciences Data Exchange, App Store Access Control, Data Protection Fair Use Statistical Tools Real-time Analysis App-spanning User Profiles Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Big Data in Life Sciences Drug Response Analysis Pathway Topology Analysis Medical Knowledge CockpitOncolyzer Clinical Trial Recruitment Cohort Analysis ... Indexed Sources
  • 10. Real-time Data Analysis and Interactive Exploration Drug Response Analysis Data Sources Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences Smoking status, tumor classification and age (1MB - 100MB) Raw DNA data and genetic variants (100MB - 1TB) Medication efficiency and wet lab results (10MB - 1GB) 10 Patient-specific Data Tumor-specific Data Compound Interaction Data
  • 11. Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 11
  • 12. Showcase Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 12 Calculating Drug Response…Predict Drug Response
  • 13. Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 13 cetuximab might be more beneficial for the current case
  • 14. ■  Online: Visit we.analyzegenomes.com for latest research results, slides, videos, tools, and publications ■  Offline: Read more about it, e.g. High-Performance In-Memory Genome Data Analysis: How In-Memory Database Technology Accelerates Personalized Medicine, In-Memory Data Management Research, Springer, ISBN: 978-3-319-03034-0, 2014 ■  In Person: Join us for “Festival of Genomics” Jan 19-21, 2016 in London, UK Where do you find additional information? Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 14
  • 15. Keep in contact with us! Hasso Plattner Institute Enterprise Platform & Integration Concepts (EPIC) Program Manager E-Health August-Bebel-Str. 88 14482 Potsdam, Germany Dr. Matthieu-P. Schapranow schapranow@hpi.de http://we.analyzegenomes.com/ Schapranow, CMS Global Life Sciences, Fankfurt, Nov 9, 2015 Big Data in Life Sciences 15