SlideShare a Scribd company logo
Data Patterns
Life Sciences / Healthcare
Chris Dwan (chris@dwan.org)
https://dwan.org
Take-home messages
Data challenges are large and growing
– Not just volume
– Also variety, velocity, quality
There is no one single perfect solution
– Requirements are diverse
– Real world solutions will be hybrid
Metadata management is a huge challenge
– Even the basics are beyond most small organizations
– We need federated systems to transform medicine
2018 10 igneous
Geek Cred: My First Petabyte,
2008
My first Petabyte: 2008
Geek Cred: My First Petabyte,
2008
My first Petabyte: 2008
2018 10 igneous
The evolution of data transfer …
Genomic Data Production in ContextGenomic data production @ Broad
Genomic Data Production in ContextGenomic data production @ Broad
I did research computing at
Broad from 2014 - 2017
Geek Cred: My First Petabyte,
2008
My first Exabyte: 2014
Data: The new oil*
Data Base: Structure, queries
Data Warehouse: All the data in one place. Limited
integration.
Data Mart: Serve up warehoused data to users (Shiny counts)
Big Data: Volume, Variety, Velocity
Data Lake: Data warehouse, but designed for in-situ analytics
Data Ocean: A data lake, for the cromulently embiggened!
Data Commons: When the benefits of sharing data outweigh
the competitive instinct to horde it
Data Biosphere: A data commons, but for the cool kids
An immature ‘tyrant
flycatcher. Needs a data
mart, because it doesn’t
know R or Linux yet.
Hype-o-meter Impact-o-meter
Primary Data Production
Data are produced
on instruments …
Sequencer /
Mass Spec /
…
Analysis
Systems
High
Performance
Storage
… Transformed
and distilled …
… Delivered to
downstream
processes …
Customer
facing storage
Primary Data Production
Data are produced
on instruments …
Sequencer /
Mass Spec /
…
Analysis
Systems
High
Performance
Storage
… Transformed
and distilled …
… Delivered to
downstream
processes …
… And archived for various
purposes (FDA, HIPAA,
Intellectual property, …).
Customer
facing storage
Durable, cost
effective storage
Primary Data Production
Data are produced
on instruments …
Sequencer /
Mass Spec /
…
Analysis
Systems
High
Performance
Storage
… Transformed
and distilled …
… Delivered to
downstream
processes …
… And archived for various
purposes (FDA, HIPAA,
Intellectual property, …).
Customer
facing storage
Durable, cost
effective storage
I recommend an
‘archive first’ approach,
EMR
ELN
Primary Data Production
Data are produced
on instruments …
Sequencer /
Mass Spec /
…
Analysis
Systems
High
Performance
Storage
… Transformed
and distilled …
… Delivered to
downstream
processes …
… And archived for various
purposes (FDA, HIPAA,
Intellectual property, …).
Customer
facing storage
Durable, cost
effective storage
I recommend an
‘archive first’ approach,
LIMS
LIS
Metadata management is still a
massive challenge
Lab_Sample_tracker.xls
Filename_as_
metadata_for
_eric_v2
Quality Matters
Quality Matters
Ask a computational
biologist / data scientist
what fraction of their time
is spent fighting data
quality, formatting, and
similar issues.
Multiply that by an entire
industry
They deserve better.
Machine Learning (ML)
Algorithms that optimize and tune based on
large amounts of data
These have been around for a very long time
(KNN and Linear Regression are totally ML).
Algorithm innovations (deep neural nets),
plus ubiquitous big data, plus improvements
in computing, storage, network, and
software.
Killer apps everywhere in image recognition,
natural language processing, clustering,
categorization
Hype-o-meter Impact-o-meter
A ‘swan pink yellow’ columbine
flower. Identifying objects in
images is machine work now.
Data for Analytics / ML / AI
Analysis Systems
High Performance
Storage
A large and
growing set of
data is curated…
Commercial
/ outsource
labs
Public or
licensed
datasets
In-house
labs
Curation
… and mined for insights.
Analyst
Data for analytics
Analysis Systems
High Performance
Storage
A large and
growing set of
data is curated…
Commercial
/ outsource
labs
Public or
licensed
datasets
In-house
labs
Curation
… and mined for insights.
insights take both short and long
paths back into the system
Analyst
Data for analytics
Analysis Systems
High Performance
Storage
A large and
growing set of
data is curated…
Commercial
/ outsource
labs
Public or
licensed
datasets
In-house
labs
Curation
… and mined for insights.
insights take both short and long
paths back into the system
Analyst
Durable, cost
effective storage
• What does “backup”
mean, exactly?
• How do we capture
provenance without
massive duplication?
Artificial Intelligence (AI)
Distinguished (for me) by autonomous
behavior and clever-looking behavior in
the face of unanticipated situations.
No requirement that “intelligent” mean
“like a human.”
Machine learning algorithms are a great
(but not the only) way to create AI
systems.
Beware “bread machine AI.”
Hype-o-meter Impact-o-meter
Getting there!
My cat shows surprising
intelligence despite having a
brain the size of a walnut
Artificial Intelligence (AI)
Distinguished (for me) by autonomous
behavior and clever-looking behavior in
the face of unanticipated situations.
No requirement that intelligence be
human style.
Machine learning algorithms are a great
(but not the only) way to build AI
systems.
Beware “bread machine AI.”
Hype-o-meter Impact-o-meter
Getting there!
My cat shows surprising
intelligence despite having a
brain the size of a walnut
Incredible opportunities
here, and rapidly
developing data silos
The Clinical Data Ecosystem
There is an incredible
wealth of data available to
support both clinical care
and research
Patient Journals
Consumer products
Unfortunately, it is carved
up and isolated
Longitudinal Data from
other providers …
Electronic
Medical Records
Possibility of a self-normal
(N of 1) over time
Diagnostic
Imaging
Natural language processing
has strong potentialClinical Notes
Innovations in the basics of
clinical observation
Hospital Telemetry
Pressure to avoid incidental
findings prevent bias
Primary Lab Data
There are both good and
bad reasons for this
Personal Data Impacts Behavior
I use a commercial service
that combines labwork with
wearable data
They provide insights and
coaching
I have, personally, found this
transformational in how I
approach my health.
Personal Data Impacts Behavior
I use a commercial service
that combines labwork with
wearable data
They provide insights and
coaching
I have, personally, found this
transformational in how I
approach my health.
Personal Data Impacts Behavior
I use a commercial service
that combines labwork with
wearable data
They provide insights and
coaching
I have, personally, found this
transformational in how I
approach my health.
Personal Data Impacts Behavior
I use a commercial service
that combines labwork with
wearable data
They provide insights and
coaching
I have, personally, found this
transformational in how I
approach my health.
Personal Data Impacts Behavior
I use a commercial service
that combines labwork with
wearable data
They provide insights and
coaching
I have, personally, found this
transformational in how I
approach my health.
Why are we here?
• Improved health outcomes
• Quality-adjusted life-years
• Increased therapeutic effectiveness
• Reduced barriers to access
• Publications / Patents / Druggable leads
• Accelerated innovation cycle
• Reduced time to market
• Speeds & Feeds
• Improved performance on benchmarks
• Lower cost per unit
• Infrastructure agility
Social Mission
Scientific / Business Goals
Technology / Infrastructure
Maslow’s Hierarchy of Needs
Friendship, connectedness, belonging
Confidence, achievement
Creativity,
Purpose
Safety, physical and economic stability
Air, food, shelter, sleep
If you lack this
You don’t get
to engage here
Maslow’s Hierarchy of Needs
Friendship, connectedness, belonging
Confidence, achievement
Creativity,
Purpose
Safety, physical and economic stability
Air, food, shelter, sleep
Wireless Internet, Fully charged battery
If you lack this
You don’t get
to engage here
IT Hierarchy of Needs
Productivity and Security, Applications,
disaster preparedness
Automation and
compliance
“Thought
Partner”
Files, formats, naming conventions, access controls
Phones, Projectors, Internet, Email, Chat
Power, Building Access, Laptops, Wifi, Identity
If you lack this
You don’t get
to engage here
Data Visibility Saves Money
Private Data Holdings
Public
Data
Backups
…
Private
copy of
public
data
$$ !!
Lack of data visibility leads to
increased costs and engineering
challenges.
It is depressingly common to see
multiple representations of the same
data, all being archived together.
BAM BCL
FASTQ
This is also a metadata challenge
Challenge Architecture: The data DMZ
• An architecture to support data creation, delivery, and
use
• … for seamless collaboration between organizations …
• … without sacrificing security, appropriate usage, or
privacy …
• … and that delivers on the potential of modern analytic
capabilities.
Blockchain
”The clown car of our industry in 2018”
• Distributed ledger: trustworthy data /
records without a central authority.
• Self executing contracts: Shared,
trustworthy code to operate on that
data.
• Initial Coin Offerings: massively
accelerated (and deregulated) way to
set monetary value on a data
ecosystem.
Amazing possibilities in permission /
consent management.
When I make snarky comments on
LinkedIn, people ask if they can invest.
Hype-o-meter Impact-o-meter
The angel weeps because there are
some really compelling use cases for
blockchain, but the hype is
deafening.
Take-home messages
Data challenges are large and growing
– Not just volume
– Also variety, velocity, quality
There is no one single perfect solution
– Requirements are diverse
– Real world solutions will be hybrid
Metadata management is a huge challenge
– Even the basics are beyond most small organizations
– We need federated systems in order to transform
medicine.
Questions?
chris@dwan.org

More Related Content

PDF
Everything Has Changed Except Us: Modernizing the Data Warehouse
PDF
Assumptions about Data and Analysis: Briefing room webcast slides
PDF
Data Architecture: OMG It’s Made of People
PDF
Solve User Problems: Data Architecture for Humans
PDF
Briefing room: An alternative for streaming data collection
PPTX
Using Big Data for Improved Healthcare Operations and Analytics
PDF
Architecting a Platform for Enterprise Use - Strata London 2018
PDF
Big Data and Bad Analogies
Everything Has Changed Except Us: Modernizing the Data Warehouse
Assumptions about Data and Analysis: Briefing room webcast slides
Data Architecture: OMG It’s Made of People
Solve User Problems: Data Architecture for Humans
Briefing room: An alternative for streaming data collection
Using Big Data for Improved Healthcare Operations and Analytics
Architecting a Platform for Enterprise Use - Strata London 2018
Big Data and Bad Analogies

What's hot (20)

PDF
Pay no attention to the man behind the curtain - the unseen work behind data ...
PDF
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
PDF
The Black Box: Interpretability, Reproducibility, and Data Management
PDF
Lean approach to IT development
PPTX
challenges of big data to big data mining with their processing framework
PDF
Big Data: Issues and Challenges
PDF
Everything has changed except us
PDF
5 Factors Impacting Your Big Data Project's Performance
PDF
Big data issues and challenges
PDF
Bi isn't big data and big data isn't BI (updated)
PDF
Operationalizing Machine Learning in the Enterprise
PDF
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
PDF
big data
PPTX
Big data ppt
PDF
Building a Data Platform Strata SF 2019
PDF
Overview of mit sloan case study on ge data and analytics initiative titled g...
PDF
2013: Trends from the Trenches
PDF
Big Data Ppt PowerPoint Presentation Slides
PDF
Addressing Big Data Challenges - The Hadoop Way
PDF
BIO-IT Brochure
Pay no attention to the man behind the curtain - the unseen work behind data ...
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
The Black Box: Interpretability, Reproducibility, and Data Management
Lean approach to IT development
challenges of big data to big data mining with their processing framework
Big Data: Issues and Challenges
Everything has changed except us
5 Factors Impacting Your Big Data Project's Performance
Big data issues and challenges
Bi isn't big data and big data isn't BI (updated)
Operationalizing Machine Learning in the Enterprise
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
big data
Big data ppt
Building a Data Platform Strata SF 2019
Overview of mit sloan case study on ge data and analytics initiative titled g...
2013: Trends from the Trenches
Big Data Ppt PowerPoint Presentation Slides
Addressing Big Data Challenges - The Hadoop Way
BIO-IT Brochure
Ad

Similar to 2018 10 igneous (20)

PPTX
Big Data Analytics_Unit1.pptx
PPTX
Chapter 4 : Introduction to BigData.pptx
PDF
Harness the power of data
PPTX
Intro big data analytics
PPTX
bigdata introduction for students pg msc
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
PPTX
basic of data science and big data......
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
PPTX
Data mining with big data
PPTX
The Role of Community-Driven Data Curation for Enterprises
PPTX
Introduction to big data
PPTX
Evolution & Introduction to Big data-2.pptx
PPTX
No Free Lunch: Metadata in the life sciences
PDF
The Bigger They Are The Harder They Fall
PDF
All About Big Data
PPTX
Machine Learning For Career Growth..pptx
PPTX
Big data Analytics Fundamentals Chapter 1
PDF
Mighty Guides- Data Disruption
PPTX
big data and machine learning ppt.pptx
PDF
Questions On The And Football
Big Data Analytics_Unit1.pptx
Chapter 4 : Introduction to BigData.pptx
Harness the power of data
Intro big data analytics
bigdata introduction for students pg msc
The Data Operating System: Changing the Digital Trajectory of Healthcare
basic of data science and big data......
The Data Operating System: Changing the Digital Trajectory of Healthcare
Data mining with big data
The Role of Community-Driven Data Curation for Enterprises
Introduction to big data
Evolution & Introduction to Big data-2.pptx
No Free Lunch: Metadata in the life sciences
The Bigger They Are The Harder They Fall
All About Big Data
Machine Learning For Career Growth..pptx
Big data Analytics Fundamentals Chapter 1
Mighty Guides- Data Disruption
big data and machine learning ppt.pptx
Questions On The And Football
Ad

More from Chris Dwan (20)

PPTX
Data and Computing Infrastructure for the Life Sciences
PDF
Somerville Police Staffing Final Report.pdf
PDF
2023 Ward 2 community meeting.pdf
PPTX
One Size Does Not Fit All
PDF
Somerville FY23 Proposed Budget
PPTX
Production Bioinformatics, emphasis on Production
PPTX
#Defund thepolice
PPTX
2009 cluster user training
PDF
Somerville ufc memo tree hearing
PDF
2011 career-fair
PPTX
Advocacy in the Enterprise (what works, what doesn't)
PPTX
"The Cutting Edge Can Hurt You"
PPT
Introduction to HPC
PPT
Intro bioinformatics
PDF
Proposed tree protection ordinance
PDF
Tree Ordinance Change Matrix
PDF
Tree protection overhaul
PDF
Response from newport
PDF
Sacramento underpass bid_docs
PPTX
2019 BioIt World - Post cloud legacy edition
Data and Computing Infrastructure for the Life Sciences
Somerville Police Staffing Final Report.pdf
2023 Ward 2 community meeting.pdf
One Size Does Not Fit All
Somerville FY23 Proposed Budget
Production Bioinformatics, emphasis on Production
#Defund thepolice
2009 cluster user training
Somerville ufc memo tree hearing
2011 career-fair
Advocacy in the Enterprise (what works, what doesn't)
"The Cutting Edge Can Hurt You"
Introduction to HPC
Intro bioinformatics
Proposed tree protection ordinance
Tree Ordinance Change Matrix
Tree protection overhaul
Response from newport
Sacramento underpass bid_docs
2019 BioIt World - Post cloud legacy edition

Recently uploaded (20)

PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Machine learning based COVID-19 study performance prediction
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Approach and Philosophy of On baking technology
PPT
Teaching material agriculture food technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Modernizing your data center with Dell and AMD
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
KodekX | Application Modernization Development
PPTX
A Presentation on Artificial Intelligence
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
MYSQL Presentation for SQL database connectivity
Machine learning based COVID-19 study performance prediction
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Approach and Philosophy of On baking technology
Teaching material agriculture food technology
Encapsulation_ Review paper, used for researhc scholars
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Modernizing your data center with Dell and AMD
20250228 LYD VKU AI Blended-Learning.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Building Integrated photovoltaic BIPV_UPV.pdf
KodekX | Application Modernization Development
A Presentation on Artificial Intelligence
Review of recent advances in non-invasive hemoglobin estimation
NewMind AI Weekly Chronicles - August'25 Week I
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx

2018 10 igneous

  • 1. Data Patterns Life Sciences / Healthcare Chris Dwan (chris@dwan.org) https://dwan.org
  • 2. Take-home messages Data challenges are large and growing – Not just volume – Also variety, velocity, quality There is no one single perfect solution – Requirements are diverse – Real world solutions will be hybrid Metadata management is a huge challenge – Even the basics are beyond most small organizations – We need federated systems to transform medicine
  • 4. Geek Cred: My First Petabyte, 2008 My first Petabyte: 2008
  • 5. Geek Cred: My First Petabyte, 2008 My first Petabyte: 2008
  • 7. The evolution of data transfer …
  • 8. Genomic Data Production in ContextGenomic data production @ Broad
  • 9. Genomic Data Production in ContextGenomic data production @ Broad I did research computing at Broad from 2014 - 2017
  • 10. Geek Cred: My First Petabyte, 2008 My first Exabyte: 2014
  • 11. Data: The new oil* Data Base: Structure, queries Data Warehouse: All the data in one place. Limited integration. Data Mart: Serve up warehoused data to users (Shiny counts) Big Data: Volume, Variety, Velocity Data Lake: Data warehouse, but designed for in-situ analytics Data Ocean: A data lake, for the cromulently embiggened! Data Commons: When the benefits of sharing data outweigh the competitive instinct to horde it Data Biosphere: A data commons, but for the cool kids An immature ‘tyrant flycatcher. Needs a data mart, because it doesn’t know R or Linux yet. Hype-o-meter Impact-o-meter
  • 12. Primary Data Production Data are produced on instruments … Sequencer / Mass Spec / … Analysis Systems High Performance Storage … Transformed and distilled … … Delivered to downstream processes … Customer facing storage
  • 13. Primary Data Production Data are produced on instruments … Sequencer / Mass Spec / … Analysis Systems High Performance Storage … Transformed and distilled … … Delivered to downstream processes … … And archived for various purposes (FDA, HIPAA, Intellectual property, …). Customer facing storage Durable, cost effective storage
  • 14. Primary Data Production Data are produced on instruments … Sequencer / Mass Spec / … Analysis Systems High Performance Storage … Transformed and distilled … … Delivered to downstream processes … … And archived for various purposes (FDA, HIPAA, Intellectual property, …). Customer facing storage Durable, cost effective storage I recommend an ‘archive first’ approach,
  • 15. EMR ELN Primary Data Production Data are produced on instruments … Sequencer / Mass Spec / … Analysis Systems High Performance Storage … Transformed and distilled … … Delivered to downstream processes … … And archived for various purposes (FDA, HIPAA, Intellectual property, …). Customer facing storage Durable, cost effective storage I recommend an ‘archive first’ approach, LIMS LIS Metadata management is still a massive challenge Lab_Sample_tracker.xls Filename_as_ metadata_for _eric_v2
  • 17. Quality Matters Ask a computational biologist / data scientist what fraction of their time is spent fighting data quality, formatting, and similar issues. Multiply that by an entire industry They deserve better.
  • 18. Machine Learning (ML) Algorithms that optimize and tune based on large amounts of data These have been around for a very long time (KNN and Linear Regression are totally ML). Algorithm innovations (deep neural nets), plus ubiquitous big data, plus improvements in computing, storage, network, and software. Killer apps everywhere in image recognition, natural language processing, clustering, categorization Hype-o-meter Impact-o-meter A ‘swan pink yellow’ columbine flower. Identifying objects in images is machine work now.
  • 19. Data for Analytics / ML / AI Analysis Systems High Performance Storage A large and growing set of data is curated… Commercial / outsource labs Public or licensed datasets In-house labs Curation … and mined for insights. Analyst
  • 20. Data for analytics Analysis Systems High Performance Storage A large and growing set of data is curated… Commercial / outsource labs Public or licensed datasets In-house labs Curation … and mined for insights. insights take both short and long paths back into the system Analyst
  • 21. Data for analytics Analysis Systems High Performance Storage A large and growing set of data is curated… Commercial / outsource labs Public or licensed datasets In-house labs Curation … and mined for insights. insights take both short and long paths back into the system Analyst Durable, cost effective storage • What does “backup” mean, exactly? • How do we capture provenance without massive duplication?
  • 22. Artificial Intelligence (AI) Distinguished (for me) by autonomous behavior and clever-looking behavior in the face of unanticipated situations. No requirement that “intelligent” mean “like a human.” Machine learning algorithms are a great (but not the only) way to create AI systems. Beware “bread machine AI.” Hype-o-meter Impact-o-meter Getting there! My cat shows surprising intelligence despite having a brain the size of a walnut
  • 23. Artificial Intelligence (AI) Distinguished (for me) by autonomous behavior and clever-looking behavior in the face of unanticipated situations. No requirement that intelligence be human style. Machine learning algorithms are a great (but not the only) way to build AI systems. Beware “bread machine AI.” Hype-o-meter Impact-o-meter Getting there! My cat shows surprising intelligence despite having a brain the size of a walnut
  • 24. Incredible opportunities here, and rapidly developing data silos The Clinical Data Ecosystem There is an incredible wealth of data available to support both clinical care and research Patient Journals Consumer products Unfortunately, it is carved up and isolated Longitudinal Data from other providers … Electronic Medical Records Possibility of a self-normal (N of 1) over time Diagnostic Imaging Natural language processing has strong potentialClinical Notes Innovations in the basics of clinical observation Hospital Telemetry Pressure to avoid incidental findings prevent bias Primary Lab Data There are both good and bad reasons for this
  • 25. Personal Data Impacts Behavior I use a commercial service that combines labwork with wearable data They provide insights and coaching I have, personally, found this transformational in how I approach my health.
  • 26. Personal Data Impacts Behavior I use a commercial service that combines labwork with wearable data They provide insights and coaching I have, personally, found this transformational in how I approach my health.
  • 27. Personal Data Impacts Behavior I use a commercial service that combines labwork with wearable data They provide insights and coaching I have, personally, found this transformational in how I approach my health.
  • 28. Personal Data Impacts Behavior I use a commercial service that combines labwork with wearable data They provide insights and coaching I have, personally, found this transformational in how I approach my health.
  • 29. Personal Data Impacts Behavior I use a commercial service that combines labwork with wearable data They provide insights and coaching I have, personally, found this transformational in how I approach my health.
  • 30. Why are we here? • Improved health outcomes • Quality-adjusted life-years • Increased therapeutic effectiveness • Reduced barriers to access • Publications / Patents / Druggable leads • Accelerated innovation cycle • Reduced time to market • Speeds & Feeds • Improved performance on benchmarks • Lower cost per unit • Infrastructure agility Social Mission Scientific / Business Goals Technology / Infrastructure
  • 31. Maslow’s Hierarchy of Needs Friendship, connectedness, belonging Confidence, achievement Creativity, Purpose Safety, physical and economic stability Air, food, shelter, sleep If you lack this You don’t get to engage here
  • 32. Maslow’s Hierarchy of Needs Friendship, connectedness, belonging Confidence, achievement Creativity, Purpose Safety, physical and economic stability Air, food, shelter, sleep Wireless Internet, Fully charged battery If you lack this You don’t get to engage here
  • 33. IT Hierarchy of Needs Productivity and Security, Applications, disaster preparedness Automation and compliance “Thought Partner” Files, formats, naming conventions, access controls Phones, Projectors, Internet, Email, Chat Power, Building Access, Laptops, Wifi, Identity If you lack this You don’t get to engage here
  • 34. Data Visibility Saves Money Private Data Holdings Public Data Backups … Private copy of public data $$ !! Lack of data visibility leads to increased costs and engineering challenges. It is depressingly common to see multiple representations of the same data, all being archived together. BAM BCL FASTQ This is also a metadata challenge
  • 35. Challenge Architecture: The data DMZ • An architecture to support data creation, delivery, and use • … for seamless collaboration between organizations … • … without sacrificing security, appropriate usage, or privacy … • … and that delivers on the potential of modern analytic capabilities.
  • 36. Blockchain ”The clown car of our industry in 2018” • Distributed ledger: trustworthy data / records without a central authority. • Self executing contracts: Shared, trustworthy code to operate on that data. • Initial Coin Offerings: massively accelerated (and deregulated) way to set monetary value on a data ecosystem. Amazing possibilities in permission / consent management. When I make snarky comments on LinkedIn, people ask if they can invest. Hype-o-meter Impact-o-meter The angel weeps because there are some really compelling use cases for blockchain, but the hype is deafening.
  • 37. Take-home messages Data challenges are large and growing – Not just volume – Also variety, velocity, quality There is no one single perfect solution – Requirements are diverse – Real world solutions will be hybrid Metadata management is a huge challenge – Even the basics are beyond most small organizations – We need federated systems in order to transform medicine.