SlideShare a Scribd company logo
data competitive
Dr June Andrews
March 26, 2019
Delphi Data
outline
Data X
Data Informed
Data Enabled
Data Driven
Data Competitive
Predicting Data ROI Challenges
Metrics
Repetition in DS
Replication
Systematic Building with ML
Uncanny Valley of ML
1
data x
data informed use case
Data is an input into a decision system based on expert knowledge.
Task could be done without cloud based data; it would just take longer.
Process map for operations of a plant. [Siemens]
3
data informed requirements
Checklist:
∙ Existing application already in production
∙ Known quantity to measure and how to measure it
∙ Ability to stream measurement to people at the right time
∙ Known policy on how to react to different measured values
4
data informed roi
Low Variance on Expected ROI
Predictable costs and benefits.
Cost:
∙ Infrastructure to stream measurements
∙ Portal to view measurements
∙ Data storage for compliance
Benefit:
∙ Saves time and manpower over manually collecting measurements
∙ Reliable auditing of historical values
∙ Centralized monitoring
5
data enabled use case
Data enables a product that can be designed with only product knowledge.
Pfizer converts from Data Informed to Data Enabled [HBS 2015]
6
data enabled requirements
Checklist:
∙ Product vision of what needs to be built.
∙ Clear specs on what data is needed.
∙ Clear specs on how to use the data.
∙ Reliable serving and maintenance of the data.
7
data enabled roi
Conditional Impact on Expected ROI
If data enables the use case, then traditional product ROI.
Cost:
∙ All costs of Data Informed (infrastructure, serving & storage)
∙ new Data acquisition
∙ new Data processing (may require ETL & some ML)
∙ new Higher personnel costs given the above.
Benefit:
∙ New product opportunities.
∙ Better customization in a global marketplace.
8
data driven use case
Given goals & constraints of company, data decides the outcome.
Netflix has a budget of $15B in 2019
for Content Creation spent with a
mix of traditional & data driven
approaches.
9
data driven requirements
Checklist:
∙ Clear goals & constraints
∙ Data for mining
∙ Infrastructure for data exploration
∙ People
∙ A conduit for results to be acted on
10
data driven roi
Uncertain ROI
Similar to R&D costs and benefits.
Expensive to start & hard to predict.
Cost:
∙ All costs of data enabled
∙ new Additional infrastructure & personnel for exploration
∙ new Discovery rate of attempts to findings
Benefit:
∙ Make rapid decisions without becoming an expert
∙ Create products that couldn’t have existed otherwise
∙ Optimize product market fit
11
data driven doesn’t always pay off
New is always interesting,
not always better.
Forbes & HBS have many
articles on the failures of
companies to benefit from their
attempts to become data driven.
12
data competitive use case
Data is used systematically over the course of years to optimize the com-
petitive advantage components of a company.
Combination of data informed (Like Button), data enabled (Messages), data
driven (news feed)
13
data competitive requirements
Checklist:
∙ No checklist is going to work - need to evolve in the face of
changing markets and technological abilities.
14
new mind set for creating data competitive company
Goal should be to have enough system support for everything to work,
and then heavy investment in what contributes to competitive
advantage.
15
components of a data competitive company
Traditional areas of a company
∙ People
∙ Vision
∙ Strategy
∙ Market
∙ Product
∙ Communication
Then Data
Treat data as an additional component of a functioning company
that needs to integrate with existing components.
16
investments in data can then be focused on what matters
Usain Bolt’s taste buds work at the level of guaranteeing he can eat
enough safe food for the rest of his body. He does not have the best
sense of taste in the world.
17
data competitive roi
Start with Low ROI, then iterate.
Learn from what has worked to predict what will work & adjust.
Cost:
∙ Combination of {Data Informed, Data Enabled, Data Driven}
∙ Continuous integration of data into all aspects of the company
Benefit:
∙ Ability to adjust to market demands
∙ Flexibility to integrate latest advantages of AI/ML/DS
18
predicting data roi challenges
predicting data roi challenges
Predicting Data ROI Challenges
Metrics
Repetition in DS
Replication
Systematic Building with ML
Uncanny Valley of ML
20
challenge - metric design
Metrics are the translation layer
between what we want and
what we tell the machines we
want.
21
example - fire ring model
Simple case - ask ML to sign up as many users as possible.
Fire Ring model in network effects demonstrates how the number of
users on a site can exponentially expand and then quickly disappear.
22
solution - metric design principles
23
challenge - longevity of data science results
Analysis Tools People
Funnel Analysis Logging Recruiting
User Targeting Data Quality Organization
Forecasting Experiments Data Literacy
Opportunity Sizing Metrics & Dashboarding Ladders
Spam Minimization Productionizing Models Decision Processes
Deep Dives Central Repositories
Sample of Common Data Science Focuses
24
solution - repetition in data science
Roughly every 2 years work in these areas will be redone for
∙ Upgrades
∙ Changing Landscapes
∙ Ownership Bias ’Not Analyzed by Me’
∙ Forgetfulness
Solutions
∙ Bow on Top time dedicated at the end of every project
∙ Document Tribal History
∙ Invite previous employees back for reviews
25
challenge - trusting data science
26
27
28
29
30
31
32
33
solution - trusting data science
Bottom line
It matters which data scientist does an analysis
Solutions
∙ Keep track of data scientist’s success rates
∙ Identify skill gaps and train folks
∙ Identify critical and chaotic decisions - have multiple data
experts produce solutions
34
challenge - systematic innovation
35
36
37
38
39
40
41
42
challenge - uncanny valley of ai
Progression of break through CGI characters in movies Toy Story (1995),
Final Fantasy Spirits Within (2001), and Terminator Genisys (2015)
43
ideal outcome of ml
Ideally want to design systems that are only improved with ML
44
finding the uncanny valley of ml
Mechanical Turk UI for labeling the number of coffee mugs in an image.
Note, adding the extra question of ‘Is the Suggestion Correct’ and the
phrase ‘If NOT correct’ was necessary. Without those UI components,
Mechanical Turk workers were in auto-pilot of labeling images and
ignored the suggested number of coffee mugs, resulting in the same
accuracy regardless of the ML Accuracy. Additionally, workers completing
fewer than 3 labels were filtered out as noise.
45
uncanny valley of ml
46
conclusion
Data Science can be systematic, principled, and foundational.
First it must take it’s own advice & measure what it wants to improve.
After looking in the mirror, iterate, improve & compete.
47

More Related Content

PDF
Critical turbine maintenance: Monitoring and diagnosing planes and power plan...
PDF
Counter Intuitive Machine Learning for the Industrial Internet of Things
PDF
Counter Intuitive Machine Learning for the Industrial Internet of Things
PDF
The Emerging Key Role of 3-D Engineering Simulation (CAE) in the Full Produ...
PPTX
Manufacturing Analytics at Scale
PDF
High Accuracy Model at what costs - Data Curry
PDF
Compounding Business Value Through Big Data & Advanced Analytics, v2
PPTX
Solving the EDW transformation conundrum - Impetus webinar
Critical turbine maintenance: Monitoring and diagnosing planes and power plan...
Counter Intuitive Machine Learning for the Industrial Internet of Things
Counter Intuitive Machine Learning for the Industrial Internet of Things
The Emerging Key Role of 3-D Engineering Simulation (CAE) in the Full Produ...
Manufacturing Analytics at Scale
High Accuracy Model at what costs - Data Curry
Compounding Business Value Through Big Data & Advanced Analytics, v2
Solving the EDW transformation conundrum - Impetus webinar

What's hot (20)

PDF
Adaptive Insights: Crittall - A Customer Success Story
PPTX
Spara Presentation
PPTX
Ark Product and Process Design V1
PDF
Smart plant3d executivebriefing
PDF
Ortiz internap
PDF
Ajitesh new self_intro
PDF
FIVE Technologies for the next 10 Years
PDF
Vignesh Balasubramanian's Resume
PDF
Presentation- IIT Bombay
PPTX
Anatomy of a data science project
PDF
Data Science for Business Managers - Trends and Evolutions
PPTX
Extreme Programming
PPT
NR talk, Info-plosion Conference (Tokyo, Jan 2012)
PDF
2021 machine-learning-in-mining-brochure
PPTX
Green computing
PDF
Planning Your Data Science Projects
PDF
Designing around customer needs
PPTX
EXTENT-2017: Putting AI to Test
PDF
Ark Product and Process Design v3 Ark Process Metrics
PDF
Ark Product and Process Design v2 Design Metrics
Adaptive Insights: Crittall - A Customer Success Story
Spara Presentation
Ark Product and Process Design V1
Smart plant3d executivebriefing
Ortiz internap
Ajitesh new self_intro
FIVE Technologies for the next 10 Years
Vignesh Balasubramanian's Resume
Presentation- IIT Bombay
Anatomy of a data science project
Data Science for Business Managers - Trends and Evolutions
Extreme Programming
NR talk, Info-plosion Conference (Tokyo, Jan 2012)
2021 machine-learning-in-mining-brochure
Green computing
Planning Your Data Science Projects
Designing around customer needs
EXTENT-2017: Putting AI to Test
Ark Product and Process Design v3 Ark Process Metrics
Ark Product and Process Design v2 Design Metrics
Ad

Similar to Data Competitive (20)

PDF
How to succeed at data without even trying!
PDF
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
PPTX
How to use your data science team: Becoming a data-driven organization
PDF
The 3 Key Barriers Keeping Companies from Deploying Data Products
PPTX
Are you ready for Data science? A 12 point test
PDF
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
PPTX
Machine intelligence data science methodology 060420
PDF
Building a Data Culture at Your Organization - Dawn of the Data Age Lecture S...
PDF
Training Taster: Leading the way to become a data-driven organization
PDF
Digital Transformation Summit 2024 - Edinburgh
PDF
Data Science and Culture
PDF
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
PDF
Disrupting with Data: Lessons from Silicon Valley
PDF
Become a citizen data scientist
PDF
How organizations can become data-driven: three main rules
PDF
Demystifying Data Science
PPTX
Eureka Analytics Seminar Series - Product Management for Data Science Products
PDF
How to Build Successful Data Team - Dataiku ?
PDF
The Data Unicorns
PPTX
Dataiku - From Big Data To Machine Learning
How to succeed at data without even trying!
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
How to use your data science team: Becoming a data-driven organization
The 3 Key Barriers Keeping Companies from Deploying Data Products
Are you ready for Data science? A 12 point test
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Machine intelligence data science methodology 060420
Building a Data Culture at Your Organization - Dawn of the Data Age Lecture S...
Training Taster: Leading the way to become a data-driven organization
Digital Transformation Summit 2024 - Edinburgh
Data Science and Culture
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
Disrupting with Data: Lessons from Silicon Valley
Become a citizen data scientist
How organizations can become data-driven: three main rules
Demystifying Data Science
Eureka Analytics Seminar Series - Product Management for Data Science Products
How to Build Successful Data Team - Dataiku ?
The Data Unicorns
Dataiku - From Big Data To Machine Learning
Ad

More from June Andrews (11)

PDF
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
PDF
The Uncanny Valley of ML
PDF
Push & Pull History of Data Science in Industry & Academia
PDF
ML Playbook
PDF
Replication in Data Science
PDF
Replication in Data Science - A Dance Between Data Science & Machine Learning...
PDF
Trends on Pinterest
PDF
Math in data
PDF
Growth, Engagement & Search Metrics: Snake Oil or North Stars
PDF
Economic Insights
PDF
Predictive Analytics & Business Insights
Scaling & Transforming Stitch Fix's Visibility into What Folks will love
The Uncanny Valley of ML
Push & Pull History of Data Science in Industry & Academia
ML Playbook
Replication in Data Science
Replication in Data Science - A Dance Between Data Science & Machine Learning...
Trends on Pinterest
Math in data
Growth, Engagement & Search Metrics: Snake Oil or North Stars
Economic Insights
Predictive Analytics & Business Insights

Recently uploaded (20)

PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
1_Introduction to advance data techniques.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Foundation of Data Science unit number two notes
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
Lecture1 pattern recognition............
Data_Analytics_and_PowerBI_Presentation.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Acceptance and paychological effects of mandatory extra coach I classes.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Quality review (1)_presentation of this 21
1_Introduction to advance data techniques.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Foundation of Data Science unit number two notes
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
.pdf is not working space design for the following data for the following dat...
Business Acumen Training GuidePresentation.pptx
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Fluorescence-microscope_Botany_detailed content
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Lecture1 pattern recognition............

Data Competitive

  • 1. data competitive Dr June Andrews March 26, 2019 Delphi Data
  • 2. outline Data X Data Informed Data Enabled Data Driven Data Competitive Predicting Data ROI Challenges Metrics Repetition in DS Replication Systematic Building with ML Uncanny Valley of ML 1
  • 4. data informed use case Data is an input into a decision system based on expert knowledge. Task could be done without cloud based data; it would just take longer. Process map for operations of a plant. [Siemens] 3
  • 5. data informed requirements Checklist: ∙ Existing application already in production ∙ Known quantity to measure and how to measure it ∙ Ability to stream measurement to people at the right time ∙ Known policy on how to react to different measured values 4
  • 6. data informed roi Low Variance on Expected ROI Predictable costs and benefits. Cost: ∙ Infrastructure to stream measurements ∙ Portal to view measurements ∙ Data storage for compliance Benefit: ∙ Saves time and manpower over manually collecting measurements ∙ Reliable auditing of historical values ∙ Centralized monitoring 5
  • 7. data enabled use case Data enables a product that can be designed with only product knowledge. Pfizer converts from Data Informed to Data Enabled [HBS 2015] 6
  • 8. data enabled requirements Checklist: ∙ Product vision of what needs to be built. ∙ Clear specs on what data is needed. ∙ Clear specs on how to use the data. ∙ Reliable serving and maintenance of the data. 7
  • 9. data enabled roi Conditional Impact on Expected ROI If data enables the use case, then traditional product ROI. Cost: ∙ All costs of Data Informed (infrastructure, serving & storage) ∙ new Data acquisition ∙ new Data processing (may require ETL & some ML) ∙ new Higher personnel costs given the above. Benefit: ∙ New product opportunities. ∙ Better customization in a global marketplace. 8
  • 10. data driven use case Given goals & constraints of company, data decides the outcome. Netflix has a budget of $15B in 2019 for Content Creation spent with a mix of traditional & data driven approaches. 9
  • 11. data driven requirements Checklist: ∙ Clear goals & constraints ∙ Data for mining ∙ Infrastructure for data exploration ∙ People ∙ A conduit for results to be acted on 10
  • 12. data driven roi Uncertain ROI Similar to R&D costs and benefits. Expensive to start & hard to predict. Cost: ∙ All costs of data enabled ∙ new Additional infrastructure & personnel for exploration ∙ new Discovery rate of attempts to findings Benefit: ∙ Make rapid decisions without becoming an expert ∙ Create products that couldn’t have existed otherwise ∙ Optimize product market fit 11
  • 13. data driven doesn’t always pay off New is always interesting, not always better. Forbes & HBS have many articles on the failures of companies to benefit from their attempts to become data driven. 12
  • 14. data competitive use case Data is used systematically over the course of years to optimize the com- petitive advantage components of a company. Combination of data informed (Like Button), data enabled (Messages), data driven (news feed) 13
  • 15. data competitive requirements Checklist: ∙ No checklist is going to work - need to evolve in the face of changing markets and technological abilities. 14
  • 16. new mind set for creating data competitive company Goal should be to have enough system support for everything to work, and then heavy investment in what contributes to competitive advantage. 15
  • 17. components of a data competitive company Traditional areas of a company ∙ People ∙ Vision ∙ Strategy ∙ Market ∙ Product ∙ Communication Then Data Treat data as an additional component of a functioning company that needs to integrate with existing components. 16
  • 18. investments in data can then be focused on what matters Usain Bolt’s taste buds work at the level of guaranteeing he can eat enough safe food for the rest of his body. He does not have the best sense of taste in the world. 17
  • 19. data competitive roi Start with Low ROI, then iterate. Learn from what has worked to predict what will work & adjust. Cost: ∙ Combination of {Data Informed, Data Enabled, Data Driven} ∙ Continuous integration of data into all aspects of the company Benefit: ∙ Ability to adjust to market demands ∙ Flexibility to integrate latest advantages of AI/ML/DS 18
  • 20. predicting data roi challenges
  • 21. predicting data roi challenges Predicting Data ROI Challenges Metrics Repetition in DS Replication Systematic Building with ML Uncanny Valley of ML 20
  • 22. challenge - metric design Metrics are the translation layer between what we want and what we tell the machines we want. 21
  • 23. example - fire ring model Simple case - ask ML to sign up as many users as possible. Fire Ring model in network effects demonstrates how the number of users on a site can exponentially expand and then quickly disappear. 22
  • 24. solution - metric design principles 23
  • 25. challenge - longevity of data science results Analysis Tools People Funnel Analysis Logging Recruiting User Targeting Data Quality Organization Forecasting Experiments Data Literacy Opportunity Sizing Metrics & Dashboarding Ladders Spam Minimization Productionizing Models Decision Processes Deep Dives Central Repositories Sample of Common Data Science Focuses 24
  • 26. solution - repetition in data science Roughly every 2 years work in these areas will be redone for ∙ Upgrades ∙ Changing Landscapes ∙ Ownership Bias ’Not Analyzed by Me’ ∙ Forgetfulness Solutions ∙ Bow on Top time dedicated at the end of every project ∙ Document Tribal History ∙ Invite previous employees back for reviews 25
  • 27. challenge - trusting data science 26
  • 28. 27
  • 29. 28
  • 30. 29
  • 31. 30
  • 32. 31
  • 33. 32
  • 34. 33
  • 35. solution - trusting data science Bottom line It matters which data scientist does an analysis Solutions ∙ Keep track of data scientist’s success rates ∙ Identify skill gaps and train folks ∙ Identify critical and chaotic decisions - have multiple data experts produce solutions 34
  • 36. challenge - systematic innovation 35
  • 37. 36
  • 38. 37
  • 39. 38
  • 40. 39
  • 41. 40
  • 42. 41
  • 43. 42
  • 44. challenge - uncanny valley of ai Progression of break through CGI characters in movies Toy Story (1995), Final Fantasy Spirits Within (2001), and Terminator Genisys (2015) 43
  • 45. ideal outcome of ml Ideally want to design systems that are only improved with ML 44
  • 46. finding the uncanny valley of ml Mechanical Turk UI for labeling the number of coffee mugs in an image. Note, adding the extra question of ‘Is the Suggestion Correct’ and the phrase ‘If NOT correct’ was necessary. Without those UI components, Mechanical Turk workers were in auto-pilot of labeling images and ignored the suggested number of coffee mugs, resulting in the same accuracy regardless of the ML Accuracy. Additionally, workers completing fewer than 3 labels were filtered out as noise. 45
  • 48. conclusion Data Science can be systematic, principled, and foundational. First it must take it’s own advice & measure what it wants to improve. After looking in the mirror, iterate, improve & compete. 47