SlideShare a Scribd company logo
Analytics - Lessons
Learnt
Dr. Venkata Pingali
April 1, 2016
Basic Process
Conceptual Process
Biz
Analytics
Team
Data
Engg
Qtns, Context
Data Req
Datasets
Model Results
Story Telling
All three roles could
be in a single team!
Process in Reality
Biz
Analytics
Team
Data
Engg
Qtns, Context
Data Req
Datasets
Model Results
Story Telling
Iterative
Uncertain
Expensive
Laborious
Process in Reality
Biz
Analytics
Team
Data
Engg
Qtns, Context
Data Req
Datasets
Model Results
Story Telling
Iterative
Uncertain
Expensive
Laborious
http://fortune.com/2016/02/05/why-big-data-isnt-paying-off-for-companies-yet/
"80% of ..
companies
strategic decision
go haywire..
“flawed” data
Nature of Domain
Sense-making with Purpose
● Goal is impact - real change in the real world
○ Not mathematical machoness
○ Not blogs, presentations, etc.
● Model + Delivery = Impact
● Model - an approximation to real world
○ Three levels - Question, Domain, Process
○ Realworld has (unknown) complexity
○ Not an end in itself
● Delivery - Facilitation of incremental change
○ Multiple levels - Mindsets, technology, processes
Closing loop is a reality check
An imperfect search process
● Imperfect questions, data, and process
○ Complexity discovered over time
○ Iterative refinement
● Laborious, error prone, and always incomplete
○ Data preparation (60-80% of work) is error prone
○ Questions -> Answers -> Questions
● Initial framing is just the beginning
○ Story will reveal itself over time
Design for uncertainty
Successful Analytics Shifts Power
● There are winners and losers
○ Change is always painful
○ Efficiencies have to come from somewhere
● Mostly through power to contradict
○ Upsets conventional wisdom
● Sometimes through new paths forward
Analytics is serious business
Trust is #1 requirement
● Change require trust in output (evidence and path forward)
● Gaining trust is hard work
○ Delivered by what you do and how
○ All the time and everything you do
● Integrity required through the entire lifecycle
○ Data
○ Process
○ Interpretation
Design for trust
Math is either correct or not
● Sense-making may be qualitative but data or transformations are not
○ Every step is mathematical step
● Correct math is the basis for trust
○ Process is laborious
○ Work should not be trusted by default!
● “Hidden” transformations are risky
○ Excel changes
○ Filtering rules
Mathematical indiscipline will be punished
Efficiency is #2 requirement
● Data science getting out of the lab environment
● Decision makers have realized that they could be wrong (often?)
○ Need to be contradicted only once - happening frequently
○ Now they are asking for input in all areas
● Sea change in last 4 years
● Growing combinations - #decisions x #scope x #frequency x #depth
○ Growing much faster than people & process can cope
Process efficiency is essential to scaling
Team Character determines Quality
● Fundamentally about collaborative reasoning under uncertainty
○ Need a creative group of people
● Balanced skill along multiple dimensions
○ Domain (technology, business, individual)
○ Approach (model, experiment, field work)
○ Engagement (presentation, tech delivery, ops)
● Balanced process
○ Increased curiosity bandwidth will give people mastery, purpose
● Sense of purpose
Look to build a strong team
Surviving the Insight Ladder
● Step 1 Wranging - Get to facts at summary level
● Step 2 Discovery - Frame initial questions & iterate to get to real
questions
● Step 3 Relevance - Meaningful imprecise answers
● Step 4 Accuracy - Meaning precise answers
● Step 5 Robustness - Meaningful, precise, robust answers
Continuously increase curiosity bandwidth
Time spent here =
Curiousity
bandwidth
Business
Has to be shared organizational experience
● Mistakes are frequent
○ Through the entire lifecycle
● Domain knowledge is discovered
○ More important than math
Make analytics a collective experience
Costs are front-loaded
● Data preparation/wranging
○ Takes arbitrary amount of time
○ Time/Effort ~ #elements ^^ 2
● Errors in model development and operation
● Data version updates
● Changes in narratives
Budgeting and expectation setting should be realistic
Empathetic delivery
● Analytics has collateral damage
○ People get fired, budgets are cut, new responsibilities get added
● Empathetic positioning and language
○ Understand that everybody wants to do their job well
○ People are not dumb
● Incremental actionables
○ Show way forward in byte chunks
Plan the delivery carefully
Analytics work is risky
● Over-hyped context
○ Bigger, better examples everywhere - real or imagined
● Burden of expectations/magic from customer
● Things go wrong
○ Underwhelming/no results, methodological issues, wrong data
● Crisis as a teaching moment
○ Culture of learning, understanding and continuous refinement
Enable team to take risks and have honest conversations
Individual
Dont be pygmalion
● Dont fall in love with data
○ It is imperfect like everything else
● Even simple data is too rich
○ You see what you want to see
● Be deeply skeptical
● Explore without judgment, detached
Develop non-judgment curiousity
Extra
Decision-maker Questions
1. Where did the numbers come from? (Correctness, Lineage)
a. Assumption, models, datasets
2. Is this an accident? Does it hold now? (Reproducibility, Retargetability)
a. Model, dataset, and question revisions
3. Can you get the results faster? (Efficiency)
a. Time, effort, cost
4. Can you also analyze X? (Extensibility)
a. Different dataset, question
5. Could we try X? (Dataset generation - synthetic and real)
a. What if scenarios, field experiments

More Related Content

PPT
Analytical skills training course – make valid decisions with maximum confidence
PDF
Applied AI Tech Talk: How to Setup a Data Science Dept
PPTX
What should be your approach for solving ml cv problem statements
PDF
Building a successful data organization nov 2018
PDF
Analytical Skills Tools and Attitudes 2013 Survey lavastorm analytics
PDF
Web Development or Data Science
PPSX
Analytical Skill & Problem Solving
PPTX
Analytical skills
Analytical skills training course – make valid decisions with maximum confidence
Applied AI Tech Talk: How to Setup a Data Science Dept
What should be your approach for solving ml cv problem statements
Building a successful data organization nov 2018
Analytical Skills Tools and Attitudes 2013 Survey lavastorm analytics
Web Development or Data Science
Analytical Skill & Problem Solving
Analytical skills

What's hot (20)

PPT
A basic course on analytical thinking
PPTX
Analytical Skills and Problem Solving
PDF
Become a Data Analyst
PPTX
Introduction to data science
PPTX
Analytical thinking & problem solving
PDF
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
PPTX
How tech startups can leverage data analytics and visualization
PPTX
The future of jobs
DOCX
Problem solving and analytical skills
PPT
Analytical Thinking Training
PPTX
How to think like a data scientist sandeep
PDF
Managing Data Science by David Martínez Rego
PPT
Analytical thinking training
PPTX
Week2 day2slide
PPTX
Analysis of "Data is Worthless if You Don’t Communicate It" by Thomas H. Dave...
PPTX
How to Start Thinking Like a Data Scientist
PDF
Max Shron, Thinking with Data at the NYC Data Science Meetup
PDF
Innovation explained
PDF
Better Living Through Analytics - Louis Cialdella Product School
PPTX
Data science
A basic course on analytical thinking
Analytical Skills and Problem Solving
Become a Data Analyst
Introduction to data science
Analytical thinking & problem solving
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
How tech startups can leverage data analytics and visualization
The future of jobs
Problem solving and analytical skills
Analytical Thinking Training
How to think like a data scientist sandeep
Managing Data Science by David Martínez Rego
Analytical thinking training
Week2 day2slide
Analysis of "Data is Worthless if You Don’t Communicate It" by Thomas H. Dave...
How to Start Thinking Like a Data Scientist
Max Shron, Thinking with Data at the NYC Data Science Meetup
Innovation explained
Better Living Through Analytics - Louis Cialdella Product School
Data science
Ad

Viewers also liked (8)

PDF
PPTX
Las motos mas caras del mundo
PDF
MESOPOTAMIA
PDF
Class Rep Certificate
PDF
Про git чуть больше, чем нужно
PPTX
Aula 39 e 40 a operações com ângulos
PDF
Dynamic behaviour of tall chimneys
PPTX
Den uchitelya 2016
Las motos mas caras del mundo
MESOPOTAMIA
Class Rep Certificate
Про git чуть больше, чем нужно
Aula 39 e 40 a operações com ângulos
Dynamic behaviour of tall chimneys
Den uchitelya 2016
Ad

Similar to Analytics Lessons Learnt (20)

PDF
Harnessing Data: Turning Information into Intelligent Action
PDF
Loras College 2016 Business Analytics Symposium Keynote
PPTX
Actionability of insights
PPTX
NTEN Your Analytics doesn't have to be dramatic to be useful
PDF
Data Leadership talk for CIIA March 2022.pdf
PPTX
Analytics that deliver Value
PPTX
Acceptance, Accessible, Actionable and Auditable
PPTX
Analytics in Action - Introduction
PPTX
PPTX
Introduction
PDF
Think Like A Data Analyst Meap V02 Chapters 1 To 4 Of 13 Mona Khalil
PDF
1440 horrobin using our laptop
PDF
Simplifying Analytics - by Novoniel Deb
PDF
"A Leader’s Guide to Data Analytics"
PDF
The Agile Manager: Empowerment and Alignment
PDF
Lecture on Data Science in a Data-Driven Culture
PPTX
Making advanced analytics work for you
PPTX
Session 5 additional analytics operations
PDF
Advancing the analytics maturity curve at your organization
PPTX
Data Analytics Time to Grow Up
Harnessing Data: Turning Information into Intelligent Action
Loras College 2016 Business Analytics Symposium Keynote
Actionability of insights
NTEN Your Analytics doesn't have to be dramatic to be useful
Data Leadership talk for CIIA March 2022.pdf
Analytics that deliver Value
Acceptance, Accessible, Actionable and Auditable
Analytics in Action - Introduction
Introduction
Think Like A Data Analyst Meap V02 Chapters 1 To 4 Of 13 Mona Khalil
1440 horrobin using our laptop
Simplifying Analytics - by Novoniel Deb
"A Leader’s Guide to Data Analytics"
The Agile Manager: Empowerment and Alignment
Lecture on Data Science in a Data-Driven Culture
Making advanced analytics work for you
Session 5 additional analytics operations
Advancing the analytics maturity curve at your organization
Data Analytics Time to Grow Up

More from Venkata Pingali (6)

PDF
Fast Sub-ML Usecase Development.pdf
PDF
Privacy Law Aware ML Data Prep April 2020
PDF
Accelerating ML using Production Feature Engineering
PDF
Reducing Cost of Production ML: Feature Engineering Case Study
PDF
Using dataset versioning in data science
PDF
R meetup talk scaling data science with dgit
Fast Sub-ML Usecase Development.pdf
Privacy Law Aware ML Data Prep April 2020
Accelerating ML using Production Feature Engineering
Reducing Cost of Production ML: Feature Engineering Case Study
Using dataset versioning in data science
R meetup talk scaling data science with dgit

Recently uploaded (20)

PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPT
Quality review (1)_presentation of this 21
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
annual-report-2024-2025 original latest.
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Clinical guidelines as a resource for EBP(1).pdf
Galatica Smart Energy Infrastructure Startup Pitch Deck
Database Infoormation System (DBIS).pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
1_Introduction to advance data techniques.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Quality review (1)_presentation of this 21
STUDY DESIGN details- Lt Col Maksud (21).pptx
annual-report-2024-2025 original latest.
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
climate analysis of Dhaka ,Banglades.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Data_Analytics_and_PowerBI_Presentation.pptx
ISS -ESG Data flows What is ESG and HowHow
Business Acumen Training GuidePresentation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
oil_refinery_comprehensive_20250804084928 (1).pptx
Supervised vs unsupervised machine learning algorithms
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Clinical guidelines as a resource for EBP(1).pdf

Analytics Lessons Learnt

  • 1. Analytics - Lessons Learnt Dr. Venkata Pingali April 1, 2016
  • 3. Conceptual Process Biz Analytics Team Data Engg Qtns, Context Data Req Datasets Model Results Story Telling All three roles could be in a single team!
  • 4. Process in Reality Biz Analytics Team Data Engg Qtns, Context Data Req Datasets Model Results Story Telling Iterative Uncertain Expensive Laborious
  • 5. Process in Reality Biz Analytics Team Data Engg Qtns, Context Data Req Datasets Model Results Story Telling Iterative Uncertain Expensive Laborious http://fortune.com/2016/02/05/why-big-data-isnt-paying-off-for-companies-yet/ "80% of .. companies strategic decision go haywire.. “flawed” data
  • 7. Sense-making with Purpose ● Goal is impact - real change in the real world ○ Not mathematical machoness ○ Not blogs, presentations, etc. ● Model + Delivery = Impact ● Model - an approximation to real world ○ Three levels - Question, Domain, Process ○ Realworld has (unknown) complexity ○ Not an end in itself ● Delivery - Facilitation of incremental change ○ Multiple levels - Mindsets, technology, processes Closing loop is a reality check
  • 8. An imperfect search process ● Imperfect questions, data, and process ○ Complexity discovered over time ○ Iterative refinement ● Laborious, error prone, and always incomplete ○ Data preparation (60-80% of work) is error prone ○ Questions -> Answers -> Questions ● Initial framing is just the beginning ○ Story will reveal itself over time Design for uncertainty
  • 9. Successful Analytics Shifts Power ● There are winners and losers ○ Change is always painful ○ Efficiencies have to come from somewhere ● Mostly through power to contradict ○ Upsets conventional wisdom ● Sometimes through new paths forward Analytics is serious business
  • 10. Trust is #1 requirement ● Change require trust in output (evidence and path forward) ● Gaining trust is hard work ○ Delivered by what you do and how ○ All the time and everything you do ● Integrity required through the entire lifecycle ○ Data ○ Process ○ Interpretation Design for trust
  • 11. Math is either correct or not ● Sense-making may be qualitative but data or transformations are not ○ Every step is mathematical step ● Correct math is the basis for trust ○ Process is laborious ○ Work should not be trusted by default! ● “Hidden” transformations are risky ○ Excel changes ○ Filtering rules Mathematical indiscipline will be punished
  • 12. Efficiency is #2 requirement ● Data science getting out of the lab environment ● Decision makers have realized that they could be wrong (often?) ○ Need to be contradicted only once - happening frequently ○ Now they are asking for input in all areas ● Sea change in last 4 years ● Growing combinations - #decisions x #scope x #frequency x #depth ○ Growing much faster than people & process can cope Process efficiency is essential to scaling
  • 13. Team Character determines Quality ● Fundamentally about collaborative reasoning under uncertainty ○ Need a creative group of people ● Balanced skill along multiple dimensions ○ Domain (technology, business, individual) ○ Approach (model, experiment, field work) ○ Engagement (presentation, tech delivery, ops) ● Balanced process ○ Increased curiosity bandwidth will give people mastery, purpose ● Sense of purpose Look to build a strong team
  • 14. Surviving the Insight Ladder ● Step 1 Wranging - Get to facts at summary level ● Step 2 Discovery - Frame initial questions & iterate to get to real questions ● Step 3 Relevance - Meaningful imprecise answers ● Step 4 Accuracy - Meaning precise answers ● Step 5 Robustness - Meaningful, precise, robust answers Continuously increase curiosity bandwidth Time spent here = Curiousity bandwidth
  • 16. Has to be shared organizational experience ● Mistakes are frequent ○ Through the entire lifecycle ● Domain knowledge is discovered ○ More important than math Make analytics a collective experience
  • 17. Costs are front-loaded ● Data preparation/wranging ○ Takes arbitrary amount of time ○ Time/Effort ~ #elements ^^ 2 ● Errors in model development and operation ● Data version updates ● Changes in narratives Budgeting and expectation setting should be realistic
  • 18. Empathetic delivery ● Analytics has collateral damage ○ People get fired, budgets are cut, new responsibilities get added ● Empathetic positioning and language ○ Understand that everybody wants to do their job well ○ People are not dumb ● Incremental actionables ○ Show way forward in byte chunks Plan the delivery carefully
  • 19. Analytics work is risky ● Over-hyped context ○ Bigger, better examples everywhere - real or imagined ● Burden of expectations/magic from customer ● Things go wrong ○ Underwhelming/no results, methodological issues, wrong data ● Crisis as a teaching moment ○ Culture of learning, understanding and continuous refinement Enable team to take risks and have honest conversations
  • 21. Dont be pygmalion ● Dont fall in love with data ○ It is imperfect like everything else ● Even simple data is too rich ○ You see what you want to see ● Be deeply skeptical ● Explore without judgment, detached Develop non-judgment curiousity
  • 22. Extra
  • 23. Decision-maker Questions 1. Where did the numbers come from? (Correctness, Lineage) a. Assumption, models, datasets 2. Is this an accident? Does it hold now? (Reproducibility, Retargetability) a. Model, dataset, and question revisions 3. Can you get the results faster? (Efficiency) a. Time, effort, cost 4. Can you also analyze X? (Extensibility) a. Different dataset, question 5. Could we try X? (Dataset generation - synthetic and real) a. What if scenarios, field experiments