SlideShare a Scribd company logo
Berkeley DS Webinar
June 1, 2016
COMPANY CONFIDENTIAL2
How business gets involved in the modeling
process (challenges involved in)
• CPG (consumer packaged goods)
• One of the first things I learned in the dS biz is that the biz problem is not far from the ds biz
wants to be invovled at all stages
– They want to pose problem
– Give perspective on solutions
– Review what DS is finding,
– Refine, the process and make suggestions
– Understand and critique the results
– Porous layer between biz and ds teams
• Can be a very positive thing: ideas on what should be included, validate if the results are
meaningful, biz context needed to build good models
• Downside: biz will often lead you down paths that are not productive or defensible + anecdotes!
• Having biz involved forces you to have models that are explanatory and not just predictive this
means they are meaningful
• If you just focus on prediction this will lead to overfit,
COMPANY CONFIDENTIAL3
It’s all about the data!
• Morgan Stanley  we sell AA but many ppl do basic stuff with data
• Means that you don’t’ spend that much time doing algo stuff, mostly about
feature generation and data prep
• In SV w/ internet companies the data science is throw all the data at an
algorithm
• If you can be more intelligent with feature gen, you will get better
performance
• nevertheless, the more data you can get, the better
• So is acquisition of data very important and part of the process (overlooked)
• Traditional world: what data to use, which transforms VERSUS throwing
data in an algorithm and hoping for the best
– This is overlooked
COMPANY CONFIDENTIAL4
It’s not about the algorithm!
• Evicore example
• In a very short period of time, just using the straightforward approach, we
found a way to save 10s of millions of dollars
• By contrast, company like Vmware they are obsessed with applying
advanced algorithms on small amounts of data, not rich data, and not
making impact on the biz
• What is more important than the algo, is finding an important biz problem
and getting to a solution in a meaningful time period
• Also what is more important is operationalizing analytics result
• You can have a perfect model, not in production is just an insight can die
on the vine
• Simple model that can give you lift in customer acquisition and impact on
fraud that’s immediate
COMPANY CONFIDENTIAL5
How to become a data scientist!
• Personal experience and what you see during hiring
• Recruiting stuff
• Plug for alpine!
• Internships are the most important! Than courses and
stuffz
• All about connections
• Meetups

More Related Content

PPT
Cheap'n'easy usability
PPT
Robert Fan - 2012 Lean Startup Conference
PDF
Lean startup workshop: practical ways to turn your idea into a successful pro...
PPTX
2011 10 12 eric ries lean startup web 2.0 expo ny keynote
PPTX
2012 05 15 eric ries the lean startup pwc canada
PDF
Is Box Theory™ Silver Software Right for You?
PPTX
2010 03 09 the lean startup - gdc
PDF
Is Box Theory™ Gold Software Right for You?
Cheap'n'easy usability
Robert Fan - 2012 Lean Startup Conference
Lean startup workshop: practical ways to turn your idea into a successful pro...
2011 10 12 eric ries lean startup web 2.0 expo ny keynote
2012 05 15 eric ries the lean startup pwc canada
Is Box Theory™ Silver Software Right for You?
2010 03 09 the lean startup - gdc
Is Box Theory™ Gold Software Right for You?

What's hot (20)

PPTX
2009 09 08 The Lean Startup Gov 2.0 Summit Edition
PPTX
2010 02 19 the lean startup - webstock 2010
PDF
Guido Jansen -How to Involve the Whole Team in Optimization
 
PDF
Yes, You Can! No, You Can't! Yes, You Can!
PPTX
Prototype to production process
PDF
Lean Startup 101
PDF
Keynote: Can you teach a 150-year-old dog new tricks?
PDF
Ash Maurya Innovation Accounting - 2012 Lean Startup Conference
PPT
Get Faster - While You're Getting Better
PDF
Building Lean and Agile in the Real World
PDF
The Lean Startup | Methodology - Dtech Systems
PDF
Change process Planning
PPT
Ammerse - SolvingDesign Introduction
PDF
How corporates could learn from startups
PPTX
Software Economies of Scale
PDF
The Business of Execution (Infographic)
PDF
10 Tactics for Building an Optimization Culture
PPTX
Agile Impact 2018: Feature Experimentation
PDF
Run High Impact Experimentation with High-quality Customer Discovery
PPTX
UXDX Amsterdam - Importance of continuous research and monitoring, by Raquel ...
2009 09 08 The Lean Startup Gov 2.0 Summit Edition
2010 02 19 the lean startup - webstock 2010
Guido Jansen -How to Involve the Whole Team in Optimization
 
Yes, You Can! No, You Can't! Yes, You Can!
Prototype to production process
Lean Startup 101
Keynote: Can you teach a 150-year-old dog new tricks?
Ash Maurya Innovation Accounting - 2012 Lean Startup Conference
Get Faster - While You're Getting Better
Building Lean and Agile in the Real World
The Lean Startup | Methodology - Dtech Systems
Change process Planning
Ammerse - SolvingDesign Introduction
How corporates could learn from startups
Software Economies of Scale
The Business of Execution (Infographic)
10 Tactics for Building an Optimization Culture
Agile Impact 2018: Feature Experimentation
Run High Impact Experimentation with High-quality Customer Discovery
UXDX Amsterdam - Importance of continuous research and monitoring, by Raquel ...
Ad

Similar to UC Berkeley Data Science Webinar (20)

PDF
Embracing data science
PPTX
Data science Nagarajan and madhav.pptx
PDF
Become a citizen data scientist
PDF
Data science presentation 2nd CI day
PDF
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
PDF
Data Science: lesson01_intro-to-ds-and-ml.pdf
PDF
Understanding Products Driven by Machine Learning and AI: A Data Scientist's ...
PDF
Big data sharing at fintech academy oct19 (1)
PPTX
Transform Banking with Big Data and Automated Machine Learning 9.12.17
PPTX
Best Practices for Scaling Data Science Across the Organization
PPTX
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
PPTX
intro to data science Clustering and visualization of data science subfields ...
PDF
5_Data Analytics, Data Science and Machine Learning
PDF
Lean Analytics: How to get more out of your data science team
PDF
Introduction-to-Data-Science.pdf
PDF
Introduction-to-Data-Science.pdf
PPTX
DevelopingDataScienceProfession
PDF
Data science-Introductions-Real World Application
PPTX
Data science in business Administration Nagarajan.pptx
PPTX
In-Depth Data Analytics
Embracing data science
Data science Nagarajan and madhav.pptx
Become a citizen data scientist
Data science presentation 2nd CI day
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
Data Science: lesson01_intro-to-ds-and-ml.pdf
Understanding Products Driven by Machine Learning and AI: A Data Scientist's ...
Big data sharing at fintech academy oct19 (1)
Transform Banking with Big Data and Automated Machine Learning 9.12.17
Best Practices for Scaling Data Science Across the Organization
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
intro to data science Clustering and visualization of data science subfields ...
5_Data Analytics, Data Science and Machine Learning
Lean Analytics: How to get more out of your data science team
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
DevelopingDataScienceProfession
Data science-Introductions-Real World Application
Data science in business Administration Nagarajan.pptx
In-Depth Data Analytics
Ad

More from Alpine Data (8)

PDF
Spark Autotuning - Spark Summit East 2017
PDF
Big Data Day LA 2017
PDF
Operationalizing Data Science using Cloud Foundry
PPTX
Think Like Spark
PDF
Enterprise Scale Topological Data Analysis Using Spark
PDF
Spark Tuning for Enterprise System Administrators
PPTX
Real Time Visualization with Spark
PDF
Harnessing Big Data with Spark
Spark Autotuning - Spark Summit East 2017
Big Data Day LA 2017
Operationalizing Data Science using Cloud Foundry
Think Like Spark
Enterprise Scale Topological Data Analysis Using Spark
Spark Tuning for Enterprise System Administrators
Real Time Visualization with Spark
Harnessing Big Data with Spark

Recently uploaded (20)

PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Mega Projects Data Mega Projects Data
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPT
Quality review (1)_presentation of this 21
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Foundation of Data Science unit number two notes
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Business Analytics and business intelligence.pdf
PPT
Reliability_Chapter_ presentation 1221.5784
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Mega Projects Data Mega Projects Data
IB Computer Science - Internal Assessment.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Quality review (1)_presentation of this 21
Clinical guidelines as a resource for EBP(1).pdf
Supervised vs unsupervised machine learning algorithms
Foundation of Data Science unit number two notes
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
IBA_Chapter_11_Slides_Final_Accessible.pptx
ISS -ESG Data flows What is ESG and HowHow
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to Knowledge Engineering Part 1
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Business Analytics and business intelligence.pdf
Reliability_Chapter_ presentation 1221.5784

UC Berkeley Data Science Webinar

  • 2. COMPANY CONFIDENTIAL2 How business gets involved in the modeling process (challenges involved in) • CPG (consumer packaged goods) • One of the first things I learned in the dS biz is that the biz problem is not far from the ds biz wants to be invovled at all stages – They want to pose problem – Give perspective on solutions – Review what DS is finding, – Refine, the process and make suggestions – Understand and critique the results – Porous layer between biz and ds teams • Can be a very positive thing: ideas on what should be included, validate if the results are meaningful, biz context needed to build good models • Downside: biz will often lead you down paths that are not productive or defensible + anecdotes! • Having biz involved forces you to have models that are explanatory and not just predictive this means they are meaningful • If you just focus on prediction this will lead to overfit,
  • 3. COMPANY CONFIDENTIAL3 It’s all about the data! • Morgan Stanley  we sell AA but many ppl do basic stuff with data • Means that you don’t’ spend that much time doing algo stuff, mostly about feature generation and data prep • In SV w/ internet companies the data science is throw all the data at an algorithm • If you can be more intelligent with feature gen, you will get better performance • nevertheless, the more data you can get, the better • So is acquisition of data very important and part of the process (overlooked) • Traditional world: what data to use, which transforms VERSUS throwing data in an algorithm and hoping for the best – This is overlooked
  • 4. COMPANY CONFIDENTIAL4 It’s not about the algorithm! • Evicore example • In a very short period of time, just using the straightforward approach, we found a way to save 10s of millions of dollars • By contrast, company like Vmware they are obsessed with applying advanced algorithms on small amounts of data, not rich data, and not making impact on the biz • What is more important than the algo, is finding an important biz problem and getting to a solution in a meaningful time period • Also what is more important is operationalizing analytics result • You can have a perfect model, not in production is just an insight can die on the vine • Simple model that can give you lift in customer acquisition and impact on fraud that’s immediate
  • 5. COMPANY CONFIDENTIAL5 How to become a data scientist! • Personal experience and what you see during hiring • Recruiting stuff • Plug for alpine! • Internships are the most important! Than courses and stuffz • All about connections • Meetups