SlideShare a Scribd company logo
Agile development of data
science projects | Part 1
Anubhav Dhiman | July 18, 2018 | Berlin
What is data science?
Data science focuses on predicting something,
prescribing something, or in some cases explaining
something, making it distinct from Business Intelligence
(BI), which focuses on backward-looking factual
reporting (describing something that happened).
It is also distinct from big data storage and processing
technologies like Hadoop and Spark. These tools are
valuable inputs into the quantitative research process
but are insufficient to realise the full potential of data
science.
Successful organizations coordinate all three areas
(data science, BI, and big data) to achieve maximum
value
Broadly data science encompasses
quantitative research, advanced analytics,
predictive modelling and machine learning.
How reliably and
sustainably can
data science team
deliver value for
organizations?
Source: Domino Data Lab
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
Data Science
Readiness Levels
Source: Emily Gorcenski
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
Can we solve
problem as stated?
Data Scientists,
Data Engineers1
4
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
What does a MVP
look like?
+Designers,
Product Managers
Data Scientists,
Data Engineers
2
1
2
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
How do we build
the MVP?
+Designers,
Product Managers
Data Scientists,
Data Engineers
+Infra, Backend,
Frontend
3
2
1
3
2
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
How do we ship
the MVP?
+QA, Legal
+Designers,
Product Managers
Data Scientists,
Data Engineers
+Infra, Backend,
Frontend
4
3
2
1
4
3
2
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
How do we
improve MVP?
+CR, Analytics
+QA, Legal
+Designers,
Product Managers
Data Scientists,
Data Engineers
+Infra, Backend,
Frontend
5
4
3
2
1
5
4
3
2
1
How to make
collaboration
easier across
organization?
Source: Louis Dorard
From :
1. background to
specifics
2. domain
integration to
predictive
engine
Source: Louis Dorard
1
2 3
4 5
7 6
8 9
10
Up Next … Part 2
- Data Science Lifecycle
- Developing and Deploying
AI solutions

More Related Content

PDF
Industry 4.0: use cases for integrated supply chain
PPTX
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
PDF
Microsoft R Server for Data Sciencea
PPTX
Agile Data Science
PPTX
Data Science as a Service: Intersection of Cloud Computing and Data Science
PPTX
Predicting Patient Outcomes in Real-Time at HCA
PPTX
Leveraging Open Source Automated Data Science Tools
PDF
Data scientist enablement dse 400 week 8 roadmap
Industry 4.0: use cases for integrated supply chain
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...
Microsoft R Server for Data Sciencea
Agile Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Predicting Patient Outcomes in Real-Time at HCA
Leveraging Open Source Automated Data Science Tools
Data scientist enablement dse 400 week 8 roadmap

What's hot (20)

PDF
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
PPTX
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
PDF
Agile data science
PDF
How to design and implement a data ops architecture with sdc and gcp
PDF
Why APM Is Not the Same As ML Monitoring
PDF
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
PPT
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
PDF
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
PDF
Don't build a data science team
PDF
H2O for Medicine and Intro to H2O in Python
PDF
The lean principles of data ops
PDF
Agile Data Science
PDF
Learning to Rank Datasets for Search with Oscar Castaneda
PPTX
ML-Ops: From Proof-of-Concept to Production Application
PPTX
Scaling Data Quality @ Netflix
PDF
Scaling AutoML-Driven Anomaly Detection With Luminaire
PDF
Josh Wills, MLconf 2013
PDF
Agile Data Science
PDF
Better Together: How Graph database enables easy data integration with Spark ...
PPTX
R at Microsoft (useR! 2016)
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
Artificial Intelligence and Analytic Ops to Continuously Improve Business Out...
Agile data science
How to design and implement a data ops architecture with sdc and gcp
Why APM Is Not the Same As ML Monitoring
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Supporting GDPR Compliance through effectively governing Data Lineage and Dat...
Don't build a data science team
H2O for Medicine and Intro to H2O in Python
The lean principles of data ops
Agile Data Science
Learning to Rank Datasets for Search with Oscar Castaneda
ML-Ops: From Proof-of-Concept to Production Application
Scaling Data Quality @ Netflix
Scaling AutoML-Driven Anomaly Detection With Luminaire
Josh Wills, MLconf 2013
Agile Data Science
Better Together: How Graph database enables easy data integration with Spark ...
R at Microsoft (useR! 2016)
Ad

Similar to Agile development of data science projects | Part 1 (20)

PDF
Lean Analytics: How to get more out of your data science team
PDF
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
PPTX
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
PDF
Building successful data science teams
PDF
DS Life Cycle
PDF
DS Life Cycle
PDF
Architecting for analytics
PPTX
DataScience.pptx
PPTX
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
PPTX
DataOps: Nine steps to transform your data science impact Strata London May 18
PDF
Embracing data science
PDF
How to succeed at data without even trying!
PDF
Data Science and Culture
PPTX
Data science 101 Masterclass
PPTX
Why Data Science Projects Fail
PPTX
Why Data Science Projects Fail
PDF
Become a citizen data scientist
PPTX
Introducition to Data scinece compiled by hu
PPTX
ANIn Coimbatore Sep 2023 | Agile for data science by Venkatesa Prasanna Selvaraj
PDF
Data Science Introduction and Process in Data Science
Lean Analytics: How to get more out of your data science team
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
Building successful data science teams
DS Life Cycle
DS Life Cycle
Architecting for analytics
DataScience.pptx
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
DataOps: Nine steps to transform your data science impact Strata London May 18
Embracing data science
How to succeed at data without even trying!
Data Science and Culture
Data science 101 Masterclass
Why Data Science Projects Fail
Why Data Science Projects Fail
Become a citizen data scientist
Introducition to Data scinece compiled by hu
ANIn Coimbatore Sep 2023 | Agile for data science by Venkatesa Prasanna Selvaraj
Data Science Introduction and Process in Data Science
Ad

Recently uploaded (20)

PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Introduction to Data Science and Data Analysis
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Business Analytics and business intelligence.pdf
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Database Infoormation System (DBIS).pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Computer network topology notes for revision
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
1_Introduction to advance data techniques.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
ISS -ESG Data flows What is ESG and HowHow
Introduction to Data Science and Data Analysis
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Analytics and business intelligence.pdf
Reliability_Chapter_ presentation 1221.5784
Galatica Smart Energy Infrastructure Startup Pitch Deck
Database Infoormation System (DBIS).pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Supervised vs unsupervised machine learning algorithms
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
annual-report-2024-2025 original latest.
Computer network topology notes for revision
STUDY DESIGN details- Lt Col Maksud (21).pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
[EN] Industrial Machine Downtime Prediction
1_Introduction to advance data techniques.pptx

Agile development of data science projects | Part 1

  • 1. Agile development of data science projects | Part 1 Anubhav Dhiman | July 18, 2018 | Berlin
  • 2. What is data science? Data science focuses on predicting something, prescribing something, or in some cases explaining something, making it distinct from Business Intelligence (BI), which focuses on backward-looking factual reporting (describing something that happened). It is also distinct from big data storage and processing technologies like Hadoop and Spark. These tools are valuable inputs into the quantitative research process but are insufficient to realise the full potential of data science. Successful organizations coordinate all three areas (data science, BI, and big data) to achieve maximum value Broadly data science encompasses quantitative research, advanced analytics, predictive modelling and machine learning.
  • 3. How reliably and sustainably can data science team deliver value for organizations? Source: Domino Data Lab
  • 4. Delivery 9. System proven in operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development Data Science Readiness Levels Source: Emily Gorcenski
  • 5. Delivery 9. System proven in operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development Can we solve problem as stated? Data Scientists, Data Engineers1 4 1
  • 6. Delivery 9. System proven in operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development What does a MVP look like? +Designers, Product Managers Data Scientists, Data Engineers 2 1 2 1
  • 7. Delivery 9. System proven in operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development How do we build the MVP? +Designers, Product Managers Data Scientists, Data Engineers +Infra, Backend, Frontend 3 2 1 3 2 1
  • 8. Delivery 9. System proven in operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development How do we ship the MVP? +QA, Legal +Designers, Product Managers Data Scientists, Data Engineers +Infra, Backend, Frontend 4 3 2 1 4 3 2 1
  • 9. Delivery 9. System proven in operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development How do we improve MVP? +CR, Analytics +QA, Legal +Designers, Product Managers Data Scientists, Data Engineers +Infra, Backend, Frontend 5 4 3 2 1 5 4 3 2 1
  • 10. How to make collaboration easier across organization? Source: Louis Dorard
  • 11. From : 1. background to specifics 2. domain integration to predictive engine Source: Louis Dorard 1 2 3 4 5 7 6 8 9 10
  • 12. Up Next … Part 2 - Data Science Lifecycle - Developing and Deploying AI solutions