SlideShare a Scribd company logo
Data-Driven?
Data Science?
!
Lecture (part 1)	

@AaltoBIZ, Feb 2, 2015	

by Johan Himberg	

Data Scientist, 	

@ReaktorNow
Data contains information
• Data may contain information
• Information gives the capacity for making
beneficial decisions
• compare with “energy gives the capacity of doing
work” or “information is the currency of decision
making”
• …yes, this is not the formal definition of “information”
Lecture @AaltoBIZ, Johan Himberg, 2015
Data
• You can’t work with “only (big) data”
• You need prior assumptions and models to
gain information from data.
• Think of the famous example on
• ice cream
• drowning, and
• temperature
Lecture @AaltoBIZ, Johan Himberg, 2015
Data-Driven: Probability & Empiricism
“Data driven means that progress in an activity is compelled by data
rather than by intuition or personal experience. It is often labeled as the
business jargon for what scientists call evidence based decision
making” 

Wikipedia 2015-02-02

“I take risks, sometimes patients die. But not taking risks causes more
patients to die, so I guess my biggest problem is I've been cursed with
the ability to do the math.” 

fictional character Dr. House in Fox television series “House”
Data-Driven
• Being “data-driven” is an old concept … and very trendy
• “Digitalisation”
• “Big Data”
• Data Science
• Business acumen [“what for”]
• Operations Research [“optimal decisions”]
• Probability theory [“how to handle uncertainties”]
• Analytics [“insight”]
• Computer Science [“how to implement all that”]
Lecture @AaltoBIZ, Johan Himberg, 2015
Ideals of being Data-Driven
• be curious (seek for evidence)
• be active (test, don’t just observe and analyse)
• be Bayesian (understand uncertainties)
• be courageous (act on the evidence)
• be agile (learn, fail fast… but not too fast: collect enough evidence)
• be transparent and helpful (show and share information, co-operate)
• be truthful and non-political (don’t abuse data, work across silos)
• be wise (there is a time to be data-driven and a time to be intuitive)
Culture 	

eats strategy 	

for breakfast 	

attributed to P. Drucker, popularised by M. Fields
Lecture @AaltoBIZ, Johan Himberg, 2015
Why
Why?
• Data business
• eg. Google, Facebook
• Operational and Strategic aspects
• case Ford
Lecture @AaltoBIZ, Johan Himberg, 2015
Data business
• Sell
• audiences (Google, Facebook, media, …)
• information (credit rating, car register,…)
Lecture @AaltoBIZ, Johan Himberg, 2015
Case Ford
• In 2006 closed the year with a $12.6 billion loss, the largest in the company’s
history. Alan Mulally CEO 2006
• “top-down data-driven culture, innovative data science techniques”
• profitable again in 3 years
• INFORMS Prize Winner, 2013, Best Company of the Year in Analytics,
Operations Research: “Analytical tools and the operations research team
supported many decisions in this period, and a number of critical applications
were developed:
• a dealer vehicle recommendation system
• a detailed econometric model enabling the study of what-if analyses of
inventory, production, pricing, and sales
• a strategic sourcing model to restructure the Ford auto interiors division”
Lecture @AaltoBIZ, Johan Himberg, 2015
Brynjolfson et al (2011) on Data-Driven
• survey data on the business practices and IT investments
of 179 large, publicly traded companies
• Firms that emphasise “data driven decision making”
• have output and productivity that is 5-6% higher than
what would be expected given other investments and IT
usage.
• relationship also appears in asset utilisation, return on
equity and market value
• statistical analysis suggests that this does not appear to be
due to reverse causality (!)
Lecture @AaltoBIZ, Johan Himberg, 2015
Operations
• Favour beneficial events: target the operations and marketing, cross-
sell, up-sell, …
• Avoid non-beneficial events: churn, people leaving, waste, credit
loss, fraud, system failures
• Optimize: work force, schedules, prices, stocks, relevancy, production
quality
• Rationalise: process efficiency, lead times, handle complexity, search
time …
• Understand: master data, transactions, processes
• internally: ERP, CRM, HR, sales systems, production, …
• externally: location, routes, weather, demographics, estates, …
Strategic
• Efficiency and competition
• react faster, streamlined decision making, risk awareness
• more financial efficiency
• not giving the information advance to competitors (cf. efficient markets, warfare)
• innovations
• Well-informed strategic decisions
• understanding and predicting customer groups, behaviorm and experience; product and service
development
• understanding and predicting world events, economics, demographics, …: react to market fluctuations,
changes in financial environment
• Brand aspects
• transparency, objectivity, personalisation as a part of company culture and brand
Lecture @AaltoBIZ, Johan Himberg, 2015
From business
problem to
predictive /
prescriptive model
CRISP-DM and Plan-Do-Check-Act
• CRISP-DM (EU consortium ~1996-2008) 

• Compare to PDCA popularised by W. E. Deming

• Deming is attributed the quote “In God we trust; all others
must bring data.”

• Comments
• Think first how to deploy, not last

• Don’t plan too big things up-front (the processes & data &
test results might ruin your plan

• Keep backlog and communicate, but don’t stuck into
“understanding” and “insights” if they are not the main task 

• You can’t make final evaluation on data before “deployment”
(remember: be empirical)

• You should deploy several tests on the field before “final”
evaluation

• Do not silo yourselves according to the boxes in the cycle!

• Learn continuously. Be truthful and curious.
P
C
D
A
Lecture @AaltoBIZ, Johan Himberg, 2015
Action
!
optimize
decide
deploy
!
Data
!
big, small, open
local, web, meta, …
!
Information
!
report
visualize
model
Businessdrivers
challenge 1
challenge 2
challenge 3
challenge 4
challenge 5
For example
• Automatised decisions;
recommendation, targeting
• Simulation
• prescriptive, predictive
modelling
For example
• documentation on meaning
of the data
• KPIs, profiles, segments,
factors, DW dashboards
• descriptive, diagnostic,
predictive modelling
For example
• source integrations
• Extract - Load - Transform
• Metadata
• modelling for cleansing &
consistency
modelling	

what are the actions 	

what are the insights
wrangling	

what data means
testing	

what is the impact
Think & plan from deployment to data
Pick a challenge!
Lecture @AaltoBIZ, Johan Himberg, 2015
Small note: This is not a guideline for IT or enterprise architecture, which is an important question, too.
Architects may benefit from the observations collected during data-driven work, though.
Action
!
DataInformation
Businessdrivers
challenge 1
start from here!
challenge 3
challenge 4
challenge 5
For example
• Business: need optimising
for customer retention
• Marketing: we could start
with special offer by SMS
• Data Scientist: we’ll set up
test & control groups!
For example
• M: some past campaign
results & execution…
• Solution expret: Field ZPOR
means revenue per unit and
it is calculated based on …
• Data Base adm : Source X in
DW is aggregated on
monthly level
• DS: let’s have historical
data on X and validate
model
For example
• DBA: we have X for 1M users
for 1 yr fields a,b,c
• DS: field c seems
suspicious, we’ll try to
correct it
modelling	

what are the actions 	

what are the insights
wrangling	

what data means
testing	

what is the impact
Data-Driven is inherently iterative and benefits from agility. 	

Data and processes are often not like assumed.	

Be curious, keep backlog, inspect, adapt.
Lecture @AaltoBIZ, Johan Himberg, 2015
Action
!
DataInformation
Businessdrivers
challenge 1
challenge 2
challenge 3
challenge 4
challenge 5
For example
• deploy campaign, collect
responses
For example
• calibrate & apply model
For example
• get data for modeling
• store results
modelling	

what are the actions 	

what are the insights
wrangling	

what data means
testing	

what is the impact
Execute based on model, collect data
results
Action
!
DataInformation
Businessdrivers
challenge 1
challenge 2
challenge 3
challenge 4
challenge 5
Backlog example
• test & control group
handling in marketing
automation
• Involve N.N. to the process
Backlog example
• define new information
source
• Look for a new data source
for determining income on
zip code areas
• correct documentation
• automatization for the
campaign modelling
Backlog example
• better system configuration
& architecture
• automatization for the
campaign process…
• new data: record
information on all
campaigns
modelling	

what are the actions 	

what are the insights
wrangling	

what data means
testing	

what is the impact
Information path focused backlog
Lecture @AaltoBIZ, Johan Himberg, 2015
Aim - Explore - Exploit
• Röntgen and Fleming (Nobel laureates)
• their great findings were “accidental”, but
• they were skilled scientists doing disciplined research for some
other aim
• Aim - explore - exploit
• Always aim at something specific but be open-minded and
curious; insights come along with the process.
• Explore occasionally “from data to insights”. But don’t overdo
exploration.
• If you find something interesting, make a disciplined test and
exploit the finding.
Lecture @AaltoBIZ, Johan Himberg, 2015
Tech
Technology?
• Variation is big: a combination of
• business
• data and information
• decision type
• deployment (actions)
• Things evolve rapidly
Lecture @AaltoBIZ, Johan Himberg, 2015
Technology?
• Prefer systems
• that give mass-access to historical, transactional data on
individual level instead of just aggregates (avoid being blinded
by averages)
• from which you’ll get the data, transformations, and results out
to another system (avoid being “data hostage”)
• where you see what the analytics actually does at least on
modular level (avoid being “method hostage”) Prefer being able
to see the actual implementation (open source).
Lecture @AaltoBIZ, Johan Himberg, 2015
Technology?
• We have used
• R, ggplot2, Shiny, …
• Apache Spark
• Python
• cloud based solutions
• required proprietary products, if needed
• a specific task (don’t reinvent all wheels)
• dedicated hardware, if critical or confidential…
• How to document the process of data transformation. That’s a question!
Lecture @AaltoBIZ, Johan Himberg, 2015
!
Culture
Organisation
Data Science is cross-functional
• Data scientists main tasks are in methods, but also in
processes and machinery of
• making evidence based decisions (automated if
possible)
• finding out confidence on the outcome (by active tests if
possible)
• getting insights based on models and data
• Data science / data scientist act also as a “glue”
Lecture @AaltoBIZ, Johan Himberg, 2015
Cross-functional teams
• Doing data-driven work and data science in any
organisation model boils down to
	 	 “Involve everyone along the information path”
	 [that was the red, crooked line in one of the previous slides]
Lecture @AaltoBIZ, Johan Himberg, 2015
Team & skills
• A change of culture; information is everybody’s business
• Business / Marketing / Finance specialists
• Project / Process / Solution owners
• Research
• Data Stewards / DB administration
• Developers
• Visualization / UX experts
• …
• One data scientist can’t excel all of that but should be knowledgeable enough to work with
everyone along the information path
Lecture @AaltoBIZ, Johan Himberg, 2015
Data Science skills
• There is no “one” definition for Data Science or the skills
• Data science is a combination of business acumen, statistics, data
mining, DBs, big data, machine learning, computer science, etc. See for
example:
• http://www.oralytics.com/2012/06/data-science-is-
multidisciplinary.html
• http://www.oralytics.com/2013/03/type-i-and-type-ii-data-
scientists.html
• You should team up anyway, the more the merrier, to find all relevant
skills. A view to this:
• http://www.accenture.com/SiteCollectionDocuments/PDF/Accenture-
Team-Solution-Data-Scientist-Shortage.pdf
Ideals of being Data-Driven
• be curious (seek for evidence)
• be active (test, don’t just observe and analyse)
• be Bayesian (understand uncertainties)
• be courageous (act on the evidence)
• be agile (learn, fail fast… but not too fast: collect enough evidence)
• be transparent and helpful (show and share information, co-operate)
• be truthful and non-political (don’t abuse data, work across silos)
• be wise (there is a time to be data-driven and a time to be intuitive)
Culture 	

eats strategy 	

for breakfast 	

attributed to P. Drucker, popularised by M. Fields
Lecture @AaltoBIZ, Johan Himberg, 2015
Read more
References & Suggested reading
• Brynjolfsson, Erik and Hitt, Lorin M. and Kim, Heekyung Hellen, Strength in Numbers: How Does
Data-Driven Decisionmaking Affect Firm Performance? (April 22, 2011). Available at SSRN:http://
ssrn.com/abstract=1819486 or http://dx.doi.org/10.2139/ssrn.1819486
• T. Davenport, J. G. Harris, R. Morison. Analytics at Work – Smarter Decisions, Better Results
• I disagree with some cultural issues, but a good overview
• A. Croll B. Yoskovitz. Lean Analytics – Use Data to Build a Better Startup Faster
• data-driven thinking is crucial for start-ups
• Ford case
• http://blog.revolutionanalytics.com/2014/11/ford-uses-r-for-data-driven-decision-making.html
• http://dataconomy.com/how-ford-uses-data-science-past-present-and-future/
• https://www.informs.org/About-INFORMS/News-Room/Press-Releases/INFORMS-Prize-2013-
Ford
• http://adage.com/article/datadriven-marketing/ford-names-chief-data-analytics-officer/296383/
Lecture @AaltoBIZ, Johan Himberg, 2015

More Related Content

PDF
Lecture on Data Science in a Data-Driven Culture
PDF
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
PPTX
Valuing the data asset
PDF
International Institute for Analytics at The Chief Analytics Officer Forum, E...
PDF
State Farm presentation at the Chief Analytics Officer Forum East Coast USA (...
PDF
Data Analytics Strategy
PPTX
Building an Effective Organizational Analytics Capability
PDF
Dept of Homeland Security presentation at the Chief Analytics Officer Forum E...
Lecture on Data Science in a Data-Driven Culture
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Valuing the data asset
International Institute for Analytics at The Chief Analytics Officer Forum, E...
State Farm presentation at the Chief Analytics Officer Forum East Coast USA (...
Data Analytics Strategy
Building an Effective Organizational Analytics Capability
Dept of Homeland Security presentation at the Chief Analytics Officer Forum E...

What's hot (20)

PDF
Jones Lang Lasalle at The Chief Analytics Officer Forum, Europe
PDF
Gramener Insight as Stories
PDF
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
PDF
Building a Winning Roadmap for Analytics
PPTX
Marketers Flunk The Big Data Text
PDF
Just Giving at The Chief Analytics Officer Forum, Europe
PPTX
Enabling Success With Big Data - Driven Talent Acquisition
PDF
Bridgei2i Analytics Solutions Introduction
PDF
Self-service Analytic for Business Users-19july2017-final
PPTX
Building Your Big Data Analytics Strategy- Impetus Webinar
PPTX
Big Data : From HindSight to Insight to Foresight
PPTX
Be Digital or Die - Big Data in Financial Services
PPTX
Empowering Success With Big Data-Driven Talent Acquisition
PPTX
Why Everything You Know About bigdata Is A Lie
PDF
People Analytics
PDF
Understanding big data and data analytics-Business Intelligence
PDF
Next Generation Business Analytics Technology Trends
PDF
Unlocking the Value of Big Data (Innovation Summit 2014)
PPSX
Recession Proofing With Data : Webinar
PDF
How to get started in extracting business value from big data 1 of 2 oct 2013
Jones Lang Lasalle at The Chief Analytics Officer Forum, Europe
Gramener Insight as Stories
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
Building a Winning Roadmap for Analytics
Marketers Flunk The Big Data Text
Just Giving at The Chief Analytics Officer Forum, Europe
Enabling Success With Big Data - Driven Talent Acquisition
Bridgei2i Analytics Solutions Introduction
Self-service Analytic for Business Users-19july2017-final
Building Your Big Data Analytics Strategy- Impetus Webinar
Big Data : From HindSight to Insight to Foresight
Be Digital or Die - Big Data in Financial Services
Empowering Success With Big Data-Driven Talent Acquisition
Why Everything You Know About bigdata Is A Lie
People Analytics
Understanding big data and data analytics-Business Intelligence
Next Generation Business Analytics Technology Trends
Unlocking the Value of Big Data (Innovation Summit 2014)
Recession Proofing With Data : Webinar
How to get started in extracting business value from big data 1 of 2 oct 2013
Ad

Similar to Lecture notes on being Data-Driven and doing Data Science (20)

PDF
Data-Driven Organisation
PPTX
GoDataDriven & Xebia: Jurriaan Bernson & Giovanni Lanzani
PDF
MIT report: How data analytics and machine learning reap competitive advantage.
PDF
Creating a Data-Driven Organization (Data Day Seattle 2015)
PDF
Creating a Data-Driven Organization -- thisismetis meetup
PDF
Training Taster: Leading the way to become a data-driven organization
PDF
Winning with a data-driven strategy
PDF
Creating a Data-Driven Organization, Data Day Texas, January 2016
PDF
Creating a Data-Driven Organization, Crunchconf, October 2015
PDF
What it really takes to become a data driven marketing organization
PDF
The 7 Habits of Effective Data Driven Companies
PDF
Chief data-officers-guide-on-transforming-to-a-data-driven-organization
PPTX
Why Businesses Need Data To Make Better Decisions
PPTX
Smart Data Module 4 d drive_business models
PPTX
DataOps: Nine steps to transform your data science impact Strata London May 18
PDF
Big and Fast Data: The Rise of Insight-Driven Business
PDF
AdTechBLR_HowToMakeDataActionable
PPTX
Making advanced analytics work for you
PPTX
Those Who Rule The Data, Rule The World
PDF
Agile Data Strategy and Lean Execution
Data-Driven Organisation
GoDataDriven & Xebia: Jurriaan Bernson & Giovanni Lanzani
MIT report: How data analytics and machine learning reap competitive advantage.
Creating a Data-Driven Organization (Data Day Seattle 2015)
Creating a Data-Driven Organization -- thisismetis meetup
Training Taster: Leading the way to become a data-driven organization
Winning with a data-driven strategy
Creating a Data-Driven Organization, Data Day Texas, January 2016
Creating a Data-Driven Organization, Crunchconf, October 2015
What it really takes to become a data driven marketing organization
The 7 Habits of Effective Data Driven Companies
Chief data-officers-guide-on-transforming-to-a-data-driven-organization
Why Businesses Need Data To Make Better Decisions
Smart Data Module 4 d drive_business models
DataOps: Nine steps to transform your data science impact Strata London May 18
Big and Fast Data: The Rise of Insight-Driven Business
AdTechBLR_HowToMakeDataActionable
Making advanced analytics work for you
Those Who Rule The Data, Rule The World
Agile Data Strategy and Lean Execution
Ad

Recently uploaded (20)

PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
annual-report-2024-2025 original latest.
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Introduction to machine learning and Linear Models
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Computer network topology notes for revision
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Reliability_Chapter_ presentation 1221.5784
Business Acumen Training GuidePresentation.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
annual-report-2024-2025 original latest.
Clinical guidelines as a resource for EBP(1).pdf
Introduction to machine learning and Linear Models
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Computer network topology notes for revision
Introduction-to-Cloud-ComputingFinal.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
.pdf is not working space design for the following data for the following dat...
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
The THESIS FINAL-DEFENSE-PRESENTATION.pptx

Lecture notes on being Data-Driven and doing Data Science

  • 1. Data-Driven? Data Science? ! Lecture (part 1) @AaltoBIZ, Feb 2, 2015 by Johan Himberg Data Scientist, @ReaktorNow
  • 2. Data contains information • Data may contain information • Information gives the capacity for making beneficial decisions • compare with “energy gives the capacity of doing work” or “information is the currency of decision making” • …yes, this is not the formal definition of “information” Lecture @AaltoBIZ, Johan Himberg, 2015
  • 3. Data • You can’t work with “only (big) data” • You need prior assumptions and models to gain information from data. • Think of the famous example on • ice cream • drowning, and • temperature Lecture @AaltoBIZ, Johan Himberg, 2015
  • 4. Data-Driven: Probability & Empiricism “Data driven means that progress in an activity is compelled by data rather than by intuition or personal experience. It is often labeled as the business jargon for what scientists call evidence based decision making” Wikipedia 2015-02-02 “I take risks, sometimes patients die. But not taking risks causes more patients to die, so I guess my biggest problem is I've been cursed with the ability to do the math.” fictional character Dr. House in Fox television series “House”
  • 5. Data-Driven • Being “data-driven” is an old concept … and very trendy • “Digitalisation” • “Big Data” • Data Science • Business acumen [“what for”] • Operations Research [“optimal decisions”] • Probability theory [“how to handle uncertainties”] • Analytics [“insight”] • Computer Science [“how to implement all that”] Lecture @AaltoBIZ, Johan Himberg, 2015
  • 6. Ideals of being Data-Driven • be curious (seek for evidence) • be active (test, don’t just observe and analyse) • be Bayesian (understand uncertainties) • be courageous (act on the evidence) • be agile (learn, fail fast… but not too fast: collect enough evidence) • be transparent and helpful (show and share information, co-operate) • be truthful and non-political (don’t abuse data, work across silos) • be wise (there is a time to be data-driven and a time to be intuitive) Culture eats strategy for breakfast attributed to P. Drucker, popularised by M. Fields Lecture @AaltoBIZ, Johan Himberg, 2015
  • 7. Why
  • 8. Why? • Data business • eg. Google, Facebook • Operational and Strategic aspects • case Ford Lecture @AaltoBIZ, Johan Himberg, 2015
  • 9. Data business • Sell • audiences (Google, Facebook, media, …) • information (credit rating, car register,…) Lecture @AaltoBIZ, Johan Himberg, 2015
  • 10. Case Ford • In 2006 closed the year with a $12.6 billion loss, the largest in the company’s history. Alan Mulally CEO 2006 • “top-down data-driven culture, innovative data science techniques” • profitable again in 3 years • INFORMS Prize Winner, 2013, Best Company of the Year in Analytics, Operations Research: “Analytical tools and the operations research team supported many decisions in this period, and a number of critical applications were developed: • a dealer vehicle recommendation system • a detailed econometric model enabling the study of what-if analyses of inventory, production, pricing, and sales • a strategic sourcing model to restructure the Ford auto interiors division” Lecture @AaltoBIZ, Johan Himberg, 2015
  • 11. Brynjolfson et al (2011) on Data-Driven • survey data on the business practices and IT investments of 179 large, publicly traded companies • Firms that emphasise “data driven decision making” • have output and productivity that is 5-6% higher than what would be expected given other investments and IT usage. • relationship also appears in asset utilisation, return on equity and market value • statistical analysis suggests that this does not appear to be due to reverse causality (!) Lecture @AaltoBIZ, Johan Himberg, 2015
  • 12. Operations • Favour beneficial events: target the operations and marketing, cross- sell, up-sell, … • Avoid non-beneficial events: churn, people leaving, waste, credit loss, fraud, system failures • Optimize: work force, schedules, prices, stocks, relevancy, production quality • Rationalise: process efficiency, lead times, handle complexity, search time … • Understand: master data, transactions, processes • internally: ERP, CRM, HR, sales systems, production, … • externally: location, routes, weather, demographics, estates, …
  • 13. Strategic • Efficiency and competition • react faster, streamlined decision making, risk awareness • more financial efficiency • not giving the information advance to competitors (cf. efficient markets, warfare) • innovations • Well-informed strategic decisions • understanding and predicting customer groups, behaviorm and experience; product and service development • understanding and predicting world events, economics, demographics, …: react to market fluctuations, changes in financial environment • Brand aspects • transparency, objectivity, personalisation as a part of company culture and brand Lecture @AaltoBIZ, Johan Himberg, 2015
  • 14. From business problem to predictive / prescriptive model
  • 15. CRISP-DM and Plan-Do-Check-Act • CRISP-DM (EU consortium ~1996-2008) • Compare to PDCA popularised by W. E. Deming • Deming is attributed the quote “In God we trust; all others must bring data.” • Comments • Think first how to deploy, not last • Don’t plan too big things up-front (the processes & data & test results might ruin your plan • Keep backlog and communicate, but don’t stuck into “understanding” and “insights” if they are not the main task • You can’t make final evaluation on data before “deployment” (remember: be empirical) • You should deploy several tests on the field before “final” evaluation • Do not silo yourselves according to the boxes in the cycle! • Learn continuously. Be truthful and curious. P C D A Lecture @AaltoBIZ, Johan Himberg, 2015
  • 16. Action ! optimize decide deploy ! Data ! big, small, open local, web, meta, … ! Information ! report visualize model Businessdrivers challenge 1 challenge 2 challenge 3 challenge 4 challenge 5 For example • Automatised decisions; recommendation, targeting • Simulation • prescriptive, predictive modelling For example • documentation on meaning of the data • KPIs, profiles, segments, factors, DW dashboards • descriptive, diagnostic, predictive modelling For example • source integrations • Extract - Load - Transform • Metadata • modelling for cleansing & consistency modelling what are the actions what are the insights wrangling what data means testing what is the impact Think & plan from deployment to data Pick a challenge! Lecture @AaltoBIZ, Johan Himberg, 2015 Small note: This is not a guideline for IT or enterprise architecture, which is an important question, too. Architects may benefit from the observations collected during data-driven work, though.
  • 17. Action ! DataInformation Businessdrivers challenge 1 start from here! challenge 3 challenge 4 challenge 5 For example • Business: need optimising for customer retention • Marketing: we could start with special offer by SMS • Data Scientist: we’ll set up test & control groups! For example • M: some past campaign results & execution… • Solution expret: Field ZPOR means revenue per unit and it is calculated based on … • Data Base adm : Source X in DW is aggregated on monthly level • DS: let’s have historical data on X and validate model For example • DBA: we have X for 1M users for 1 yr fields a,b,c • DS: field c seems suspicious, we’ll try to correct it modelling what are the actions what are the insights wrangling what data means testing what is the impact Data-Driven is inherently iterative and benefits from agility. Data and processes are often not like assumed. Be curious, keep backlog, inspect, adapt. Lecture @AaltoBIZ, Johan Himberg, 2015
  • 18. Action ! DataInformation Businessdrivers challenge 1 challenge 2 challenge 3 challenge 4 challenge 5 For example • deploy campaign, collect responses For example • calibrate & apply model For example • get data for modeling • store results modelling what are the actions what are the insights wrangling what data means testing what is the impact Execute based on model, collect data results
  • 19. Action ! DataInformation Businessdrivers challenge 1 challenge 2 challenge 3 challenge 4 challenge 5 Backlog example • test & control group handling in marketing automation • Involve N.N. to the process Backlog example • define new information source • Look for a new data source for determining income on zip code areas • correct documentation • automatization for the campaign modelling Backlog example • better system configuration & architecture • automatization for the campaign process… • new data: record information on all campaigns modelling what are the actions what are the insights wrangling what data means testing what is the impact Information path focused backlog Lecture @AaltoBIZ, Johan Himberg, 2015
  • 20. Aim - Explore - Exploit • Röntgen and Fleming (Nobel laureates) • their great findings were “accidental”, but • they were skilled scientists doing disciplined research for some other aim • Aim - explore - exploit • Always aim at something specific but be open-minded and curious; insights come along with the process. • Explore occasionally “from data to insights”. But don’t overdo exploration. • If you find something interesting, make a disciplined test and exploit the finding. Lecture @AaltoBIZ, Johan Himberg, 2015
  • 21. Tech
  • 22. Technology? • Variation is big: a combination of • business • data and information • decision type • deployment (actions) • Things evolve rapidly Lecture @AaltoBIZ, Johan Himberg, 2015
  • 23. Technology? • Prefer systems • that give mass-access to historical, transactional data on individual level instead of just aggregates (avoid being blinded by averages) • from which you’ll get the data, transformations, and results out to another system (avoid being “data hostage”) • where you see what the analytics actually does at least on modular level (avoid being “method hostage”) Prefer being able to see the actual implementation (open source). Lecture @AaltoBIZ, Johan Himberg, 2015
  • 24. Technology? • We have used • R, ggplot2, Shiny, … • Apache Spark • Python • cloud based solutions • required proprietary products, if needed • a specific task (don’t reinvent all wheels) • dedicated hardware, if critical or confidential… • How to document the process of data transformation. That’s a question! Lecture @AaltoBIZ, Johan Himberg, 2015
  • 26. Data Science is cross-functional • Data scientists main tasks are in methods, but also in processes and machinery of • making evidence based decisions (automated if possible) • finding out confidence on the outcome (by active tests if possible) • getting insights based on models and data • Data science / data scientist act also as a “glue” Lecture @AaltoBIZ, Johan Himberg, 2015
  • 27. Cross-functional teams • Doing data-driven work and data science in any organisation model boils down to “Involve everyone along the information path” [that was the red, crooked line in one of the previous slides] Lecture @AaltoBIZ, Johan Himberg, 2015
  • 28. Team & skills • A change of culture; information is everybody’s business • Business / Marketing / Finance specialists • Project / Process / Solution owners • Research • Data Stewards / DB administration • Developers • Visualization / UX experts • … • One data scientist can’t excel all of that but should be knowledgeable enough to work with everyone along the information path Lecture @AaltoBIZ, Johan Himberg, 2015
  • 29. Data Science skills • There is no “one” definition for Data Science or the skills • Data science is a combination of business acumen, statistics, data mining, DBs, big data, machine learning, computer science, etc. See for example: • http://www.oralytics.com/2012/06/data-science-is- multidisciplinary.html • http://www.oralytics.com/2013/03/type-i-and-type-ii-data- scientists.html • You should team up anyway, the more the merrier, to find all relevant skills. A view to this: • http://www.accenture.com/SiteCollectionDocuments/PDF/Accenture- Team-Solution-Data-Scientist-Shortage.pdf
  • 30. Ideals of being Data-Driven • be curious (seek for evidence) • be active (test, don’t just observe and analyse) • be Bayesian (understand uncertainties) • be courageous (act on the evidence) • be agile (learn, fail fast… but not too fast: collect enough evidence) • be transparent and helpful (show and share information, co-operate) • be truthful and non-political (don’t abuse data, work across silos) • be wise (there is a time to be data-driven and a time to be intuitive) Culture eats strategy for breakfast attributed to P. Drucker, popularised by M. Fields Lecture @AaltoBIZ, Johan Himberg, 2015
  • 32. References & Suggested reading • Brynjolfsson, Erik and Hitt, Lorin M. and Kim, Heekyung Hellen, Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance? (April 22, 2011). Available at SSRN:http:// ssrn.com/abstract=1819486 or http://dx.doi.org/10.2139/ssrn.1819486 • T. Davenport, J. G. Harris, R. Morison. Analytics at Work – Smarter Decisions, Better Results • I disagree with some cultural issues, but a good overview • A. Croll B. Yoskovitz. Lean Analytics – Use Data to Build a Better Startup Faster • data-driven thinking is crucial for start-ups • Ford case • http://blog.revolutionanalytics.com/2014/11/ford-uses-r-for-data-driven-decision-making.html • http://dataconomy.com/how-ford-uses-data-science-past-present-and-future/ • https://www.informs.org/About-INFORMS/News-Room/Press-Releases/INFORMS-Prize-2013- Ford • http://adage.com/article/datadriven-marketing/ford-names-chief-data-analytics-officer/296383/ Lecture @AaltoBIZ, Johan Himberg, 2015