SlideShare a Scribd company logo
CAUSAL INFERENCE
for FUN and PROFIT
@mcfunley Data Natives Berlin 2018
CAUSAL INFERENCE
for FUN and PROFIT
@mcfunley Data Natives Berlin 2018
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
xkcd.com/208

please give him all your money
Trying science:
Previously, at
datadriven.club
iterative.club
experimentcalculator.com
xkcd.com/208

please give him all your money
Trying science:
Kind of complicated, tbh
xkcd.com/208

please give him all your money
Trying science:
Kind of complicated, tbh

I have now done this twice
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Experiments?
user
A
B
C
user
A
B
C
user
✨ MAGIC ✨
A
B
C
user
✨ MAGIC ✨
A
B
C
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Sources of chaos:
Sources of chaos:
1. It is tricky to do any of this correctly, turns out
Sources of chaos:
1. It is tricky to do any of this correctly, turns out

2. You are introducing the possibility that you will
unlaunch things
The idea had been spoken. And
the words wouldn’t go back after
they’d been uttered aloud.
Things to worry about
Correctness
Effectiveness
Adoption
Use a vendor
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Confidence, if constrained
and directed tastefully, is not
a bad thing.
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Things that are hard:
(non-exhaustive)
Counting things
Counting things
(that arrive quickly or in large numbers or both)
Bucketing
The Stats Model
The Stats Model
IMPORTANT CAVEAT
javascript testing is bad
IMPORTANT CAVEAT
vendors are independent
humans with motives
cool-testing-platform.io
Advanced settings you probably don’t need to worry about lol
90%
Significance level
It’s cool trust me
This is fineCancel
VENDOR MOTIVES:
align with high
experiment volume
Pavel Sokov
Ease into it
Gopal Vijayaraghavan

One theory on experiment volume
Gopal Vijayaraghavan

Individually small chances of
finding a real effect
The “loads of experiments” theory:
Knowing where to run experiments is a problem.
The “loads of experiments” theory:
Knowing where to run experiments is a problem.

Knowing where to run experiments is the biggest problem.
The “loads of experiments” theory:
Knowing where to run experiments is a problem.

Knowing where to run experiments is the biggest problem.

Running a lot of experiments is a way to find such a place.
The “loads of experiments” theory:
Knowing where to run experiments is a problem.

Knowing where to run experiments is the biggest problem.

Running a lot of experiments is a way to find such a place.

Running a lot of experiments is a good way to find such a place.
if bucket(user) and eligible(user):
return render(‘new.tpl’)
return render(‘old.tpl’)
derp
The biggest problem is subtle mistakes
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
CODE:
not immediately intuitive
STATS:
oh wow much worse
EXperIMENTpOWER
ODDS OF ANARCHY
ARE UNCORRELATED TO
The “loads of experiments” theory:
Knowing where to run experiments is a problem.

Knowing where to run experiments is the biggest problem.

Running a lot of experiments is a way to find such a place.

Running a lot of experiments is a good way to find such a place.
ship it
Idea
Build it
experiment?
ship it
Idea
Build it
experiment?
expensive
error prone
ship it
Idea
arithmetic?
Build it
experiment?
ship it
Idea
arithmetic?
Build it
experiment?
cheap
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Gopal Vijayaraghavan

Ok, once you know what you’re doing
Changing process
ship it
Idea
arithmetic?
Build it
experiment?
implications
class CogWidget {
// stuff
}
$w = new CogWidget();
class CogWidget {
// stuff
}
class CogWidget2 {
// stuff
// new stuff
}
$w = (bucket($user) == ‘control’) ?
new CogWidget() :
new CogWidget2();
“ ~
The problem with code re-use is that it
gets in the way of changing your mind
later on.
tef
https://programmingisterrible.com/post/139222674273/write-code-that-is-easy-to-delete-not-easy-to
class CogWidget {
// stuff
}
class CogWidget2 {
// stuff
// new stuff
}
$w = new CogWidget();
ship it
Idea
arithmetic?
Build it
experiment?
Cleanup
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
S M T W R F S
meetings
1 2 3 4 5 6 7
8 9 10 11 12 13 14
S M T W R F S
1 2 3 4 5 6 7
8 9 10 11 12 13 14
S M T W R F S
meetings
if bucket(user):
return render(‘new.tpl’)
return render(‘old.tpl’)
if has_no_edge_cases(user):
if bucket(user):
return render(‘new.tpl’)
return render(‘old.tpl’)
IT’S EASY
just change basically everything
about development and add an
entirely new technical discipline
OK NOT REALLY
maybe start by building a single
team with expertise
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
https://github.com/mcfunley/optimizely-changes
Concluding
Midway upon the journey of our life
I found myself within a forest dark
For the straightforward pathway had been lost
Correctness
YES NO
A cross-disciplinary base of
expertise for executing the full
lifecycle of product experiments
that can be built upon moving
forward.
A declarative YAML framework
feeding realtime metrics into a
Kafka cluster for self-adjusting
multi-arm bandit testing.
Effectiveness
Adoption
“ ~
Yes, it’s true that a team at Google
couldn’t decide between two blues,
so they’re testing 41 shades between
each blue to see which one performs
better…
Douglas Bowman
http://stopdesign.com/archive/2009/03/20/goodbye-google.html
CREDIBILITY
Comes from success as well as
avoiding egregious wastes of time
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Humility
@mcfunley
mcfunley@gmail.com
mcfunley.com

More Related Content

PPTX
Interact your wearable and an iot device
PDF
Agile Testing
PDF
Presentation: Philips
PDF
You've Got (Big) Data! Now What?
PPTX
A/B Testing at Pinterest: Building a Culture of Experimentation
PDF
20150409 nuvention analytics
PDF
Lecture 1 computing and algorithms
PDF
quant skillz beyond wall st: deriving value from large, non-financial datasets
Interact your wearable and an iot device
Agile Testing
Presentation: Philips
You've Got (Big) Data! Now What?
A/B Testing at Pinterest: Building a Culture of Experimentation
20150409 nuvention analytics
Lecture 1 computing and algorithms
quant skillz beyond wall st: deriving value from large, non-financial datasets

Similar to DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp (20)

PDF
Datascope: Designing your Data Viz - The (Iterative) Process
PPTX
Cracking the Facebook Coding Interview.pptx
PDF
GalvanizeU Seattle: Eleven Almost-Truisms About Data
PDF
N=10^9: Automated Experimentation at Scale
PDF
Data science is not Software Development and how Experiment Management can ma...
PDF
Software testing
PDF
Real Developers Don't Need Unit Tests
PDF
Conductrics bandit basicsemetrics1016
PPTX
Learn Your Way to AWESOME.
PDF
Building Agile & AI startups - Basic tips for Product Managers
PDF
Guide Controlled Experiments
PDF
The Rule of Three
PDF
Experiment Driven Design Workshop at Agile2018
PDF
ATDD for Web Apps
PDF
The Hitchhiker’s Guide to Kaggle
PDF
Computational thinking with Swift Playgrounds (Future Schools 2017)
PPTX
Machine Learning Experimentation at Sift Science
PPTX
Prompt_engineering_and_applications.pptx
PDF
Monkeys in Lab Coats: Applying Failure Testing Research @Netflix
PPTX
Integrating on premise Line Of Business applications with CRM Online
Datascope: Designing your Data Viz - The (Iterative) Process
Cracking the Facebook Coding Interview.pptx
GalvanizeU Seattle: Eleven Almost-Truisms About Data
N=10^9: Automated Experimentation at Scale
Data science is not Software Development and how Experiment Management can ma...
Software testing
Real Developers Don't Need Unit Tests
Conductrics bandit basicsemetrics1016
Learn Your Way to AWESOME.
Building Agile & AI startups - Basic tips for Product Managers
Guide Controlled Experiments
The Rule of Three
Experiment Driven Design Workshop at Agile2018
ATDD for Web Apps
The Hitchhiker’s Guide to Kaggle
Computational thinking with Swift Playgrounds (Future Schools 2017)
Machine Learning Experimentation at Sift Science
Prompt_engineering_and_applications.pptx
Monkeys in Lab Coats: Applying Failure Testing Research @Netflix
Integrating on premise Line Of Business applications with CRM Online
Ad

More from Dataconomy Media (20)

PDF
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
PDF
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
PDF
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
PDF
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
PPTX
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
PPTX
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
PPTX
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
PDF
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
PPTX
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
PDF
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
PPTX
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
PDF
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
PDF
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
PDF
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
PDF
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
PPTX
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
PDF
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
PPTX
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
PPTX
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
PPTX
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Ad

Recently uploaded (20)

PDF
Mega Projects Data Mega Projects Data
PPTX
modul_python (1).pptx for professional and student
PDF
Business Analytics and business intelligence.pdf
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPT
Predictive modeling basics in data cleaning process
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Managing Community Partner Relationships
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
IB Computer Science - Internal Assessment.pptx
Mega Projects Data Mega Projects Data
modul_python (1).pptx for professional and student
Business Analytics and business intelligence.pdf
STERILIZATION AND DISINFECTION-1.ppthhhbx
Data_Analytics_and_PowerBI_Presentation.pptx
[EN] Industrial Machine Downtime Prediction
Predictive modeling basics in data cleaning process
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Acceptance and paychological effects of mandatory extra coach I classes.pptx
annual-report-2024-2025 original latest.
IBA_Chapter_11_Slides_Final_Accessible.pptx
climate analysis of Dhaka ,Banglades.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Managing Community Partner Relationships
ISS -ESG Data flows What is ESG and HowHow
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
IB Computer Science - Internal Assessment.pptx

DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp