SlideShare a Scribd company logo
Causal Python
PyCon Israel 2021
Dr. Hanan Shteingart
Summary
Data +
Assumptions
Causal
Inference
Better
Decisions
Misson Impossible
A typical causal workshop… PyCon 2021
45 min!
https://github.com/amit-sharma/causal-inference-tutorial
Agenda
MOTIVATION THEORY PRACTISE
Motivation
The need to go beyond predictions
Most Important Business Question?
What ACTIONS should I take to maximize my KPIs?
ACTIONS
BUSINESS INTERVENTIONS
KPI(S)
OUTCOME(S) YOU CARE ABOUT.
Three Layers of Analytics
1. There are three types of analytic
questions
2. What business need is better
decisions (not better predictions)
3. “There is a gap between making
a prediction and making a
decision” - S. Athey 2017, Science.
Griffin, D. K. (2020).
Athey, S. (2017).
Bertsimas, D., & Kallus, N. (2020).
Prescriptive is Neglected
• Prescriptive methods seem to
be neglected
• What is the effect of doing an
action A?
• What is the optimal policy π to
maximize the KPI(s)?
https://www.kaggle.com/kaggle-survey-2020
Most $$$ in AI will be in 2 areas!
• Two main target markets:
• Marketing & Sales
• Supply-chain management and
manufacturing
• What’s common?
• Increase some KPI – 𝑅
• by doing some actions – 𝐴
• In some context - 𝑆
• Under some constraints - 𝐶
• Causal Inference and
Reinforcement Learning!
https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/most-of-ais-business-uses-will-be-in-two-areas
Don’t believe me
Don’t believe me 2
"causal inference
helps us provide a
better user
experience for
customers on
the Uber platform
"
"we rely on quasi-
experiments and
causal inference
methods, especially to
measure new
marketing and
advertising ideas."
"we analyze
marketing
campaigns and
the impact of app
preloads using a
fourth type of
observational
study format."
"figuring out
whether booking
an attraction
ticket increases
long term user
engagement"
"Leveraging a
market-level
approach to
measure landing
page effectiveness
on Airbnb"
(Difference in
difference)
Theory
Causal Inference 101
Summary
𝑃 𝑋
𝑃 𝑌|𝑋
𝑃 𝑌|𝑋, 𝑑𝑜 𝑇
−
Why is Causal Infernece Different then
Supervised Learning?
Causal Inference Main Concepts
Direceted Acyclic Causal
Graph (DACG) – truth
about what causes what
Causal Discovery
Find the causal dependce
given a dataset (otherwise
you need an expert)
𝑇 → 𝑌
Potential Outcomes
What would had happened if
(only) the treatment was set
to 𝑇
𝑌𝑡
= 𝑌 𝑑𝑜 𝑇 = 𝑡
Causal Inference
Find the average
treatment effect (ATE):
𝜏𝐴𝑇𝐸 = 𝐸(𝑌1
) − 𝐸(𝑌0
)
CATE/ITE/THE
What is the effect per unit?
𝜏ITE(X) = 𝐸 𝑌1
𝑋 − 𝐸(𝑌0
|𝑋)
Policy Evaluation
What is the value of policy
𝜋 𝑋 = Pr 𝑇 𝑋 ?
PE = 𝐸𝑇~𝜋 𝑋 (𝑌)
Policy Optimization
What is the best policy?
PO
= argmax
𝜋
𝐸𝑇~𝜋 𝑋 (𝑌)
What is the Fundumental Problem?
• Counterfactual is a missing data problem
• Play make belief with potential outcomes
https://www.bradyneal.com/causal-inference-course
𝒊 𝑻 𝒀 𝒀𝟏 𝒀𝟎 𝝉 = 𝒀𝟏 − 𝒀𝟎
1 0 0 ? 0 ?
2 1 1 1 ? ?
3 1 0 0 ? ?
4 0 1 ? 1 ?
5 0 1 ? 1 ?
treatment outcome potential outcomes Individual treatment effect
Quiz
• Exercise is known to reduce Cholesterol level
• You collected a medical dataset and plotted these variables against
each other
• What can explain this?
Confounders Create Bias in Effect Estimation
• Age is a confounder which effects both the treatment (Exercise) and
the outcome (Cholesterol)
• This creates a bias!
https://towardsdatascience.com/implementing-causal-inference-a-key-step-towards-agi-de2cde8ea599
Y
T
X
Y
T
X
Beyond Confounders
Red lines should not be accounted for.
Lederer et al., 2019
Identifiability
• The ability to estimate causal effect from observed data.
• If the following assumptions hold, then
𝐸 𝑌𝑎 𝑋 = 𝑥 = 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥)
𝐸 𝑌𝑎 = 𝐸𝑥(𝐸 𝑌𝑎 𝑋 )
1. Stable Unit Treatment Value Assumption (SUTVA)
for 𝑖 ≠ 𝑗: 𝐴𝑖 ⊥ 𝐴𝑗 and 𝑌𝑖 ⊥ 𝐴𝑗
1. Consistency
𝐴 = 𝑎 → 𝑌 = 𝑌𝑎
∀𝑎
2. Ignorability
𝑌0
, 𝑌1
⊥ 𝐴|𝑋
3. Positivity
𝑃 𝐴 = 𝑎 𝑋 = 𝑥 > 0 ∀𝑎, 𝑥
Not to be confused with
the law of total expectation
Quiz: which assumption is this?
What to control for (what is 𝑋)?
When the causal DAG is complicated Do-
calculus (Pearl) helps to do identification
• Input: DAG + Data
• Output: identification (a recipe of how
to estimate the effect)
Sacerdote, et al International journal of epidemiology 2012
Estimation Methods
1. Stratification – aggregate over stratas
If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then:
𝐸 𝑌𝑎
= ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥)
Estimation Methods
1. Stratification – aggregate over stratas
If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then:
𝐸 𝑌𝑎
= ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥)
2. Matching – find “tweens” in high dim
Propensity Score
• Define 𝐴 = 1 for treatment and 𝐴 = 0 for control, we will denote the
propensity score for subject 𝑖 by
𝜋𝑖 = Pr(𝐴 = 1|𝑋𝑖)
• propensity is a “balancing score”: meaning if we control/match for it,
we will get unbiased effect estimation
𝑃 𝑋 𝜋 𝑋 = 𝑝, 𝐴 = 1 = 𝑃 𝑋 𝜋 𝑋 = 𝑝, 𝐴 = 0
Estimation Methods
1. Stratification – aggregate over stratas
If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then:
𝐸 𝑌𝑎
= ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥)
2. Matching – find “tweens” in high dimension
3. Propensity Matching – find tweens in one dimension
Inverse Propensity Weighting
• 𝜋𝑖 = Pr(𝐴𝑖|𝑋 = 𝑥𝑖)
• 𝐴𝑇𝐸 = 𝐸 𝑌1 − 𝑌0 = ∑𝑌𝑖
𝐴𝑖−𝜋𝑖
𝜋𝑖 1−𝜋𝑖
It can be shown that IPTW and
standartization are equivalent
(Technical Point 2.3, see Appendix)
Estimation Methods
1. Stratification – aggregate over stratas
If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then:
𝐸 𝑌𝑎
= ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥)
2. Matching – find “tweens” in high dimension
3. Propensity Matching – find tweens in one dimension
4. IPTW - Inverse Propensity Treatment Weighting
Better Predictions ↛ Better Effect Estimation
Which model is more accurate?
Model A is more outcome accurate
Model B is more causal accurate
• “The effect of ads is positive and in small
companies it is twice the effect on large one” 1
0.5
0.25
-1
1
0.5
-1.5
-1
-0.5
0
0.5
1
1.5
small large
Uplift
in
CTR
Effect
True Effect Model A Model B
1
2
5
5.5
1.5 1.75
5.5
4.5
-0.5
0.5
3
3.5
-2
0
2
4
6
small small large large
untreated treated untreated treated
CTR
Outcome
true ROI (unknown) Model A Model B
Refutation (aka Model Validation)
Placebo Treatment
• Replace treatment with
a random variable.
Irrelevant Additional
Confounder
• Add a random common
cause variable.
Subset validation
• Remove a random
subset of the data.
Random Replace
• Random replace a
covariate with an
irrelevant variable.
Selection Bias
• Blackwell, 2013
Application example: Uplift Modeling
• E.g., Instead of predicting who will churn  predict who is most likely
to reduce churn due to treatment
• Steps:
1. Estimate CATE
2. Rank users according to expected effect size
3. The more you target the lower would be the marginal performance
(diminishing return)
• See CausalML
Summary – Supervised vs Causal Learning
Supervised Learning Causal Inference
Predicts outcome 𝑃(𝑌|𝑋) effect of change 𝑃(𝑌|𝑑𝑜 𝑋)
Assumption Passive observer Decision maker
Train-Test Equailly distributed Distribution shift
Validation Easy, via hold-out Fundamental challenge.
Better prediction is NOT better causal
estimation
Feature set Quantitative (over fit / under fit) Qualitative – could cause a bias in the
estimate
Domain
Knoweledge
Nice to have, deep neural network are
doing beyond humans without
Essential to make assumptions to avoid
pitfalls
For Who?
Practice
Causality in Python
Causality in Python https://vesoft-inc.github.io/github-statistics/
Like Scikit 10 years ago
Typical Stages in a Causal Project
1. Model – assumptions as a graph
(DAG)
If this is missing you can try causal
discovery methods
2. Identify – turn assumptions into
a list of what to control for
3. Estimate – use estimation
methods to estimate the effect
4. Refute – validate and check for
robustness
Let’s read some code
Appendix

More Related Content

PDF
J.P. Morgan: Consolidated-Full-Presentation.pdf
PPTX
Dialpad Company Presentation
PPTX
Consumer Decision Journeys
PPTX
Ecotourism in Romania - from concept and EETLS to regional sustainable develo...
PDF
Medical Cost Trend: Behind the Numbers 2017
PDF
Exploratory data analysis project
PPTX
R programming
DOCX
Binary Logistic Regression
J.P. Morgan: Consolidated-Full-Presentation.pdf
Dialpad Company Presentation
Consumer Decision Journeys
Ecotourism in Romania - from concept and EETLS to regional sustainable develo...
Medical Cost Trend: Behind the Numbers 2017
Exploratory data analysis project
R programming
Binary Logistic Regression

What's hot (7)

PDF
Autocorrelation
PDF
Measure What Matters: Making the Most of Metrics [True University 2017, mini-...
PDF
Dashboards By Function Powerpoint Presentation Slides
PPTX
Value Creation in SaaS Businesses
PPTX
The FDA and industry: A recipe for collaborating in the New Health Economy
PDF
Chapter8 Introduction to Estimation Hypothesis Testing.pdf
PPTX
3 data visualization
Autocorrelation
Measure What Matters: Making the Most of Metrics [True University 2017, mini-...
Dashboards By Function Powerpoint Presentation Slides
Value Creation in SaaS Businesses
The FDA and industry: A recipe for collaborating in the New Health Economy
Chapter8 Introduction to Estimation Hypothesis Testing.pdf
3 data visualization
Ad

Similar to Causality in Python PyCon 2021 ISRAEL (20)

PDF
Business Optimization via Causal Inference
PPTX
Dowhy: An end-to-end library for causal inference
PDF
Causally regularized machine learning
PPTX
DoWhy Python library for causal inference: An End-to-End tool
PPTX
Introduction tocausalinference april02_2020
PPT
Analytic Methods and Issues in CER from Observational Data
PDF
Some notes on Deep Causal Inference - Presentation
PDF
Causal Inference Introduction.pdf
PPTX
Causal inference for complex exposures: asking questions that matter, getting...
PDF
PyData Meetup Berlin 2017-04-19
PDF
Causality and Propensity Score Methods
PDF
Causal Inference in Data Science and Machine Learning
PDF
2019 PMED Spring Course - Preliminaries: Basic Causal Inference - Marie David...
PDF
IDS Impact, Innovation and Learning Workshop March 2013: Day 2, Paper session...
PDF
Are you better than a coin toss? - Richard Warbuton & John Oliver (jClarity)
PPTX
Machine Learning and Causal Inference
PPTX
Bayesian networks and the search for causality
PPTX
Causal Inference in Marketing
PDF
Supercharge your AB testing with automated causal inference - Community Works...
PDF
Shmueli
Business Optimization via Causal Inference
Dowhy: An end-to-end library for causal inference
Causally regularized machine learning
DoWhy Python library for causal inference: An End-to-End tool
Introduction tocausalinference april02_2020
Analytic Methods and Issues in CER from Observational Data
Some notes on Deep Causal Inference - Presentation
Causal Inference Introduction.pdf
Causal inference for complex exposures: asking questions that matter, getting...
PyData Meetup Berlin 2017-04-19
Causality and Propensity Score Methods
Causal Inference in Data Science and Machine Learning
2019 PMED Spring Course - Preliminaries: Basic Causal Inference - Marie David...
IDS Impact, Innovation and Learning Workshop March 2013: Day 2, Paper session...
Are you better than a coin toss? - Richard Warbuton & John Oliver (jClarity)
Machine Learning and Causal Inference
Bayesian networks and the search for causality
Causal Inference in Marketing
Supercharge your AB testing with automated causal inference - Community Works...
Shmueli
Ad

Recently uploaded (20)

PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPT
Introduction Database Management System for Course Database
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Transform Your Business with a Software ERP System
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
AI in Product Development-omnex systems
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
System and Network Administration Chapter 2
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PPTX
history of c programming in notes for students .pptx
Which alternative to Crystal Reports is best for small or large businesses.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
Introduction Database Management System for Course Database
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Transform Your Business with a Software ERP System
How to Choose the Right IT Partner for Your Business in Malaysia
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 41
AI in Product Development-omnex systems
Odoo Companies in India – Driving Business Transformation.pdf
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
System and Network Administration Chapter 2
Understanding Forklifts - TECH EHS Solution
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
2025 Textile ERP Trends: SAP, Odoo & Oracle
ManageIQ - Sprint 268 Review - Slide Deck
history of c programming in notes for students .pptx

Causality in Python PyCon 2021 ISRAEL

  • 1. Causal Python PyCon Israel 2021 Dr. Hanan Shteingart
  • 3. Misson Impossible A typical causal workshop… PyCon 2021 45 min! https://github.com/amit-sharma/causal-inference-tutorial
  • 5. Motivation The need to go beyond predictions
  • 6. Most Important Business Question? What ACTIONS should I take to maximize my KPIs? ACTIONS BUSINESS INTERVENTIONS KPI(S) OUTCOME(S) YOU CARE ABOUT.
  • 7. Three Layers of Analytics 1. There are three types of analytic questions 2. What business need is better decisions (not better predictions) 3. “There is a gap between making a prediction and making a decision” - S. Athey 2017, Science. Griffin, D. K. (2020). Athey, S. (2017). Bertsimas, D., & Kallus, N. (2020).
  • 8. Prescriptive is Neglected • Prescriptive methods seem to be neglected • What is the effect of doing an action A? • What is the optimal policy π to maximize the KPI(s)? https://www.kaggle.com/kaggle-survey-2020
  • 9. Most $$$ in AI will be in 2 areas! • Two main target markets: • Marketing & Sales • Supply-chain management and manufacturing • What’s common? • Increase some KPI – 𝑅 • by doing some actions – 𝐴 • In some context - 𝑆 • Under some constraints - 𝐶 • Causal Inference and Reinforcement Learning! https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/most-of-ais-business-uses-will-be-in-two-areas
  • 11. Don’t believe me 2 "causal inference helps us provide a better user experience for customers on the Uber platform " "we rely on quasi- experiments and causal inference methods, especially to measure new marketing and advertising ideas." "we analyze marketing campaigns and the impact of app preloads using a fourth type of observational study format." "figuring out whether booking an attraction ticket increases long term user engagement" "Leveraging a market-level approach to measure landing page effectiveness on Airbnb" (Difference in difference)
  • 13. Summary 𝑃 𝑋 𝑃 𝑌|𝑋 𝑃 𝑌|𝑋, 𝑑𝑜 𝑇 −
  • 14. Why is Causal Infernece Different then Supervised Learning?
  • 15. Causal Inference Main Concepts Direceted Acyclic Causal Graph (DACG) – truth about what causes what Causal Discovery Find the causal dependce given a dataset (otherwise you need an expert) 𝑇 → 𝑌 Potential Outcomes What would had happened if (only) the treatment was set to 𝑇 𝑌𝑡 = 𝑌 𝑑𝑜 𝑇 = 𝑡 Causal Inference Find the average treatment effect (ATE): 𝜏𝐴𝑇𝐸 = 𝐸(𝑌1 ) − 𝐸(𝑌0 ) CATE/ITE/THE What is the effect per unit? 𝜏ITE(X) = 𝐸 𝑌1 𝑋 − 𝐸(𝑌0 |𝑋) Policy Evaluation What is the value of policy 𝜋 𝑋 = Pr 𝑇 𝑋 ? PE = 𝐸𝑇~𝜋 𝑋 (𝑌) Policy Optimization What is the best policy? PO = argmax 𝜋 𝐸𝑇~𝜋 𝑋 (𝑌)
  • 16. What is the Fundumental Problem? • Counterfactual is a missing data problem • Play make belief with potential outcomes https://www.bradyneal.com/causal-inference-course 𝒊 𝑻 𝒀 𝒀𝟏 𝒀𝟎 𝝉 = 𝒀𝟏 − 𝒀𝟎 1 0 0 ? 0 ? 2 1 1 1 ? ? 3 1 0 0 ? ? 4 0 1 ? 1 ? 5 0 1 ? 1 ? treatment outcome potential outcomes Individual treatment effect
  • 17. Quiz • Exercise is known to reduce Cholesterol level • You collected a medical dataset and plotted these variables against each other • What can explain this?
  • 18. Confounders Create Bias in Effect Estimation • Age is a confounder which effects both the treatment (Exercise) and the outcome (Cholesterol) • This creates a bias! https://towardsdatascience.com/implementing-causal-inference-a-key-step-towards-agi-de2cde8ea599 Y T X Y T X
  • 19. Beyond Confounders Red lines should not be accounted for. Lederer et al., 2019
  • 20. Identifiability • The ability to estimate causal effect from observed data. • If the following assumptions hold, then 𝐸 𝑌𝑎 𝑋 = 𝑥 = 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥) 𝐸 𝑌𝑎 = 𝐸𝑥(𝐸 𝑌𝑎 𝑋 ) 1. Stable Unit Treatment Value Assumption (SUTVA) for 𝑖 ≠ 𝑗: 𝐴𝑖 ⊥ 𝐴𝑗 and 𝑌𝑖 ⊥ 𝐴𝑗 1. Consistency 𝐴 = 𝑎 → 𝑌 = 𝑌𝑎 ∀𝑎 2. Ignorability 𝑌0 , 𝑌1 ⊥ 𝐴|𝑋 3. Positivity 𝑃 𝐴 = 𝑎 𝑋 = 𝑥 > 0 ∀𝑎, 𝑥 Not to be confused with the law of total expectation
  • 22. What to control for (what is 𝑋)? When the causal DAG is complicated Do- calculus (Pearl) helps to do identification • Input: DAG + Data • Output: identification (a recipe of how to estimate the effect) Sacerdote, et al International journal of epidemiology 2012
  • 23. Estimation Methods 1. Stratification – aggregate over stratas If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then: 𝐸 𝑌𝑎 = ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥)
  • 24. Estimation Methods 1. Stratification – aggregate over stratas If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then: 𝐸 𝑌𝑎 = ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥) 2. Matching – find “tweens” in high dim
  • 25. Propensity Score • Define 𝐴 = 1 for treatment and 𝐴 = 0 for control, we will denote the propensity score for subject 𝑖 by 𝜋𝑖 = Pr(𝐴 = 1|𝑋𝑖) • propensity is a “balancing score”: meaning if we control/match for it, we will get unbiased effect estimation 𝑃 𝑋 𝜋 𝑋 = 𝑝, 𝐴 = 1 = 𝑃 𝑋 𝜋 𝑋 = 𝑝, 𝐴 = 0
  • 26. Estimation Methods 1. Stratification – aggregate over stratas If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then: 𝐸 𝑌𝑎 = ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥) 2. Matching – find “tweens” in high dimension 3. Propensity Matching – find tweens in one dimension
  • 27. Inverse Propensity Weighting • 𝜋𝑖 = Pr(𝐴𝑖|𝑋 = 𝑥𝑖) • 𝐴𝑇𝐸 = 𝐸 𝑌1 − 𝑌0 = ∑𝑌𝑖 𝐴𝑖−𝜋𝑖 𝜋𝑖 1−𝜋𝑖 It can be shown that IPTW and standartization are equivalent (Technical Point 2.3, see Appendix)
  • 28. Estimation Methods 1. Stratification – aggregate over stratas If 𝐸 𝑌 𝐴 = 𝑎, 𝑋 = 𝑥 = 𝐸 𝑌𝑎 𝑋 = 𝑥 , then: 𝐸 𝑌𝑎 = ∑𝑃 𝑋 𝐸(𝑌|𝐴 = 𝑎, 𝑋 = 𝑥) 2. Matching – find “tweens” in high dimension 3. Propensity Matching – find tweens in one dimension 4. IPTW - Inverse Propensity Treatment Weighting
  • 29. Better Predictions ↛ Better Effect Estimation Which model is more accurate? Model A is more outcome accurate Model B is more causal accurate • “The effect of ads is positive and in small companies it is twice the effect on large one” 1 0.5 0.25 -1 1 0.5 -1.5 -1 -0.5 0 0.5 1 1.5 small large Uplift in CTR Effect True Effect Model A Model B 1 2 5 5.5 1.5 1.75 5.5 4.5 -0.5 0.5 3 3.5 -2 0 2 4 6 small small large large untreated treated untreated treated CTR Outcome true ROI (unknown) Model A Model B
  • 30. Refutation (aka Model Validation) Placebo Treatment • Replace treatment with a random variable. Irrelevant Additional Confounder • Add a random common cause variable. Subset validation • Remove a random subset of the data. Random Replace • Random replace a covariate with an irrelevant variable. Selection Bias • Blackwell, 2013
  • 31. Application example: Uplift Modeling • E.g., Instead of predicting who will churn  predict who is most likely to reduce churn due to treatment • Steps: 1. Estimate CATE 2. Rank users according to expected effect size 3. The more you target the lower would be the marginal performance (diminishing return) • See CausalML
  • 32. Summary – Supervised vs Causal Learning Supervised Learning Causal Inference Predicts outcome 𝑃(𝑌|𝑋) effect of change 𝑃(𝑌|𝑑𝑜 𝑋) Assumption Passive observer Decision maker Train-Test Equailly distributed Distribution shift Validation Easy, via hold-out Fundamental challenge. Better prediction is NOT better causal estimation Feature set Quantitative (over fit / under fit) Qualitative – could cause a bias in the estimate Domain Knoweledge Nice to have, deep neural network are doing beyond humans without Essential to make assumptions to avoid pitfalls For Who?
  • 34. Causality in Python https://vesoft-inc.github.io/github-statistics/ Like Scikit 10 years ago
  • 35. Typical Stages in a Causal Project 1. Model – assumptions as a graph (DAG) If this is missing you can try causal discovery methods 2. Identify – turn assumptions into a list of what to control for 3. Estimate – use estimation methods to estimate the effect 4. Refute – validate and check for robustness

Editor's Notes

  • #12: Linkedin - https://hdsr.mitpress.mit.edu/pub/wjhth9tr/release/1 Netflix - https://netflixtechblog.com/reimagining-experimentation-analysis-at-netflix-71356393af21 Google – Optimize - https://www.blog.google/products/marketingplatform/analytics/new-ways-manage-and-measure-your-experiments-google-optimize/ Google – research CausalImpact - https://google.github.io/CausalImpact/CausalImpact.html, Alex D'amour https://research.google/pubs/pub47884/ AirBnb - https://medium.com/airbnb-engineering/experimentation-measurement-for-search-engine-optimization-b64136629760 Booking - https://booking.ai/pydata-amsterdam-2019-at-booking-com-198c2d3ae34a Uber - https://eng.uber.com/causal-inference-at-uber
  • #30: https://iyarlin.github.io/2019/05/20/causal-inference-bake-off-kaggle-style/