SlideShare a Scribd company logo
amshar@microsoft.com
1http://www.github.com/amit-sharma/causal-inference-tutorial
2
3
4
5
Use these correlations to make a predictive model.
Future Activity ->
f(number of friends, logins in past month)

6
7
8
9
10
11
12
13
14
15
16
17
18
19
Old Algorithm (A) New Algorithm (B)
50/1000 (5%) 54/1000 (5.4%)
20
Old Algorithm (A) New Algorithm (B)
10/400 (2.5%) 4/200 (2%)
Old Algorithm (A) New Algorithm (B)
40/600 (6.6%) 50/800 (6.2%)
0
2
4
6
8
Low-activity High-activity
CTR
Is Algorithm A better?
Old algorithm (A) New Algorithm
(B)
CTR for Low-
Activity users
10/400 (2.5%) 4/200 (2%)
CTR for High-
Activity users
40/600 (6.6%) 50/800 (6.2%)
Total CTR 50/1000 (5%) 54/1000 (5.4%)
21
22
Average comment length decreases over time.
23
But for each yearly cohort of users, comment length
increases over time.
24
25
26
27http://plato.stanford.edu/entries/causation-mani/
28http://plato.stanford.edu/entries/causation-counterfactual/
29
30
31
32
33
34
35
36
37
38
39
40
41Dunning (2002), Rosenzweig-Wolpin (2000)
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Does new Algorithm B increase CTR for recommendations on
Windows Store, compared to old algorithm A?
Does new Algorithm B increase CTR for recommendations on
Windows Store, compared to old algorithm A?
56
57
58
59
60
61
62
63
64
65
𝑷𝒓𝒐𝒑𝒆𝒏𝒔𝒊𝒕𝒚 𝑁𝑒𝑤𝐴𝑙𝑔𝑜 𝑈𝑠𝑒𝑟𝑖 = 𝑳𝒐𝒈𝒊𝒔𝒕𝒊𝒄(𝑎 𝑐𝑎𝑡1, 𝑎 𝑐𝑎𝑡2, … 𝑎 𝑐𝑎𝑡𝑛)
Compare CTR between users with the same propensity score.
66
67
68
69
Non-FriendsEgo Network
f5
u
f1
f4
f3f2
n5
u
n1
n4
n3n2
70
71
72
73http://tylervigen.com/spurious-correlations
74
http://www.github.com/amit-sharma/causal-inference-
tutorial
amshar@microsoft.com
75
https://www.github.com/amit-sharma/causal-inference-tutorial
76
77
78
79
80
81
> nrow(user_app_visits_A)
[1] 1,000,000
> length(unique(user_app_visits_A$user_id))
[1] 10,000
> length(unique(user_app_visits_A$product_id))
[1] 990
> length(unique(user_app_visits_A$category))
[1] 10
82
83
84
> user_app_visits_B = read.csv("user_app_visits_B.csv")
> naive_observational_estimate <- function(user_visits){
# Naive observational estimate
# Simply the fraction of visits that resulted in a recommendation click-
through.
est =
summarise(user_visits,
naive_estimate=sum(is_rec_visit)/length(is_rec_visit))
return(est)
}
> naive_observational_estimate(user_app_visits_A)
naive_estimate
[1] 0.200768
> naive_observational_estimate(user_app_visits_B)
naive_estimate
[1] 0.226467
85
86
> stratified_by_activity_estimate(user_app_visits_A)
Source: local data frame [4 x 2]
activity_level stratified_estimate
1 1 0.1248852
2 2 0.1750483
3 3 0.2266394
4 4 0.2763522
> stratified_by_activity_estimate(user_app_visits_B)
Source: local data frame [4 x 2]
activity_level stratified_estimate
1 1 0.1253469
2 2 0.1753933
3 3 0.2257211
4 4 0.2749867
87
> stratified_by_category_estimate(user_app_visits_A)
Source: local data frame [10 x 2]
category stratified_estimate
1 1 0.1758294
2 2 0.2276829
3 3 0.2763157
4 4 0.1239860
5 5 0.1767163
… … …
> stratified_by_category_estimate(user_app_visits_B)
Source: local data frame [10 x 2]
category stratified_estimate
1 1 0.2002127
2 2 0.2517528
3 3 0.3021371
4 4 0.1503150
5 5 0.1999519
… … …
88
89
90
91
92
> naive_observational_estimate(user_app_visits_A)
naive_estimate
[1] 0.200768
> ranking_discontinuity_estimate(user_app_visits_A)
discontinuity_estimate
[1] 0.121362
40% of app visits coming from recommendation click-
throughs are not causal.
Could have happened even without the
recommendation system.
93
94
95
amshar@microsoft.com

More Related Content

PPTX
Causal inference in online systems: Methods, pitfalls and best practices
PPTX
Data mining for causal inference: Effect of recommendations on Amazon.com
PPTX
Causal inference in data science
PPTX
Causal inference in practice
PPTX
Updated modifications to the HIPAA Privacy Rule
DOCX
(New) final exam for acc 421 all correct answers 100%
DOCX
(New) final exam for mgt 521 all correct answers 100%
PDF
Criminología i concepto objeto de estudio y entidad científica6
Causal inference in online systems: Methods, pitfalls and best practices
Data mining for causal inference: Effect of recommendations on Amazon.com
Causal inference in data science
Causal inference in practice
Updated modifications to the HIPAA Privacy Rule
(New) final exam for acc 421 all correct answers 100%
(New) final exam for mgt 521 all correct answers 100%
Criminología i concepto objeto de estudio y entidad científica6

Viewers also liked (9)

PPTX
Designing a Moodle Template Based on Quality Matters Standards
PPTX
The interplay of personal preference and social influence in sharing networks...
PPTX
Focusing the Course Design Process on Alignment
DOCX
(New) final exam for qnt 561 all correct answers 100%
PDF
Causal Inference, Reinforcement Learning, and Continuous Optimization
PPT
Using the verb have
PPTX
5 Characteristics of an Effective Recommendation
PPTX
5.3.5 causal inference in research
 
PPTX
Implant Supported Over Denture By Dr pitzak
Designing a Moodle Template Based on Quality Matters Standards
The interplay of personal preference and social influence in sharing networks...
Focusing the Course Design Process on Alignment
(New) final exam for qnt 561 all correct answers 100%
Causal Inference, Reinforcement Learning, and Continuous Optimization
Using the verb have
5 Characteristics of an Effective Recommendation
5.3.5 causal inference in research
 
Implant Supported Over Denture By Dr pitzak
Ad

Similar to From prediction to causation: Causal inference in online systems (11)

PPTX
causal_inference_extended_tutorial.pptx
PPTX
The Impact of Computing Systems | Causal inference in practice
PPTX
Measuring effectiveness of machine learning systems
PDF
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
PDF
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal Inference
PDF
Predictive Analytics with UX Research Data: Yes We Can!
PDF
The User Side of Personalization: How Personalization Affects the Users
PDF
Supercharge your AB testing with automated causal inference - Community Works...
PPTX
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
PDF
Business Optimization via Causal Inference
PDF
Causal reasoning and Learning Systems
causal_inference_extended_tutorial.pptx
The Impact of Computing Systems | Causal inference in practice
Measuring effectiveness of machine learning systems
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal Inference
Predictive Analytics with UX Research Data: Yes We Can!
The User Side of Personalization: How Personalization Affects the Users
Supercharge your AB testing with automated causal inference - Community Works...
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Business Optimization via Causal Inference
Causal reasoning and Learning Systems
Ad

More from Amit Sharma (14)

PPTX
Dowhy: An end-to-end library for causal inference
PPTX
Alleviating Privacy Attacks Using Causal Models
PPTX
DoWhy Python library for causal inference: An End-to-End tool
PPTX
Artificial Intelligence for Societal Impact
PPTX
Causal data mining: Identifying causal effects at scale
PPTX
Auditing search engines for differential satisfaction across demographics
PPTX
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
PPTX
Estimating the causal impact of recommender systems
PPTX
Predictability of popularity on online social media: Gaps between prediction ...
PPTX
Estimating influence of online activity feeds on people's actions
PPTX
Causal inference in practice: Here, there, causality is everywhere
PDF
The role of social connections in shaping our preferences
PDF
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
PDF
RSWEB 2013: A research platform for social recommendation
Dowhy: An end-to-end library for causal inference
Alleviating Privacy Attacks Using Causal Models
DoWhy Python library for causal inference: An End-to-End tool
Artificial Intelligence for Societal Impact
Causal data mining: Identifying causal effects at scale
Auditing search engines for differential satisfaction across demographics
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Estimating the causal impact of recommender systems
Predictability of popularity on online social media: Gaps between prediction ...
Estimating influence of online activity feeds on people's actions
Causal inference in practice: Here, there, causality is everywhere
The role of social connections in shaping our preferences
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
RSWEB 2013: A research platform for social recommendation

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
annual-report-2024-2025 original latest.
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Introduction to the R Programming Language
PPTX
Introduction to machine learning and Linear Models
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Reliability_Chapter_ presentation 1221.5784
annual-report-2024-2025 original latest.
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
oil_refinery_comprehensive_20250804084928 (1).pptx
IB Computer Science - Internal Assessment.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Supervised vs unsupervised machine learning algorithms
Introduction-to-Cloud-ComputingFinal.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
SAP 2 completion done . PRESENTATION.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Miokarditis (Inflamasi pada Otot Jantung)
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to the R Programming Language
Introduction to machine learning and Linear Models
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx

From prediction to causation: Causal inference in online systems