SlideShare a Scribd company logo
Applied Data Science
for monetization
pitfalls, common misconceptions, and novel approaches
Vladislav Grozin
Co-founder & Head of DS @ INCYMO.AI
Ph.D. @ Colorado University Boulder (USA)
Solve Problems in
Improving retention via gifts
2
Solve problems in:
AB testing new mechanics
3
Solve Problems in
Solve problems in:
Update roll-out analysis
4
Solve Problems in
Learn Today
- how and why common solutions fail
- how to do better: 3 tools you can use today!
NB: NO MATH!
Just some (criminally)
underused methods
(or tricks?... algorithms,
maybe?..) 5
Learn Today
7 years - DS products for e-commerce and other online
services
6
About me
7 years - DS products for e-commerce and other online
services
Academic background:
- 5 published papers;
- interest in new and applicable approached
7
About Me
7 years - DS products for e-commerce and other online
services
Academic background:
- 5 published papers;
- interest in new and applicable approached
… and amateur game developer (in past)!
8
About
About Me
About Me
7 years - DS products for e-commerce and other online
services
Academic background:
- 5 published papers;
- interest in new and applicable approached
… and amateur game developer (in past)!
Now switched to providing DS solution for games
and work @ Incymo.ai
9
About
Case 1
Improving retention via gifts
10
Case 1 / Setting
Imagine:
• want: prevent players from
churning
11
Case 1 / Setting
Imagine:
• want: prevent players from
churning
• idea: can do something to
retain them
12
Case 1 / Setting
Imagine:
• want: prevent players from
churning
• idea: can do something to
retain them
But which players should
receive these gifts to
maximize profits?
13
Case 1 / Setting
Imagine:
• want: prevent players from
churning
• idea: can do something to
retain them
But which players should
receive these gifts to
maximize profits?
14
Typical solution:
• build a churn prediction
model to estimate
probability that player
will leave
Case 1 / Pitfalls
Questions:
• What are we actually predicting?
• Player won’t return tomorrow? In a week? In 1 month?..
15
Case 1 / Pitfalls
Questions:
• What are we actually predicting?
• Player won’t return tomorrow? In a week? In 1 month?..
• What’s then?
• Send to players with p>90%?.. Or 75% < p <80%?..
16
Case 1 / Pitfalls
Questions:
• What are we actually predicting?
• Player won’t return tomorrow? In a week? In 1 month?..
• What’s then?
• Send to players with p>90%?.. Or 75% < p <80%?..
17
Hard to answer precisely!
Let’s handwave and say “it just works™”
Case 1 / Pitfalls
Semi-realistic assumptions:
1. Some users like freebie items, but churn a lot
18
Case 1 / Pitfalls
Semi-realistic assumptions:
1. Some users like freebie items, but churn a lot
2. We correctly identify them and send them gift items
19
Case 1 / Pitfalls
Semi-realistic assumptions:
1. Let’s say some users like freebie items, but churn a lot
2. We correctly identify them and send them gift items
3. It works, and they are retained. That’s recorded in the data.
20
Case 1 / Pitfalls
aaand after some time…
21
Case 1 / Pitfalls
aaand after some time…
1. The model is updated using new dataset
22
Case 1 / Pitfalls
aaand after some time…
1. The model is updated using new dataset
2. It observes that users with these traits do not churn
23
Case 1 / Pitfalls
aaand after some time…
1. The model is updated using new dataset
2. It observes that users with these traits do not churn
3. Now, model assigns them low churn probability
24
Case 1 / Pitfalls
aaand after some time…
1. The model is updated using new dataset
2. It observes that users with these traits do not churn
3. Now, model assigns them low churn probability
4. …and we don’t send them items anymore
25
Case 1 / Problem Core
There is a disjoint:
1. We want to improve retention
2. Collect behavior data
3. Predict churn probability
4. ????
5. Decide on which players
should receive freebie items
6. Enjoy retention
26
Case 1 / Problem Core
There is a disjoint:
1. We want to improve retention
2. Collect behavior data
3. Predict churn probability
4. ????
5. Decide on which players
should receive freebie items
6. Enjoy retention
What’s missing?
● We’d like to send to the most
susceptible users
● … that’s not measured by churn
probability
● we need change in probability
27
Case 1 / Problem Core
There is a disjoint:
1. We want to improve retention
2. Collect behavior data
3. Predict churn probability
4. ????
5. Decide on which players
should receive freebie items
6. Enjoy retention
What’s missing?
● We’d like to send to the most
susceptible users
● … that’s not measured by churn
probability
● we need change in probability
There is a gap between “predictions” and “decision making”28
• Estimate the change in probabilities after performing a change
• Plenty of implementations (scikit-uplift, pylift, …)
Case 1 / Solution / Uplift Models
*“p(churn)” is not quite the same as “p(churn|didn’t got a gift)”, ignoring details for now
29
Player Churn% ↓Churn% | if we send gift Prediction-based decision Uplift-based decision
A 95% 5% Send
B 90% 10% Send
C 85% 30% Send
20% Send
5%
1%
Case 1 / Conclusion
Uplift models:
• estimate gain from an action
• easy to use: implemented in most model packages
• more suitable than prediction-based decision-making
30
Case 1 / Conclusion
Uplift models:
• estimate gain from an action
• easy to use: implemented in most model packages
• more suitable than prediction-based decision-making
No one really needs predictions.
Applied Data Science is actually about data-guided
automated decision-making
31
Case 2
AB testing new mechanics
32
Case 2 / Setting
Imagine:
• New players underperform in
$$$
33
Case 2 / Setting
Imagine:
• New players underperform in
$$$
• We designed two “Welcome”
offer pop-ups
34
Case 2 / Setting
Imagine:
• New players underperform in
$$$
• We designed two “Welcome”
offer pop-ups
• We want to pick the best and
fine-tune parameters
35
Case 2 / Setting
Imagine:
• New players underperform in
$$$
• We designed two “Welcome”
offer pop-ups
• We want to pick the best and
fine-tune parameters
36
Typical solution:
• Run AB test:
“Welcome1” vs
“Welcome2” vs baseline
Case 2 / Setting
Imagine:
• New players underperform in
$$$
• We designed two “Welcome”
offer pop-ups
• We want to pick the best and
fine-tune parameters
37
Typical solution:
• Run AB test:
“Welcome1” vs
“Welcome2” vs baseline
• Allocate subgroups with
different parameter
values to pick the best
Case 2 / Pitfalls
• How should we run the test? When to stop?
38
Case 2 / Pitfalls
• How should we run the test? When to stop?
• Should we stop before the planned end
39
Case 2 / Pitfalls
• How should we run the test? When to stop?
• Should we stop before the planned end
• We are deliberately losing money during the test.
40
Case 2 / Pitfalls
• How should we run the test? When to stop?
• Should we stop before the planned end
• We are deliberately losing money during the test
• Will the chosen “best” weights will remain “best” forever?
41
Multi-armed bandits!
A well-known algorithm
Case 2 / A Better Approach
42
Multi-armed bandits!
Directly makes “decisions” from a set
Case 2 / A Better Approach
43
Multi-armed bandits!
Directly makes “decisions” from a set
by searching for best “decision”;
where best one is which gives most “value”.
Case 2 / A Better Approach
44
Multi-armed bandits!
Directly makes “decisions” from a set
by searching for best “decision”;
where best one is which gives most “value”.
Has been well-tested and applied widely in e-commerce,
marketing, etc.
Case 2 / A Better Approach
45
Case 2 / Multi-armed Bandits / Setting
Imagine: you enter a casino and see X slot machines
46
Case 2 / Multi-armed Bandits / Setting
- On using one (pulling “arm”), it may give you some $$
- The chance and amount of $$ is random
- No prior knowledge about distributions, but you “see” all machines
- Machines have different distributions that do not change
47
Case 2 / Multi-armed Bandits / Setting
- On using one (pulling “arm”), it may give you some $$
- The chance and amount of $$ is random
- No prior knowledge about distributions, but you “see” all machines
- Machines have different distributions that do not change
- You’d like to maximize total $$$
- Pulls are free
- Goal: cumulative regret metric
48
Case 2 / Multi-armed Bandits / Setting
Setting (last slide recap)
● X slot machines
● Using one => maybe $$$
○ Random chance and amount
○ Each machine has different
distributions
● Maximize total $$$ per pull
Example: 3 machines
$$$ distribution (unknown to you)
A (50%: 5$)
B (10%: 10..20$)
C (1%: 1000$)
Your pulled each arm twice:
A(0$, 5$)
B(0$, 15$)
C(0$, 0$)
Your next pull? 49
Thompson Sampling algorithm for Multi-armed bandit
problem:
1. Estimate $$$ distr. for each “arm”
2. Draw a sample from that distribution
3. Pull arm with largest sample
Case 2 / Multi-armed Bandits / Algo
50
Thompson Sampling algorithm for Multi-armed bandit
problem:
1. Estimate $$$ distr. for each “arm”
2. Draw a sample from that distribution
3. Pull arm with largest sample
Application to our problem:
Each arm = offer variant
Pull = assigning variant to a player and showing the offer
Reward = future money from that player
Case 2 / Multi-armed Bandits / Algo
51
Case 2 / Multi-armed Bandits / Algo
Application:
Each arm = offer variant (“Welcome 1”, “Welcome 2”)
Pull = assigning variant to a player, showing offer
Reward = future money from that player
52
Case 2 / Multi-armed Bandits / Algo
Advantages:
- Converges to best offer variant; and keeps
using it
53
Case 2 / Multi-armed Bandits / Algo
Advantages:
- Converges to best offer variant; and keeps
using it
- Does so in best number of trials (=very fast)
- Much better than running time-fixed AB
test
54
Case 2 / Multi-armed Bandits / Algo
Advantages:
- Converges to best offer variant; and keeps
using it
- Does so in best number of trials (=very fast)
- Much better than running time-fixed AB
test
- If we leave it running: adapts to player
cohort drifts
- Much much much better than re-running
AB tests
55
Case 2 / Conclusion
Multi-armed bandits:
• drop-in replacement for AB tests
• easy to implement; pick best solution faster
• it draws some assumptions about setting
(provided example meets them)
56
Case 2 / Conclusion
All models / machine decision-making make assumptions
(you’d better be explicit about that! are you?)
57
Case 2 / Conclusion
All models / machine decision-making make assumptions
(you’d better be explicit about that! are you?)
You shouldn’t cross-validate hyperparameters on users using AB
tests
58
Case 3
Update roll-out analysis
59
Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
60
Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
• It was rolled out on half of the players
61
Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
• It was rolled out on half of the players
• Week has passed
62
Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
• It was rolled out on half of the players
• Week has passed
• We want to check (and measure) how
much more money we are making
63
Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
• It was rolled out on half of the players
• Week has passed
• We want to check (and measure) how
much more money we are making
64
Dashboard says
+10% ARPU
Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
• It was rolled out on half of the players
• Week has passed
• We want to check (and measure) how
much more money we are making
65
Dashboard says
+10% ARPU
Rolling out on 100%
of player?..
Case 3 / Pitfalls 1
Dashboard says:
A: (5+3+5+4+3+5+4) = 29$
B: (4+5+3+5+4+5+5) = 32$
Lift = A/B
= 32$ / 29$
= +10% ARPU
Good!
Rolling out?..
66
Case 3 / Pitfalls 1 / ARPU Metric
● You can’t just add up ARPU across days!
Sum(ARPU per day) != Total ARPU for these days
67
Case 3 / Pitfalls 1 / ARPU Metric
68
Player Day1 Day2 Day3
A $1 - -
B $0 $0 $1
C - $1 $0
D - - $1
E - - $1
ARPU
(if we sum daily’s)
$1.30
Total $ $5.00
Players 5
ARPU
(the right one)
$1.00
Daily$ $1.00 $1.00 $3.00
Daily
ARPU
$0.50 $0.20 $0.60
● Let’s look at an example (“-” means “no visit”)
Case 3 / Pitfalls 2
Recalculated ARPU properly: +5% ARPU
Good now?
69
Case 3 / Pitfalls 2
Recalculated ARPU properly: +5% ARPU
Good now?
No! A single metric doesn’t tell the whole story and may
mislead
(imagine: we broke Android version, and fixed iOS one)
(there is a lot of ‘wrong things’
with this ‘approach’)
70
Case 3 / Pitfalls 2
Recalculated ARPU properly: +5% ARPU
Good now?
No! A single metric doesn’t tell the whole story and may
mislead.
Also, this +5% can be attributed to the randomness.
71
Case 3 / Pitfalls 2
Recalculated ARPU properly: +5% ARPU
Good now?
No! A single metric doesn’t tell the whole story and may
mislead.
Also, this +5% can be attributed to the randomness.
Solution: using confidence intervals.
But some metrics have no formula for confidence intervals…
72
Case 3 / Bootstrap/ Solution
An algorithm that calculates CI for any per-user metric!
73
Case 3 / Bootstrap / Solution
Steps:
1. Calculate X: an array of numbers, one per player; each value = user
metric.
($$$ for each active player in our case; 0$ if no purchases)
2. Repeat steps N times:
a) Resample X (as Y): make a new array (same size as X) by picking
random numbers from X (independently, without removing them)
b) Compute average of Y of that new array
It gives N different numbers (N averages of resampled X).
1. Compute percentiles of these N averages:
(5%, 50%, 95%). They are
(lower CI, average value, and upper CI) values of your metric
Voila!
74
Case 3 / Bootstrap / Solution
That’s ~10 lines of code.
Works with ANY per-user metric!
With that, we can decide on
whether we should roll-out.
Intuition:
Step 2a simulates “AB tests from the parallel
universe”, step 2b: computes its average
metric.
Steps: (last slide recap):
1. Calculate X with per-user
metrics
2. Repeat N times:
a) Resample X as Y
b) Compute avg(Y)
1. Percentiles (5%, 50%, 95%) of
Step 2b (it’s an array of length
N) is the average metric with CI
75
Case 3 / Conclusion
• Bootstrap algorithm: computes any CI in 10 lines of code
• ARPU is not additive (can’t just add two values)
• Using only one metric misleads (and disappoints)
76
Case 3 / Conclusion
• Bootstrap algorithm: computes any CI in 10 lines of code
• ARPU is not additive (can’t just add two values)
• Using only one metric misleads (and disappoints)
Metrics have assumptions behind them too!
Product decision based on a single number is under-
supported
77
Discussion
Update roll-out analysis
78
Game Dev & Data Science: right now
• Mobile games: hard to make money; and they are ad spam fest
• Plethora of usable data is collected, yet…
• No adoption of complex, and low adoption of simple
techniques/models
(PERSONAL SPECULATION!)
79
Game Dev & Data Science: right now
Low adoption?
Disbelief: many tried standard approaches to meager results.
Domain complexity: games are dynamic and highly reactive.
- that makes them hard to model; common tools are unsuitable.
Integration complexity: models may break control over gameplay (if
misused)
(PERSONAL SPECULATION!)
80
Game Dev & Data Science: right now
Low adoption?
Disbelief: many tried standard approaches to meager results.
Domain complexity: games are dynamic and highly reactive.
- that makes them hard to model; common tools are unsuitable.
Integration complexity: models may break control over gameplay (if
misused)
What’s good $$$ source? Focusing on paying few (“whales”) is a good
strategy. Unless you have good personalization tools -
- what about extracting 0.5$ from 50% of your non-paying players?
(PERSONAL SPECULATION!)
81
Case 1 “Improving retention via gifts”
1. Uplift modeling, a better alternative to predicting stuff
2. Decisions are more important than “just prediction”
82
What We’ve Learned Today
Case 1 “Improving retention via gifts”
1. Uplift modeling, a better alternative to predicting stuff
2. Decisions are more important than “just prediction”
Case 2: “AB testing new mechanics”:
1. Multi-armed bandits, a better alternative for AB
2. Algorithms have assumptions behind them
83
What We’ve Learned Today
What We’ve Learned Today
Case 1 “Improving retention via gifts”
1. Uplift modeling, a better alternative to predicting stuff
2. Decisions are more important than “just prediction”
Case 2: “AB testing new mechanics”:
1. Multi-armed bandits, a better alternative for AB
2. Algorithms have assumptions behind them
Case 3 “Update roll-out analysis”:
1. Bootstrap technique for easy confidence interval
2. Metrics have assumptions too; and may mislead
84
Personal Hypotheses
● Misuse of tools + hard domain >
> failed early experiments >
> distrust >
> low adoption
● Mobile games are overfitting to whales
85
INCYMO.AI
Join us!
Free blog on monetization:
86
LinkedIn:
Vlad Grozin
E-mail: vg@incymo.ai
Twitter/ Telegram: @rampeer

More Related Content

PPTX
(Almost) everything i know about testing i learned playing poker - Matt Eakin
PDF
bandits problems robert platt northeaster.pdf
PDF
PPTX
LPP FORMULATION 21 -22.pptx
PDF
Future of AI-powered automation in business
PDF
What to expect from investing in startups
DOCX
1 Review and Practice Exam Questions for Exam 2 Lea.docx
PDF
136 advanced a-b testing (anthony rindone)
(Almost) everything i know about testing i learned playing poker - Matt Eakin
bandits problems robert platt northeaster.pdf
LPP FORMULATION 21 -22.pptx
Future of AI-powered automation in business
What to expect from investing in startups
1 Review and Practice Exam Questions for Exam 2 Lea.docx
136 advanced a-b testing (anthony rindone)

Similar to Applied Data Science for monetization: pitfalls, common misconceptions, and novel approaches / Vladislav Grozin (INCYMO.AI) (20)

PPTX
160708 - Applied Behavioral Economics
DOC
STAT 200 Massive Success / snaptutorial.com
PDF
Disciplined Entrepreneurship: What can you do for your customer?
PDF
Kaggle "Give me some credit" challenge overview
PPT
LP1 as EEM Electrial Engineering Department.ppt
PPTX
Thats not my agile - Agile on the beach 2017
PDF
Unit I Econ
PPTX
Week14_Business Simulation Modeling MSBA.pptx
PDF
Probabilistic modeling in deep learning
PDF
Estimation Games – Pascal Van Cauwenberghe
PPT
Oligopoly game theory
PPTX
Scrum Coach : Estimation
PDF
Practical AI for Business: Bandit Algorithms
PDF
Brief History and Overview of LLM Agents
PPTX
bm_unit_1.7_organizational_planning_tools.pptx
PDF
How to Get Your First Paying Customers - Grace
DOCX
Applications of Rationalizability and Iterated DominanceEcon 4.docx
DOCX
Applications of Rationalizability and Iterated DominanceEcon 4.docx
PPT
Prinecomi lectureppt ch02
PPT
Churn Predictive Modelling
160708 - Applied Behavioral Economics
STAT 200 Massive Success / snaptutorial.com
Disciplined Entrepreneurship: What can you do for your customer?
Kaggle "Give me some credit" challenge overview
LP1 as EEM Electrial Engineering Department.ppt
Thats not my agile - Agile on the beach 2017
Unit I Econ
Week14_Business Simulation Modeling MSBA.pptx
Probabilistic modeling in deep learning
Estimation Games – Pascal Van Cauwenberghe
Oligopoly game theory
Scrum Coach : Estimation
Practical AI for Business: Bandit Algorithms
Brief History and Overview of LLM Agents
bm_unit_1.7_organizational_planning_tools.pptx
How to Get Your First Paying Customers - Grace
Applications of Rationalizability and Iterated DominanceEcon 4.docx
Applications of Rationalizability and Iterated DominanceEcon 4.docx
Prinecomi lectureppt ch02
Churn Predictive Modelling
Ad

More from DevGAMM Conference (20)

PPTX
The art of small steps, or how to make sound for games in conditions of war /...
PPTX
Breaking up with FMOD - Why we ended things and embraced Metasounds / Daniel ...
PPTX
How Audio Objects Improve Spatial Accuracy / Mads Maretty Sønderup (Audiokine...
PPTX
Why indie developers should consider hyper-casual right now / Igor Gurenyov (...
PPTX
AI / ML for Indies / Tyler Coleman (Retora Games)
PDF
Agility is the Key: Power Up Your GameDev Project Management with Agile Pract...
PPTX
New PR Tech and AI Tools for 2023: A Game Changer for Outreach / Kirill Perev...
PDF
Playable Ads - Revolutionizing mobile games advertising / Jakub Kukuryk (Popc...
PDF
Creative Collaboration: Managing an Art Team / Nastassia Radzivonava (Glera G...
PDF
From Local to Global: Unleashing the Power of Payments / Jan Kuhlmannn (Xsolla)
PDF
Strategies and case studies to grow LTV in 2023 / Julia Iljuk (Balancy)
PDF
Why is ASO not working in 2023 and how to change it? / Olena Vedmedenko (Keya...
PDF
How to increase wishlists & game sales from China? Growth marketing tactics &...
PDF
Turkish Gaming Industry and HR Insights / Mustafa Mert EFE (Zindhu)
PDF
Building an Awesome Creative Team from Scratch, Capable of Scaling Up / Sasha...
PPTX
Seven Reasons Why Your LiveOps Is Not Performing / Alexander Devyaterikov (Be...
PDF
The Power of Game and Music Collaborations: Reaching and Engaging the Masses ...
PPTX
Branded Content: How to overcome players' immunity to advertising / Alex Brod...
PPTX
Resurrecting Chasm: The Rift - A Source-less Remastering Journey / Gennadii P...
PPTX
How NOT to do showcase events: Behind the scenes of Midnight Show / Andrew Ko...
The art of small steps, or how to make sound for games in conditions of war /...
Breaking up with FMOD - Why we ended things and embraced Metasounds / Daniel ...
How Audio Objects Improve Spatial Accuracy / Mads Maretty Sønderup (Audiokine...
Why indie developers should consider hyper-casual right now / Igor Gurenyov (...
AI / ML for Indies / Tyler Coleman (Retora Games)
Agility is the Key: Power Up Your GameDev Project Management with Agile Pract...
New PR Tech and AI Tools for 2023: A Game Changer for Outreach / Kirill Perev...
Playable Ads - Revolutionizing mobile games advertising / Jakub Kukuryk (Popc...
Creative Collaboration: Managing an Art Team / Nastassia Radzivonava (Glera G...
From Local to Global: Unleashing the Power of Payments / Jan Kuhlmannn (Xsolla)
Strategies and case studies to grow LTV in 2023 / Julia Iljuk (Balancy)
Why is ASO not working in 2023 and how to change it? / Olena Vedmedenko (Keya...
How to increase wishlists & game sales from China? Growth marketing tactics &...
Turkish Gaming Industry and HR Insights / Mustafa Mert EFE (Zindhu)
Building an Awesome Creative Team from Scratch, Capable of Scaling Up / Sasha...
Seven Reasons Why Your LiveOps Is Not Performing / Alexander Devyaterikov (Be...
The Power of Game and Music Collaborations: Reaching and Engaging the Masses ...
Branded Content: How to overcome players' immunity to advertising / Alex Brod...
Resurrecting Chasm: The Rift - A Source-less Remastering Journey / Gennadii P...
How NOT to do showcase events: Behind the scenes of Midnight Show / Andrew Ko...
Ad

Recently uploaded (20)

PDF
Is Kanav Kesar Legit or a Scam? Uncovering the Truth Behind the Hype
PDF
E_Book_Customer_Relation_Management_0.pdf
PDF
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
PPTX
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
PDF
Ramjilal Ramsaroop || Trending Branding
PDF
Proven AI Visibility: From SEO Strategy To GEO Tactics
PDF
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
PDF
Building a strong social media presence.
PDF
How the Minnesota Vikings Used Community to Drive 170% Growth and Acquire 34K...
PDF
Coleção Nature .
PDF
NeuroRank™: The Future of AI-First SEO..
PPTX
PRINCIPLES OF MANAGEMENT and functions (1).pptx
PDF
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
PPTX
Ranking a Webpage with SEO (And Tracking It with the Right Attribution Type a...
PPTX
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
PDF
AI & Automation: The Future of Marketing or the End of Creativity - Eric Ritt...
PPTX
Solomon_Chapter 6_The Self: Mind, Gender, and Body.pptx
PPTX
Final Project parkville.............pptx
PDF
Future Retail Disruption Trends and Observations
PDF
Digital Marketing Agency in Thrissur with Proven Strategies for Local Growth
Is Kanav Kesar Legit or a Scam? Uncovering the Truth Behind the Hype
E_Book_Customer_Relation_Management_0.pdf
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
Ramjilal Ramsaroop || Trending Branding
Proven AI Visibility: From SEO Strategy To GEO Tactics
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
Building a strong social media presence.
How the Minnesota Vikings Used Community to Drive 170% Growth and Acquire 34K...
Coleção Nature .
NeuroRank™: The Future of AI-First SEO..
PRINCIPLES OF MANAGEMENT and functions (1).pptx
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
Ranking a Webpage with SEO (And Tracking It with the Right Attribution Type a...
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
AI & Automation: The Future of Marketing or the End of Creativity - Eric Ritt...
Solomon_Chapter 6_The Self: Mind, Gender, and Body.pptx
Final Project parkville.............pptx
Future Retail Disruption Trends and Observations
Digital Marketing Agency in Thrissur with Proven Strategies for Local Growth

Applied Data Science for monetization: pitfalls, common misconceptions, and novel approaches / Vladislav Grozin (INCYMO.AI)

  • 1. Applied Data Science for monetization pitfalls, common misconceptions, and novel approaches Vladislav Grozin Co-founder & Head of DS @ INCYMO.AI Ph.D. @ Colorado University Boulder (USA)
  • 2. Solve Problems in Improving retention via gifts 2
  • 3. Solve problems in: AB testing new mechanics 3 Solve Problems in
  • 4. Solve problems in: Update roll-out analysis 4 Solve Problems in
  • 5. Learn Today - how and why common solutions fail - how to do better: 3 tools you can use today! NB: NO MATH! Just some (criminally) underused methods (or tricks?... algorithms, maybe?..) 5 Learn Today
  • 6. 7 years - DS products for e-commerce and other online services 6 About me
  • 7. 7 years - DS products for e-commerce and other online services Academic background: - 5 published papers; - interest in new and applicable approached 7 About Me
  • 8. 7 years - DS products for e-commerce and other online services Academic background: - 5 published papers; - interest in new and applicable approached … and amateur game developer (in past)! 8 About About Me
  • 9. About Me 7 years - DS products for e-commerce and other online services Academic background: - 5 published papers; - interest in new and applicable approached … and amateur game developer (in past)! Now switched to providing DS solution for games and work @ Incymo.ai 9 About
  • 11. Case 1 / Setting Imagine: • want: prevent players from churning 11
  • 12. Case 1 / Setting Imagine: • want: prevent players from churning • idea: can do something to retain them 12
  • 13. Case 1 / Setting Imagine: • want: prevent players from churning • idea: can do something to retain them But which players should receive these gifts to maximize profits? 13
  • 14. Case 1 / Setting Imagine: • want: prevent players from churning • idea: can do something to retain them But which players should receive these gifts to maximize profits? 14 Typical solution: • build a churn prediction model to estimate probability that player will leave
  • 15. Case 1 / Pitfalls Questions: • What are we actually predicting? • Player won’t return tomorrow? In a week? In 1 month?.. 15
  • 16. Case 1 / Pitfalls Questions: • What are we actually predicting? • Player won’t return tomorrow? In a week? In 1 month?.. • What’s then? • Send to players with p>90%?.. Or 75% < p <80%?.. 16
  • 17. Case 1 / Pitfalls Questions: • What are we actually predicting? • Player won’t return tomorrow? In a week? In 1 month?.. • What’s then? • Send to players with p>90%?.. Or 75% < p <80%?.. 17 Hard to answer precisely! Let’s handwave and say “it just works™”
  • 18. Case 1 / Pitfalls Semi-realistic assumptions: 1. Some users like freebie items, but churn a lot 18
  • 19. Case 1 / Pitfalls Semi-realistic assumptions: 1. Some users like freebie items, but churn a lot 2. We correctly identify them and send them gift items 19
  • 20. Case 1 / Pitfalls Semi-realistic assumptions: 1. Let’s say some users like freebie items, but churn a lot 2. We correctly identify them and send them gift items 3. It works, and they are retained. That’s recorded in the data. 20
  • 21. Case 1 / Pitfalls aaand after some time… 21
  • 22. Case 1 / Pitfalls aaand after some time… 1. The model is updated using new dataset 22
  • 23. Case 1 / Pitfalls aaand after some time… 1. The model is updated using new dataset 2. It observes that users with these traits do not churn 23
  • 24. Case 1 / Pitfalls aaand after some time… 1. The model is updated using new dataset 2. It observes that users with these traits do not churn 3. Now, model assigns them low churn probability 24
  • 25. Case 1 / Pitfalls aaand after some time… 1. The model is updated using new dataset 2. It observes that users with these traits do not churn 3. Now, model assigns them low churn probability 4. …and we don’t send them items anymore 25
  • 26. Case 1 / Problem Core There is a disjoint: 1. We want to improve retention 2. Collect behavior data 3. Predict churn probability 4. ???? 5. Decide on which players should receive freebie items 6. Enjoy retention 26
  • 27. Case 1 / Problem Core There is a disjoint: 1. We want to improve retention 2. Collect behavior data 3. Predict churn probability 4. ???? 5. Decide on which players should receive freebie items 6. Enjoy retention What’s missing? ● We’d like to send to the most susceptible users ● … that’s not measured by churn probability ● we need change in probability 27
  • 28. Case 1 / Problem Core There is a disjoint: 1. We want to improve retention 2. Collect behavior data 3. Predict churn probability 4. ???? 5. Decide on which players should receive freebie items 6. Enjoy retention What’s missing? ● We’d like to send to the most susceptible users ● … that’s not measured by churn probability ● we need change in probability There is a gap between “predictions” and “decision making”28
  • 29. • Estimate the change in probabilities after performing a change • Plenty of implementations (scikit-uplift, pylift, …) Case 1 / Solution / Uplift Models *“p(churn)” is not quite the same as “p(churn|didn’t got a gift)”, ignoring details for now 29 Player Churn% ↓Churn% | if we send gift Prediction-based decision Uplift-based decision A 95% 5% Send B 90% 10% Send C 85% 30% Send 20% Send 5% 1%
  • 30. Case 1 / Conclusion Uplift models: • estimate gain from an action • easy to use: implemented in most model packages • more suitable than prediction-based decision-making 30
  • 31. Case 1 / Conclusion Uplift models: • estimate gain from an action • easy to use: implemented in most model packages • more suitable than prediction-based decision-making No one really needs predictions. Applied Data Science is actually about data-guided automated decision-making 31
  • 32. Case 2 AB testing new mechanics 32
  • 33. Case 2 / Setting Imagine: • New players underperform in $$$ 33
  • 34. Case 2 / Setting Imagine: • New players underperform in $$$ • We designed two “Welcome” offer pop-ups 34
  • 35. Case 2 / Setting Imagine: • New players underperform in $$$ • We designed two “Welcome” offer pop-ups • We want to pick the best and fine-tune parameters 35
  • 36. Case 2 / Setting Imagine: • New players underperform in $$$ • We designed two “Welcome” offer pop-ups • We want to pick the best and fine-tune parameters 36 Typical solution: • Run AB test: “Welcome1” vs “Welcome2” vs baseline
  • 37. Case 2 / Setting Imagine: • New players underperform in $$$ • We designed two “Welcome” offer pop-ups • We want to pick the best and fine-tune parameters 37 Typical solution: • Run AB test: “Welcome1” vs “Welcome2” vs baseline • Allocate subgroups with different parameter values to pick the best
  • 38. Case 2 / Pitfalls • How should we run the test? When to stop? 38
  • 39. Case 2 / Pitfalls • How should we run the test? When to stop? • Should we stop before the planned end 39
  • 40. Case 2 / Pitfalls • How should we run the test? When to stop? • Should we stop before the planned end • We are deliberately losing money during the test. 40
  • 41. Case 2 / Pitfalls • How should we run the test? When to stop? • Should we stop before the planned end • We are deliberately losing money during the test • Will the chosen “best” weights will remain “best” forever? 41
  • 42. Multi-armed bandits! A well-known algorithm Case 2 / A Better Approach 42
  • 43. Multi-armed bandits! Directly makes “decisions” from a set Case 2 / A Better Approach 43
  • 44. Multi-armed bandits! Directly makes “decisions” from a set by searching for best “decision”; where best one is which gives most “value”. Case 2 / A Better Approach 44
  • 45. Multi-armed bandits! Directly makes “decisions” from a set by searching for best “decision”; where best one is which gives most “value”. Has been well-tested and applied widely in e-commerce, marketing, etc. Case 2 / A Better Approach 45
  • 46. Case 2 / Multi-armed Bandits / Setting Imagine: you enter a casino and see X slot machines 46
  • 47. Case 2 / Multi-armed Bandits / Setting - On using one (pulling “arm”), it may give you some $$ - The chance and amount of $$ is random - No prior knowledge about distributions, but you “see” all machines - Machines have different distributions that do not change 47
  • 48. Case 2 / Multi-armed Bandits / Setting - On using one (pulling “arm”), it may give you some $$ - The chance and amount of $$ is random - No prior knowledge about distributions, but you “see” all machines - Machines have different distributions that do not change - You’d like to maximize total $$$ - Pulls are free - Goal: cumulative regret metric 48
  • 49. Case 2 / Multi-armed Bandits / Setting Setting (last slide recap) ● X slot machines ● Using one => maybe $$$ ○ Random chance and amount ○ Each machine has different distributions ● Maximize total $$$ per pull Example: 3 machines $$$ distribution (unknown to you) A (50%: 5$) B (10%: 10..20$) C (1%: 1000$) Your pulled each arm twice: A(0$, 5$) B(0$, 15$) C(0$, 0$) Your next pull? 49
  • 50. Thompson Sampling algorithm for Multi-armed bandit problem: 1. Estimate $$$ distr. for each “arm” 2. Draw a sample from that distribution 3. Pull arm with largest sample Case 2 / Multi-armed Bandits / Algo 50
  • 51. Thompson Sampling algorithm for Multi-armed bandit problem: 1. Estimate $$$ distr. for each “arm” 2. Draw a sample from that distribution 3. Pull arm with largest sample Application to our problem: Each arm = offer variant Pull = assigning variant to a player and showing the offer Reward = future money from that player Case 2 / Multi-armed Bandits / Algo 51
  • 52. Case 2 / Multi-armed Bandits / Algo Application: Each arm = offer variant (“Welcome 1”, “Welcome 2”) Pull = assigning variant to a player, showing offer Reward = future money from that player 52
  • 53. Case 2 / Multi-armed Bandits / Algo Advantages: - Converges to best offer variant; and keeps using it 53
  • 54. Case 2 / Multi-armed Bandits / Algo Advantages: - Converges to best offer variant; and keeps using it - Does so in best number of trials (=very fast) - Much better than running time-fixed AB test 54
  • 55. Case 2 / Multi-armed Bandits / Algo Advantages: - Converges to best offer variant; and keeps using it - Does so in best number of trials (=very fast) - Much better than running time-fixed AB test - If we leave it running: adapts to player cohort drifts - Much much much better than re-running AB tests 55
  • 56. Case 2 / Conclusion Multi-armed bandits: • drop-in replacement for AB tests • easy to implement; pick best solution faster • it draws some assumptions about setting (provided example meets them) 56
  • 57. Case 2 / Conclusion All models / machine decision-making make assumptions (you’d better be explicit about that! are you?) 57
  • 58. Case 2 / Conclusion All models / machine decision-making make assumptions (you’d better be explicit about that! are you?) You shouldn’t cross-validate hyperparameters on users using AB tests 58
  • 59. Case 3 Update roll-out analysis 59
  • 60. Case 3 / Setting • We released a bunch of updates: patched crash for iOS optimization for Android, and some gameplay changes for monetization 60
  • 61. Case 3 / Setting • We released a bunch of updates: patched crash for iOS optimization for Android, and some gameplay changes for monetization • It was rolled out on half of the players 61
  • 62. Case 3 / Setting • We released a bunch of updates: patched crash for iOS optimization for Android, and some gameplay changes for monetization • It was rolled out on half of the players • Week has passed 62
  • 63. Case 3 / Setting • We released a bunch of updates: patched crash for iOS optimization for Android, and some gameplay changes for monetization • It was rolled out on half of the players • Week has passed • We want to check (and measure) how much more money we are making 63
  • 64. Case 3 / Setting • We released a bunch of updates: patched crash for iOS optimization for Android, and some gameplay changes for monetization • It was rolled out on half of the players • Week has passed • We want to check (and measure) how much more money we are making 64 Dashboard says +10% ARPU
  • 65. Case 3 / Setting • We released a bunch of updates: patched crash for iOS optimization for Android, and some gameplay changes for monetization • It was rolled out on half of the players • Week has passed • We want to check (and measure) how much more money we are making 65 Dashboard says +10% ARPU Rolling out on 100% of player?..
  • 66. Case 3 / Pitfalls 1 Dashboard says: A: (5+3+5+4+3+5+4) = 29$ B: (4+5+3+5+4+5+5) = 32$ Lift = A/B = 32$ / 29$ = +10% ARPU Good! Rolling out?.. 66
  • 67. Case 3 / Pitfalls 1 / ARPU Metric ● You can’t just add up ARPU across days! Sum(ARPU per day) != Total ARPU for these days 67
  • 68. Case 3 / Pitfalls 1 / ARPU Metric 68 Player Day1 Day2 Day3 A $1 - - B $0 $0 $1 C - $1 $0 D - - $1 E - - $1 ARPU (if we sum daily’s) $1.30 Total $ $5.00 Players 5 ARPU (the right one) $1.00 Daily$ $1.00 $1.00 $3.00 Daily ARPU $0.50 $0.20 $0.60 ● Let’s look at an example (“-” means “no visit”)
  • 69. Case 3 / Pitfalls 2 Recalculated ARPU properly: +5% ARPU Good now? 69
  • 70. Case 3 / Pitfalls 2 Recalculated ARPU properly: +5% ARPU Good now? No! A single metric doesn’t tell the whole story and may mislead (imagine: we broke Android version, and fixed iOS one) (there is a lot of ‘wrong things’ with this ‘approach’) 70
  • 71. Case 3 / Pitfalls 2 Recalculated ARPU properly: +5% ARPU Good now? No! A single metric doesn’t tell the whole story and may mislead. Also, this +5% can be attributed to the randomness. 71
  • 72. Case 3 / Pitfalls 2 Recalculated ARPU properly: +5% ARPU Good now? No! A single metric doesn’t tell the whole story and may mislead. Also, this +5% can be attributed to the randomness. Solution: using confidence intervals. But some metrics have no formula for confidence intervals… 72
  • 73. Case 3 / Bootstrap/ Solution An algorithm that calculates CI for any per-user metric! 73
  • 74. Case 3 / Bootstrap / Solution Steps: 1. Calculate X: an array of numbers, one per player; each value = user metric. ($$$ for each active player in our case; 0$ if no purchases) 2. Repeat steps N times: a) Resample X (as Y): make a new array (same size as X) by picking random numbers from X (independently, without removing them) b) Compute average of Y of that new array It gives N different numbers (N averages of resampled X). 1. Compute percentiles of these N averages: (5%, 50%, 95%). They are (lower CI, average value, and upper CI) values of your metric Voila! 74
  • 75. Case 3 / Bootstrap / Solution That’s ~10 lines of code. Works with ANY per-user metric! With that, we can decide on whether we should roll-out. Intuition: Step 2a simulates “AB tests from the parallel universe”, step 2b: computes its average metric. Steps: (last slide recap): 1. Calculate X with per-user metrics 2. Repeat N times: a) Resample X as Y b) Compute avg(Y) 1. Percentiles (5%, 50%, 95%) of Step 2b (it’s an array of length N) is the average metric with CI 75
  • 76. Case 3 / Conclusion • Bootstrap algorithm: computes any CI in 10 lines of code • ARPU is not additive (can’t just add two values) • Using only one metric misleads (and disappoints) 76
  • 77. Case 3 / Conclusion • Bootstrap algorithm: computes any CI in 10 lines of code • ARPU is not additive (can’t just add two values) • Using only one metric misleads (and disappoints) Metrics have assumptions behind them too! Product decision based on a single number is under- supported 77
  • 79. Game Dev & Data Science: right now • Mobile games: hard to make money; and they are ad spam fest • Plethora of usable data is collected, yet… • No adoption of complex, and low adoption of simple techniques/models (PERSONAL SPECULATION!) 79
  • 80. Game Dev & Data Science: right now Low adoption? Disbelief: many tried standard approaches to meager results. Domain complexity: games are dynamic and highly reactive. - that makes them hard to model; common tools are unsuitable. Integration complexity: models may break control over gameplay (if misused) (PERSONAL SPECULATION!) 80
  • 81. Game Dev & Data Science: right now Low adoption? Disbelief: many tried standard approaches to meager results. Domain complexity: games are dynamic and highly reactive. - that makes them hard to model; common tools are unsuitable. Integration complexity: models may break control over gameplay (if misused) What’s good $$$ source? Focusing on paying few (“whales”) is a good strategy. Unless you have good personalization tools - - what about extracting 0.5$ from 50% of your non-paying players? (PERSONAL SPECULATION!) 81
  • 82. Case 1 “Improving retention via gifts” 1. Uplift modeling, a better alternative to predicting stuff 2. Decisions are more important than “just prediction” 82 What We’ve Learned Today
  • 83. Case 1 “Improving retention via gifts” 1. Uplift modeling, a better alternative to predicting stuff 2. Decisions are more important than “just prediction” Case 2: “AB testing new mechanics”: 1. Multi-armed bandits, a better alternative for AB 2. Algorithms have assumptions behind them 83 What We’ve Learned Today
  • 84. What We’ve Learned Today Case 1 “Improving retention via gifts” 1. Uplift modeling, a better alternative to predicting stuff 2. Decisions are more important than “just prediction” Case 2: “AB testing new mechanics”: 1. Multi-armed bandits, a better alternative for AB 2. Algorithms have assumptions behind them Case 3 “Update roll-out analysis”: 1. Bootstrap technique for easy confidence interval 2. Metrics have assumptions too; and may mislead 84
  • 85. Personal Hypotheses ● Misuse of tools + hard domain > > failed early experiments > > distrust > > low adoption ● Mobile games are overfitting to whales 85
  • 86. INCYMO.AI Join us! Free blog on monetization: 86 LinkedIn: Vlad Grozin E-mail: vg@incymo.ai Twitter/ Telegram: @rampeer