Applied Data Science for monetization: pitfalls, common misconceptions, and novel approaches / Vladislav Grozin (INCYMO.AI)

Applied Data Science
for monetization
pitfalls, common misconceptions, and novel approaches
Vladislav Grozin
Co-founder & Head of DS @ INCYMO.AI
Ph.D. @ Colorado University Boulder (USA)

Solve Problems in
Improving retention via gifts
2

Solve problems in:
AB testing new mechanics
3
Solve Problems in

Solve problems in:
Update roll-out analysis
4
Solve Problems in

Learn Today
- how and why common solutions fail
- how to do better: 3 tools you can use today!
NB: NO MATH!
Just some (criminally)
underused methods
(or tricks?... algorithms,
maybe?..) 5
Learn Today

7 years - DS products for e-commerce and other online
services
6
About me

services
Academic background:
- 5 published papers;
- interest in new and applicable approached
7
About Me

services
… and amateur game developer (in past)!
8
About
About Me

About Me
services
… and amateur game developer (in past)!
Now switched to providing DS solution for games
and work @ Incymo.ai
9
About

Case 1
Improving retention via gifts
10

Case 1 / Setting
Imagine:
• want: prevent players from
churning
11

Case 1 / Setting
Imagine:
churning
• idea: can do something to
retain them
12

Case 1 / Setting
Imagine:
churning
retain them
But which players should
receive these gifts to
maximize profits?
13

Case 1 / Setting
Imagine:
churning
retain them
But which players should
receive these gifts to
maximize profits?
14
Typical solution:
• build a churn prediction
model to estimate
probability that player
will leave

Case 1 / Pitfalls
Questions:
• What are we actually predicting?
• Player won’t return tomorrow? In a week? In 1 month?..
15

Case 1 / Pitfalls
Questions:
• What’s then?
• Send to players with p>90%?.. Or 75% < p <80%?..
16

Case 1 / Pitfalls
Questions:
• What’s then?
• Send to players with p>90%?.. Or 75% < p <80%?..
17
Hard to answer precisely!
Let’s handwave and say “it just works™”

Case 1 / Pitfalls
Semi-realistic assumptions:
1. Some users like freebie items, but churn a lot
18

Case 1 / Pitfalls
1. Some users like freebie items, but churn a lot
2. We correctly identify them and send them gift items
19

Case 1 / Pitfalls
1. Let’s say some users like freebie items, but churn a lot
2. We correctly identify them and send them gift items
3. It works, and they are retained. That’s recorded in the data.
20

Case 1 / Pitfalls
aaand after some time…
21

Case 1 / Pitfalls
1. The model is updated using new dataset
22

Case 1 / Pitfalls
2. It observes that users with these traits do not churn
23

Case 1 / Pitfalls
3. Now, model assigns them low churn probability
24

Case 1 / Pitfalls
3. Now, model assigns them low churn probability
4. …and we don’t send them items anymore
25

Case 1 / Problem Core
There is a disjoint:
1. We want to improve retention
2. Collect behavior data
3. Predict churn probability
4. ????
5. Decide on which players
should receive freebie items
6. Enjoy retention
26

4. ????
6. Enjoy retention
What’s missing?
● We’d like to send to the most
susceptible users
● … that’s not measured by churn
probability
● we need change in probability
27

4. ????
6. Enjoy retention
What’s missing?
● We’d like to send to the most
susceptible users
● … that’s not measured by churn
probability
● we need change in probability
There is a gap between “predictions” and “decision making”28

• Estimate the change in probabilities after performing a change
• Plenty of implementations (scikit-uplift, pylift, …)
Case 1 / Solution / Uplift Models
*“p(churn)” is not quite the same as “p(churn|didn’t got a gift)”, ignoring details for now
29
Player Churn% ↓Churn% | if we send gift Prediction-based decision Uplift-based decision
A 95% 5% Send
B 90% 10% Send
C 85% 30% Send
20% Send
5%
1%

Case 1 / Conclusion
Uplift models:
• estimate gain from an action
• easy to use: implemented in most model packages
• more suitable than prediction-based decision-making
30

Case 1 / Conclusion
Uplift models:
• estimate gain from an action
• easy to use: implemented in most model packages
• more suitable than prediction-based decision-making
No one really needs predictions.
Applied Data Science is actually about data-guided
automated decision-making
31

Case 2
AB testing new mechanics
32

Case 2 / Setting
Imagine:
• New players underperform in
$$$
33

Case 2 / Setting
Imagine:
$$$
• We designed two “Welcome”
offer pop-ups
34

Case 2 / Setting
Imagine:
$$$
offer pop-ups
• We want to pick the best and
fine-tune parameters
35

Case 2 / Setting
Imagine:
$$$
offer pop-ups
36
Typical solution:
• Run AB test:
“Welcome1” vs
“Welcome2” vs baseline

Case 2 / Setting
Imagine:
$$$
offer pop-ups
37
Typical solution:
• Run AB test:
“Welcome1” vs
“Welcome2” vs baseline
• Allocate subgroups with
different parameter
values to pick the best

Case 2 / Pitfalls
• How should we run the test? When to stop?
38

Case 2 / Pitfalls
• Should we stop before the planned end
39

Case 2 / Pitfalls
• We are deliberately losing money during the test.
40

Case 2 / Pitfalls
• We are deliberately losing money during the test
• Will the chosen “best” weights will remain “best” forever?
41

Multi-armed bandits!
A well-known algorithm
Case 2 / A Better Approach
42

Directly makes “decisions” from a set
43

by searching for best “decision”;
where best one is which gives most “value”.
44

by searching for best “decision”;
where best one is which gives most “value”.
Has been well-tested and applied widely in e-commerce,
marketing, etc.
45

Case 2 / Multi-armed Bandits / Setting
Imagine: you enter a casino and see X slot machines
46

- On using one (pulling “arm”), it may give you some $$
- The chance and amount of $$ is random
- No prior knowledge about distributions, but you “see” all machines
- Machines have different distributions that do not change
47

- On using one (pulling “arm”), it may give you some $$
- The chance and amount of $$ is random
- No prior knowledge about distributions, but you “see” all machines
- Machines have different distributions that do not change
- You’d like to maximize total $$$
- Pulls are free
- Goal: cumulative regret metric
48

Setting (last slide recap)
● X slot machines
● Using one => maybe $$$
○ Random chance and amount
○ Each machine has different
distributions
● Maximize total $$$ per pull
Example: 3 machines
$$$ distribution (unknown to you)
A (50%: 5$)
B (10%: 10..20$)
C (1%: 1000$)
Your pulled each arm twice:
A(0$, 5$)
B(0$, 15$)
C(0$, 0$)
Your next pull? 49

Thompson Sampling algorithm for Multi-armed bandit
problem:
1. Estimate $$$ distr. for each “arm”
2. Draw a sample from that distribution
3. Pull arm with largest sample
Case 2 / Multi-armed Bandits / Algo
50

Thompson Sampling algorithm for Multi-armed bandit
problem:
1. Estimate $$$ distr. for each “arm”
2. Draw a sample from that distribution
3. Pull arm with largest sample
Application to our problem:
Each arm = offer variant
Pull = assigning variant to a player and showing the offer
Reward = future money from that player
51

Application:
Each arm = offer variant (“Welcome 1”, “Welcome 2”)
Pull = assigning variant to a player, showing offer
Reward = future money from that player
52

Advantages:
- Converges to best offer variant; and keeps
using it
53

Advantages:
using it
- Does so in best number of trials (=very fast)
- Much better than running time-fixed AB
test
54

Advantages:
using it
- Does so in best number of trials (=very fast)
- Much better than running time-fixed AB
test
- If we leave it running: adapts to player
cohort drifts
- Much much much better than re-running
AB tests
55

Case 2 / Conclusion
Multi-armed bandits:
• drop-in replacement for AB tests
• easy to implement; pick best solution faster
• it draws some assumptions about setting
(provided example meets them)
56

Case 2 / Conclusion
All models / machine decision-making make assumptions
(you’d better be explicit about that! are you?)
57

Case 2 / Conclusion
All models / machine decision-making make assumptions
(you’d better be explicit about that! are you?)
You shouldn’t cross-validate hyperparameters on users using AB
tests
58

Case 3
59

Case 3 / Setting
• We released a bunch of updates:
patched crash for iOS
optimization for Android, and some
gameplay changes for
monetization
60

Case 3 / Setting
monetization
• It was rolled out on half of the players
61

Case 3 / Setting
monetization
• Week has passed
62

Case 3 / Setting
monetization
• Week has passed
• We want to check (and measure) how
much more money we are making
63

Case 3 / Setting
monetization
• Week has passed
64
Dashboard says
+10% ARPU

Case 3 / Setting
monetization
• Week has passed
65
Dashboard says
+10% ARPU
Rolling out on 100%
of player?..

Case 3 / Pitfalls 1
Dashboard says:
A: (5+3+5+4+3+5+4) = 29$
B: (4+5+3+5+4+5+5) = 32$
Lift = A/B
= 32$ / 29$
= +10% ARPU
Good!
Rolling out?..
66

Case 3 / Pitfalls 1 / ARPU Metric
● You can’t just add up ARPU across days!
Sum(ARPU per day) != Total ARPU for these days
67

Case 3 / Pitfalls 1 / ARPU Metric
68
Player Day1 Day2 Day3
A $1 - -
B $0 $0 $1
C - $1 $0
D - - $1
E - - $1
ARPU
(if we sum daily’s)
$1.30
Total $ $5.00
Players 5
ARPU
(the right one)
$1.00
Daily$ $1.00 $1.00 $3.00
Daily
ARPU
$0.50 $0.20 $0.60
● Let’s look at an example (“-” means “no visit”)

Case 3 / Pitfalls 2
Recalculated ARPU properly: +5% ARPU
Good now?
69

Case 3 / Pitfalls 2
Good now?
No! A single metric doesn’t tell the whole story and may
mislead
(imagine: we broke Android version, and fixed iOS one)
(there is a lot of ‘wrong things’
with this ‘approach’)
70

Case 3 / Pitfalls 2
Good now?
mislead.
Also, this +5% can be attributed to the randomness.
71

Case 3 / Pitfalls 2
Good now?
mislead.
Also, this +5% can be attributed to the randomness.
Solution: using confidence intervals.
But some metrics have no formula for confidence intervals…
72

Case 3 / Bootstrap/ Solution
An algorithm that calculates CI for any per-user metric!
73

Case 3 / Bootstrap / Solution
Steps:
1. Calculate X: an array of numbers, one per player; each value = user
metric.
($$$ for each active player in our case; 0$ if no purchases)
2. Repeat steps N times:
a) Resample X (as Y): make a new array (same size as X) by picking
random numbers from X (independently, without removing them)
b) Compute average of Y of that new array
It gives N different numbers (N averages of resampled X).
1. Compute percentiles of these N averages:
(5%, 50%, 95%). They are
(lower CI, average value, and upper CI) values of your metric
Voila!
74

Case 3 / Bootstrap / Solution
That’s ~10 lines of code.
Works with ANY per-user metric!
With that, we can decide on
whether we should roll-out.
Intuition:
Step 2a simulates “AB tests from the parallel
universe”, step 2b: computes its average
metric.
Steps: (last slide recap):
1. Calculate X with per-user
metrics
2. Repeat N times:
a) Resample X as Y
b) Compute avg(Y)
1. Percentiles (5%, 50%, 95%) of
Step 2b (it’s an array of length
N) is the average metric with CI
75

Case 3 / Conclusion
• Bootstrap algorithm: computes any CI in 10 lines of code
• ARPU is not additive (can’t just add two values)
• Using only one metric misleads (and disappoints)
76

Case 3 / Conclusion
• Bootstrap algorithm: computes any CI in 10 lines of code
• ARPU is not additive (can’t just add two values)
• Using only one metric misleads (and disappoints)
Metrics have assumptions behind them too!
Product decision based on a single number is under-
supported
77

Discussion
78

Game Dev & Data Science: right now
• Mobile games: hard to make money; and they are ad spam fest
• Plethora of usable data is collected, yet…
• No adoption of complex, and low adoption of simple
techniques/models
(PERSONAL SPECULATION!)
79

Low adoption?
Disbelief: many tried standard approaches to meager results.
Domain complexity: games are dynamic and highly reactive.
- that makes them hard to model; common tools are unsuitable.
Integration complexity: models may break control over gameplay (if
misused)
80

Low adoption?
Disbelief: many tried standard approaches to meager results.
Domain complexity: games are dynamic and highly reactive.
- that makes them hard to model; common tools are unsuitable.
Integration complexity: models may break control over gameplay (if
misused)
What’s good $$$ source? Focusing on paying few (“whales”) is a good
strategy. Unless you have good personalization tools -
- what about extracting 0.5$ from 50% of your non-paying players?
81

Case 1 “Improving retention via gifts”
1. Uplift modeling, a better alternative to predicting stuff
2. Decisions are more important than “just prediction”
82
What We’ve Learned Today

Case 2: “AB testing new mechanics”:
1. Multi-armed bandits, a better alternative for AB
2. Algorithms have assumptions behind them
83

Case 2: “AB testing new mechanics”:
1. Multi-armed bandits, a better alternative for AB
2. Algorithms have assumptions behind them
Case 3 “Update roll-out analysis”:
1. Bootstrap technique for easy confidence interval
2. Metrics have assumptions too; and may mislead
84

Personal Hypotheses
● Misuse of tools + hard domain >
> failed early experiments >
> distrust >
> low adoption
● Mobile games are overfitting to whales
85

INCYMO.AI
Join us!
Free blog on monetization:
86
LinkedIn:
Vlad Grozin
E-mail: vg@incymo.ai
Twitter/ Telegram: @rampeer

Applied Data Science for monetization: pitfalls, common misconceptions, and novel approaches / Vladislav Grozin (INCYMO.AI)

More Related Content

Similar to Applied Data Science for monetization: pitfalls, common misconceptions, and novel approaches / Vladislav Grozin (INCYMO.AI) (20)

More from DevGAMM Conference (20)

Recently uploaded (20)

Applied Data Science for monetization: pitfalls, common misconceptions, and novel approaches / Vladislav Grozin (INCYMO.AI)