SlideShare a Scribd company logo
Custom-Made Games
with Machine Learning
and Big Data
África Periáñez, PhD
CEO, Yokozuna Data
GDC2019 San Francisco
@aperianez
Dr. África Periáñez
Founder and CEO of Yokozuna Data
PhD Mathematics (Ensemble Learning) - 2015 University of Reading
MSc String Theory - 2006 CERN
MSc Theoretical Physics - 2003 UAM
BSc Physics - 2001 UAM
2005/2006
2006/2010
2011/2015
2015/...
Founded in 2015, joined Keywords Studios in 2018
to push back the frontiers of General Behavioral Machine Learning
and to revamp the video-game industry: Personalized games
What is
Yokozuna Data?
Mission
To unlock the knowledge of big game databases
To convert unstructured data into actionable information
in order to understand and predict individual player behavior
THE TEAMTHE TEAM
Vitor Santos, MA
DESIGN &
BUSINESS DIRECTOR
Álvaro de Benito, MA
PR & COMMUNICATION
LEAD
Nitin Kumar, MSc
FULL-STACK
ENGINEER
África Periáñez, PhD
Founder & CEO
Pooja Revanna, MSc
BACKEND
ENGINEER
Omid Aladini, MSc
DATA INFRASTRUCTURE
ADVISOR
Dexian Tang, MSc
BIG DATA
ENGINEER
Javier Grande, PhD
SCIENTIFIC
EDITOR
Yu-Kai Hung, MSc
COMMUNITY MANAGER
FOR ASIA
Ana Fernández, MSc
SENIOR RESEARCH
DATA SCIENTIST
Pei Pei Chen, MSc
MACHINE LEARNING
ENGINEER LEAD
Anna Guitart, MSc
DATA
SCIENTIST
Jing Li, PhD
MACHINE LEARNING
ENGINEER
Cristian Conteduca, MSc
BACKEND
ENGINEER LEAD
Peng Xiao, BSc
BIG DATA
ENGINEER
Shi Hui Tan, MSc
DATA
SCIENTIST
+ +AI CLOUD DEVOPS
Yokozuna
Data
AWS GameSparks
AI Cloud DevOps
PUSHING BACK
THE FRONTIERS
OF GAME
DATA SCIENCE:
A state-of-the-art
machine learning
engine that predicts
individual player
behavior
ACADEMIA
SCIENTIFIC
RESEARCH
INTERDISCIPLINARITY
START-UP
VIDEO GAME
BUSINESS
CHALLENGES
Highly sophisticated games allow players
to express nuanced emotions through their in-game actions
PLAYER

RETENTION
GAME
OPTIMIZATION
GAME
DEVELOPMENT
PERSONALIZATION
Individual
playlists
Personalized
film selection
Individual
search results
Product
recommendations
WHEN WILL PLAYERS
LEAVE THE GAME?
DATE LEVEL PLAYTIME MONEY
WHICH ITEM WILL THEY

PURCHASE NEXT?
Billing history
Distribution of item
probabilities
Time to next purchase
08/06 08/15 08/27 09/06
Playtime
GAME
APPLICATIONS
Players who
may stop
purchasing
Upcoming
churners
RECOMMENDS THE BEST
SEQUENCE OF EVENTS TO MAXIMIZE
PLAYER ENGAGEMENT CONSIDERING
EXTERNAL AND ENVIRONMENTAL FACTORS
Personalized
matching
Who should you compete
against in Mario Kart?
Which clan is your best
opponent in Clash of Clans?
PERSONALIZATION
PERSONALIZATION
External PersonalizationBehavioral Personalization
Item recommendation
system
Action
recommendation
Rewards
and discounts
Engagement and
retention-motivated
actionable
recommendations
PERSONALIZATION
VIDEO GAME
CHALLENGES
Predicting player
behavioral outcomes is key
to success of game developers
Man is a deterministic
device thrown into a
probalistic Universe
Amos Tversky
Daniel Kahneman
Nobel prize 2002
NEW
PLAYERS
Acquiring new users is expensive: in Japan, the average
cost-per-install for gaming apps reached 6.07 USD in 2018
Video game challenges:
Increasing Retention
Between 75–90% of new players churn on the first day
and inefficient: only 5% remain after one month
1 Liftoff and Adjust, 2018
VIP
PLAYERS
Retention of the most valuable players is crucial:
the top 10% of paying users contribute 60% of the revenue
Thousands of titles are published every year and
compete for same playersʼ time and attention
Video game challenges:
Increasing Retention
Only a small fraction of users make purchases,
identifying these users and predicting their Customer Lifetime Value is crucial
A key challenge for game developers is to convert
players from non-premium to premium
Actionable Goals:
1) tailor marketing efforts: In-game targeting of advertising and price promotions
2) customize game difficulty, e.g. dynamic difficulty adjustment
3) manage user acquisition campaigns
Video game challenges:
Maximizing the engagement of VIP players
Targeting the right players by customizing game events
and publishing them at the right time
Identify the reasons behind behavioral trends
Find the best acquisition, marketing and game event strategies
Video game challenges:
Optimizing Game Events and Marketing Campaigns
SOLVING VIDEO GAME

BUSINESS CHALLENGES
$1.3M/month
sales increase
+10%
VIP Retention
$4M/month
sales increase
+5%
PU Increase
$8M/year
sales increase
Improving
development
of levels with
the highest
churn rate
NUMBERS FOR
AAA MOBILE GAMES
$200K/month
sales increase
$70K/month
sales increase
+5%
PU Increase
$10M/year
sales increase
+10%
VIP Retention
Improving
development
of levels with
the highest
churn rate
NUMBERS FOR
WESTERN CASUAL GAMES
1/ OPERATIONAL PLAYER
BEHAVIORAL PREDICTION
2/ DEEP LEARNING AND
ENSEMBLE LEARNING
3/ BIG DATA
ENSEMBLE LEARNING
Multiple learners are trained to solve
the same problem
Ordinary machine learning approaches
learn one hypothesis from training data
Ensemble methods try to construct a set
of hypotheses and combine them OUTPUT
label
Reconstruction
Error
Output
nth feature
layer
2nd feature
layer
1st feature
layer
Input
label
Learning
Learning&Generalization
Deep Learning ModelArtificial Neural
Network
DEEP LEARNING
PLAYER LIFETIME VALUE PREDICTION?
UPP: UPCOMING PURCHASES PREDICTION
Lifetime value (LTV), is an estimate about the amount a player will spend from
today until she exits the game
DATASET
Unknown churn day
Various lifetime
(from 1 day to years)
CURRENT DAY
CONVOLUTIONAL NEURAL NETWORK (CNN, CONVNET)
Convolutional layers Max pooling layers
Training:
Optimize weights and bias
to minimize error.
Downsampling
With filters that cover more than one input, CNNs can learn local
connectivity between inputs.
Proposed by LeCun et al. in 1998 [1], CNNs have been widely applied
to image processing, signal processing, and time series prediction.
[1] LeCun, Yann, et al. "Object recognition with gradient-based learning." Shape, contour and grouping in computer vision. Springer, Berlin, Heidelberg, 1999. 319-345.
Reference:
https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/
https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050
INPUT
Feature Time Series
Convolutional Layer
32 Filters
300 Nodes
150 60
Output
Fully Connected
Layers
Pooling Layer
Subsample
......
- CNNs can learn local connectivity between inputs.
- CNNs are able to learn user behavior directly from the raw time series.
CONVOLUTIONAL NEURAL NETWORKS (CNNs)
YOKOZUNA DATA PEER-REVIEWED ARTICLES
CHURN PREDICTION IN
MOBILE SOCIAL GAMES:
TOWARDS A COMPLETE
ASSESSMENT USING
SURVIVAL ENSEMBLES
A. Periáñez, A. Saas,
A. Guitart and C. Magne
IEEE DSAA 2016 Montreal
DISCOVERING PLAYING
PATTERNS:
TIME SERIES CLUSTERING
OF FREE-TO-PLAY
GAME DATA
A. Guitart, A. Periáñez
and A. Saas,
IEEE CIG 2016 Santorini
GAMES AND BIG DATA:
A SCALABLE
MULTI-DIMENSIONAL
CHURN PREDICTION
MODEL
P. Bertens, A. Guitart
and A. Periáñez, A. Saas,
IEEE CIG 2017 New York
FORECASTING PLAYER
BEHAVIORAL DATA
AND SIMULATING
IN-GAME EVENTS
A. Guitart, P. Chen,
P. Bertens and A. Periáñez
IEEE FICC 2018 Singapore
THE WINNING SOLUTION TO
THE IEEE CIG 2017
GAME DATA MINING
COMPETITION
A. Guitart, P. Chen,
A. Periáñez
A MACHINE-LEARNING
ITEM RECOMMENDATION
SYSTEM FOR
VIDEO GAMES
P. Chen, A. Guitart, P. Bertens,
A. Periáñez
IEEE CIG 2018 Maastricht
CUSTOMER LIFETIME
VALUE IN VIDEO
GAMES USING DEEP
LEARNING AND
PARAMETRIC MODELS
P. Chen, A. Guitart,
A. Fernández del Río and A. Periáñez
IEEE Big Data 2018 Seattle
THE WINNING
SOLUTION TO
THE IEEE CIG 2017
GAME
DATA MINING
COMPETITION
Number of registrants
in the competition
2 tracks:
Which players will leave the game
264 Teams
When they will leave the game
Results
Rank
YokozunaData (Japan)
UTU (Finland)
TripleS (Korea)
TheCowKing
goedleio
0.610098
0.60326
0.57968
0.59370
0.57717
0.63326
0.60370
0.62459
0.60718
0.56205
0.62145
0.60348
0.60130
0.60036
0.58882
Team Test1 Score Test2 Score Total Score Rank
1
2
3
4
5
YokozunaData (Japan)
IISLABSKKU
UTU (Finland)
TripleS (Korea)
DTND
0.883248
1.034321
0.927712
0.958308
1.032688
0.616499
0.679214
0.898471
0.891106
0.930417
0.726151
0.819972
0.912857
0.923486
0.978888
Team Test1 Score Test2 Score Total Score
Track 1 Which players will leave the game Track 2 When they will leave the game
CONDITIONAL INFERENCE SURVIVAL ENSEMBLES
TWO STEPS ALGORITHM:
1) The optimal split variable is selected:
association between covariates and response
2) The optimal split point is determined by
comparing two-sample linear statistics for all
possible partitions of the split variable
RANDOM SURVIVAL FOREST
RSF is based on original random forest algorithm
RSF favors variables with many possible
split points over variables with fewer
4) Ishwaran H. et. al, 2008. Random Survival Forests.
5) Breiman L. et. al, 2001. Random Forests.
4
5
CENSORED DATA PROBLEM RESULTS
Predicted survival curves as a function of
playtime (hours), level and days of existing and new players
ITEM 1
ITEM 2
ITEM 3
SAMPLING TO HANDLE MULTI-LABEL OUTPUTS
Players make multiple purchases
- Multiple prediction targets
Subsample until time t and find next purchase after t
- Single label to train on
- Enable multiple subsamples to enlarge training set
- Reduce overfitting
ITEM RECOMMENDATION
MODEL
GAMES WITH LARGE

NUMBER OF ITEMS
High-dimensional
item space
Dimensionality reduction
to train ML models
Operational in-game item
recommendation to individual
players
MACHINE LEARNING MODELS
AN OPERATIONAL PREDICTION SYSTEM
BIG DATA ENGINEERING INFRASTRUCTURE
The Problem
The Solution
Supporting data from thousands of games
and millions of Monthly Active Users (MAU)
A CLOUD DISTRIBUTED-SYSTEM DESIGN FOR:
- Data upload
- Databases and storage
- Parallel computing for data processing
and machine learning execution
SCALING TO INFINITY
UPLOAD COMPUTE RESULTS
RESULTS
DATA UPLOAD
Servers/
Storage
Compute Instances
Cloud Storage
Computing
Servers
Web
Servers
Database
Cluster
Cluster
KUBERNETES
CLIENT Web Server +
Engine m5.4xlarge
DB Instance
db.r4.2xlarge
X3 nodes
WEB APP SERVICE
DB CLUSTER
YOKOZUNA DATA ARCHITECTURE ON AWS
VPC
REGION: ASIA PACIFIC TOKYO
S3
CASSANDRA SERVICE
Cassandra Servers +
Engine m5.2xlarge
M
Storage
Storage
backup
The biggest challenge is to be able to retain users
with very diverse tastes, desires and motivations
- Profile players at the individual level and direct them
towards the activities that are more likely to interest them
- Tailor game events, pricing and promotions
- Identify VIP players sooner to provide a premium service
1
SUMMARY
SUMMARY
Yokozuna Data provide accurate predictions of:
- Individual-user actions
- Personalized promotions and rewards (recommendations)
- Moment and level at which players will leave the game
- Money they will spend in the game and their potential
2
CONTACT
aperianez@yokozunadata.com
linkedin.com/in/africaperianez
@yokozunadata
www.yokozunadata.com
THANK YOU! :)

More Related Content

PPT
Serious games for upper limb rehabilitation following stroke
PPTX
KaviVirtualRealityTrainingSlides15
PDF
IRJET- Virtual Fitness Trainer with Spontaneous Feedback using a Line of Moti...
PDF
Immersive Environments, Machine Learning, Neuroimaging, & Wearable Sensing Te...
PPTX
What are future con­sumer appli­ca­tions of bio-sensing neu­rotech­nol­ogy?
PDF
Video Games and Your Brain
PPTX
Research on reducing motion sickness in virtual reality
PPTX
DECISION SUPPORT SYSTEMS
Serious games for upper limb rehabilitation following stroke
KaviVirtualRealityTrainingSlides15
IRJET- Virtual Fitness Trainer with Spontaneous Feedback using a Line of Moti...
Immersive Environments, Machine Learning, Neuroimaging, & Wearable Sensing Te...
What are future con­sumer appli­ca­tions of bio-sensing neu­rotech­nol­ogy?
Video Games and Your Brain
Research on reducing motion sickness in virtual reality
DECISION SUPPORT SYSTEMS

Similar to Custom-Made Games with Machine Learning and Big Data (20)

PPTX
Nintendo YOO - The next generation of Console Gaming
PDF
DSDT Meetup February 2018
PDF
Dsdt meetup 2018
PDF
Dsdt meetup 2018 02-12
PPT
Case Study: Evolution of Game Marketing at Cisco
PDF
Gdmc v11 presentation
PDF
楽天技術研究所の次世代AI 技術への挑戦
PDF
Research Proposal
PPT
Case Study: Marketing 2.0 The Evolution of Games
PPT
5th world otron
PPTX
Esports in education, what's going on?
PDF
AR Camera_ Investment Pitch Book
PDF
CG_report_merged (1).pdf
PDF
FACIAL EMOTION RECOGNITION
PDF
The Sensor Web - New Opportunities for MediaMixing
PDF
Game Data Science: The State of the Art
PPTX
[DSC Europe 23][Pandora] Dmitrii_Matveev-MULTI-TASK_LEARNING_IN_DNN_FORECASTI...
PPTX
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
PDF
IRJET - Face Recognition based Attendance System
PPTX
Mastering the Game - Big Data and Gamification
Nintendo YOO - The next generation of Console Gaming
DSDT Meetup February 2018
Dsdt meetup 2018
Dsdt meetup 2018 02-12
Case Study: Evolution of Game Marketing at Cisco
Gdmc v11 presentation
楽天技術研究所の次世代AI 技術への挑戦
Research Proposal
Case Study: Marketing 2.0 The Evolution of Games
5th world otron
Esports in education, what's going on?
AR Camera_ Investment Pitch Book
CG_report_merged (1).pdf
FACIAL EMOTION RECOGNITION
The Sensor Web - New Opportunities for MediaMixing
Game Data Science: The State of the Art
[DSC Europe 23][Pandora] Dmitrii_Matveev-MULTI-TASK_LEARNING_IN_DNN_FORECASTI...
[DSC Europe 23][Pandora] Siyu SUN Data Science Enter The Game.pptx
IRJET - Face Recognition based Attendance System
Mastering the Game - Big Data and Gamification
Ad

Recently uploaded (20)

PPTX
Managing Community Partner Relationships
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
How to run a consulting project- client discovery
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Business Analytics and business intelligence.pdf
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
Predictive modeling basics in data cleaning process
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Transcultural that can help you someday.
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Managing Community Partner Relationships
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
How to run a consulting project- client discovery
STERILIZATION AND DISINFECTION-1.ppthhhbx
IBA_Chapter_11_Slides_Final_Accessible.pptx
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
importance of Data-Visualization-in-Data-Science. for mba studnts
Database Infoormation System (DBIS).pptx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
ISS -ESG Data flows What is ESG and HowHow
Business Analytics and business intelligence.pdf
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Predictive modeling basics in data cleaning process
Optimise Shopper Experiences with a Strong Data Estate.pdf
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Transcultural that can help you someday.
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Ad

Custom-Made Games with Machine Learning and Big Data

  • 1. Custom-Made Games with Machine Learning and Big Data África Periáñez, PhD CEO, Yokozuna Data GDC2019 San Francisco
  • 2. @aperianez Dr. África Periáñez Founder and CEO of Yokozuna Data PhD Mathematics (Ensemble Learning) - 2015 University of Reading MSc String Theory - 2006 CERN MSc Theoretical Physics - 2003 UAM BSc Physics - 2001 UAM
  • 4. Founded in 2015, joined Keywords Studios in 2018 to push back the frontiers of General Behavioral Machine Learning and to revamp the video-game industry: Personalized games What is Yokozuna Data?
  • 5. Mission To unlock the knowledge of big game databases To convert unstructured data into actionable information in order to understand and predict individual player behavior
  • 6. THE TEAMTHE TEAM Vitor Santos, MA DESIGN & BUSINESS DIRECTOR Álvaro de Benito, MA PR & COMMUNICATION LEAD Nitin Kumar, MSc FULL-STACK ENGINEER África Periáñez, PhD Founder & CEO Pooja Revanna, MSc BACKEND ENGINEER Omid Aladini, MSc DATA INFRASTRUCTURE ADVISOR Dexian Tang, MSc BIG DATA ENGINEER Javier Grande, PhD SCIENTIFIC EDITOR Yu-Kai Hung, MSc COMMUNITY MANAGER FOR ASIA Ana Fernández, MSc SENIOR RESEARCH DATA SCIENTIST Pei Pei Chen, MSc MACHINE LEARNING ENGINEER LEAD Anna Guitart, MSc DATA SCIENTIST Jing Li, PhD MACHINE LEARNING ENGINEER Cristian Conteduca, MSc BACKEND ENGINEER LEAD Peng Xiao, BSc BIG DATA ENGINEER Shi Hui Tan, MSc DATA SCIENTIST
  • 7. + +AI CLOUD DEVOPS
  • 9. PUSHING BACK THE FRONTIERS OF GAME DATA SCIENCE: A state-of-the-art machine learning engine that predicts individual player behavior
  • 11. Highly sophisticated games allow players to express nuanced emotions through their in-game actions
  • 14. WHEN WILL PLAYERS LEAVE THE GAME? DATE LEVEL PLAYTIME MONEY
  • 15. WHICH ITEM WILL THEY
 PURCHASE NEXT? Billing history Distribution of item probabilities Time to next purchase 08/06 08/15 08/27 09/06 Playtime GAME APPLICATIONS Players who may stop purchasing Upcoming churners
  • 16. RECOMMENDS THE BEST SEQUENCE OF EVENTS TO MAXIMIZE PLAYER ENGAGEMENT CONSIDERING EXTERNAL AND ENVIRONMENTAL FACTORS
  • 17. Personalized matching Who should you compete against in Mario Kart? Which clan is your best opponent in Clash of Clans? PERSONALIZATION
  • 19. Item recommendation system Action recommendation Rewards and discounts Engagement and retention-motivated actionable recommendations PERSONALIZATION
  • 20. VIDEO GAME CHALLENGES Predicting player behavioral outcomes is key to success of game developers
  • 21. Man is a deterministic device thrown into a probalistic Universe Amos Tversky Daniel Kahneman Nobel prize 2002
  • 22. NEW PLAYERS Acquiring new users is expensive: in Japan, the average cost-per-install for gaming apps reached 6.07 USD in 2018 Video game challenges: Increasing Retention Between 75–90% of new players churn on the first day and inefficient: only 5% remain after one month 1 Liftoff and Adjust, 2018
  • 23. VIP PLAYERS Retention of the most valuable players is crucial: the top 10% of paying users contribute 60% of the revenue Thousands of titles are published every year and compete for same playersʼ time and attention Video game challenges: Increasing Retention
  • 24. Only a small fraction of users make purchases, identifying these users and predicting their Customer Lifetime Value is crucial A key challenge for game developers is to convert players from non-premium to premium Actionable Goals: 1) tailor marketing efforts: In-game targeting of advertising and price promotions 2) customize game difficulty, e.g. dynamic difficulty adjustment 3) manage user acquisition campaigns Video game challenges: Maximizing the engagement of VIP players
  • 25. Targeting the right players by customizing game events and publishing them at the right time Identify the reasons behind behavioral trends Find the best acquisition, marketing and game event strategies Video game challenges: Optimizing Game Events and Marketing Campaigns
  • 27. $1.3M/month sales increase +10% VIP Retention $4M/month sales increase +5% PU Increase $8M/year sales increase Improving development of levels with the highest churn rate NUMBERS FOR AAA MOBILE GAMES
  • 28. $200K/month sales increase $70K/month sales increase +5% PU Increase $10M/year sales increase +10% VIP Retention Improving development of levels with the highest churn rate NUMBERS FOR WESTERN CASUAL GAMES
  • 29. 1/ OPERATIONAL PLAYER BEHAVIORAL PREDICTION 2/ DEEP LEARNING AND ENSEMBLE LEARNING 3/ BIG DATA
  • 30. ENSEMBLE LEARNING Multiple learners are trained to solve the same problem Ordinary machine learning approaches learn one hypothesis from training data Ensemble methods try to construct a set of hypotheses and combine them OUTPUT
  • 31. label Reconstruction Error Output nth feature layer 2nd feature layer 1st feature layer Input label Learning Learning&Generalization Deep Learning ModelArtificial Neural Network DEEP LEARNING
  • 32. PLAYER LIFETIME VALUE PREDICTION? UPP: UPCOMING PURCHASES PREDICTION Lifetime value (LTV), is an estimate about the amount a player will spend from today until she exits the game
  • 33. DATASET Unknown churn day Various lifetime (from 1 day to years) CURRENT DAY
  • 34. CONVOLUTIONAL NEURAL NETWORK (CNN, CONVNET) Convolutional layers Max pooling layers Training: Optimize weights and bias to minimize error. Downsampling With filters that cover more than one input, CNNs can learn local connectivity between inputs. Proposed by LeCun et al. in 1998 [1], CNNs have been widely applied to image processing, signal processing, and time series prediction. [1] LeCun, Yann, et al. "Object recognition with gradient-based learning." Shape, contour and grouping in computer vision. Springer, Berlin, Heidelberg, 1999. 319-345. Reference: https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/ https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050
  • 35. INPUT Feature Time Series Convolutional Layer 32 Filters 300 Nodes 150 60 Output Fully Connected Layers Pooling Layer Subsample ...... - CNNs can learn local connectivity between inputs. - CNNs are able to learn user behavior directly from the raw time series. CONVOLUTIONAL NEURAL NETWORKS (CNNs)
  • 36. YOKOZUNA DATA PEER-REVIEWED ARTICLES CHURN PREDICTION IN MOBILE SOCIAL GAMES: TOWARDS A COMPLETE ASSESSMENT USING SURVIVAL ENSEMBLES A. Periáñez, A. Saas, A. Guitart and C. Magne IEEE DSAA 2016 Montreal DISCOVERING PLAYING PATTERNS: TIME SERIES CLUSTERING OF FREE-TO-PLAY GAME DATA A. Guitart, A. Periáñez and A. Saas, IEEE CIG 2016 Santorini GAMES AND BIG DATA: A SCALABLE MULTI-DIMENSIONAL CHURN PREDICTION MODEL P. Bertens, A. Guitart and A. Periáñez, A. Saas, IEEE CIG 2017 New York FORECASTING PLAYER BEHAVIORAL DATA AND SIMULATING IN-GAME EVENTS A. Guitart, P. Chen, P. Bertens and A. Periáñez IEEE FICC 2018 Singapore THE WINNING SOLUTION TO THE IEEE CIG 2017 GAME DATA MINING COMPETITION A. Guitart, P. Chen, A. Periáñez A MACHINE-LEARNING ITEM RECOMMENDATION SYSTEM FOR VIDEO GAMES P. Chen, A. Guitart, P. Bertens, A. Periáñez IEEE CIG 2018 Maastricht CUSTOMER LIFETIME VALUE IN VIDEO GAMES USING DEEP LEARNING AND PARAMETRIC MODELS P. Chen, A. Guitart, A. Fernández del Río and A. Periáñez IEEE Big Data 2018 Seattle
  • 37. THE WINNING SOLUTION TO THE IEEE CIG 2017 GAME DATA MINING COMPETITION
  • 38. Number of registrants in the competition 2 tracks: Which players will leave the game 264 Teams When they will leave the game
  • 39. Results Rank YokozunaData (Japan) UTU (Finland) TripleS (Korea) TheCowKing goedleio 0.610098 0.60326 0.57968 0.59370 0.57717 0.63326 0.60370 0.62459 0.60718 0.56205 0.62145 0.60348 0.60130 0.60036 0.58882 Team Test1 Score Test2 Score Total Score Rank 1 2 3 4 5 YokozunaData (Japan) IISLABSKKU UTU (Finland) TripleS (Korea) DTND 0.883248 1.034321 0.927712 0.958308 1.032688 0.616499 0.679214 0.898471 0.891106 0.930417 0.726151 0.819972 0.912857 0.923486 0.978888 Team Test1 Score Test2 Score Total Score Track 1 Which players will leave the game Track 2 When they will leave the game
  • 40. CONDITIONAL INFERENCE SURVIVAL ENSEMBLES TWO STEPS ALGORITHM: 1) The optimal split variable is selected: association between covariates and response 2) The optimal split point is determined by comparing two-sample linear statistics for all possible partitions of the split variable RANDOM SURVIVAL FOREST RSF is based on original random forest algorithm RSF favors variables with many possible split points over variables with fewer 4) Ishwaran H. et. al, 2008. Random Survival Forests. 5) Breiman L. et. al, 2001. Random Forests. 4 5
  • 41. CENSORED DATA PROBLEM RESULTS Predicted survival curves as a function of playtime (hours), level and days of existing and new players
  • 42. ITEM 1 ITEM 2 ITEM 3 SAMPLING TO HANDLE MULTI-LABEL OUTPUTS Players make multiple purchases - Multiple prediction targets Subsample until time t and find next purchase after t - Single label to train on - Enable multiple subsamples to enlarge training set - Reduce overfitting ITEM RECOMMENDATION MODEL
  • 43. GAMES WITH LARGE
 NUMBER OF ITEMS High-dimensional item space Dimensionality reduction to train ML models Operational in-game item recommendation to individual players MACHINE LEARNING MODELS
  • 44. AN OPERATIONAL PREDICTION SYSTEM BIG DATA ENGINEERING INFRASTRUCTURE
  • 45. The Problem The Solution Supporting data from thousands of games and millions of Monthly Active Users (MAU) A CLOUD DISTRIBUTED-SYSTEM DESIGN FOR: - Data upload - Databases and storage - Parallel computing for data processing and machine learning execution SCALING TO INFINITY
  • 46. UPLOAD COMPUTE RESULTS RESULTS DATA UPLOAD Servers/ Storage Compute Instances Cloud Storage Computing Servers Web Servers Database Cluster Cluster KUBERNETES
  • 47. CLIENT Web Server + Engine m5.4xlarge DB Instance db.r4.2xlarge X3 nodes WEB APP SERVICE DB CLUSTER YOKOZUNA DATA ARCHITECTURE ON AWS VPC REGION: ASIA PACIFIC TOKYO S3 CASSANDRA SERVICE Cassandra Servers + Engine m5.2xlarge M Storage Storage backup
  • 48. The biggest challenge is to be able to retain users with very diverse tastes, desires and motivations - Profile players at the individual level and direct them towards the activities that are more likely to interest them - Tailor game events, pricing and promotions - Identify VIP players sooner to provide a premium service 1 SUMMARY
  • 49. SUMMARY Yokozuna Data provide accurate predictions of: - Individual-user actions - Personalized promotions and rewards (recommendations) - Moment and level at which players will leave the game - Money they will spend in the game and their potential 2