SlideShare a Scribd company logo
PRACTICAL BANDITS
FOR BUSINESS
Yan Xu
Houston Machine Learning Meetup
June 22, 2019
OUTLINE
- Recap on Bandit Problem
- A Contextual-Bandit Approach to Personalized News
Article Recommendation
http://rob.schapire.net/papers/www10.pdf
- An efficient bandit algorithm for realtime multivariate
optimization
https://www.kdd.org/kdd2017/papers/view/an-
efficient-bandit-algorithm-for-realtime-multivariate-
optimization
MULTI-ARMED BANDITS
DILEMMA: EXPLORATION VS.
EXPLOITATION
The exploration/exploitation trade-off is a dilemma we
frequently face in choosing between options.
Stay the same route to drive home, or try a new route?
Choose your favorite restaurant, or the new one?
Listen to your favorite music channel, or try a new artist?
Attend a new meetup?
HOW TO RESOLVE THE
DILEMMA
https://pavlov.tech/2019/03/02/animated-multi-
armed-bandit-policies/
Epsilon Greedy
UCB (Upper Confidence Bound)
Thompson Sampling
REWARD AND REGRET
REWARD AND REGRET
REWARD AND REGRET
MULTI-ARMED BANDITS
FORMULATION
PRACTICAL BANDITS
APPLICATION
BANDITS FOR PERSONALIZED
RECOMMENDATION
BANDITS FOR NEWS
RECOMMENDATION
CONTEXTUAL BANDITS
CONTEXTUAL BANDITS
CONTEXTUAL BANDITS
[0.1, 0.6]
[0.6, 0.4]
[0.7, 0.1]
[0.4, 0.2]
LINUCB ALGORITHM
LINUCB ALGORITHM
LINEAR DISJOINT MODEL
LINEAR DISJOINT MODEL
UPPER BOUND ILLUSTRATION
FEATURE FREE VS LINEAR
CONTEXTUAL BANDIT
BANDITS EVALUATION
BANDITS EVALUATION
BANDITS EVALUATION
BANDITS EVALUATION
DEALING WITH HIGH
DIMENSIONALITY
~1000 binary features per user; ~100 binary feature per article
DEALING WITH HIGH
DIMENSIONALITY
DEALING WITH HIGH
DIMENSIONALITY
RESULT: PERSONALIZED
NEWS
Omniscient: always chooses the article with highest empirical
CONCLUSION
AMAZON: BANDITS FOR
MULTIVARIATE OPTIMIZATION
Published at KDD 2017, KDD 2019 is in Alaska!
AMAZON: BANDITS FOR
MULTIVARIATE OPTIMIZATION
OPTIMIZING WEB LAYOUT
PROBLEM FORMULATION
STEP 1: PROBIT REGRESSION
STEP 2: THOMPSON
SAMPLING
STEP 3: HILLING-CLIMBING TO
DECIDE
SIMULATION RESULT
SIMULATION RESULT
Control widget interaction in simulation
through alpha_2.
EXPERIMENT ON REAL
TRAFFIC
• After only a single week of online
optimization, we saw a 21%
conversion increase compared to the
median layout
SUMMARY
Contextual bandits
 Linear payoff
 Add interaction components
 UCB: Variance estimation of expected rewards
 Thompson sampling: Sample weights from posterior distribution
Applications
 Recommendation
 Multi-variate optimization
For more details
 - A Contextual-Bandit Approach to Personalized News Article
Recommendation
http://rob.schapire.net/papers/www10.pdf
- An efficient bandit algorithm for realtime multivariate
optimization
https://www.kdd.org/kdd2017/papers/view/an-efficient-
bandit-algorithm-for-realtime-multivariate-optimization

More Related Content

PDF
Contextual Bandit Survey
PDF
Practical AI for Business: Bandit Algorithms
PDF
A Multi-Armed Bandit Framework For Recommendations at Netflix
PDF
Sequential Decision Making in Recommendations
PDF
Multi-Armed Bandit and Applications
PPTX
Netflix talk at ML Platform meetup Sep 2019
PPTX
Naive Bayes Presentation
PDF
Reinforcement Learning in Practice: Contextual Bandits
Contextual Bandit Survey
Practical AI for Business: Bandit Algorithms
A Multi-Armed Bandit Framework For Recommendations at Netflix
Sequential Decision Making in Recommendations
Multi-Armed Bandit and Applications
Netflix talk at ML Platform meetup Sep 2019
Naive Bayes Presentation
Reinforcement Learning in Practice: Contextual Bandits

What's hot (20)

PDF
Reinforcement Learning 2. Multi-armed Bandits
PDF
Personalizing the listening experience
PDF
Calibrated Recommendations
PDF
Homepage Personalization at Spotify
PDF
Multi-armed Bandits
PDF
Overview on Optimization algorithms in Deep Learning
PDF
Recent Trends in Personalization at Netflix
PPTX
multi-armed bandit
PPTX
Kmeans
PDF
Deep Learning for Recommender Systems
PDF
Data council SF 2020 Building a Personalized Messaging System at Netflix
PDF
Bandit Algorithms
PDF
Multi armed bandit
PDF
Multi-armed bandit by Joni Turunen
PPTX
ML Infrastracture @ Dropbox
PDF
Missing values in recommender models
PDF
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
PDF
【2017年度】勉強会資料_学習に関するテクニック
PDF
敵対的学習に対するラデマッハ複雑度
PPTX
バンディット問題について
Reinforcement Learning 2. Multi-armed Bandits
Personalizing the listening experience
Calibrated Recommendations
Homepage Personalization at Spotify
Multi-armed Bandits
Overview on Optimization algorithms in Deep Learning
Recent Trends in Personalization at Netflix
multi-armed bandit
Kmeans
Deep Learning for Recommender Systems
Data council SF 2020 Building a Personalized Messaging System at Netflix
Bandit Algorithms
Multi armed bandit
Multi-armed bandit by Joni Turunen
ML Infrastracture @ Dropbox
Missing values in recommender models
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
【2017年度】勉強会資料_学習に関するテクニック
敵対的学習に対するラデマッハ複雑度
バンディット問題について
Ad

Similar to Practical contextual bandits for business (20)

PDF
Multi-Armed Bandit: an algorithmic perspective
PDF
Improving experimentation velocity via Multi-Armed Bandits
PPTX
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
PDF
Artwork Personalization at Netflix
PDF
Multi-Armed Bandits:
 Intro, examples and tricks
PDF
poster_final_v7
PDF
Multi Armed Bandits
PDF
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
PDF
Claudia Vicol - Solving the multi armed bandit problem - Codemotion Amsterdam...
PDF
Sequential and reinforcement learning for demand side management by Margaux B...
PDF
ODP
Choosing between several options in uncertain environments
PPTX
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
PPTX
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
PDF
Personalized News Recommendation (Stream Data Based)
PDF
Bandit algorithms for website optimization - A summary
PDF
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
PDF
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
PDF
25 introduction reinforcement_learning
PDF
Multi-Armed Bandit: an algorithmic perspective
Improving experimentation velocity via Multi-Armed Bandits
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Artwork Personalization at Netflix
Multi-Armed Bandits:
 Intro, examples and tricks
poster_final_v7
Multi Armed Bandits
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
Claudia Vicol - Solving the multi armed bandit problem - Codemotion Amsterdam...
Sequential and reinforcement learning for demand side management by Margaux B...
Choosing between several options in uncertain environments
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
Personalized News Recommendation (Stream Data Based)
Bandit algorithms for website optimization - A summary
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
25 introduction reinforcement_learning
Ad

More from Yan Xu (20)

PPTX
Kaggle winning solutions: Retail Sales Forecasting
PDF
Basics of Dynamic programming
PPTX
Walking through Tensorflow 2.0
PDF
Introduction to Multi-armed Bandits
PDF
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
PDF
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
PDF
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
PDF
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
PDF
Introduction to Autoencoders
PPTX
State of enterprise data science
PDF
Long Short Term Memory
PDF
Deep Feed Forward Neural Networks and Regularization
PPTX
Linear algebra and probability (Deep Learning chapter 2&3)
PPTX
HML: Historical View and Trends of Deep Learning
PDF
Secrets behind AlphaGo
PPTX
Optimization in Deep Learning
PDF
Introduction to Recurrent Neural Network
PDF
Convolutional neural network
PDF
Introduction to Neural Network
PDF
Nonlinear dimension reduction
Kaggle winning solutions: Retail Sales Forecasting
Basics of Dynamic programming
Walking through Tensorflow 2.0
Introduction to Multi-armed Bandits
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Introduction to Autoencoders
State of enterprise data science
Long Short Term Memory
Deep Feed Forward Neural Networks and Regularization
Linear algebra and probability (Deep Learning chapter 2&3)
HML: Historical View and Trends of Deep Learning
Secrets behind AlphaGo
Optimization in Deep Learning
Introduction to Recurrent Neural Network
Convolutional neural network
Introduction to Neural Network
Nonlinear dimension reduction

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
A Presentation on Artificial Intelligence
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Digital-Transformation-Roadmap-for-Companies.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Per capita expenditure prediction using model stacking based on satellite ima...
NewMind AI Weekly Chronicles - August'25 Week I
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
Understanding_Digital_Forensics_Presentation.pptx
Big Data Technologies - Introduction.pptx
Spectral efficient network and resource selection model in 5G networks
A Presentation on Artificial Intelligence
NewMind AI Monthly Chronicles - July 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Building Integrated photovoltaic BIPV_UPV.pdf

Practical contextual bandits for business