SlideShare a Scribd company logo
Sequential Learning in
the Position-Based Model
Claire Vernade, Olivier Cappé, Paul Lagrée (Télécom ParisTech)
B.Kveton, S.Katariya, Z.Weng, C.Szepesvàri (Adobe Research, U.Alberta)
-Chris Stucchio
« Don’t use Bandit Algorithms,
they probably don’t work for you.»
Blog de C.Stucchio: https://www.chrisstucchio.com/blog/2015/dont_use_bandits.html
Position-Based Model
1
2
3
4
✓
Xt ⇠ B(1 ⇥ ✓)
Chucklin et al. (2008):
!
Cascade Model,
User Browsing,
DCN,
CCN,
DCM,
…
Multi-Armed Bandit
0,53
0,61
0,42
0,40
0,60
0,55
Unobserved
expected reward
Estimated empirical
averages after a
few pulls
Multi-Armed Bandit
0,53
0,61
0,42
0,40 0,60
0,55
✓1
✓2
✓3
Two Bandit Games
1. Website optimization: You are the website manager
!
2. Add Placement: You want to place the right add in the
right location
1
2
3
4
Balzac
Zola
Website Optimization
At = ( , , , )
✓1✓2 ✓3✓4
✓4rt = 4321 + + +✓2 ✓1 ✓3
Multiple-Plays Bandits in the Position-Based Model. NIPS 2016
Website Optimization
The C-KLUCB algorithm
The KL-UCB algorithm for Bounded Stochastic Bandits and Beyond. Cappé, Garivier, COLT 2011
Website Optimization
Complexity Theorem (Lower Bound on the Regret)
For any uniformly e cient algorithm, the regret is asymptotically bounded
from below by
For T large enough, R(T) log(T) ⇥ C(, ✓)
102
103
104
Round t
0
20
40
60
80
100
RegretR(T)
Lower Bound
C-KLUCB
Ranked-UCB
Add Placement
0
B
B
@
· · · · ·
· · ✓kl · ·
· · · · ·
· · · · ·
1
C
C
A
✓k
l
1
2
3
4
At = (k, l)
rt = ✓kl
✓1
✓2
✓3
✓4
Stochastic Rank-1 Bandits. AISTATS 2017
KxL arms but K+L parameters !
Add Placement
Stochastic Rank-1 Bandits. AISTATS 2017
lim inf
T !1
R(T)
log(T)
KX
k=2
(✓11 ✓k1)
d(✓k1; ✓11)
+
LX
l=2
(✓11 ✓1l)
d(✓1l; ✓11)
Complexity Theorem (Lower Bound on the Regret)
Ccol(, ✓) Crow(, ✓)+R(T) log(T) ( )
For any uniformly e cient algorithm, the regret is asymptotically bounded
from below by
Which can be rewritten : for any T su ciently large,
Add Placement
BM-KLUCB
Idea : Alternatively explore the rows and the
columns of the matrix using KL-UCB
102
103
104
105
106
Round t
0
20
40
60
80
100
120
140
RegretR(T)
K = 3, L = 3
Lower Bound
R1klucb
Take-Home Message
‘Real-Life’ Bandit Algorithms are getting real… but not yet.
What comes next on Bandit models for recommendation and
conversion optimization : stochastic bandits with delays,
Rank-1 best arm identification, higher rank models ?
No free lunch theorems : exploring comes at some price
which depends on the complexity of the problem
Existing ‘super theoretical’ works on bandits provide us
super efficient algorithms in the end…
@vernadec

More Related Content

PPTX
Deep Learning on Aerial Imagery: What does it look like on a map?
PPTX
Advanced R Graphics
PDF
Speed-up Solving Linear Systems on Parallel Architectures via Aggregation of ...
PDF
[DCSB] Undine Lieberwirth & Axel Gering (TOPOI) 3D GIS in archaeology – a mic...
PDF
Graph x pregel
PDF
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
PPTX
Super COMPUTING Journal
PPTX
Merge sort algorithm power point presentation
Deep Learning on Aerial Imagery: What does it look like on a map?
Advanced R Graphics
Speed-up Solving Linear Systems on Parallel Architectures via Aggregation of ...
[DCSB] Undine Lieberwirth & Axel Gering (TOPOI) 3D GIS in archaeology – a mic...
Graph x pregel
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Super COMPUTING Journal
Merge sort algorithm power point presentation

What's hot (12)

PDF
Heaps
PDF
Role of Machine Learning in High Energy physics research at LHC
PDF
Autoencoding RNN for inference on unevenly sampled time-series data
PPTX
Analysis of Algorithm (Bubblesort and Quicksort)
PDF
Streaming multiscale anomaly detection
PDF
virtualization
PDF
PDF
Functions 2 inverse , composite
PPTX
Parallelizing matrix multiplication
ODP
Click-Trough Rate (CTR) prediction
PPTX
Analytical models of learning curves with variable processing time
Heaps
Role of Machine Learning in High Energy physics research at LHC
Autoencoding RNN for inference on unevenly sampled time-series data
Analysis of Algorithm (Bubblesort and Quicksort)
Streaming multiscale anomaly detection
virtualization
Functions 2 inverse , composite
Parallelizing matrix multiplication
Click-Trough Rate (CTR) prediction
Analytical models of learning curves with variable processing time
Ad

Viewers also liked (10)

PDF
Pulpix - Video Recommendation at Scale
PDF
Recommendation @ Meetic
PDF
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
PDF
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
PDF
What can bring library metadata to the web? Trust, links and love
PDF
Dictionary Learning for Massive Matrix Factorization
PDF
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
PPTX
RecsysFR: Criteo presentation
PDF
Injecting semantic links into a graph-based recommender system
PDF
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Pulpix - Video Recommendation at Scale
Recommendation @ Meetic
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
What can bring library metadata to the web? Trust, links and love
Dictionary Learning for Massive Matrix Factorization
Predictive quality metrics @ tinyclues - Artem Kozhevnikov - Tinyclues
RecsysFR: Criteo presentation
Injecting semantic links into a graph-based recommender system
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Ad

Similar to Sequential Learning in the Position-Based Model (20)

PDF
Automated Security Response through Online Learning with Adaptive Con jectures
PDF
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
PDF
Interface 2010
PDF
SLAM of Multi-Robot System Considering Its Network Topology
PDF
PDF
Presentation OCIP2014
PDF
Overlap Layout Consensus assembly
PDF
Lec7 deeprlbootcamp-svg+scg
PDF
Maneuvering target track prediction model
PDF
Application of parallel hierarchical matrices and low-rank tensors in spatial...
PDF
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
PDF
Data Profiling in Apache Calcite
PDF
Skiena algorithm 2007 lecture15 backtracing
PPTX
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
PPTX
Python Homework Help
PDF
Predicate-Preserving Collision-Resistant Hashing
PDF
A common fixed point theorem for two random operators using random mann itera...
PDF
Introduction to Machine Learning
PPTX
[Vldb 2013] skyline operator on anti correlated distributions
PDF
Data Smashing
Automated Security Response through Online Learning with Adaptive Con jectures
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Interface 2010
SLAM of Multi-Robot System Considering Its Network Topology
Presentation OCIP2014
Overlap Layout Consensus assembly
Lec7 deeprlbootcamp-svg+scg
Maneuvering target track prediction model
Application of parallel hierarchical matrices and low-rank tensors in spatial...
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Data Profiling in Apache Calcite
Skiena algorithm 2007 lecture15 backtracing
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
Python Homework Help
Predicate-Preserving Collision-Resistant Hashing
A common fixed point theorem for two random operators using random mann itera...
Introduction to Machine Learning
[Vldb 2013] skyline operator on anti correlated distributions
Data Smashing

More from recsysfr (15)

PPTX
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
PDF
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
PDF
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
PDF
Recommendations @ Rakuten Group
PDF
Data-Driven Recommender Systems
PPTX
Recommender systems
PDF
Recommendation @Deezer
PPTX
Flexible recommender systems based on graphs
PPTX
Using Neural Networks to predict user ratings
PDF
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
PDF
Recommendation @ PriceMinister-Rakuten - Road to personalization
PDF
Rakuten Institute of Technology Paris
PDF
Tailor-made personalization and recommendation - Sailendra
PDF
New tools from the bandit literature to improve A/B Testing
PDF
Story of the algorithms behind Deezer Flow
Multi Task DPP for Basket Completion by Romain WARLOP, Fifty Five
Building a recommender system with Annoy and Word2Vec by Cristian PEREZ, Kern...
An Homophily-based Approach for Fast Post Recommendation in Microblogging Sys...
Recommendations @ Rakuten Group
Data-Driven Recommender Systems
Recommender systems
Recommendation @Deezer
Flexible recommender systems based on graphs
Using Neural Networks to predict user ratings
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?
Recommendation @ PriceMinister-Rakuten - Road to personalization
Rakuten Institute of Technology Paris
Tailor-made personalization and recommendation - Sailendra
New tools from the bandit literature to improve A/B Testing
Story of the algorithms behind Deezer Flow

Recently uploaded (20)

PPTX
international classification of diseases ICD-10 review PPT.pptx
PDF
Introduction to the IoT system, how the IoT system works
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PPTX
E -tech empowerment technologies PowerPoint
PPTX
artificial intelligence overview of it and more
PDF
An introduction to the IFRS (ISSB) Stndards.pdf
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PPT
tcp ip networks nd ip layering assotred slides
PDF
Paper PDF World Game (s) Great Redesign.pdf
PPTX
Internet___Basics___Styled_ presentation
PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PDF
The Internet -By the Numbers, Sri Lanka Edition
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PDF
Sims 4 Historia para lo sims 4 para jugar
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
international classification of diseases ICD-10 review PPT.pptx
Introduction to the IoT system, how the IoT system works
Design_with_Watersergyerge45hrbgre4top (1).ppt
E -tech empowerment technologies PowerPoint
artificial intelligence overview of it and more
An introduction to the IFRS (ISSB) Stndards.pdf
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Introuction about ICD -10 and ICD-11 PPT.pptx
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
tcp ip networks nd ip layering assotred slides
Paper PDF World Game (s) Great Redesign.pdf
Internet___Basics___Styled_ presentation
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
Decoding a Decade: 10 Years of Applied CTI Discipline
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
The Internet -By the Numbers, Sri Lanka Edition
introduction about ICD -10 & ICD-11 ppt.pptx
Sims 4 Historia para lo sims 4 para jugar
Slides PDF The World Game (s) Eco Economic Epochs.pdf

Sequential Learning in the Position-Based Model

  • 1. Sequential Learning in the Position-Based Model Claire Vernade, Olivier Cappé, Paul Lagrée (Télécom ParisTech) B.Kveton, S.Katariya, Z.Weng, C.Szepesvàri (Adobe Research, U.Alberta)
  • 2. -Chris Stucchio « Don’t use Bandit Algorithms, they probably don’t work for you.» Blog de C.Stucchio: https://www.chrisstucchio.com/blog/2015/dont_use_bandits.html
  • 3. Position-Based Model 1 2 3 4 ✓ Xt ⇠ B(1 ⇥ ✓) Chucklin et al. (2008): ! Cascade Model, User Browsing, DCN, CCN, DCM, …
  • 6. Two Bandit Games 1. Website optimization: You are the website manager ! 2. Add Placement: You want to place the right add in the right location 1 2 3 4 Balzac Zola
  • 7. Website Optimization At = ( , , , ) ✓1✓2 ✓3✓4 ✓4rt = 4321 + + +✓2 ✓1 ✓3 Multiple-Plays Bandits in the Position-Based Model. NIPS 2016
  • 8. Website Optimization The C-KLUCB algorithm The KL-UCB algorithm for Bounded Stochastic Bandits and Beyond. Cappé, Garivier, COLT 2011
  • 9. Website Optimization Complexity Theorem (Lower Bound on the Regret) For any uniformly e cient algorithm, the regret is asymptotically bounded from below by For T large enough, R(T) log(T) ⇥ C(, ✓) 102 103 104 Round t 0 20 40 60 80 100 RegretR(T) Lower Bound C-KLUCB Ranked-UCB
  • 10. Add Placement 0 B B @ · · · · · · · ✓kl · · · · · · · · · · · · 1 C C A ✓k l 1 2 3 4 At = (k, l) rt = ✓kl ✓1 ✓2 ✓3 ✓4 Stochastic Rank-1 Bandits. AISTATS 2017 KxL arms but K+L parameters !
  • 11. Add Placement Stochastic Rank-1 Bandits. AISTATS 2017 lim inf T !1 R(T) log(T) KX k=2 (✓11 ✓k1) d(✓k1; ✓11) + LX l=2 (✓11 ✓1l) d(✓1l; ✓11) Complexity Theorem (Lower Bound on the Regret) Ccol(, ✓) Crow(, ✓)+R(T) log(T) ( ) For any uniformly e cient algorithm, the regret is asymptotically bounded from below by Which can be rewritten : for any T su ciently large,
  • 12. Add Placement BM-KLUCB Idea : Alternatively explore the rows and the columns of the matrix using KL-UCB 102 103 104 105 106 Round t 0 20 40 60 80 100 120 140 RegretR(T) K = 3, L = 3 Lower Bound R1klucb
  • 13. Take-Home Message ‘Real-Life’ Bandit Algorithms are getting real… but not yet. What comes next on Bandit models for recommendation and conversion optimization : stochastic bandits with delays, Rank-1 best arm identification, higher rank models ? No free lunch theorems : exploring comes at some price which depends on the complexity of the problem Existing ‘super theoretical’ works on bandits provide us super efficient algorithms in the end…