Improving experimentation velocity via Multi-Armed Bandits

Improving experimentation
velocity via Multi-Armed Bandits
Dr Ilias Flaounas
Senior Data Scientist
Growth Hacking Meetup, Sydney, 20 June 2016

http://www.nancydixonblog.com/2012/05/-why-knowledge-management-didnt-save-general-motors-addressing-complex-issues-
by-convening-conversat.html

Conversion rate
PDF•In a classic A/B we pick where to assign the next user randomly.
•In MAB we actively choose the cohort.
Pick black to exploit
Pick green (or red) to explore

Win for variation “d” and estimation of p-values

Let’s run it for a bit longer… Again, win for variation “d”.
Classic A/B/C/D/E: ~2.5K samples
Multi-armed bandit: ~1K samples
60% Less samples

No winner after 1K iterations
Classic A/B/C: ~5K samples
80% Less samples

No winner after 1K iterations
Classic A/B/C: ~2.8K samples
64% Less samples

Win for variation “a”.
Classic A/B/C: ~1.8K samples
45% Less samples

Disadvantages
• Reaching significance for
non-winning arms takes
longer
• Unclear stopping criteria -
App-specific heuristics
• Hard to order non-winning
arms and assess reliably
their impact
Advantages
• Reaching significance for
the winning arm is faster 
• Best arm can change over
time
• There are no false
positives in the long term

• How can we locate the
city of Bristol from
tweets?
• 10K candidate locations
organised in a 100x100
grid
• At every step we get
tweets from one
location and count the
number of mentions of
the word “Bristol”
• Challenge: ﬁnd the target
in sub-linear time
complexity!

• Contextual bandits
can tackle this
problem
• We proposed the
KernelUCB, a non-
linear & contextual
ﬂavour of MAB.
• The last few steps
of the algorithm
before it locates
Bristol.
Technical description: M. Valko, N. Korda, R. Munos, I. Flaounas, N. Cristianini, “Finite-
Time Analysis of Kernelised Contextual Bandits”, UAI, 2013.

Target is the red dot.
KernelUCB Matlab code: http://www.complacs.org/pmwiki.php/CompLACS/KernelUCB
KernelUCB
with RBF
kernel
converges
after ~300
iterations
(instead of
>>10K).

Thank you!
Yes, we are hiring
Dr Ilias Flaounas
Senior Data Scientist

Improving experimentation velocity via Multi-Armed Bandits

More Related Content

Similar to Improving experimentation velocity via Multi-Armed Bandits (20)

More from Ilias Flaounas (9)

Recently uploaded (20)

Improving experimentation velocity via Multi-Armed Bandits