Multi-armed bandits are a technique that can improve experimentation velocity by actively choosing which variation or "arm" to test users on, balancing exploring all options and exploiting the best identified so far. In a presentation, Dr. Ilias Flaounas showed that using multi-armed bandits can reduce the number of samples needed in A/B tests by 45-80% compared to randomly assigning users. He also discussed an extension, kernelized contextual bandits, that can locate a target like the city of Bristol from tweets mentioning it using only 300 iterations, much better than searching all 10,000 candidate locations.
Related topics: