This document discusses an ensemble contextual bandit approach for personalized recommendation. It addresses the cold start problem where there is not enough data to validate individual recommendation models. The approach treats each model as an arm in a contextual multi-armed bandit problem, where the context is user features. It uses Thompson sampling to allocate recommendation chances to models, allowing models with higher estimated click-through rates to be selected more often. This ensemble approach can perform close to the best individual model without needing to evaluate models separately during the cold start period.
Related topics: