A Multiple-play Bandit Algorithm Applies to Recommender Systems

Edition Number: 
28
Date: 
May, 2015
Place: 
Hollywood, Florida, USA
PageStart: 
67
PageEnd: 
72
Abstract: 

For several web tasks such as ad placement or e-commerce, recommender systems must recommend multiple items to their users---such problems can be modeled as bandits with multiple plays. State-of-the-art methods require running as many single-play bandit algorithms as there are items to recommend. On the contrary, some recent theoretical work in the machine learning literature designed new algorithms to address the multiple-play case directly. These algorithms were proved to have strong theoretical guarantees. In this paper we compare one such multiple-play algorithm with previous methods. We show on two real-world datasets that the multiple-play algorithm we use converges to equivalent values but learns about three times faster than state-of-the-art methods. We also show that carefully adapting these earlier methods can improve their performance.