Algorithmes de bandit pour les systèmes de recommandation : le cas de multiples recommandations simultanées

Edition Number: 
12
Date: 
March, 2015
Place: 
Paris, France
PageStart: 
73
PageEnd: 
88
Abstract: 

The multiple-play recommender systems (RS) are RS which recommend several items to the users. RS are based on learning models in order to choose the items to recommend. Among these models, the bandit algorithms offer the advantage to learn and exploite the learnt elements at the same time. Current approaches require running as many instances of a bandit algorithm as there are items to recommend. As opposed to that, we handle all recommendations simultaneously, by a single instance of a bandit algorithm. We show on two benchmark datasets (Movielens and Jester) that our method, MPB (Multiple Play Bandit), obtains a learning rate about thirteen times faster while obtaining equivalent click-through rates. We also show that the choice of the bandit algorithm used impacts the level of improvement.