Parametric Bandits: The Generalized Linear Case

ConferenceName:

Neural Information Processing Systems

url:

NIPS*2010

Edition Number:

Date:

December, 2010

Place:

Vancouver, Canada

Authors:

Abstract:

We consider structured multi-armed bandit problems based on the Generalized Linear Model (GLM) framework of statistics. For these bandits, we propose a new algorithm, called GLM-UCB. We derive ﬁnite time, high probability bounds on the regret of the algorithm, extending previous analyses developed for the linear bandits to the non-linear case. The analysis highlights a key difﬁculty in generalizing linear bandit algorithms to the non-linear case, which is solved in GLM-UCB by focusing on the reward space rather than on the parameter space. Moreover, as the actual effectiveness of current parameterized bandit algorithms is often poor in practice, we provide a tuning method based on asymptotic arguments, which leads to signiﬁcantly better practical performance. We present two numerical experi-
ments on real-world data that illustrate the potential of the GLM-UCB approach.

Direct link:

Proceedings published as Advances in Neural Information Processing Systems 23 (with supplementary material)

Main menu

Parametric Bandits: The Generalized Linear Case

Keywords:

Search form

Main menu

You are here

Parametric Bandits: The Generalized Linear Case

Keywords: