Optimal Best Arm Identification with Fixed Confidence

Edition Number: 
29
Date: 
June, 2016
Place: 
New York, USA
PageStart: 
998
PageEnd: 
1 027
Abstract: 

Julia code of the simulations on github
We provide a complete characterization of the complexity of best-arm identification in one-parameter bandit problems. We prove a new, tight lower bound on the sample complexity. We propose the 'Track-and-Stop' strategy, which is proved to be asymptotically optimal. It consists in a new sampling rule (which tracks the optimal proportions of arm draws highlighted by the lower bound) and in a stopping rule named after Chernoff, for which we give a new analysis.
COLT presentation on youtube (with some unfortunate sound recording problems)

Arxiv Number: 
1602.04589
Hal Number: 
01273838