Optimal Best Arm Identification with Fixed Confidence

Soumis par zenno le lun, 08/22/2016 - 11:33

We provide a complete characterization of the complexity of best-arm identification in one-parameter bandit problems. We prove a new, tight lower bound on the sample complexity. We propose the 'Track-and-Stop' strategy, which is proved to be asymptotically optimal. It consists in a new sampling rule (which tracks the optimal proportions of arm draws highlighted by the lower bound) and in a stopping rule named after Chernoff, for which we give a new analysis.

Référence Bibliographique:

Conference On Learning Theory n°29 Jun. 2016, ArXiv:1602.04589 hal-01273838

Auteurs:

Aurélien Garivier, Emilie Kaufmann

Identifiez-vous pour poster des commentaires

Menu principal

Vous êtes ici

Connexion utilisateur

Optimal Best Arm Identification with Fixed Confidence