The classical Upper-Confidence Bound policies are known to have some nice optimality
properties in simple bandit models. In more general contexts, however, they appear
to be quite unsatisfying, because of the too rudimentary estimation procedure they rely
on. In this talk, I will explain how the Empirical Likelihood method may be adapted to
address this issue. A central requirement of the bandit context is the necessity to obtain
non-asymptotic confidence bounds, while most results on the Empirical Likelihood
method are asymptotic