Non-Asymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit models

Date: 
May, 2019
Hal Number: 
02123833
Abstract: 
In this paper, we study sequential testing problems with overlapping hypotheses. We first focus on the simple problem of assessing if the mean µ of a Gaussian distribution is ≥ε−or≤ε; if µ∈(−ε,ε), both answers are considered to be correct. Then, we consider PAC-best arm identification in a bandit model: given K probability distributions on R with means µ1,...,µK , we derive the asymptotic complexity of identifying, with risk at most δ, an index I∈1,...,K such that µ_I≥max_i µ_i−ε. We provide non asymptotic bounds on the error of a parallel General Likelihood Ratio Test, which can also be used for more general testing problems. We further propose lower bound on the number of observation needed to identify a correct hypothesis. Those lower bounds rely on information-theoretic arguments, and specifically on two versions of a change of measure lemma (a high-level form, and a low-level form) whose relative merits are discussed.

Keywords: