Context Tree Selection: A Unifying View

Date: 
November, 2011
PageStart: 
2 488
PageEnd: 
2 506
Arxiv Number: 
1011.2424
Abstract: 
Context tree models have been introduced by Rissanen as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper investigates non-asymptotic properties of two popular procedures of context tree estimation: Rissanen’s algorithm Context and penalized maximum likelihood. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning over-estimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The under-estimation properties rely on classical hypotheses for processes of infinite memory. These results improve on and generalize the bounds obtained in Duarte et al. (2006), Galves et al. (2008), Galves and Leonardi (2008), Leonardi (2010), refining asymptotic results of Bühlmann and Wyner (1999) and Csiszár and Talata (2006).
Issue: 
11
Volume: 
121