List of Publications

Alexandre Marthe, Aurélien Garivier, Claire Vernade

Apr. 2024

ArXiv:2210.05222

hal-03805239

Keywords:

Optimization, Auctions

Motivated by programmatic advertising optimization, we consider the task of sequentially allocating budget across a set of resources. At every time step, a feasible allocation is chosen and only a corresponding random return is observed. The goal is to maximize the cumulative expected sum of returns. This is a realistic model for budget allocation across subdivisions of marketing campaigns, with the objective of maximizing the number of conversions. We study direct search (also known as pattern search) methods for linearly constrained and derivative-free optimization in the presence of noise, which apply in particular to sequential budget allocation. These algorithms, which do not rely on hierarchical partitioning of the resource space, are easy to implement; they respect the operational constraints of resource allocation by avoiding evaluation outside of the feasible domain; and, they are also compatible with warm start by being (approximate) descent algorithms. However, they have not yet been analyzed from the perspective of cumulative regret. We show that direct search methods achieves finite regret in the deterministic and unconstrained case. In the presence of evaluation noise and linear constraints, we propose a simple extension of direct search that achieves a regret upper-bound of the order of T^(2/3). We also propose an accelerated version of the algorithm, relying on repeated sequential testing, that significantly improves the practical behavior of the approach.

Beyond Average Return in Markov Decision Processes

2023

Aug. 2023

arXiv:2310.20266

hal-04264487

available online (discussed here)

Keywords:

Reinforcement learning, MDP, risk-aware optimization

Abstract What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes? In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics. We summarize the characterization of these classes for policy evaluation, and give a new answer for the planning problem. Interestingly, we prove that only generalized means can be optimized exactly, even in the more general framework of Distributional Reinforcement Learning (DistRL). DistRL permits, however, to evaluate other functionals approximately. We provide error bounds on the resulting estimators, and discuss the potential of this approach as well as its limitations. These results contribute to advancing the theory of Markov Decision Processes by examining overall characteristics of the return, and particularly risk-conscious strategies.

On Adaptivity in Quantum Testing

Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier

Clément Lalanne, Aurélien Garivier, Rémi Gribonval

Sep. 2023

hal-04107265

available on openreview

Keywords:

sequential statistics, quantum channels

Can adaptive strategies outperform non-adaptive ones for quantum hypothesis selection? We exhibit problems where adaptive strategies provably reduce the number of required samples by a factor four in the worst case, and possibly more when the actual difficulty of the problem makes it possible. In addition, we exhibit specific hypotheses classes for which there is a provable polynomial separation between adaptive and non-adaptive strategies -- a specificity of the quantum framework that does not appear in classical testing.

About the Cost of Global Privacy in Density Estimation

Clément Lalanne, Aurélien Garivier, Rémi Gribonval

Aug. 2023

hal-04133655

available on openreview

Keywords:

Differential Privacy, Non-parametrics

We study non-parametric density estimation for densities in Lipschitz and Sobolev spaces, and under global privacy. In particular, we investigate regimes where the privacy budget is not supposed to be constant. We consider the classical definition of global differential privacy, but also the more recent notion of global concentrated differential privacy. We recover the result of Barber & Duchi (2014) stating that histogram estimators are optimal against Lipschitz distributions for the L2 risk, and under regular differential privacy, and we extend it to other norms and notions of privacy. Then, we investigate higher degrees of smoothness, drawing two conclusions: First, and contrary to what happens with constant privacy budget (Wasserman & Zhou, 2010), there are regimes where imposing privacy degrades the regular minimax risk of estimation on Sobolev densities. Second, so-called projection estimators are near-optimal against the same classes of densities in this new setup with pure differential privacy, but contrary to the constant privacy budget case, it comes at the cost of relaxation. With zero concentrated differential privacy, there is no need for relaxation, and we prove that the estimation is optimal.

Private Statistical Estimation of Many Quantiles

Fortieth International Conference on Machine Learning

Jul. 2023

ArXiv:2302.06943

hal-03986170

Keywords:

Differential Privacy, Inverse Sensibility

This work studies the estimation of many statistical quantiles under differential privacy. More precisely, given a distribution and access to i.i.d. samples from it, we study the estimation of the inverse of its cumulative distribution function (the quantile function) at specific points. For instance, this task is of key importance in private data generation. We present two different approaches. The first one consists in privately estimating the empirical quantiles of the samples and using this result as an estimator of the quantiles of the distribution. In particular, we study the statistical properties of the recently published algorithm introduced by Kaplan et al. 2022 that privately estimates the quantiles recursively. The second approach is to use techniques of density estimation in order to uniformly estimate the quantile function on an interval. In particular, we show that there is a tradeoff between the two methods. When we want to estimate many quantiles, it is better to estimate the density rather than estimating the quantile function at specific points.

Quantum Channel Certification with Incoherent Strategies

Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier

Conference on Learning Theory, Bengalore

n° 36

COLT

Jul. 2023

ArXiv:2303.01188

Keywords:

Quantum Physics, Sequential statistics, Information Theory

In the problem of quantum channel certification, we have black box access to a quantum process and would like to decide if this process matches some predefined specification or is ε-far from this specification. The objective is to achieve this task while minimizing the number of times the black box is used. Here, we focus on optimal incoherent strategies for two relevant extreme cases of channel certification. The first one is when the predefined specification is a unitary channel, e.g., a gate in a quantum circuit. In this case, we show that testing whether the black box is described by a fixed unitary operator in dimension d or ε-far from it in the trace norm requires Θ(d/ε^2) uses of the black box. The second setting we consider is when the predefined specification is a completely depolarizing channel with input dimension din and output dimension dout. In this case, we prove that, in the non-adaptive setting, Θ(d_in^2 d_out^1.5 / ε^2) uses of the channel are necessary and sufficient to verify whether it is equal to the depolarizing channel or ε-far from it in the diamond norm. Finally, we prove a lower bound of Ω(d_in^2d_out/ε^2) for this problem in the adaptive setting. Note that the special case d_in=1 corresponds to the well-studied quantum state certification problem.

On the Statistical Complexity of Estimation and Testing under Privacy Constraints

Clément Lalanne, Aurélien Garivier, Rémi Gribonval

Clément Lalanne, Clément Gastaud, Nicolas Grislain, Aurélien Garivier, Rémi Gribonval

Apr. 2023

ArXiv:2210.02215

hal-03794374

Keywords:

Differential Privacy, Inverse Sensibility

Producing statistics that respect the privacy of the samples while still maintaining their accuracy is an important topic of research. We study minimax lower bounds when the class of estimators is restricted to the differentially private ones. In particular, we show that characterizing the power of a distributional test under differential privacy can be done by solving a transport problem. With specific coupling constructions, this observation allows us to derivate Le Cam-type and Fano-type inequalities for both regular definitions of differential privacy and for divergence-based ones (based on Renyi divergence). We then proceed to illustrate our results on three simple, fully worked out examples. In particular, we show that the problem class has a huge importance on the provable degradation of utility due to privacy. For some problems, privacy leads to a provable degradation only when the rate of the privacy parameters is small enough whereas for other problem, the degradation systematically occurs under much looser hypotheses on the privacy parametters. Finally, we show that the known privacy guarantees of DP-SGLD, a private convex solver, when used to perform maximum likelihood, leads to an algorithm that is near-minimax optimal in both the sample size and the privacy tuning parameters of the problem for a broad class of parametric estimation procedures that includes exponential families.

Private Quantiles Estimation in the Presence of Atoms

accepted

Information and Inference

Jul. 2023

ArXiv:2202.08969

hal-03572701

Keywords:

Differential Privacy, Quantile Estimation, Inverse Sensibility

We address the differentially private estimation of multiple quantiles (MQ) of a dataset, a key building block in modern data analysis. We apply the recent non-smoothed Inverse Sensitivity (IS) mechanism to this specific problem and establish that the resulting method is closely related to the current state-of-the-art, the JointExp algorithm, sharing in particular the same computational complexity and a similar efficiency. However, we demonstrate both theoretically and empirically that (non-smoothed) JointExp suffers from an important lack of performance in the case of peaked distributions, with a potentially catastrophic impact in the presence of atoms. While its smoothed version would allow to leverage the performance guarantees of IS, it remains an open challenge to implement. As a proxy to fix the problem we propose a simple and numerically efficient method called Heuristically Smoothed JointExp (HSJointExp), which is endowed with performance guarantees for a broad class of distributions and achieves results that are orders of magnitude better on problematic datasets.

On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits.

Antoine Barrier, Aurélien Garivier, Gilles Stoltz

The 34th International Conference on Algorithmic Learning Theory

Feb. 2023

ArXiv:2210.00895

hal-03792668

Keywords:

Bandit models, Best-arm identification, fixed budget

We lay the foundations of a non-parametric theory of best-arm identification in multi-armed bandits with a fixed budget T. We consider general, possibly non-parametric, models D for distributions over the arms; an overarching example is the model D = P(0,1) of all probability distributions over [0,1]. We propose upper bounds on the average log-probability of misidentifying the optimal arm based on information-theoretic quantities that correspond to infima over Kullback-Leibler divergences between some distributions in D and a given distribution. This is made possible by a refined analysis of the successive-rejects strategy of Audibert, Bubeck, and Munos (2010). We finally provide lower bounds on the same average log-probability, also in terms of the same new information-theoretic quantities; these lower bounds are larger when the (natural) assumptions on the considered strategies are stronger. All these new upper and lower bounds generalize existing bounds based, e.g., on gaps between distributions.

On the complexity of All ε-Best Arms Identification

Aymen Al Marjani, Aurélien Garivier, Tomáš Kocák

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD'22)

Sep. 2022

ArXiv:2202.06280

hal-03570280

Keywords:

Bandit models, Best-arm identification,

We consider the problem introduced by [MJTN20] of identifying all the ε-optimal arms in a finite stochastic multi-armed bandit with Gaussian rewards. In the fixed confidence setting, we give a lower bound on the number of samples required by any algorithm that returns the set of ε-good arms with a failure probability less than some risk level δ. This bound writes as T * ε (µ) log(1/δ), where T * ε (µ) is a characteristic time that depends on the vector of mean rewards µ and the accuracy parameter ε. We also provide an efficient numerical method to solve the convex max-min program that defines the characteristic time. Our method is based on a complete characterization of the alternative bandit instances that the optimal sampling strategy needs to rule out, thus making our bound tighter than the one provided by [MJTN20]. Using this method, we propose a Track-and-Stop algorithm that identifies the set of ε-good arms w.h.p and enjoys asymptotic optimality (when δ goes to zero) in terms of the expected sample complexity. Finally, using numerical simulations, we demonstrate our algorithm's advantage over state-of-the-art methods, even for moderate values of the risk parameter.

Génération de traces cellulaires réalistes

Anne Josiane Kouam, Aline Carneiro Viana, Aurélien Garivier, Alain Tchana

CORES 2022 – 7ème Rencontres Francophones sur la Conception de Protocoles, l’Évaluation de Performance et l’Expérimentation des Réseaux de Communication

May 2022

hal-03658019

Keywords:

CDR réalistes, Réseaux cellulaires, Simulation

Du fait de la confidentialité des données que contiennent les traces cellulaires CDRs (Call Data Records), leur obtention est soumise à des lois strictes qui limitent leur accessibilité aux chercheurs, et leur utilisabilité même lorsqu’ils sont disponibles. Pour pallier à ce manque, dans cet article, nous nous intéressons à la production de traces CDRs réalistes sur la base d’un ensemble de caractérisations de CDRs réels obtenus auprès d’un opérateur.

A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

Antoine Barrier, Aurélien Garivier, Tomáš Kocák

25th International Conference on Artificial Intelligence and Statistics (AISTAT'22)

Mar. 2022

ArXiv:2105.12978

hal-03236583

Keywords:

auctions

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy called Exploration-Biased Sampling is not only asymptotically optimal: we also prove non-asymptotic bounds occurring with high probability. To the best of our knowledge, this is the first strategy with such guarantees. But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is slightly biased in favor of exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.

A A/B/n Testing with Control in the Presence of Subpopulations

Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter M Koolen

Dec. 2021

ArXiv:2110.15573

hal-03407247

Keywords:

Bandit models, Best-arm identification, A/B testing

Motivated by A/B/n testing applications, we consider a finite set of distributions (called \emph{arms}), one of which is treated as a \emph{control}. We assume that the population is stratified into homogeneous subpopulations. At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation. The quality of each arm is assessed through a weighted combination of its subpopulation means. We propose a strategy for sequentially choosing one arm per time step so as to discover as fast as possible which arms, if any, have higher weighted expectation than the control. This strategy is shown to be asymptotically optimal in the following sense: if is the first time when the strategy ensures that it is able to output the correct answer with probability at least , then grows linearly with at the exact optimal rate. This rate is identified in the paper in three different settings: (1) when the experimenter does not observe the subpopulation information, (2) when the subpopulation of each sample is observed but not chosen, and (3) when the experimenter can select the subpopulation from which each response is sampled. We illustrate the efficiency of the proposed strategy with numerical simulations on synthetic and real data collected from an A/B/n experiment.

Navigating to the Best Policy in Markov Decision Processes

Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere

Dec. 2021

ArXiv:2106.02847

hal-03454436

Keywords:

Reinforcement learning, Best policy identification, Markov Decision Processes

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible. We propose an information-theoretic lower bound on the average number of steps required before a correct answer can be given with probability at least . This lower bound involves a non-convex optimization problem, for which we propose a convex relaxation. We further provide an algorithm whose sample complexity matches the relaxed lower bound up to a factor . This algorithm addresses general communicating MDPs; we propose a variant with reduced exploration rate (and hence faster convergence) under an additional ergodicity assumption. This work extends previous results relative to the \emph{generative setting}~\cite{marjani2020adaptive}, where the agent could at each step observe the random outcome of any (state, action) pair. In contrast, we show here how to deal with the \emph{navigation constraints} and comment on the gap between the generative and online settings. Our analysis relies on an ergodic theorem for non-homogeneous Markov chains which we consider of wide interest in the analysis of Markov Decision Processes.

Sequential Algorithms for Testing Closeness of Distributions

Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier

Juliette Achddou, Olivier Cappé, Aurélien Garivier

spotlight

Dec. 2021

ArXiv:2205.06069

hal-03481048

Keywords:

sequential tests

What advantage do \emph{sequential} procedures provide over batch algorithms for testing properties of unknown distributions? Focusing on the problem of testing whether two distributions $\mathcal{D}_1$ and $\mathcal{D}_2$ on $\{1,\dots, n\}$ are equal or $\eps$-far, we give several answers to this question. We show that for a small alphabet size $n$, there is a sequential algorithm that outperforms any batch algorithm by a factor of at least $4$ in terms sample complexity. For a general alphabet size $n$, we give a sequential algorithm that uses no more samples than its batch counterpart, and possibly fewer if the actual distance $\TV(\mathcal{D}_1, \mathcal{D}_2)$ between $\mathcal{D}_1$ and $\mathcal{D}_2$ is larger than $\eps$. As a corollary, letting $\eps$ go to $0$, we obtain a sequential algorithm for testing closeness when no a priori bound on $\TV(\mathcal{D}_1, \mathcal{D}_2)$ is given that has a sample complexity $\tilde{\mathcal{O}}(\frac{n^{2/3}}{\TV(\mathcal{D}_1, \mathcal{D}_2)^{4/3}})$: this improves over the $\tilde{\mathcal{O}}(\frac{n/\log n}{\TV(\mathcal{D}_1, \mathcal{D}_2)^{2} })$ tester of~\citet{daskalakis2017optimal} and is optimal up to multiplicative constants. We also establish limitations of sequential algorithms for the problem of testing closeness: they can improve the worst case number of samples by at most a constant factor.

Fast Rate Learning in Stochastic First Price Bidding

13th Asian Conference on Machine Learning (ACML)

n°13

Nov. 2021

ArXiv:2107.01835

hal-03277164

Keywords:

Bandits, Auctions

First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. As far as learning is concerned, first-price auctions are more challenging because the optimal bidding strategy does not only depend on the value of the item but also requires some knowledge of the other bids. They have already given rise to several works in sequential learning, many of which consider models for which the value of the buyer or the opponents' maximal bid is chosen in an adversarial manner. Even in the simplest settings, this gives rise to algorithms whose regret grows as $\sqrt{T}$ with respect to the time horizon $T$. Focusing on the case where the buyer plays against a stationary stochastic environment, we show how to achieve significantly lower regret: when the opponents' maximal bid distribution is known we provide an algorithm whose regret can be as low as $\log^2(T)$; in the case where the distribution must be learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+ \epsilon}$ regret, for any $\epsilon>0$. To obtain these results, we introduce two novel ideas that can be of interest in their own right. First, by tranposing results obtained in the posted price setting, we provide conditions under which the first-price biding utility is locally quadratic around its optimum. Second, we leverage the observation that, on small sub-intervals, the concentration of the variations of the empirical distribution function may be controlled more accurately than by using the classical Dvoretzky-Kiefer-Wolfowitz inequality. Numerical simulations confirm that our algorithms converge much faster than alternatives proposed in the literature for various bid distributions, including for bids collected on an actual programmatic advertising platform.

Epsilon Best Arm Identification in Spectral Bandits

Tomáš Kocák, Aurélien Garivier

International Joint Conference on Artificial Intelligence (IJCAI)

n°30

Aug. 2021

direct link on ijcai.org

Keywords:

Bandits, Graphs, Best-arm identification

We propose an analysis of PAC identification of an epsilon-best arm in graph bandit models with Gaussian distributions. We consider finite but potentially very large bandit models where the set of arms is endowed with a graph structure, and we assume that the arms' expectations $\bmu$ are smooth with respect to this graph. Our goal is to identify an arm whose expectation is at most $\epsilon$ below the largest of all means. We focus on the fixed-confidence setting: given a risk parameter $\delta$, we consider sequential strategies that yield an epsilon-optimal arm with probability at least 1-delta. All such strategies use at least T*_{R,epsilon}(mu)\log(1/delta) samples, where R is the smoothness parameter. We identify the complexity term T*_{R,epsilon}(mu) as the solution of a min-max problem for which we give a game-theoretic analysis and an approximation procedure. This procedure is the key element required by the asymptotically optimal Track-and-Stop strategy.

Efficient Algorithms for Stochastic Repeated Second-price Auctions

Juliette Achddou, Olivier Cappé, Aurélien Garivier

Algorithmic Learning Theory (ALT)

n°32

Mar. 2021

ArXiv:2011.05072

hal-02997579

Keywords:

auctions

Developing efficient sequential bidding strategies for repeated auctions is an important practical challenge in various marketing tasks. In this setting, the bidding agent obtains information, on both the value of the item at sale and the behavior of the other bidders, only when she wins the auction. Standard bandit theory does not apply to this problem due to the presence of action-dependent censoring. In this work, we consider second-price auctions and propose novel, efficient UCB-like algorithms for this task. These algorithms are analyzed in the stochastic setting, assuming regularity of the distribution of the opponents' bids. We provide regret upper bounds that quantify the improvement over the baseline algorithm proposed in the literature. The improvement is particularly significant in cases when the value of the auctioned item is low, yielding a spectacular reduction in the order of the worst-case regret. We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies. As an alternative, we propose more explainable strategies which are reminiscent of the Explore Then Commit bandit algorithm. We provide a critical analysis of this class of strategies, showing both important advantages and limitations. In particular, we provide a minimax lower bound and propose a nearly minimax-optimal instance of this class.

Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

Yoan Russac, Olivier Cappé, Aurélien Garivier

24th International Conference on Artificial Intelligence and Statistics (AISTAT'21)

Apr. 2021

ArXiv:2011.00819

hal-02984117

Keywords:

Non-stationary Bandits

Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them. In contrast to the case of linear bandits, existing algorithms for GLB have two drawbacks undermining their applicability. First, they rely on excessively pessimistic concentration bounds due to the non-linear nature of the model. Second, they require either non-convex projection steps or burn-in phases to enforce boundedness of the estimators. Both of these issues are worsened when considering non-stationary models, in which the GLB parameter may vary with time. In this work, we focus on self-concordant GLB (which include logistic and Poisson regression) with forgetting achieved either by the use of a sliding window or exponential weights. We propose a novel confidence-based algorithm for the maximum-likehood estimator with forgetting and analyze its perfomance in abruptly changing environments. These results as well as the accompanying numerical simulations highlight the potential of the proposed approach to address non-stationarity in GLB.

Hierarchical and Unsupervised Graph Representation Learning with Loukas's Coarsening

Louis Béthune, Yacouba Kaloga, Pierre Borgnat, Aurélien Garivier, Amaury Habrard

Algorithms 13(9)

Aug. 2020

206 (pp. 1-22)

ArXiv:2007.03373

Direct link, https://doi.org/10.3390/a13090206, Special Issue Efficient Graph Algorithms in Machine Learning

Keywords:

Graph clustering

We propose a novel algorithm for unsupervised graph representation learning with attributed graphs. It combines three advantages addressing some current limitations of the literature: i) The model is inductive: it can embed new graphs without re-training in the presence of new data; ii) The method takes into account both micro-structures and macro-structures by looking at the attributed graphs at different scales; iii) The model is end-to-end differentiable: it is a building block that can be plugged into deep learning pipelines and allows for back-propagation. We show that combining a coarsening method having strong theoretical guarantees with mutual information maximization suffices to produce high quality embeddings. We evaluate them on classification tasks with common benchmarks of the literature. We show that our algorithm is competitive with state of the art among unsupervised graph representation learning methods.

Best Arm Identification in Spectral Bandits

Tomáš Kocák, Aurélien Garivier

International Joint Conference on Artificial Intelligence (IJCAI)

n°29

Jul. 2020

pp.2200-2226

ArXiv:2005.09841

hal-02552568

direct link

Keywords:

Bandits, Graphs, Best-arm identification

We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint. We provide and analyze an efficient gradient ascent algorithm to compute the sample complexity of this problem as a solution of a non-smooth max-min problem (providing in passing a simplified analysis for the unconstrained case). Building on this algorithm, we propose an asymptotically optimal strategy. We furthermore illustrate by numerical experiments both the strategy's efficiency and the impact of the smoothness constraint on the sample complexity. Best Arm Identification (BAI) is an important challenge in many applications ranging from parameter tuning to clinical trials. It is now very well understood in vanilla bandit models, but real-world problems typically involve some dependency between arms that requires more involved models. Assuming a graph structure on the arms is an elegant practical way to encompass this phenomenon, but this had been done so far only for regret minimization. Addressing BAI with graph constraints involves delicate optimization problems for which the present paper offers a solution.

Algorithms for Non-Stationary Generalized Linear Bandits

Yoan Russac, Olivier Cappé, Aurélien Garivier

Jan. 2020

ArXiv:2003.10113

hal-02514151

Keywords:

Andressa Cerqueira, Aurélien Garivier, Florencia Leonardi

The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings. In the example of binary rewards, logistic regression is well-known to be preferable to the use of standard linear modeling. Previous works have shown how to deal with GLMs in contextual online learning with bandit feedback when the environment is assumed to be stationary. In this paper, we relax this latter assumption and propose two upper confidence bound based algorithms that make use of either a sliding window or a discounted maximum-likelihood estimator. We provide theoretical guarantees on the behavior of these algorithms for general context sequences and in the presence of abrupt changes. These results take the form of high probability upper bounds for the dynamic regret that are of order d^2/3 G^1/3 T^2/3 , where d, T and G are respectively the dimension of the unknown parameter, the number of rounds and the number of breakpoints up to time T. The empirical performance of the algorithms is illustrated in simulated environments.

A note on perfect simulation for exponential random graph models

ESAIM: Probability and Statistics

n°24

Mar. 2020

pp 138--147

ArXiv:1710.00873

hal-02502270

freely available online , DOI 10.1051/ps/2019024

Keywords:

Perfect Simulation

In this paper we propose a perfect simulation algorithm for the Exponential Random Graph Model, based on the Coupling From The Past method of Propp & Wilson (1996). We use a Glauber dynamics to... more

Thresholding Bandit for Dose-ranging: The Impact of Monotonicity

Aurélien Garivier, Pierre Ménard, Laurent Rossi

International Conference on Machine Learning, Artificial Intelligence and Applications (ICMA)

(online)

Aug. 2020

pp.38--61

ArXiv:1711.04454

hal-01629479

proceedings available here

Keywords:

Non-Asymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit models

We analyze the sample complexity of the thresholding bandit problem, with and without the assumption that the mean values of the arms are increasing. In each case, we provide a lower bound valid... more

Aurélien Garivier, Emilie Kaufmann

Sequential Analysis

vol. 40

Mar. 2021

pp. 61-96, 18th Wald Prize

Keywords:

In this paper, we study sequential testing problems with overlapping hypotheses. We first focus on the simple problem of assessing if the mean µ of a Gaussian distribution is ≥ε−or≤ε; if µ∈(−ε,ε),... more

Julia code for best-arm identification

X-Armed Bandits: Optimizing Quantiles and Other Risks

Léonard Torossian, Aurélien Garivier, Victor Picheny

Asian Conference on Machine Learning

n°11

JMLR: Workshop and Conference Proceedings

101

Nov. 2019

252-267

Keywords:

We propose and analyze StoROO, an algorithm for risk optimization on stochastic black-box functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine,... more

Conditional quantile sequential estimation for stochastic codes

Tatiana Labopin-Richard, Fabrice Gamboa, Aurélien Garivier, Jérôme Stenger

Journal of Statistical Theory and Practice

Vol. 13 (55)

Dec. 2019 (first version in Feb. 2016)

pp.1-40

ArXiv:1508.06505

hal-01187329

available on the editor's website, DOI

This paper is devoted to the estimation of conditional quantile, more precisely the quantile of the output of a real stochastic code whose inputs are in R d. In this purpose, we introduce a... more

A Review on Quantile Regression for Stochastic Computer Experiments

Léonard Torossian, Victor Picheny, Robert Faivre , Aurélien Garivier

Reliability Engineering and System Safety

201(C)

2020

ArXiv:1901.07874

hal-02010735

sciencedirect link, DOI: 10.1016/j.ress.2020.106858

Keywords:

Machine learning

We report on an empirical study of the main strategies for conditional quantile estimation in the context of stochastic computer experiments. To ensure adequate diversity, six metamodels are... more

L'IA du Quotidien peut elle être Éthique ? : Loyauté des Algorithmes d'Apprentissage Automatique

Philippe Besse, Céline Castets-Renard, Aurélien Garivier, Jean-Michel Loubes

Statistique et Société

vol.

(3)

Dec. 2018

pp.9-31

ArXiv:1810.01729

hal-01886699

En accès libre sur le site de Statistique et Société

Keywords:

Machine learning

Associant données massives (big data) et algorithmes d'apprentissage automatique (machine learning), la puissance des outils de décision automatique suscite autant d'espoir que de craintes... more

Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

Emilie Kaufmann, Wouter M. Koolen, Aurélien Garivier

available on the website of neurips

n°32

Dec. 2018

ArXiv:1806.00973

hal-01804581

Keywords:

Mastane Achab, Stéphan Clémençon, Aurélien Garivier

(poster) Learning the minimum/maximum mean among a finite set of distributions is a fundamental sub-task in planning, game tree search... more

Profitable Bandits

Asian Conference on Machine Learning

n°10

Nov. 2018

ArXiv:1805.02908

Keywords:

Aurélien Garivier, Pierre Ménard, Gilles Stoltz

Originally motivated by default risk management applications, this paper
investigates a novel problem, referred to as the profitable bandit problem
here. At each step, an agent chooses... more

Explore First, Exploit Next: The True Shape of Regret in Bandit Problems

Mathematics of Operations Research vol 44(2), pp.377-399

May 2019

ArXiv:1602.07182

hal-01276324

available on pubsonline.informs https://doi.org/10.1287/moor.2017.0928

Keywords:

Grégoire Jauvion, Nicolas Grislain, Pascal Dkengne Sielenou, Aurélien Garivier, Sébastien Gerchinovitz

We revisit lower bounds on the regret in the case of multi-armed bandit problems. We obtain non-asymptotic, distribution-dependent bounds and provide simple proofs based only on well-known properties... more

Optimization of a SSP's Header Bidding Strategy using Thompson Sampling

Special Interest Group on Knowledge Discovery and Data Mining

n°2 018

Aug. 2018

hal-01822524

Keywords:

Machine learning, UCB, HMM, Bandit Problems

In the last decade digital media: web or app publishers, generalised the use of real-time ad auctions to sell their ad-spaces. Multiple auction platforms, also called Supply Side Platforms (SSP)... more

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

Aurélien Garivier, Hédi Hadiji, Pierre Ménard, Gilles Stoltz

Journal of Machine Learning Research

23 (179)

Aug. 2022

pp.1−66

ArXiv:1805.05071

hal-01785705

Keywords:

UCB, Empirical Likelihood, Bandit Problems

In the context of K-armed stochastic bandits with distribution only assumed to be supported by [0,1], we introduce a new algorithm, KL-UCB-switch, and prove that is enjoys simultaneously a... more

Learning the distribution with largest mean: two bandit frameworks

Emilie Kaufmann, Aurélien Garivier

ESAIM: Proceedings and Surveys

vol.

Dec. 2017

pp.114-131

ArXiv:1702.00001

hal-01449822

accessible sur le site d'ESAIM:PS

Keywords:

Machine learning, Recommender systems, UCB, Self-Normalized, Bandit Problems

Over the past few years, the multi-armed bandit model has become increasingly popular in the machine learning community, partly because of applications including online content optimization. This... more

A minimax and asymptotically optimal algorithm for stochastic bandits

Pierre Ménard, Aurélien Garivier

Algorithmic Learning Theory

n°28

Oct. 2017

pp.223-237

ArXiv:1702.07211

hal-01475078

published in Proceedings of Machine Learning Research (PMLR) vol. 76

Keywords:

Mastane Achab, Stéphan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade

We propose the kl-UCB ++ algorithm for regret minimization in stochastic bandit models with exponential families of distributions. We prove that it is simultaneously asymptotically optimal (in the... more

Max K-armed bandits: On the ExtremeHunter algorithm and beyond

European Conference on Machine Learning

n°2 017

Sep. 2017

published in

Keywords:

This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values. Our contribution is twofold.
We... more

Les systèmes de recommandations en ligne

Les Big Data à Découvert, Editions CNRS

Mar. 2017

Keywords:

Big Data, Recommender systems, Bandit Problems

Quel est le point commun entre un site de vente sur Internet, un système de navigation GPS, un essai clinique et un moteur de publicité en ligne ? Petit article de 2 pages pour présenter le domaine... more

Diversified Bandit: the k-diversified Multiple-Play Recommendation Algorithm

Max Chevalier, Aurélien Garivier, Jonathan Louëdec, Josiane Mothe

Mar. 2017

Keywords:

Recommender systems, Bandit Problems

Recommender systems (RS) aim at recommending either one or several items simultaneously : the former type of RS is named single play RS, while the latter is named multiple-play RS. The... more

On Explore-Then-Commit Strategies

Aurélien Garivier, Emilie Kaufmann, Tor Lattimore

Machine learning, Recommender systems, Bandit Problems

n°30

Dec. 2016

ArXiv:1605.08988

hal-01322906

Keywords:

We study the problem of minimising regret in two-armed bandit problems with Gaussian rewards. Our objective is to use this simple setting to illustrate that strategies based on an exploration... more

Algorithme de bandit et obsolescence : un modèle pour la recommandation

Jonathan Louëdec, Laurent Rossi, Max Chevalier, Aurélien Garivier, Josiane Mothe

Conférence Francophone sur l'Apprentissage Automatique

n°18

Jul. 2016

accès direct

Keywords:

Aurélien Garivier, Emilie Kaufmann, Wouter M. Koolen

Un nombre croissant de systèmes numériques font appel à des algortihmes de bandits pour combiner efficacement exploration de l'environnement et exploitation de l'information accumulée. Les modèles... more

Maximin Action Identification: A New Bandit Framework for Games

In: Proceedings of of Machine Learning Research vol. 49

n°29

Jun. 2016

pp.1 028-1 050

ArXiv:1602.04676

hal-01273842

Keywords:

Reinforcement Learning, Bandit Problems

We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search. It consists in identifying the best action in a game, when the player may sample... more

Optimal Best Arm Identification with Fixed Confidence

Aurélien Garivier, Emilie Kaufmann

In: Proceedings of of Machine Learning Research vol. 49

n°29

Jun. 2016

pp.998-1 027

ArXiv:1602.04589

hal-01273838

Keywords:

MDL, Bandit Problems

Julia code of the simulations on github, updated code for newer versions of Julia
We provide a complete characterization of the complexity of best-arm identification in... more

Bregman superquantiles. Estimation methods and applications.

Tatiana Labopin-Richard, Fabrice Gamboa, Aurélien Garivier, Bertrand Iooss

Dependence Modeling

vol.

(1)

Mar. 2016

pp.76-108

ArXiv: 1405.6677

hal-00996440

accès libre sur Dependence Modeling

Keywords:

Non-Parametrics

In this work, we extend some parameters built on a probability distribution introduced before to the case where the proximity between real numbers is measured by using a Bregman divergence. This... more

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

Journal of Machine Learning Research

vol.

(1)

Jan. 2016

pp.1-42

ArXiv:1407.4443

hal-01024894

available on JMLR

Keywords:

Self-Normalized, Bandit Problems

The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning. Whereas the achievable limit in terms of regret... more

Modélisation Aléatoire et Statistique - Journées MAS 2014 (éditeur)

ESAIM: Proceedings and Surveys

vol.

Oct. 2015

en libre accès sur ESAIM:procs

Depuis 1996, les Journées du groupe Modélisation Aléatoire et Statistique (MAS) de la SMAI réunissent tous les deux ans une large part des communautés probabilistes et statistiques françaises. L... more

Systèmes de recommandations : algorithmes de bandits et évaluation expérimentale

Jonathan Louëdec, Max Chevalier, Aurélien Garivier, Josiane Mothe

Journées de statistiques

n°2 015

Jun. 2015

version de travail

Keywords:

Teaching, UCB

Les systèmes de recommandation à très grande échelle sont aujourd'hui omniprésents sur internet : ouvrages conseillés à l'achat dans les librairies en ligne, articles recommandés sur les... more

A Multiple-play Bandit Algorithm Applies to Recommender Systems

Jonathan Louëdec, Max Chevalier, Josiane Mothe, Aurélien Garivier, Sébastien Gerchinovitz

Florida Artificial Intelligence Research Society Conference

n°28

May. 2015

pp.67-72

version de travail

Keywords:

Recommender systems, UCB, Bandit Problems

For several web tasks such as ad placement or e-commerce, recommender systems must recommend multiple items to their users---such problems can be modeled as bandits with multiple plays. State-of-... more

Algorithmes de bandit pour les systèmes de recommandation : le cas de multiples recommandations simultanées

Jonathan Louëdec, Max Chevalier, Aurélien Garivier, Josiane Mothe

Conférence en Recherche d’Information et Applications

n°12

Mar. 2015

pp.73-88

version de travail

Keywords:

Recommender systems, UCB, Bandit Problems

The multiple-play recommender systems (RS) are RS which recommend several items to the users. RS are based on learning models in order to choose the items to recommend. Among these models, the... more

Perfect Simulation Of Processes With Long Memory: A 'Coupling Into And From The Past' Algorithm

Random Structures and Algorithms

vol.

(2)

Mar. 2015

pp.300-319

ArXiv:1106.5971

hal-00798388

DOI: 10.1002/rsa.20527

Keywords:

VLMC, Perfect Simulation

We describe a new algorithm for the perfect simulation of variable length Markov chains and random systems with perfect connections. This algorithm generalizes Propp and Wilson's simulation... more

Algorithmes de bandits pour la recommandation à tirages multiples

Jonathan Louëdec, Max Chevalier, Aurélien Garivier, Josiane Mothe

Document numérique (Hermès)

vol.

Feb. 2015

pp.59-79

lien direct

Keywords:

Poursuite d'étude après un IUT STID : l'exemple du Cursus de Master en Ingénierie Statistique et Informatique Décisionnelle de Toulouse

Les systèmes de recommandation (SR) à tirages multiples font référence aux SR recommandant plusieurs objets en même temps aux utilisateurs. La plupart des SR s'appuient sur des modèles d'... more

Cécile Chouquet, Aurélien Garivier

Journées de statistiques

n°2 014

Jun. 2014

hal-00977409

Keywords:

Teaching

Après obtention de leur DUT, de nombreux étudiants des IUT STID souhaitent poursuivre leurs études pour obtenir un Master qui leur donne la possibilité de travailler comme ingénieur-expert ou chef... more

On the Complexity of A/B Testing

Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

free access on JMLR W&CP 35

n°27

Jun. 2014

pp.461-481

ArXiv:1405.3224

hal-00990254

Keywords:

UCB, Self-Normalized, Bandit Problems

A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes. We provide distribution-dependent lower bounds for the performance of A/B testing... more

Big Data - Retour vers le Futur 3; De Statisticien à Data Scientist

Philippe Besse, Aurélien Garivier, Jean-Michel Loubes

Ingénierie des Systèmes d'Information

vol.

(3)

Mar. 2014

pp.93-105

ArXiv:1403.3758

hal-00959267

version protégée acsm accessible sur le site d'ISI

Keywords:

Big Data, Teaching

The rapid evolution of information systems managing more and more voluminous data has caused profound paradigm shifts in the job of statistician, becoming successively data miner, bioinformatician... more

Informational Confidence Bounds for Self-Normalized Averages and Applications

IEEE Information Theory Workshop

Sep. 2013

pp.489-493

ArXiv:1309.3376

hal-00862062

Keywords:

VLMC, UCB, Self-Normalized, Bandit Problems

We present deviation bounds for self-normalized averages and applications to estimation with a random number of observations.
The results rely on a peeling argument in exponential martingale... more

Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation

Olivier Cappé, Aurélien Garivier, Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz

Annals of Statistics

vol.

(3)

Jun. 2013

pp.1 516-1 541

ArXiv:1210.1136

hal-00738209

accessible on Projet Euclid

Keywords:

Bandit Problems, Empirical Likelihood, Non-Parametrics, Self-Normalized, UCB

We consider optimal sequential allocation in the context of the so-called stochastic multi-armed bandit model. We describe a generic index policy, in the sense of Gittins (1979), based on upper... more

Joint Estimation of Intersecting Context Tree Models

Antonio Galves, Aurélien Garivier, Elisabeth Gassiat

Scandinavian Journal of Statistics

vol.

(2)

Jun. 2013

pp.344-362

ArXiv:1102.0673

hal-00738202

accessible on wiley online library

Keywords:

BIC, MDL, Model Selection, Non-Parametrics, VLMC

We study a problem of model selection for data produced by two different context tree sources. Motivated by linguistic questions, we consider the case where the probabilistic context trees... more

Optimal Discovery with Probabilistic Expert Advice: Finite Time Analysis and Macroscopic Optimality

Sébastien Bubeck, Damien Ernst, Aurélien Garivier

Journal of Machine Learning Research

vol.

Feb. 2013

pp.601-623

free access on JMLR

Keywords:

Bandit Problems, UCB

We consider an original problem that arises from the issue of security analysis of a power system and that we name optimal discovery with probabilistic expert advice. We address it with an algorithm... more

Optimal Discovery with Probabilistic Expert Advice

Sébastien Bubeck, Damien Ernst, Aurélien Garivier

IEEE Conference on Decision and Control

n°51

Dec. 2012

pp.6 808-6 812

ArXiv:1110.5447

Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

Keywords:

Bandit Problems, UCB

Motivated by issues of security analysis for power systems, we analyze a new problem, called optimal discovery with probabilistic expert advice. We address it with an algorithm based on the... more

On Bayesian Upper Confidence Bounds for Bandit Problems

International Conference on Artificial Intelligence and Statistics

n°15

Apr. 2012

pp.592-600

workshop proceedings available online

Keywords:

Bandit Problems, Bayesian Methods, Self-Normalized, UCB

Stochastic bandit problems have been analyzed from two different perspectives: a frequentist view, where the parameter is a deterministic unknown quantity, and a Bayesian approach, where the... more

Sequential Monte Carlo smoothing for general state space Hidden Markov Models

Randal Douc, Aurélien Garivier, Eric Moulines, Jimmy Olsson

Annals of Applied Probability

vol.

(6)

Dec. 2011

pp.2 109-2 145

ArXiv:1202.2945

hal-00839311

available on Project Euclide

Keywords:

Aurélien Garivier, Florencia Leonardi

Computing smoothing distributions, the distributions of one or more states conditional on past, present, and future observations is a recurring problem when operating on general hidden Markov models... more

Context Tree Selection: A Unifying View

Stochastic Processes and their Applications

vol.

121

(11)

Nov. 2011

pp.2 488-2 506

ArXiv:1011.2424

available on ScienceDirect

Keywords:

BIC, MDL, Non-Parametrics, Self-Normalized, VLMC

Context tree models have been introduced by Rissanen as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper... more

Analyse d'algorithmes pour l'estimation et l'optimisation stochastiques

Aurélien Garivier, Eric Moulines

Mémoire d'habilitation à diriger des recherches (HDR)

Nov. 2011

direct link

Keywords:

VLMC, UCB, Sequential Monte-Carlo, Self-Normalized, Reinforcement Learning, Good-Turing, Empirical Likelihood, BIC, Bayesian Methods, HMM, MDL, Perfect Simulation, Non-Parametrics, Model Selection, Bandit Problems

On Upper-Confidence Bound Policies for Non-stationary Bandit Problems

Algorithmic Learning Theory

n°22

Oct. 2011

pp.174-188

ArXiv:0805.3415

hal-00281392

Proceedings published as Lecture Notes in Computer Science Volume 6925

Keywords:

Bandit Problems, Self-Normalized, UCB

Many problems, such as cognitive radio, parameter control of a scanning tunnelling microscope or internet advertisement, can be modelled as non-stationary bandit problems where the distributions... more

The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond

Aurélien Garivier, Olivier Cappé

Optimally Sensing a Single Channel Without Prior Information: The Tiling Algorithm and Regret Bounds

n°24

Jul. 2011

pp.359-376

ArXiv:1102.2490

available on jmlr.org

Keywords:

UCB, Self-Normalized, Bandit Problems

This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems. We prove two distinct results: first, for arbitrary bounded... more

Sarah Filippi, Olivier Cappé, Aurélien Garivier

IEEE Journal of Selected Topics in Signal Processing

vol.

(1)

Feb. 2011

pp.68-76

ArXiv:0908.0319

hal-00408867

Reinforcement Learning, UCB

Keywords:

We consider the task of optimally sensing a two-state Markovian channel with an observation cost and without any prior information regarding the channel's transition probabilities. This task is... more

Parametric Bandits: The Generalized Linear Case

Sarah Filippi, Olivier Cappé, Aurélien Garivier, Csaba Szepesvári

Proceedings published as Advances in Neural Information Processing Systems 23 (with supplementary material)

n°24

Dec. 2010

Keywords:

Bandit Problems, Bayesian Methods, UCB

We consider structured multi-armed bandit problems based on the Generalized Linear Model (GLM) framework of statistics. For these bandits, we propose a new algorithm, called GLM-UCB. We derive... more

Optimism in Reinforcement Learning and Kullback-Leibler Divergence

Sarah Filippi, Olivier Cappé, Aurélien Garivier

Allerton Conference on Communication, Control, and Computing

n°48

Sep. 2010

pp.115-122

ArXiv:1004.5229

hal-00476116

Reinforcement Learning, Self-Normalized, UCB

Keywords:

We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. In MDPs, optimism can be implemented by carrying out... more

Bandits contextuels: apprentissage par renforcement dans les modèles linéaires généralisés

Sarah Filippi, Olivier Cappé, Aurélien Garivier, Csaba Szepesvári

Journées Francophones Planification, Décision, et Apprentissage pour la conduite de systèmes

Jun. 2010

available on the conference's site

Keywords:

Bandit Problems, UCB

Nous considérons des problèmes de bandits multibras structurés dans lesquels l’agent est guidé par une connaissance a priori sur la structure de la récompense qui peut être exploitée de manière à... more

On Approximate Maximum Likelihood Methods for Blind Identification: How to Cope with the Curse of Dimensionality

Steffen Barembruch, Aurélien Garivier, Eric Moulines

IEEE Transactions on Signal Processing

vol.

(11)

Nov. 2009

pp.4 247-4 259

Keywords:

We discuss approximate maximum-likelihood methods for blind identification and deconvolution. These algorithms are based on particle approximation versions of the expectation-maximization (EM)... more

A Lower-bound for the Maximin Redundancy in Pattern Processes

Stéphan Clémençon, Aurélien Garivier, Jessica Tressou

Entropy

vol.

(4)

Oct. 2009

pp.634-642

available on Entropy

Keywords:

Universal Coding

We show that the maximin average redundancy in pattern coding is eventually larger than 1.84 (n/log n)1/3 for messages of length n. This improves recent results on pattern redundancy, although it... more

Pseudo Regenerative Block-Bootstrap for Hidden Markov Models

IEEE Statistical Signal Processing Workshop

n°15

Aug. 2009

pp.465-468

A MDL approach to HMM with Poisson and Gaussian emissions. Application to order identification

Keywords:

HMM

This paper is devoted to extend the regenerative block-bootstrap (RBB) proposed for regenerative Markov chains to Hidden Markov Models {(Xn, Yn)}. In the HMM setup, regeneration times of the... more

Antoine Chambaz, Aurélien Garivier, Elisabeth Gassiat

Journal of Statistical Planning and Inference

vol.

139

(3)

Mar. 2009

pp.962-977

available on ScienceDirect

Keywords:

Bayesian Methods, BIC, HMM, MDL, Model Selection

We address the issue of order identification for hidden Markov models with Poisson and Gaussian emissions. We prove information-theoretic BIC-like mixture inequalities in the spirit of Finesso [1991... more

Coding on countably infinite alphabets

Stéphane Boucheron, Aurélien Garivier, Elisabeth Gassiat

IEEE Transactions on Information Theory

vol.

(1)

Jan. 2009

pp.358-373

ArXiv:0801.2456

hal-00121892

Non-Parametrics, Universal Coding

Keywords:

This paper describes universal lossless coding strategies for compressing sources on countably infinite alphabets. Classes of memoryless sources defined by an envelope condition on the marginal... more

On Approximate Maximum Likelihood Methods for Blind Identification: How to Cope with the Curse of Dimensionality

Steffen Barembruch, Aurélien Garivier, Eric Moulines

International Workshop on Signal Processing Advances for Wireless Communications

n°9

Jun. 2008

pp.639-643

Keywords:

Steffen Barembruch, Aurélien Garivier, Eric Moulines

We discuss approximate maximum likelihood methods for blind identification and deconvolution. These algorithms are based on particle approximation versions of the EM algorithm. We consider two... more

On Optimal Sampling for Particle Filtering in Digital Communication

International Workshop on Signal Processing Advances for Wireless Communications

n°9

Jun. 2008

pp.634-638

Keywords:

Particle filtering has been successfully used to approximate the fixed-lag or fixed-interval smoothing distributions in digital communication and to perform approximate maximum likelihood... more

Redundancy of the Context-Tree Weighting Method on Renewal and Markov Renewal Processes

IEEE Transactions on Information Theory

vol.

(12)

Dec. 2006

pp.5 579-5 586

Keywords:

Bayesian Methods, Non-Parametrics, Universal Coding, VLMC

The Context Tree Weighting method (CTW) is shown to be almost adaptive on the classes of renewal and Markov renewal processes. Up to logarithmic factor, ctw achieves the minimax pointwise redundancy... more

Modèles contextuels et alphabets infinis en théorie de l'information

Mémoire de thèse (PhD dissertation)

Nov. 2006

Keywords:

VLMC, Universal Coding, BIC, Bayesian Methods, HMM, MDL, Non-Parametrics, Model Selection

Consistency of the unlimited BIC Context Tree estimator

IEEE Transactions on Information Theory

vol.

(10)

Oct. 2006

pp.4 630-4 635

Keywords:

Bayesian Methods, BIC, Model Selection, VLMC

The Bayesian information criterion (BIC) and Krichevsky- Trofimov (KT) version of minimum description length (MDL) principle are popular in the study of model selection. For order estimation of... more

Parsing et codage universel

Anthony Winstanley, Aurélien Garivier, Mark R. Greenstreet

Rapport de DEA (Master's dissertation)

Sep. 2003

Keywords:

Universal Coding

Codage universel sans perte par transformée grammaticale.

An event spacing experiment

International Symposium on Asynchronous Circuits and Systems

n°8

Apr. 2002

pp.47-56