MATH5107 & INFO5161 :
Master of Advanced Mathematics track Probability and Statistics, and Course CR01 of the M2IF.
Lecturers
Jean-Christophe Mourrat, Aurélien Garivier, Rémi Gribonval
Course description
This course will introduce the notion of concentration of measure and highlight its applications, notably in high dimensional data processing and machine learning.
The course will start from deviations inequalities for averages of independent variables, and illustrate their interest for the analysis of random graphs and random projections for dimension reduction.
It will then be shown how other high-dimensional random functions concentrate, and what guarantees this concentration yields for randomized algorithms and machine learning procedures to learn from large training collections.
Presentation slide
Weekly schedule 2024-2025: Friedays 14:00-16:00
- [09.13] (Gribonval) Deviation bounds and Chernoff's methods
- [09.20] (Gribonval)
- [09.27] (Garivier) notes for lectures 3, 4 and 5
- [10.04] (Garivier)
- [10.11] (Garivier)
- [10.18] (Mourrat)
- [10.25] (Mourrat)
- [11.08] (Garivier)
- [11.15] (Mourrat)
- [11.22] (Mourrat)
- [11.29] (Gribonval)
- [12.06] (Gribonval)
- [12.13] Project Defenses (13:30-17:30)
Exercices
Exercise sheet number 1
Exercise sheet number 2
Exercise sheet number 3
Three homeworks count for 50% in total of the final note: they will be progressively available on the
portail des études.
Notebooks
Research articles
Each group of 4 students chooses one among the following articles:
- Rahimi, A., & Recht, B. (2007). Random Features for Large Scale Kernel Machines. Advances in Neural Information Processing Systems (NIPS), (1), 1–8. https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf
- Le, Q., T Sarlós, & Smola, A. (2013). Fastfood-approximating kernel expansions in loglinear time. http://proceedings.mlr.press/v28/le13.pdf
- Tropp, J. A. (2011). User-friendly tail bounds for Sums of Random Matrices. Foundations of Computational Mathematics. http://doi.org/doi:10.1007/s10208-011-9099-z
- MacAllester, D. and Ortiz, L. Concentration Inequalities for the Missing Mass and for Histogram Rule Error. https://www.jmlr.org/papers/volume4/mcallester03a/mcallester03a.pdf
- Baraniuk, R., Davenport, M., DeVore, R. A., & Wakin, M. B. (2008). A simple proof of the restricted isometry property for random matrices. Constructive Approximation, 28(3), 253–263. https://www.math.tamu.edu/~rdevore/publications/131.pdf
- Sélection de modèles et sélection d’estimateurs pour l'apprentissage statistique (Cours Peccot) Premier cours: Apprentissage statistique et sélection d’estimateurs https://www.imo.universite-paris-saclay.fr/~arlot/enseign/2011Peccot/peccot_notes_cours1.pdf
https://www.math.tamu.edu/~rdevore/publications/131.pdf
- Boucheron S. and Thomas M., Concentration inequalities for order statistics - https://projecteuclid.org/euclid.ecp/1465263184
- Chandrasekaran, V., Recht, B., Parrilo, P. A., & Willsky, A. S. (2012). The Convex Geometry of Linear Inverse Problems. Found. Comput. Math., 12(6), 805–849. http://doi.org/10.1007/s10208-012-9135-7
- Garivier, a. & Pilliat, E. (2024). On Sparsity and Sub-Gaussianity in the Johnson-Lindenstrauss Lemma. http://doi.org/10.1145/375551.375608
- Puy, G., Davies, M. E., & Gribonval, R. (2017). Recipes for Stable Linear Embeddings From Hilbert Spaces to $ {\mathbb {R}}^{m}$. IEEE Transactions on Information Theory, 63(4), 2171–2187. http://doi.org/10.1109/tit.2017.2664858
- Tropp, J. A. (2008). On the conditioning of random subdictionaries. Appl. Comput. Harmon. Anal., 25(1), 1–24. https://www.sciencedirect.com/science/article/pii/S1063520307000863
- Bach, F. (2015). On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions. https://jmlr.org/papers/volume18/15-178/15-178.pdf
- Chatterjee, S. Stein's method for concentration inequalities (2006).https://arxiv.org/abs/math/0604352
- Spencer, J. Six standard deviations suffice. Trans. Amer. Math. Soc. 289 (1985), 679-706.
https://www.ams.org/journals/tran/1985-289-02/S0002-9947-1985-0784009-0
Prerequisite
Basic knowledge of probability theory, linear algebra and analysis over the reals.
Bibliography
- Concentration Inequalities, by Stéphane Boucheron, Pascal Massart and Gabor Lugosi
- High-Dimensional Probability – An Introduction with Applications in Data Science, by Roman Vershynin
- Understanding Machine Learning, From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David
Evaluation
Three homeworks count for 50% in total of the final note: they will be progressively available on the portail des études. The projet on a research article will give the other 50%.
Project: paper / chapter presentation
End of September | Distribution and allocation of topics Students organize an allocation of the proposed topics by groups of 4 In case of a disagreement, an allocation will be adjusted by the course lecturers |
Monday November 1st | Submission deadline for videos + slides + written reports (half a page summary) |
Monday November 8th | Oral session (10 minutes of questions per group) |
EXPECTED WORK
a/ Written document: half a page summary
Written in French or English, on half a page, in the format of the summary of a conference paper describing the topic, its stakes, the difficulties, novelties, and main results.
Writing style, grammar and spelling must be done with care.
b/ Video presentation and slides
15-minutes lecture to be made available to the course lecturers as a video recording, paying attention to ensuring a balanced share of time highlighting the participation of each student of the group.
PDF version of slides with page numbers to be made available to the course lecturers.
Each student is invited to watch the videos of the other groups before the oral session.
Mandatory participation of all students to the oral session to answer questions from the course lecturers.
It is important to respect the allotted duration of the video lecture : do not hesitate to use a timer!
A limited duration lecture cannot contain the same information as a 20 to 30 pages paper / chapter : you have to make choices to bring us a viewpoint on the addressed topic. Roughly count 1 minute 30 per slide, and seek a good balance between giving a global view and being technically precise on selected aspects.
Adapting the content to the expected background of the targeted audience is important: here the audience consists of the course lecturers as well as all the students who followed the course.
As an example, a standard lecture structure can be as follows: considered problem and its context; state of the art approaches and/or approaches studied during the course; proposed approach; synthesis and discussion, possibly highlighting your own contributions (critical perspective, complementary bibliographical study, implementation, difficulties …) ; conclusion.
Among other things, the grade will reflect an evaluation of:
Your understanding of the topic and of the studied papers / chapters.
Your ability to explain them: many persons in the audience are not necessarily experts.
Your critical analysis of the studied papers / chapters.
Possible complementary bibliographical references bringing additional viewpoints.
PLANNED PRACTICALITIES
Submission of written + video material : via « portail des études » if technically adapted
Oral session: on site if possible, otherwise in hybrid mode or by videoconf (BBB of “portail des études”)
Details will be given as soon as possible according to technical and sanitary constraints.
Some of these propositions date back to last year, but some may remain active or relevant.
Otherwise, you can find here a
. The list is not at all comprehensive: it gathers some propositions that we have received with a few colleagues from Lyon. Other possibilities exist.