M2 course, ENS Lyon: Mathematical foundations of deep neural networks

Machine Learning

Distant access to online lectures:

BBB link on the Portail des Etudes
Lectures: Thursday 13:30-17:00

Lecturers

Aurélien Garivier, Rémi Gribonval, Nelly Pustelnik

Course description

This course is proposed in the Master 2 Advanced Mathematics in ENS Lyon. It provides a detailed overview of the mathematical foundations of modern learning techniques based on deep neural networks. Starting with the universal approximation property of neural networks, the course will then show why depth improves the capacity of networks to provide accurate function approximations for a given computational budget. Tools to address the optimization problems appearing when training networks on large collections will then be covered, and their convergence properties will be reviewed. Finally, statistical results on the generalization guarantees of deep neural networks will be described, both in the classical underfitting scenario and in the overfitting scenario leading to the so-called “double descent” phenomenon.

Class Notes (last year)

Evaluation

50% for the 3 homeworks, 50% for the presentation of an article.

Articles proposed for presentations

For the project part, please send an email to the three lecturers with your group (3 or 4 names, all members of the student group should be in cc: of the mail) and your ordered list of three subjects.

Deadline: Tuesday January 25th

  1. B. Ghorbani, S. Mei, T. Misiakiewicz, and A. Montanari, “Linearized two-layers neural networks in high dimension.,” arXiv, vol. math.ST, 2019.
  2. L. Venturi, A. S. Bandeira, and J. Bruna, “Spurious Valleys in One-hidden-layer Neural Network Optimization Landscapes,” Journal of Machine Learning Research, vol. 20, no. 133, pp. 1–34, 2019.
  3. X. Chen, J. L. 0003, Z. Wang, and W. Yin, “Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds.,” NeurIPS, 2018.
  4. L. Chizat, E. Oyallon, F. Bach, "On Lazy Training in Differentiable Programming", NeurIPS 2019
  5. M. Telgarsky, “Benefits of depth in neural networks.,” Journal of Machine Learning Research, vol. 49, no. June, pp. 1517–1539, Jun. 2016.
  6. Ward, Rachel, Xiaoxia Wu, and Leon Bottou. "AdaGrad stepsizes: Sharp convergence over nonconvex landscapes." International Conference on Machine Learning. PMLR, 2019.
  7. Cohen, Taco, and Max Welling. "Group equivariant convolutional networks." International conference on machine learning. PMLR, 2016.
  8. Weiler, Maurice, and Gabriele Cesa. "General e (2)-equivariant steerable cnns." Advances in neural information processing systems 32 (2019).
  9. Herbreteau, Sébastien, Emmanuel Moebel, and Charles Kervrann. "Normalization-Equivariant Neural Networks with Application to Image Denoising." arXiv preprint arXiv:2306.05037 (2023).
  10. M. Phuong and C. H. Lampert, “Functional vs. parametric equivalence of ReLU networks.,” ICLR, 2020.
  11. H. Bölcskei, P. Grohs, G. Kutyniok, and P. Petersen, “Optimal Approximation with Sparsely Connected Deep Neural Networks.,” SIAM J. Math. Data Sci., vol. 1, no. 1, pp. 8–45, 2019.
  12. M. Telgarsky, “Neural Networks and Rational Functions.,” ICML, 2017.
  13. D. Yarotsky, “Error bounds for approximations with deep ReLU networks.,” Neural Networks, 2017.
  14. P. L. Combettes, J. P. S.-V. A. V. Analysis, 2020, “Deep neural network structures solving variational inequalities,” Springer
  15. R. Arora, A. Basu, P. Mianjy, and A. Mukherjee, “Understanding Deep Neural Networks with Rectified Linear Units.,” ICLR, 2018.
  16. Daniely, “Depth Separation for Neural Networks,” presented at the Proceedings of the 2017 Conference on Learning Theory, Amsterdam, Netherlands, 2017, vol. 65, pp. 690–696.

Other eferences

  • Neural Network Learning: Theoretical Foundations , by Martin Anthony and Peter L. Bartlett.
  • Convex Optimization, Stephen Boyd & Lieven Vandenberghe, Cambridge University Press.