Statistical learning theory 2024/25 — различия между версиями
Bauwens (обсуждение | вклад) |
Bauwens (обсуждение | вклад) |
||
| Строка 30: | Строка 30: | ||
|| ''Part 1. Online learning'' | || ''Part 1. Online learning'' | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v= | + | | [https://www.youtube.com/watch?v=N_JUBxw3sZo 21 Sept] |
| − | || Philosophy. The online mistake bound model. The halving and weighted majority algorithms. | + | || Philosophy. The online mistake bound model. The halving and weighted majority algorithms. |
| − | || [https://www.dropbox.com/scl/fi/ | + | || [https://www.dropbox.com/scl/fi/nbxsehlcl8hqodcaho7sg/01slides_all.pdf?rlkey=7u4smvn3jaofhscwrddh6mcoy&st=yb9esz0d&dl=0 sl01] |
| − | || [https://www.dropbox.com/scl/fi/svgelu3iwijls092ehqqf/00book_intro.pdf?rlkey=jxdya4290kfc0hfl06b0y7k4b&st=lnv8chxf&dl=0 ch00] [https://www.dropbox.com/ | + | || [https://www.dropbox.com/scl/fi/svgelu3iwijls092ehqqf/00book_intro.pdf?rlkey=jxdya4290kfc0hfl06b0y7k4b&st=lnv8chxf&dl=0 ch00] |
| + | [https://www.dropbox.com/scl/fi/uqa9615215wy7ievgr50y/01book_onlineMistakeBound.pdf?rlkey=jiqzz84b5ipaw4t6cff7b17sl&st=mc354l04&dl=0 ch01] | ||
|| [https://www.dropbox.com/scl/fi/luee4if0mrd4f440q69hd/01sem.pdf?rlkey=8702taq325mvb4ifh15stvvto&st=sq946cf3&dl=0 prob01] | || [https://www.dropbox.com/scl/fi/luee4if0mrd4f440q69hd/01sem.pdf?rlkey=8702taq325mvb4ifh15stvvto&st=sq946cf3&dl=0 prob01] | ||
|| <!-- [https://www.dropbox.com/scl/fi/kksvt6ttgf06u8uce6g9z/01sol.pdf?rlkey=ldcqaewvg7cqdlfqkt7ltckej&dl=0 sol01] --> | || <!-- [https://www.dropbox.com/scl/fi/kksvt6ttgf06u8uce6g9z/01sol.pdf?rlkey=ldcqaewvg7cqdlfqkt7ltckej&dl=0 sol01] --> | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=gQm1G3Ep-5s | + | | [https://www.youtube.com/watch?v=gQm1G3Ep-5s 24 Sept] |
|| The perceptron algorithm. Kernels. The standard optimal algorithm. | || The perceptron algorithm. Kernels. The standard optimal algorithm. | ||
|| [https://www.dropbox.com/s/sy959ee81mov5cr/02slides.pdf?dl=0 sl02] | || [https://www.dropbox.com/s/sy959ee81mov5cr/02slides.pdf?dl=0 sl02] | ||
| Строка 44: | Строка 45: | ||
|| <!-- [https://www.dropbox.com/scl/fi/d2wuka77bu18j9plivwl5/02sol.pdf?rlkey=yp2eprgxpc7r2antyidjd8qiw&dl=0 sol02] --> | || <!-- [https://www.dropbox.com/scl/fi/d2wuka77bu18j9plivwl5/02sol.pdf?rlkey=yp2eprgxpc7r2antyidjd8qiw&dl=0 sol02] --> | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=H7kvz2rxX4o | + | | [https://www.youtube.com/watch?v=H7kvz2rxX4o 01 Oct] |
|| Prediction with expert advice. Recap probability theory (seminar). | || Prediction with expert advice. Recap probability theory (seminar). | ||
|| [https://www.dropbox.com/s/a60p9b76cxusgqy/03slides.pdf?dl=0 sl03] | || [https://www.dropbox.com/s/a60p9b76cxusgqy/03slides.pdf?dl=0 sl03] | ||
| Строка 54: | Строка 55: | ||
|| ''Part 2. Distribution independent risk bounds'' | || ''Part 2. Distribution independent risk bounds'' | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=ycfYXvmKF0I | + | | [https://www.youtube.com/watch?v=ycfYXvmKF0I 08 Oct] |
|| Necessity of a hypothesis class. Sample complexity in the realizable setting, examples: threshold functions and finite classes. | || Necessity of a hypothesis class. Sample complexity in the realizable setting, examples: threshold functions and finite classes. | ||
|| [https://www.dropbox.com/s/pi0f3wab1xna6d7/04slides.pdf?dl=0 sl04] | || [https://www.dropbox.com/s/pi0f3wab1xna6d7/04slides.pdf?dl=0 sl04] | ||
| Строка 61: | Строка 62: | ||
|| <!-- [https://www.dropbox.com/scl/fi/55530savq0vn6apra7oo4/05sol.pdf?rlkey=ql9q3a7s7k5dkggymul4p4s2o&dl=0 sol05] --> | || <!-- [https://www.dropbox.com/scl/fi/55530savq0vn6apra7oo4/05sol.pdf?rlkey=ql9q3a7s7k5dkggymul4p4s2o&dl=0 sol05] --> | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=8J5B9CCy-ws | + | | [https://www.youtube.com/watch?v=8J5B9CCy-ws 15 Oct] |
|| Growth functions, VC-dimension and the characterization of sample comlexity with VC-dimensions | || Growth functions, VC-dimension and the characterization of sample comlexity with VC-dimensions | ||
|| [https://www.dropbox.com/s/rpnh6288rdb3j8m/05slides.pdf?dl=0 sl05] | || [https://www.dropbox.com/s/rpnh6288rdb3j8m/05slides.pdf?dl=0 sl05] | ||
| Строка 68: | Строка 69: | ||
|| <!-- [https://www.dropbox.com/scl/fi/gcr4n00ef62ezrta7atll/06sol.pdf?rlkey=b9rgqxgmnlxouvsl5eevpwg3d&dl=0 sol06] --> | || <!-- [https://www.dropbox.com/scl/fi/gcr4n00ef62ezrta7atll/06sol.pdf?rlkey=b9rgqxgmnlxouvsl5eevpwg3d&dl=0 sol06] --> | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=zHau8Br_UFQ | + | | [https://www.youtube.com/watch?v=zHau8Br_UFQ 22 Oct] |
|| Risk decomposition and the fundamental theorem of statistical learning theory | || Risk decomposition and the fundamental theorem of statistical learning theory | ||
|| [https://www.dropbox.com/s/0p8r5wgjy1hlku2/06slides.pdf?dl=0 sl06] | || [https://www.dropbox.com/s/0p8r5wgjy1hlku2/06slides.pdf?dl=0 sl06] | ||
| Строка 75: | Строка 76: | ||
|| <!-- [https://www.dropbox.com/scl/fi/dw3u10rhy33pv37z5zf5m/07sol.pdf?rlkey=wssi52zoiveccmpy2197ry5pt&dl=0 sol07] --> | || <!-- [https://www.dropbox.com/scl/fi/dw3u10rhy33pv37z5zf5m/07sol.pdf?rlkey=wssi52zoiveccmpy2197ry5pt&dl=0 sol07] --> | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=yMsUH1brAs8 | + | | [https://www.youtube.com/watch?v=yMsUH1brAs8 29 Oct] |
|| Bounded differences inequality, Rademacher complexity, symmetrization, contraction lemma. | || Bounded differences inequality, Rademacher complexity, symmetrization, contraction lemma. | ||
|| [https://www.dropbox.com/s/kfithyq0dgcq6h8/07slides.pdf?dl=0 sl07] | || [https://www.dropbox.com/s/kfithyq0dgcq6h8/07slides.pdf?dl=0 sl07] | ||
| Строка 85: | Строка 86: | ||
|| ''Part 3. Margin risk bounds with applications'' | || ''Part 3. Margin risk bounds with applications'' | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=oU2AzubDXeo | + | | [https://www.youtube.com/watch?v=oU2AzubDXeo 05 Nov] |
|| Simple regression, support vector machines, margin risk bounds, and neural nets with dropout regularization | || Simple regression, support vector machines, margin risk bounds, and neural nets with dropout regularization | ||
|| [https://www.dropbox.com/s/oo1qny9busp3axn/08slides.pdf?dl=0 sl08] | || [https://www.dropbox.com/s/oo1qny9busp3axn/08slides.pdf?dl=0 sl08] | ||
| Строка 92: | Строка 93: | ||
|| <!-- [https://www.dropbox.com/scl/fi/e598w1t8tzqxfvn1d4ww1/09sol.pdf?rlkey=yr1gzu8kg2rdkubaelicljj46&dl=0 sol09] --> | || <!-- [https://www.dropbox.com/scl/fi/e598w1t8tzqxfvn1d4ww1/09sol.pdf?rlkey=yr1gzu8kg2rdkubaelicljj46&dl=0 sol09] --> | ||
|- | |- | ||
| − | | [https://youtu.be/9FhFxLHR4eE | + | | [https://youtu.be/9FhFxLHR4eE 12 Nov] |
|| Kernels: RKHS, representer theorem, risk bounds | || Kernels: RKHS, representer theorem, risk bounds | ||
|| [https://www.dropbox.com/s/jst60ww8ev4ypie/09slides.pdf?dl=0 sl09] | || [https://www.dropbox.com/s/jst60ww8ev4ypie/09slides.pdf?dl=0 sl09] | ||
| Строка 99: | Строка 100: | ||
|| <!-- [https://www.dropbox.com/scl/fi/a5c0buap9b1h1ojdbhp3u/10sol.pdf?rlkey=8ft5tjyy1sl5dkj4p4hh8phbc&dl=0 sol10] --> | || <!-- [https://www.dropbox.com/scl/fi/a5c0buap9b1h1ojdbhp3u/10sol.pdf?rlkey=8ft5tjyy1sl5dkj4p4hh8phbc&dl=0 sol10] --> | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=OgiaWrWh_WA | + | | [https://www.youtube.com/watch?v=OgiaWrWh_WA 19 Nov] |
|| AdaBoost and the margin hypothesis | || AdaBoost and the margin hypothesis | ||
|| [https://www.dropbox.com/s/umum3kd9439dt42/10slides.pdf?dl=0 sl10] | || [https://www.dropbox.com/s/umum3kd9439dt42/10slides.pdf?dl=0 sl10] | ||
| Строка 106: | Строка 107: | ||
|| <!-- [https://www.dropbox.com/scl/fi/c805j4f54ioiozphvh9j0/11sol.pdf?rlkey=6rrxlweaiko1lm0z2ua4k7mqk&dl=0 sol11] --> | || <!-- [https://www.dropbox.com/scl/fi/c805j4f54ioiozphvh9j0/11sol.pdf?rlkey=6rrxlweaiko1lm0z2ua4k7mqk&dl=0 sol11] --> | ||
|- | |- | ||
| − | | [https://youtu.be/GL574ljefJ8 | + | | [https://youtu.be/GL574ljefJ8 26 Nov] |
|| Implicit regularization of stochastic gradient descent in overparameterized neural nets ([https://www.youtube.com/watch?v=ygVHVW3y3wM recording] with many details about the Hessian) | || Implicit regularization of stochastic gradient descent in overparameterized neural nets ([https://www.youtube.com/watch?v=ygVHVW3y3wM recording] with many details about the Hessian) | ||
|| | || | ||
| Строка 113: | Строка 114: | ||
|| | || | ||
|- | |- | ||
| − | | [https://www.youtube.com/watch?v=RDTK7hBqiJY | + | | [https://www.youtube.com/watch?v=RDTK7hBqiJY 03 Dec] |
|| Part 2 of previous lecture: Hessian control and stability of the NTK. | || Part 2 of previous lecture: Hessian control and stability of the NTK. | ||
|| | || | ||
Версия 12:16, 21 сентября 2024
Содержание
General Information
First lecture Saturday 21.09 at 10h00 in room R208 (and on the above zoom link).
Lectures: on Tuesday 9h30--10h50 in room M302 and in zoom by Bruno Bauwens
Seminars: online in Google Meet by Nikita Lukianenko.
Please join the telegram group The course is similar to last year.
Homeworks
Deadline every 2 weeks, before the lecture. The tasks are at the end of each problem list. (Problem lists will be updated, check the year.)
Before 3rd lecture: see problem lists 1 and 2. Before 5th lecture: see problems lists 3 and 4. Etc.
Email homeworks to brbauwens-at-gmail.com. Start the subject line with SLT-HW. Results will be here.
Late policy: 1 homework can be submitted at most 24 late without explanations.
Course materials
| Video | Summary | Slides | Lecture notes | Problem list | Solutions |
|---|---|---|---|---|---|
| Part 1. Online learning | |||||
| 21 Sept | Philosophy. The online mistake bound model. The halving and weighted majority algorithms. | sl01 | ch00
ch01 |
prob01 | |
| 24 Sept | The perceptron algorithm. Kernels. The standard optimal algorithm. | sl02 | ch02 ch03 | prob02 | |
| 01 Oct | Prediction with expert advice. Recap probability theory (seminar). | sl03 | ch04 ch05 | prob03 | |
| Part 2. Distribution independent risk bounds | |||||
| 08 Oct | Necessity of a hypothesis class. Sample complexity in the realizable setting, examples: threshold functions and finite classes. | sl04 | ch06 | prob05 | |
| 15 Oct | Growth functions, VC-dimension and the characterization of sample comlexity with VC-dimensions | sl05 | ch07 ch08 | prob06 | |
| 22 Oct | Risk decomposition and the fundamental theorem of statistical learning theory | sl06 | ch09 | prob07 | |
| 29 Oct | Bounded differences inequality, Rademacher complexity, symmetrization, contraction lemma. | sl07 | ch10 ch11 | prob08 | |
| Part 3. Margin risk bounds with applications | |||||
| 05 Nov | Simple regression, support vector machines, margin risk bounds, and neural nets with dropout regularization | sl08 | ch12 ch13 | prob09 | |
| 12 Nov | Kernels: RKHS, representer theorem, risk bounds | sl09 | ch14 | prob10 | |
| 19 Nov | AdaBoost and the margin hypothesis | sl10 | ch15 | prob11 | |
| 26 Nov | Implicit regularization of stochastic gradient descent in overparameterized neural nets (recording with many details about the Hessian) | ch16 ch17 | |||
| 03 Dec | Part 2 of previous lecture: Hessian control and stability of the NTK. |
The lectures in October and November are based on the book:
Foundations of machine learning 2nd ed, Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalker, 2018.
A gentle introduction to the materials of the first 3 lectures and an overview of probability theory, can be found in chapters 1-6 and 11-12 of the following book: Sanjeev Kulkarni and Gilbert Harman: An Elementary Introduction to Statistical Learning Theory, 2012.
Grading formula
Final grade = 0.35 * [score of homeworks] + 0.35 * [score of colloquium] + 0.3 * [score on the exam] + bonus from quizzes.
All homework questions have the same weight. Each solved extra homework task increases the score of the final exam by 1 point. At the end of the lectures there is a short quiz in which you may earn 0.1 bonus points on the final non-rounded grade.
There is no rounding except for transforming the final grade to the official grade. Arithmetic rounding is used.
Autogrades: if you only need 6/10 on the exam to have the maximal 10/10 for the course, this will be given automatically. This may happen because of extra homework questions and bonuses from quizzes.
Colloquium
Rules and questions from last year.
Date: TBA
Problems exam
TBA
-- You may use handwritten notes, lecture materials from this wiki (either printed or through your PC), Mohri's book
-- You may not search on the internet or interact with other humans (e.g. by phone, forums, etc)
Office hours
Bruno Bauwens: TBA
Nikita Lukianenko: Write in Telegram, the time is flexible