Enseignants
Langue
d'enseignement
d'enseignement
Anglais
Préalables
Concepts et outils équivalents à ceux enseignés dans les UEs
| LSTAT2020 | Logiciels et programmation statistique de base |
| LSTAT2120 | Linear models |
| LSTAT2100 | Modèles linéaires généralisés et données discrêtes |
Thèmes abordés
Le cours se concentre sur le cadre ‘modélisation en grande dimension’ et sur les techniques permettant l'estimation des paramètres, la sélection de modèles et les procédures inférentielles valides pour les modèles de grande dimension en statistique.
Acquis
d'apprentissage
d'apprentissage
A la fin de cette unité d’enseignement, l’étudiant est capable de : | |
| 1 | Eu égard au référentiel AA du programme de master en statistique, orientation générale, cette activité contribue au développement et à l'acquisition des AA suivants, de manière prioritaire : 1.4, 1.5, 2.4, 4.3, 6.1, 6.2 |
Contenu
The class is focused on the presentation of key concepts of statistical learning and high-dimensional models such as:
- Statistical learning
- Challenges concerning high-dimensional models and differences from low-dimensional models
- Classical variable selection techniques for linear regression models: R2, adj.R2, Cp
- Information criteria selection: KL divergence, AIC/TIC/BIC derivation
- Cross-validation based selection: Leave-one-out and K-fold
- Under- and overfitting or the bias-variance trade-off
- Ridge shrinkage: theoretical properties, bias/variance trade-off, GCV
- Lasso shrinkage: regularization paths, LARS, coordinate descent algorithm, prediction error bounds, degrees of freedom for lasso, support recovery, stability selection, knock-offs; inference by debiasing, post-selection inference, Bayesian inference
- Extensions of Lasso: elastic net, group lasso, adaptive lasso, fused lasso
- Other techniques: sparse graphical models and networks, sparse PCA, sparse Discriminant Analysis
Méthodes d'enseignement
The class consists of lectures (30h) and exercises sessions (7.5h).
The classes and the TP are intended to be face to face.
Teaching language: English.
The classes and the TP are intended to be face to face.
Teaching language: English.
Modes d'évaluation
des acquis des étudiants
des acquis des étudiants
June Session:
The final grade for the LSTAT2450 course in June is given by the points obtained for the assignments + the points obtained for the project + the points obtained for knowldge about the material covered in class.
To validate the course, the student needs a final mark of 10 or more.
August session:
The final grade for the LSTAT2450 course in August is given by the points obtained for the project + the points obtained for knowldge about the material covered in class. The points awarded for homework do not count for the August session, as continuous assessment is only planned for work during the semester.
To validate the course, the student needs a final mark of 10 or more.
- During the semester the student must submit 2 compulsory assignments (short, 2-3 pages maximum per assignment), counting for 1 point of the final grade (each assignment = 0.5 points). The assignments are to be solved individually or in groups of 2. A mark will be assigned per group. Assignments arriving after the deadline are not considered.
- A project (written in French/English in min 6 and max 12 pages in the template on Moodle, appendices not included) which will illustrate the methods of the course for 5 points. This (written) project will be submitted before the exam session and discussed with the teacher during the exam session. The evaluation of the project is done on the basis of the written report and on the basis of the answers in an oral discussion (without slides) on the results and methodology used for the report, during the exam session. The project is to be solved individually or in groups of 2. A score will be awarded per group. Projects arriving after the deadline are not considered.
- An oral exam (~45min), in which the teacher will assess knowledge about the material covered in class (14 points), the quality of the project and the homework.
The final grade for the LSTAT2450 course in June is given by the points obtained for the assignments + the points obtained for the project + the points obtained for knowldge about the material covered in class.
To validate the course, the student needs a final mark of 10 or more.
August session:
- A project (written in French/English in min 6 and max 12 pages in the template on Moodle, appendices not included) which will illustrate the methods of the course for 5 points. This (written) project will be submitted before the exam session and discussed with the teacher during the exam session. The evaluation of the project is done on the basis of the written report and on the basis of the answers in an oral discussion (without slides) on the results and methodology used for the report, during the exam session. The project is to be solved individually or in groups of 2. A score will be awarded per group. Projects arriving after the deadline are not considered.
- An oral exam (~45min), in which the teacher will assess the knowledge about the material covered in class (15 points) and the quality of the project.
The final grade for the LSTAT2450 course in August is given by the points obtained for the project + the points obtained for knowldge about the material covered in class. The points awarded for homework do not count for the August session, as continuous assessment is only planned for work during the semester.
To validate the course, the student needs a final mark of 10 or more.
Autres infos
Software: R/Python
French friendly class.
French friendly class.
Ressources
en ligne
en ligne
Moodle website of the class : LSTAT2450 - Statistical learning. Estimation, selection and inference.
https://moodle.uclouvain.be/course/view.php?id=4214
https://moodle.uclouvain.be/course/view.php?id=4214
Bibliographie
- Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
- James, G., Witten, D., Hastie, T., and Tibshirani, R. (2014). An Introduction to Statistical Learning: With Applications in R. Springer
- Hastie, T., Tibshirani, R. and Wainwright, M. J. (2015). Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC.
- Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press.
- Bühlmann, P., van de Geer, S. (2011). Statistics for High-Dimensional Data. Springer.
Support de cours
- Transparents du cours disponible pendant le quadrimestre
Faculté ou entité
en charge
en charge
Programmes / formations proposant cette unité d'enseignement (UE)
Intitulé du programme
Sigle
Crédits
Prérequis
Acquis
d'apprentissage
d'apprentissage
Master [120] en science des données, orientation statistique
Master [120] en statistique, orientation biostatistiques
Master [120] en sciences mathématiques
Master [120] en statistique, orientation générale
Master [120] : ingénieur civil en mathématiques appliquées
Master [120] : ingénieur civil en science des données
Certificat d'université : Statistique et science des données (15/30 crédits)
Master [120] en science des données, orientation technologies de l'information