LSTAT2013 - Concepts de base en statistique inférentielle
LSTAT2120 Linear models
LSTAT2020 Logiciels et programmation statistique de base
At the end of this learning unit, the student is able to :
|1||With regard to the AA reference framework of the Master's programme in Statistics, general orientation, this activity contributes to the development and acquisition of the following AAs, as a matter of priority : 1.4, 1.5, 2.4, 4.3, 6.1, 6.2
- Statistical learning
- Challenges concerning high-dimensional models and differences from low-dimensional models
- Classical variable selection techniques for linear regression models: R2, adj.R2, Cp
- Information criteria selection: KL divergence, AIC/TIC/BIC derivation
- Cross-validation based selection: Leave-one-out and K-fold
- Under- and overfitting or the bias-variance trade-off
- Ridge shrinkage: theoretical properties, bias/variance trade-off, GCV
- Lasso shrinkage: regularization paths, LARS, coordinate descent algorithm, prediction error bounds, degrees of freedom for lasso, support recovery, stability selection, knock-offs; inference by debiasing, post-selection inference, Bayesian inference
- Extensions of Lasso: elastic net, group lasso, adaptive lasso, fused lasso
- Other techniques: sparse graphical models, sparse PCA, sparse Disriminant Analysis
Due to the COVID-19 crisis, the information in this section is particularly likely to change.The class consists of lectures (30h) and exercises sessions (7.5h).
Teaching language: English.
Due to the COVID-19 crisis, the information in this section is particularly likely to change.An oral examination, where the instructors evaluate:
- knowledge about the concepts seen in class throughout the semester (50% des points);
- the quality of a project (written in French / English in min 5 and max 8 pages in the template on Moodle, annexes not included) of data analysis/simulation that ilustrates the statistical learning methods in a concrete case (50% des points). This written project will be handed in before the exam session and discussed with the instructors during the exam session. The evaluation of the project is based on the written manuscript and responses to questions in an oral discussion about the results and the methodology used for the report.
To be allowed to take part in the examination the student has to submit 3 compulsory homeworks (short, 1-2 pages maximum per homework). The homeworks are not graded as they are not part of the evaluation.
Submission of less than 3 homework results in failure of the course!
- Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
- James, G., Witten, D., Hastie, T., and Tibshirani, R. (2014). An Introduction to Statistical Learning: With Applications in R. Springer
- Hastie, T., Tibshirani, R. and Wainwright, M. J. (2015). Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC.
- Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press.
- Bühlmann, P., van de Geer, S. (2011). Statistics for High-Dimensional Data. Springer.
- Transparents du cours disponible sur moodle.