3.00 credits

15.0 h + 7.5 h

Q1

Teacher(s)

Segers Johan;

Language

French

Main themes

The course presents an overview of the main tools of exploratory multivariate data analysis via factorial methods. The data is projected onto a low-dimensional subspace while retaining maximum information. This reduction in dimension facilitates visualization and aids in the discovery of information and patterns in a data table.

- Reminders of algebra and geometry useful for data analysis
- Basic principles of factorial methods
- Principal component analysis
- Classification: moving averages and hierarchical classification
- Linear discriminant analysis

Content

- Data matrices
- Principal component analysis
- Classification: k-means clustering and hierarchical clustering
- Linear discriminant analysis

Teaching methods

During the lectures, the teacher presents the various statistical methods, covering the questions and data-sets to which they apply, the underlying mathematical theory, and how to program them in R. Homework assignments are given, the solution of which is discussed in the lectures too.

The tutorials take place in computer rooms and have as primary objective to allow the students to train themselves in applying the method on real data-sets in R.

The tutorials take place in computer rooms and have as primary objective to allow the students to train themselves in applying the method on real data-sets in R.

Evaluation methods

Exam (12/20):

Project (8/20):

- written, closed book, with the help of a formula list and a pocket calculator
- exercises and questions involving (small) calculcations, interpretation of computer output, and understanding of the main results and formulas

- Test 1: Data matrices and principal component analysis
- Test 2: Clustering and linear discriminant analysis

Project (8/20):

- individually or in pairs
- data application, the data being sought by the students themselves
- written report, to be submitted at a date or at dates specified during the semester
- detailed instructions will be provided in the exercise sessions and on the MoodleUCL course page

Other information

Prerequisities:

- vector and matrix calculus
- Euclidean geometry: points, spaces, orthogonality, distances, angles
- basic notions in statistiques: sample mean, (co)variance, correlation, covariance matrix, conditional probabilities, normal distribution, chi-square distribution

Online resources

All teaching material is made available through the Moodle UCLouvain cours page: slides, exercises, software scripts. In addition, links to interesting external material are given too: on-line courses, videos, software documentation.

Bibliography

- Escofier, B. et Pagès, J. (2016): Analyses factorielles simples et multiples, 5e édition, Dunod, Paris.
- Lebart, L., Piron, M. et Morineau, A. (2006): Statistique exploratoire multidimensionnelle, 4e édition, Dunod, Paris.
- Saporta, G. (2011): Probabilités, analyse des données et statistique, 3e édition révisée, Editions TECHNIP, Paris.

Faculty or entity

**LSBA**