Multivariate Statistical Analysis

linge1222  2019-2020  Louvain-la-Neuve

Multivariate Statistical Analysis
Note from June 29, 2020
Although we do not yet know how long the social distancing related to the Covid-19 pandemic will last, and regardless of the changes that had to be made in the evaluation of the June 2020 session in relation to what is provided for in this learning unit description, new learnig unit evaluation methods may still be adopted by the teachers; details of these methods have been - or will be - communicated to the students by the teachers, as soon as possible.
4 credits
30.0 h + 15.0 h
Q2
Teacher(s)
Segers Johan; Uyttendaele Nathan (compensates Segers Johan);
Language
French
Prerequisites

The prerequisite(s) for this Teaching Unit (Unité d’enseignement – UE) for the programmes/courses that offer this Teaching Unit are specified at the end of this sheet.
Main themes
Part 1: Basic descriptive methods and basic notations. In this part, students are taught how matrix notation facilitates treatment of multidimensional data and basic properties of random vectors. They will also learn that the basic (uni-and bivariate) descriptive tools have both their uses and limitations. Part 2: Techniques of multivariate data analysis. In this part, students learn about basic dimension reduction techniques for continuous and qualitative variables (principal components, correspondence analysis). Basic classification techniques are also presented. A wide range of examples is given to illustrate these methods and show when they should be used. Part 3: Multivariate analysis models. In this part, students see how to model inter-variable relations: linear models (including variance and variance-covariance analysis) which make it possible to use explanatory variables to explain response variable variation. Models adapted to categorical response variable are also introduced, log-linear models for contingency tables, the logit model and discrimination analysis models. Here too, a wide range of examples is given to illustrate these methods and show when they should be used.
Aims

At the end of this learning unit, the student is able to :

1 This course develops the elements introduced in the basic Probability and Statistics courses within a multivariate framework, the aim being to equip students with the instruments they need to analyse multidimensional data sets. By the end of the course, students should be able to use the most widely-used instruments to analyse real data. A key aim of the course will therefore be to give students a clear understanding of the methods and how to apply them, and how to use relevant analytical software.
 

The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.
Content
  • Introduction to multivariate data analyis
  • Linear algebra and Euclidean geometry
  • Descriptive statistics for data matrices
  • Principal component analysis
  • Cluster analysis: k-means clustering and hierarchical cluster algorithms
  • Linear discriminant analysis
  • Distribution theory
  • Multiple linear regression, including AN(C)OVA
  • Logistic regression
Teaching methods
  • Lectures: the teacher introduces the concepts through an application and then presents the abstract form
  • Exercise sessions in computer rooms: the teacher gives students real-data problems to solve using the statistical software environment R.
Evaluation methods
  • Computer test: at the end of the course, the students need to solve multiple choice questions related to real data sets, to be solved using the statistical software environment R. This part is open-book.
  • Exam: written, closed book, with the help of a formula list and a pocket calculator. The exam part comprises both theory questions as well as exercises related to interpreting and reconstructing the output of the R software.
Online resources
The list of formulas, the slides used in the lectures and the computer labs, R software documentation and links to external web resources (videos, on-line courses, documents) are available on the MoodleUCL course page.
Bibliography
  • Härdle, W. and L. Simar (2007): Applied Multivariate Statistical Analysis, 2nd Edition, Springer-Verlag, Berlin.
  • James, G., Witten, D., Hastie, T. and R. Tibshirani (2013): An Introduction to Statistical Learning, Springer, New York.
  • Saporta, G. (2011): Probabilités, analyse des données et statistique, 3e édition révisée, Editions TECHNIP, Paris.
Teaching materials
  • syllabus "LINGE1222 - Multivariate Statistical Analysis" (J. Segers)
Faculty or entity
ESPO


Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme
Sigle
Credits
Prerequisites
Aims
Certificat d'université : Statistique et sciences des données (15/30 crédits)

Bachelor in Mathematics

Bachelor : Business Engineering