Teacher(s)
Language
English
Prerequisites
The student masters the bases of statistics and data analysis, including probabilities, statistical inference method, analysis of variance, linear regression. The student masters programming in the R language.
Main themes
Advanced statistical analysis techniques for biological data: multivariate statistics, generalized linear mixed models.
Learning outcomes
At the end of this learning unit, the student is able to : | |
1 | Contribution of the teaching unit to the program's AA reference framework In line with the BOE2M program's competency framework, this teaching unit contributes to the development and acquisition of the following skills: 3 (3.3, 3.4, 3.5, 3.6) |
2 | Course-specific learning outcomes : The student expands her/his knowledge in data visualization and statistical analysis, with a specific focus on multivariate statistical methods and generalized mixed linear models |
Content
Module 1 : Linear statistical modeling
Theoretical introduction into mixed and generalized models (6h); Practical sessions on R (14h); Two case studies on mixed and generalized models (4h+2h).
Module 2 : Multivariate data exploration
This module details how to visualize and check multivariate data, how to summarize and combine a set of continuous variables into a lower number of variables through PCA (Principal Component Analysis), how to perform the PCA equivalent for categorical data(FCA, Factorial Correspondence Analysis), and how to unravel the links between two sets of continuous variables (Canonical Correlation Analysis). The teaching philosophy insists on the fact that statistics are tools and that the key skills the student should acquire is the expertise to choose the right tool for the job, how to parameterize it and interpret its results critically. Real examples from ecology will be used to illustrate clean but also more difficult cases, closer to real life.
Theoretical introduction into mixed and generalized models (6h); Practical sessions on R (14h); Two case studies on mixed and generalized models (4h+2h).
Module 2 : Multivariate data exploration
This module details how to visualize and check multivariate data, how to summarize and combine a set of continuous variables into a lower number of variables through PCA (Principal Component Analysis), how to perform the PCA equivalent for categorical data(FCA, Factorial Correspondence Analysis), and how to unravel the links between two sets of continuous variables (Canonical Correlation Analysis). The teaching philosophy insists on the fact that statistics are tools and that the key skills the student should acquire is the expertise to choose the right tool for the job, how to parameterize it and interpret its results critically. Real examples from ecology will be used to illustrate clean but also more difficult cases, closer to real life.
Teaching methods
Lectures, seminars, and exercise sessions in a computer room. The student is encouraged to interactivity for all these activities.
Exercises : learn to solve a statistical problem. Find the appropriate analysis when faced with a problem, check the application conditions relating to the use of this analysis, perform the statistical test on the R software, interpret the results obtained and illustrate them.
Exercises : learn to solve a statistical problem. Find the appropriate analysis when faced with a problem, check the application conditions relating to the use of this analysis, perform the statistical test on the R software, interpret the results obtained and illustrate them.
Evaluation methods
The two modules will be evaluated separately, each module contributing 10/20 to the final score. As the final score must be an integer number, the sum of the two notes will be rounded up if both modules are passed (at least 5/10) and down if it is not the case.
Module 1 (Linear statistical modelling) :
Open book exam, including two exercises on LMM and GLM(M) on R (based on practical sessions and first seminar) and one case study (based on second seminar).
Module 2 (Multivariate data analysis) :
Open book written exam consisting of multiple choice questions, open questions and practical solution of exercises with R software on a computer. The exam is carried out on Moodle, in a computer room on campus, unless health regulations require that the exam be taken at a distance.
Module 1 (Linear statistical modelling) :
Open book exam, including two exercises on LMM and GLM(M) on R (based on practical sessions and first seminar) and one case study (based on second seminar).
Module 2 (Multivariate data analysis) :
Open book written exam consisting of multiple choice questions, open questions and practical solution of exercises with R software on a computer. The exam is carried out on Moodle, in a computer room on campus, unless health regulations require that the exam be taken at a distance.
Other information
A basic knowledge of the R software is required: the student is expected to be able to create and modify R-data sets independently and perform basic data management and statistical analysis procedures. If such knowledge is not acquired, the student must be trained autonomously in these skills, e.g. by means of the many resources available online for free.
Online resources
All resources are available on the Moodle website: visuals of the lectures and practical sessions, data sets and R scripts, links to additional resources and supporting books.
Teaching materials
- Les visuels du cours et de support aux travaux pratiques / Course visuals and materials supporting the practical work
Faculty or entity