Data analysis and modeling of biological systems

lboe2112  2022-2023  Louvain-la-Neuve

Data analysis and modeling of biological systems
5.00 credits
24.0 h + 36.0 h
Q1
Teacher(s)
De Laender Frederik; Segers Johan;
Language
English
Prerequisites
In order to successfully follow this course, you should be acquainted with the concept of Probability and the
rules of Probability calculus, the bases of statistical inference, the principles and practice of the classical methods for statistical analysis of continuous data (Regression, Analysis of Variance) and of discrete data (Contingency tables, Goodness of fit tests), and the use of a statistical software for applying the above.
Main themes
Taking into account the most frequently encountered needs of researchers in Biology, as well as the time
constraints, the course offers of two main modules : Linear Modeling, and Methods of Multivariate Analysis.
The examples presented are mainly drawn from researches in Ecology.
Learning outcomes

At the end of this learning unit, the student is able to :

1 The objectives are that, as a result of successfully attending this course, the students :

- Are aware of the necessity of planning any scientific experiment before it is started.
- Have practiced, in the frame of a personal scientific question, the principles of experimental design.
- Are able to review, choose, and apply knowingly the best adapted methods for modeling and analysing
data from their domain of expertise in Biology.
- Are able to set up a scientific experiment, to manage the data generated by this experiment, to analyse
them (usually with the help of a computer software), and to interprete critically the results.
- Have shown their ability to report a scientific experiment in a written document and through an oral
communication. These reports may be elaborated in groups of two or three students.
 
Content
Module 1 : Linear statistical modeling
Theoretical introduction into mixed and generalised liner models (6h); two case studies on mixed and generalised models (2x3h). At the end of each case study lecture, the teacher will present a scientific article that resembles the case study just presented. The teacher asks the students to interpret the statistical results in that paper (e.g. what is wrong with the statistical techniques? what could have been done better? what do the results mean/imply?). The students will write down their answers in their group report, which serves as evaluation (see ‘Evaluation’).

Module 2 : Multivariate data exploration
After a brief reminder on linear algebra and Euclidean geometry (vectors, matrices, distances, angles), the following (canonical) ordination techniques will be treated : principal component analysis, correspondence analysis, redundancy analysis (also called principal component analysis on instrumental variables), and canonical correspondence analysis. The lectures (12h) will treat both some theoretical background as well as the practical implementation via modern statistical software (R, package ade4).
Teaching methods
Lectures, seminars, and exercise sessions in a computer room. Self-study.
Exercises : learn to solve a statistical problem. Find the appropriate analysis when faced with a problem, check the application conditions relating to the use of this analysis, perform the statistical test on the R software, interpret the results obtained and illustrate them.
Evaluation methods
The two modules will be evaluated separately, each module contributing 10/20 to the final score.
Module 1 (Linear statistical modelling) :
A group report serves as evaluation :
  • Groups of 4 students (2 with stronger and 2 with weaker background in statistics).
  • The exercises during the exercise sessions are used to ‘train’ the students. The solutions to these problems should not be in the reports.
  • For the report, students get a new problem to solve.
  • In the report, the students also have to answer a question on the interpretation of an analysis in a published article.
Module 2 (Multivariate data exploration) :
The evaluation consists of two parts, each part counting for 5/10 :
  • A written exam (open course) on the correct interpretation of the numerical and graphical output of a given data analysis. A dispensatory test will be organised near the end of the lectures.
  • A group report on an actual data analysis, covering both the implementation of the method and the interpretation of the output, to be carried out by the students.
Online resources
Moodle UCLouvain
R scripts of the recommended book Zuur et al. (2007): http://highstat.com/index.php/analysing-ecological-data
Self-study website: http://webapps.fundp.ac.be/umdb/biostats2017/
Bibliography
  • Dias cours magistraux, syllabus TP, bases de données, codes informatiques. Site web auto-apprentissage.
  • Alain F. Zuur, Elena N. Iono, Graham M. Smith, Analysing Ecological Data, Springer Science, 2007 (non-obligatoire).
  • Pierre Legendre, Louis Legendre, Numerical Ecology, Elsevier, 2012 (non-obligatoire)
Teaching materials
  • Dias cours magistraux et TP, syllabus TP, bases de données, codes informatiques. Site web auto-apprentissage.
Faculty or entity
BIOL


Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme
Sigle
Credits
Prerequisites
Learning outcomes
Master [120] in Biology of Organisms and Ecology

Master [60] in Biology