Multivariate statistics for linguistics

5.00 credits

22.5 h + 10.0 h

> Schedule

Teacher(s)

François Thomas;

Language

English
> French-friendly

Main themes

Access to this course is restricted to students who have already successfully completed an introductory statistics course, e.g. LFIAL2260.
The course will cover the following topics:

Concept and structure of a statistical model (linear or non-linear) and examples of typical linguistic research questions
Regression models: introduction to different regression models (e.g. linear regression, logistic regression, etc.), algorithms to select variables, notion of multicollinearity, models’ estimation, interpretation of parameters, quality of predictions
Analysis of variance: presentation of different analysis of variance techniques (for parametric and non-parametric data; for independent or repeated measures; with one or two classification criteria, etc.), the logic of F-test, multiple comparison of means (post-hoc tests)
Linear mixed models: notion of generalised linear model, random effects, hierarchical models
Exploratory methods: exploration of linguistic data with methods such as principal component analysis, factor analysis, etc.
Classification methods: introduction to classification models, e.g. decision trees
Model validation: measures of goodness of fit, residual analysis, variance and sphericity homogeneity tests, detection of outliers or influential points, variable transformation, etc.

Learning outcomes

At the end of this learning unit, the student is able to :
1	Translate a linguistic research problem into a series of statistical questions, choose the appropriate methods, apply them and present all the results in a report.

2	Understand and explain the statistical concepts underlying the different methods used in the course.

3	Apply the various statistical methods covered in the course to textual data using the R software.

This teaching unit contributes to the development and mastery of the following skills and outcomes of the ELAL programmes (ELAL learning outcomes): 1.4 ; 2.3 ; 2.6 ; 3.1 ; 3.2 ; 3.3 ; 3.5 ; 4.5 ; 5.1 ; 5.2 ; 5.3

Content

After an introduction to the central role of statistical methods in linguistics, the course will cover various topics:

Notions of statistical modelling
ANOVA I - Analysis of variance with one classification criterion: Classical model, post-hoc comparisons and Kruskal-Wallis ANOVA
ANOVA II - Analysis of variance with two classification criteria
ANOVA for repeated measures: Classical model and Friedman ANOVA
Simple and multiple regression models and residual analysis
Simple et multiple logistic regression models
GLM - General linear model and mixed models
Exploratory multivariate statistical analyses: Principal component analysis and factor analysis
Classification methods: decision trees

Teaching methods

Lectures + personal readings + exercises (during lectures or in the tutorials)

Evaluation methods

The evaluation is three-fold :

continuous assessment (exercices during TP and readings) (30 %)
written examination (30 %)
personal written essay (40 %)

In September, the evaluation is adapted as follows:

written examination (50 %)
personal written essay (50 %)

Generative artificial intelligence (AI) must be used responsibly and in accordance with the practices of academic integrity (see the official UCLouvain guidelines for the good use of AI). Scientific integrity requires that sources be cited, and the use of AI must always be reported. The use of artificial intelligence for tasks where it is explicitly forbidden will be considered as cheating.

Other information

Support (available on Moodle) :

slides;
articles ou book chapters;
additional exercices.

Bibliography

Field, A. (2013). Discovering statistics using IBM SPSS statistics. Sage.
Howell, D. (2008). Méthodes statistiques en sciences humaines, Paris, De Boeck Université.
Muller, Charles (1992). Initiation aux méthodes de la statistique linguistique, Champion.
Rasinger, S.M. (2008). Quantitative Research in Linguistics. New York, Continuum International Publishing Group

Faculty or entity

> ELAL

Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme

Sigle

Credits

Prerequisites

Learning outcomes

Master [120] in Linguistics

LING2M

Master [120] in Modern Languages and Literatures : German, Dutch and English

GERM2M

Master [120] in Modern Languages and Literatures : General

ROGE2M