Programme

Course content: Predictive modeling for linguists with R

The bootcamp is a hands-on introduction to statistical methods for both graduate students and seasoned researchers and is loosely based on the third edition (2021) of Gries’s textbook Statistics for linguistics with R (see prerequisites). The course is intended for linguists who already have a basic knowledge in statistics and some experience using R and who wish to improve their proficiency in statistical modeling of linguistic data. Using the open source software and programming language R, we will deal with:

  • fundamental aspects of fixed effects regression modeling for both numeric and binary response variables; these include exploration of data and their preparation for modeling, model formulation and selection; numerical and visual interpretation and evaluation of models;
  • more advanced aspects of fixed-effects regression modeling such as contrasts for ordinal predictors, orthogonal contrasts, curvature of numeric predictors, and maybe general linear hypothesis tests;
  • the theoretical foundations of mixed-effects regression modeling;
  • applications of mixed-effects modeling for both numeric and binary response variables;
  • tree-based methods and random forests: 'fitting' and interpreting them with importance scores, partial dependence scores, and detecting (not just capturing) interactions.

 

Typical schedule

Week day

Schedule

 

Monday

9.00-9.30 Welcome
9.30-12.30 Class

2.00 - 5.00 Class
7.00 Welcome dinner

Tuesday

9.00-12.15 Class

1.45 - 5.00 Class

Wednesday

9.00-12.15 Class

1.45 - 5.00 Class

Thursday

9.00-12.15 Class

1.45 - 5.00 Class

Friday

9.00-12.15 Class

1.45 - 5.00 Class

Class sessions of more than two hours include a 15-minute break.