Corpora

The CECL has been responsible for the compilation of several corpora, some of them still ongoing. Four main types of corpora can be distinguished:

  • learner corpora, which bring together data produced by foreign language learners (ICLE, FRIDA, LINDSEI, LONGDALE, VESPA) and learners of institutionalized second-language varieties of English (NESSI);
  • pedagogical corpora, which contain pedagogical materials, for instance textbook materials (TeMa);
  • multilingual corpora, in which data from several languages are collected (PLECI, Mult-Ed);
  • pedagogical corpora, which contain pedagogical materials, for instance textbook materials (TeMa & CONNECT).

In addition, LOCNESS is a corpus of native novice writing. While most of our corpora represent (native or non-native) varieties of English, FRIDA contains French texts written by learners of French, and our multilingual corpora include several languages apart from English (French, Dutch, Swedish).