Although we do not yet know how long the social distancing related to the Covid-19 pandemic will last, and regardless of the changes that had to be made in the evaluation of the June 2020 session in relation to what is provided for in this learning unit description, new learnig unit evaluation methods may still be adopted by the teachers; details of these methods have been - or will be - communicated to the students by the teachers, as soon as possible.
It covers the following topics: corpus design:
- data collection, archiving and markup.
- corpus typology: spoken and written corpora; monolingual vs multilingual; native vs learner; diachronic vs synchronic.
- major electronic corpora: British National Corpus, International Corpus of English, International Corpus of Learner English, MICASE, Louvain International Database of Spoken English Interlanguage, etc.
- corpus annotation (POS-tagging, lemmatization, parsing, semantic tagging, prosodic annotation, error tagging).
- automated analysis of lexis, grammar and discourse.
At the end of this learning unit, the student is able to :
By the end of the course, students are expected to have a solid theoretical background in corpus linguistics and master the main techniques and tools used to analyse spoken and written computerized data. They will be able to read the scientific literature and conduct their own research in the field.
The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.
In January or September: written exam counting for 80% of the final grade.
A WORD OF CAUTION: students who have not handed in their written assignment(s) on time will fail this course overall (7/20 or less if the mean is lower).
- Notes de cours et Powerpoint sur Moodle
- Portefeuille de lectures