Natural language processing

lfial2620  2022-2023  Louvain-la-Neuve

Natural language processing
5.00 credits
22.5 h
Fairon Cédrick; Tack Anaïs (compensates Fairon Cédrick);
Main themes
The course begins with the architectural study of a complex automatic language processing system (recognition, analysis, generation). It continues with the study of the central linguistic theories and computer formalities of ANLP. Special attention is given to the presentation and analysis of real applications.
Learning outcomes

At the end of this learning unit, the student is able to :

1 The course will teach students the basic theory necessary to understanding the current objectives and issues of the automatic natural language processing (ANPL). At the same time, students will learn to analyse and explain the practical and technical limits that arise in the elaboration of computer systems aimed at language processing (problems of ambiguity, necessity of linguistic resource adaptability, multilingualism, etc.). By the end of the course, students will have received an overview of the "state of the art" in ANLP, be able to take a critical approach to ANLP applications, and have a general knowledge of the main theories in the field.
The course is given in the form of an interactive lecture. A reader composed of specialized articles allows students to prepare for the lectures.
Course Outline:
- The domain of NLP (naming, historical overview, levels of analysis)
- Coding and pre-processing
- Formal languages (regular expressions, FSA)
- Probabilistic language models (notions of probability, n-gram models)
- Lexical resources (electronic dictionaries, etc.)
- Lemmatization
- POS-tagging (rule-based approach, HMMs)
- Formal grammars (Chomsky's hierarchy, non-contextual grammars)
- Syntactic parsing (general principles, alternatives)
- Lexical semantics (thesaurus, ontologies, WordNet)
- Vector semantics (distributionalism, word embeddings)
Evaluation methods
- Two practical works to be carried out during the semester [6 points]
- Written (or oral) examination focusing mainly on the course and, to a lesser extent, on important concepts from the required readings. [14 points]
Other information
English-friendly course: course taught in French but offering facilities in English.
Teaching materials
  • Jurafsky & Martin, "Speech and Language Processing" (2nd edition)
Faculty or entity

Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme
Learning outcomes
Master [120] in Data Science : Statistic

Master [120] in French and Romance Languages and Literatures : French as a Foreign Language

Master [120] in Linguistics