Stochastic Optimal Control and Reinforcement Learning

linma2222  2025-2026  Louvain-la-Neuve

Stochastic Optimal Control and Reinforcement Learning
5.00 credits
30.0 h + 22.5 h
Q1
Language
English
Prerequisites
This course assumes familiarity with notions on dynamical systems (level of LEPL1106: Signals and Systems, and LINMA1510: Linear Control) and calculus and linear algebra (level of LEPL1101: Algebra, and LEPL1102: Calculus I). LINMA2470: Stochastic Modelling is highly recommended.
Main themes
  • Foundations of probabilities, optimal control
  • Finite-state systems and MDPs
  • State-space models: LTI, hybrid, and nonlinear
  • Optimal control in the face of model uncertainty
  • Reinforcement learning
Learning outcomes

At the end of this learning unit, the student is able to :

Contribution of the course to the program objectives:
  • AA1.1, AA1.2, AA1.3, AA2.2
  • AA5.5
  • AA6.3
At completion of this course, the student will be able to:
  • understand the concept of optimizing a stochastic process or system;
  • reformulate practical problems as mathematical decision/design problems for stochastic systems;
  • utilize the foundational tools from stochastic optimal control and reinforcement learning to solve decision/design problems for stochastic systems;
  • apply algorithmic tools for the exact or approximate solving of stochastic optimal control problems, as well as understand their strengths and limitations and scope of applicability;
  • apply the concept of exploitation vs exploration and regret minimization;
  • provide an exact or approximate solution to stochastic optimal control problems, with applications in diverse fields, such as financial mathematics, robotics, …
Transversal learning outcomes :
  • Handling unforeseen technical issues that appear when optimizing a real-world system.
  • Making reasonable hypothesis for a given problem, and evaluating them a posteriori.
  • Taking part to a technical class in English.
 
Content
Part 1: Foundations of probabilities, system, and optimal control
Part 2: Exact algorithms for optimal decision-making and control
Part 3: Approximate algorithms
Part 4: Data-driven optimal decision-making and control, and
applications 
Teaching methods
Learning will be based on face-to-face courses, interlaced with practical exercise session and supervised homeworks. In addition, the course may include a project or a presentation to be realized in groups.
Evaluation methods
  •  If exam successfully passed: Exam (60% of the final mark. Project during the semester (40% of the final mark)
  • If the exam is not successfully passed (less than 10/20), only the exam grade will count as the final mark.
  • In september, only the 2nd session exam counts for the final mark.
  • Other activities, such as quizzes and homework exercises, can be taken into account in the course grade
  • Oral examinations may replace in part or entirely other parts of the evaluation.
The use of AI, and the exchange or diffusion of (parts of) solutions with other individuals are of course forbidden for any graded activity.
Teaching materials
  • Meyn, Control Systems and Reinforcement Learning (Cambridge University Press, 2022)
Faculty or entity


Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme
Sigle
Credits
Prerequisites
Learning outcomes
Master [120] in Actuarial Science

Master [120] in Statistics: General

Master [120] in Mathematical Engineering