Student project

Multi-Armed Bandits for Addressing the Exploration/Exploitation Trade-off in Self Improving Learning Environment

This project proposes the use of machine learning techniques such as Multi-Armed Bandits to implement self-improving learning environments. The goal of a self-improving learning environment is to perform good pedagogical choices while measuring the efficiency of these choices. The modeling of students is done using the LFA model and fitted on a dataset of university courses to allow to simulate students. Three experiments with simulated students are carried out and show that the Multi-Armed Bandit approach improves learning outcomes.

Related material