Multi-Armed Bandits for Addressing the Exploration/Exploitation Trade-off in Self Improving Learning Environment

This project proposes the use of machine learning techniques such as Multi-Armed Bandits to implement self-improving learning environments. The goal of a self-improving learning environment is to perform good pedagogical choices while measuring the efficiency of these choices. The modeling of students is done using the LFA model and fitted on a dataset of university courses to allow to simulate students. Three experiments with simulated students are carried out and show that the Multi-Armed Bandit approach improves learning outcomes.


Advisor(s):
Dillenbourg, Pierre
Year:
2017
Keywords:
Laboratories:




 Record created 2017-08-09, last modified 2018-09-13

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)