Abstract—The abilities to improve teaching strategies online
is important for an intelligent tutoring system (ITS) to perform
adaptive teaching. Reinforcement learning (RL) may help an
ITS obtain the abilities. Conventionally, RL works in a Markov
decision process (MDP) framework. However, to handle
uncertainties in teaching/studying processes, we need to apply
the partially observable Markov decision process (POMDP)
model in building an ITS. In a POMDP framework, it is difficult
to use the improvement algorithms of the conventional RL
because the required state information is unavailable. In our
research, we have developed a reinforcement learning
technique, which enables a POMDP-based ITS to learn from its
teaching experience and improve teaching strategies online.
Index Terms—Computer supported education, intelligent
tutoring system, reinforcement learning, partially observable
Markov decision process.
F. Wang is with the School of Computer Science, University of Guelph,
Ontario, Canada (e-mail: fjwang@uoguelph.ca).
Cite: Fangju Wang, "Reinforcement Learning in a POMDP Based Intelligent Tutoring System for Optimizing Teaching Strategies," International Journal of Information and Education Technology vol. 8, no. 8, pp. 553-558, 2018.