Abstract—E-learning has presented new opportunities for
learning with the rapid development of information and
communication technologies (ICTs). Learners are no longer
restricted by the location and time to learn. Lecture video is one
of the most commonly used learning materials on e-learning
platforms. It presents knowledge in a lively manner and keeps
the learners more attentive during the learning process. While
organizing lecture videos in a sequential list seems to be a
natural choice, it presents the problems of inefficiency in
searching for domain concepts and the inability to show
relationships between such concepts. In this work, the task of
constructing a knowledge representation scheme for video
corpuses is explored. The knowledge representation aims to
achieve the goals of facilitating the searching of domain
concepts and to extract the relationships between the concepts
so as to identify effective learning strategies for the corpus. A
framework using text recognition, speech recognition,
multimodal analysis and clustering techniques is proposed for
the construction of the knowledge representation. Two lecture
video corpuses on the topics of general chemistry and geometry
are acquired from the Khan Academy for demonstrating the
feasibility of the proposed framework. Experimental results
have shown that the framework can be used to achieve the
intended goals in specific domains.
Index Terms—E-Learning, knowledge representation, online
education, learning strategies.
The authors are with the Hong Kong University of Science and
Technology Computer Science and Engineering Department, Clear Water
Bay, Kowloon, Hong Kong (e-mail: leofpm@ust.hk, tcpong@ust.hk).
Cite:Pak-Ming Fan and Ting-Chuen Pong, "Constructing Knowledge Representation from Lecture Videos through Multimodal Analysis," International Journal of Information and Education Technology vol. 3, no. 3, pp. 304-309, 2013.