Abstract

This paper presents the MUST-VIS system for the MediaMixer/VideoLectures.NET Temporal Segmentation and Annotation Grand Challenge. The system allows users to visualize a lecture as a series of segments represented by keyword clouds, with relations to other similar lectures and segments. Segmentation is performed using a multi-factor algorithm which takes advantage of the audio (through automatic speech recognition and word-based segmentation) and video (through the detection of actions such as writing on the blackboard). The similarity across segments and lectures is computed using a content-based recommendation algorithm. Overall, the graph-based representation of segment similarity appears to be a promising and cost-effective approach to navigating lecture databases.

Details

Actions