Multimedia event modelling and recognition

The recognition of events in multimedia data is a challenging area of research. The growth in the amount of multimedia data being produced and stored increases the need for systems capable of automatically analysing this data. This analysis can aid in efficient browsing of and information retrieval from the data. The specific task addressed in this thesis is the recognition of sequences of high-level semantic events in both sports broadcasts and meeting data. In particular, the use of statistical machine learning techniques to model and recognise events in this task will be investigated. A vital task in multimedia event recognition is feature extraction. Feature extraction is an open problem in multimedia and currently there is no universal set of general robust features. In this thesis, we introduce a novel method of segmenting the playfield in sports. In this method statistical modelling, using Gaussian Mixture Models, is used to model the general grass playfield colour. Maximum a Posteriori adaptation is used to modify this general model to the specific colour of various grass sports fields. This method of playfield segmentation produced significantly improved results over existing methods. One issue in the task of multimedia event recognition is overfitting of statistical models when using low-level features and relatively small amounts of training data. One approach to this problem that is investigated here is the decomposition of the data into multiple streams based on different feature types. In this multi-stream event modelling each stream is processed separately and recognition decisions from each stream are then combined to give an overall decision. We present a study of various multi-stream modelling techniques, using variations of Hidden Markov Models (HMMs), these are tested on the task of event recognition in sports and meetings. Multi-stream techniques offered some improvement in performance through the ability to balance the contribution from each stream. A different approach to the problem of overfitting in statistical models is to use a hierarchical approach to recognition. In Layered-HMMs the recognition process is broken down into a number of layers where the recognition decision from low-level events is used as data in order to recognise high-level events. A technique combining unsupervised clustering and Layered-HMMs to recognise sequences of high-level events in the sport of rugby is presented. More precisely, unsupervised clustering is used to define a number of low-level sub-events. Then the first layer of the Layered-HMM produces recognition probabilities for these sub-events and the high-level events are modelled in terms of these probabilities. The Layered-HMM method was found to give more robust performance in high level event recognition than standard modelling techniques.

    Keywords: vision

    Thèse École polytechnique fédérale de Lausanne EPFL, n° 3370 (2005)
    Section de génie électrique et électronique
    Faculté des sciences et techniques de l'ingénieur
    Institut de génie électrique et électronique
    Laboratoire de l'IDIAP
    Jury: Patrick Bouthemy, Juan Ramon Mosig, Jean-Marc Odobez, Gerhard Rigoll, Jean-Philippe Thiran

    Public defense: 2005-11-24


    Record created on 2005-10-12, modified on 2016-08-08


Related material