Files

Abstract

This paper presents a scalable video coding scheme (MP3D), based on the use of a redundant 3-D spatio-temporal dictionary of functions. The spatial component of the dictionary consists of directional and anisotropically scaled functions, which form a rich collection of visual primitives. The temporal component is tuned to capture most of the energy along motion trajectories in the video sequences. The MP3D video coding first finds motion trajectories. It then applies a spatio-temporal decomposition using an adaptive approximation algorithm based on Matching Pursuit (MP). The coefficients and the function parameters are quantized and coded in a progressive fashion, under multiple rate constraints, allowing for adaptive decoding by simple bit stream truncation. The motion fields are losslessly coded and transmitted as side information to the decoder. The multi-resolution structure of the dictionary allows for flexible spatial and temporal resolution adaptation. This scheme is shown to yield comparable rate-distortion performances to state-of-the-art schemes, like H.264 and MPEG-4. It represents a promising alternative for low and medium rate applications, or as a flexible base layer for higher rate video systems.

Details

PDF