We present an MPEG–7 compliant description of generic video sequences aiming at their scalable transmission and reconstruction. The proposed method allows efficient and flexible video coding while keeping the advantages of textual descriptions in database applications. Visual objects are described in terms of their shape, color, texture and motion; these features can be extracted automatically and are sufficient in a wide range of applications. To permit partial sequence reconstruction, at least one simple qualitative as well as a quantitative descriptor is provided for each feature. In addition, we propose a structure for the organization of the descriptors into objects and scenes and some possible applications for our method. Experimental results obtained with news and video surveillance sequences validate our method and highlight its main features.