Compact representations for static and dynamic texture synthesis

In nature, textures can be found everywhere. Texture, together with color and shape, represents the fundamental characteristics of objects. Texture conveys an idea of repetition of a certain structure, which is not limited only to the visual domain. We can refer to the texture of a sound, a fabric, or a certain wine. In image processing, a static texture is defined as an image showing spatial stationarity, while a dynamic texture is a sequence of images showing temporal stationarity. Texture synthesis is the process of producing artificial textures starting from a given texture sample. In this thesis, we consider texture synthesis starting from compact representations. For static, this reduces to define the smallest portion of a texture that, when used for synthesis, produces a synthetic texture indistinguishable from the original image. We designed a psychophysical experiment to quantify this size for a collection of textures and two synthesis algorithms: the parametric algorithm by Portilla and Simoncelli and the procedural algorithm by Nealen and Alexa. The results of the experiment show that for pre-attentive vision condition the size of the compact texture does not depend on the algorithm used for synthesis but is a characteristic of the texture. We evaluated if an objective function, based on spatial correlation between pixels, is able to predict the texture size necessary for synthesis. We found that a statistical measure based on random walks correlated well with the results of the subjective experiment. In the case of dynamic textures, the term "compact" applies to the model size that is used for synthesis. We propose a dynamic texture analysis that is able to obtain a more compact model starting from the linear model of Soatto and Doretto. Current methods perform a dimension reduction of the data by applying the SVD to the video frames unfolded into column vectors. This permits only to exploit the temporal correlation. We avoid the unfolding operations and decompose the signal directly using a multidimensional decomposition known as Higher-Order SVD (HOSVD). Chromatic components are exploited more efficiently by combining the HOSVD decomposition with the YCbCr color encoding for the input data. Tests show that the combined model has far fewer parameters than models derived with other algorithms, for the same visual quality and approximately the same computational synthesis cost. The HOSVD-based model was used in two applications for portable devices where memory and computational power are limited. We created a video effect for webcams and smartphones where a dynamic texture is synthesized and blended to the video in real time.


Related material