Skip to Main Content
Videos representing flames, water, smoke, etc., are often defined as dynamic textures: "textures" because they are characterized by the redundant repetition of a pattern and "dynamic" because this repetition is also in time and not only in space. Dynamic textures have been modeled as linear dynamic systems by unfolding the video frames into column vectors and describing their trajectory as time evolves. After the projection of the vectors onto a lower dimensional space by a singular value decomposition (SVD), the trajectory is modeled using system identification techniques. Synthesis is obtained by driving the system with random noise. In this paper, we show that the standard SVD can be replaced by a higher order SVD (HOSVD), originally known as Tucker decomposition. HOSVD decomposes the dynamic texture as a multidimensional signal (tensor) without unfolding the video frames on column vectors. This is a more natural and flexible decomposition, since it permits us to perform dimension reduction in the spatial, temporal, and chromatic domain, while standard SVD allows for temporal reduction only. We show that for a comparable synthesis quality, the HOSVD approach requires, on average, five times less parameters than the standard SVD approach. The analysis part is more expensive, but the synthesis has the same cost as existing algorithms. Our technique is, thus, well suited to dynamic texture synthesis on devices limited by memory and computational power, such as PDAs or mobile phones.