Gianfranco Doretto / Publications

DYNAMIC TEXTURES: modeling, learning, synthesis, animation, segmentation, and recognition

Doretto, G.
DYNAMIC TEXTURES: modeling, learning, synthesis, animation, segmentation, and recognition
Ph.D. Thesis, University of California, Los Angeles, CA, 2005. Committee: Adnan Darwiche, Petros Faloutsos, Demetri Terzopoulos, Ying Nian Wu, Stefano Soatto (Chair)

Download

PDF (6.9MB )  

Abstract

Dynamic textures are sequences of images of dynamic scenes that exhibit some temporal regularity properties, intended in a statistical sense; these include, for example, ocean waves, smoke, whirlwind, fire, foliage, but also moving objects with a “defined shape,” for instance flowers, or flags in wind etc. This work presents a characterization of this class of video sequences, and poses the problems of modeling, learning, synthesis, animation, recognition, and segmentation of dynamic textures.Since, in absence of any additional prior knowledge, the visual reconstruction problem from images alone is ill-posed, in this work we give up trying to infer the physical model that generated the images, and analyze sequences of images solely as visual signals. We do so by building a statistical framework, and draw on disciplines like time series analysis, system, control, and identification theory.We derive three generative models, the simplest possible, that are able to capture, respectively, the temporal second-order statistics, the spatio-temporal second-order statistics, and the higher-order temporal statistics of dynamic textures. We propose to learn model parameters in the maximum-likelihood sense, or minimum prediction error variance. We derive efficient closed-form inference procedures for learning the second-order statistics, and revert to nonlinear optimization techniques for the higher-order ones. After learning a model, it can be used to extrapolate, or predict new image data both in the temporal and spatial domain. We analyze the meaning of the parameters of a model, and show how they can be manipulated to control, or animate the simulation. Using the geometry of subspaces, and statistical pattern recognition theory we derive a technique to discriminate between models, and assess the potential for building a recognition system. Finally, by combining these results with a variational framework, we design a region-based segmentation system able to partition a video sequence into regions characterized by different spatio-temporal statistics.

BibTeX

@PhdThesis{doretto05dissertation,
  Title                    = {{DYNAMIC TEXTURES}: modeling, learning, synthesis, animation, segmentation, and recognition},
  Author                   = {Doretto, G.},
  School                   = {University of California},
  Year                     = {2005},
  Address                  = {Los Angeles, CA},
  Month                    = {March},
  Note                     = {{C}ommittee: {A}dnan {D}arwiche, {P}etros {F}aloutsos, {D}emetri {T}erzopoulos, {Y}ing {N}ian {W}u, {S}tefano {S}oatto ({C}hair)},
  Abstract                 = {Dynamic textures are sequences of images of dynamic scenes that exhibit some temporal regularity properties, intended in a statistical sense; these include, for example, ocean waves, smoke, whirlwind, fire, foliage, but also moving objects with a “defined shape,” for instance flowers, or flags in wind etc. This work presents a characterization of this class of video sequences, and poses the problems of modeling, learning, synthesis, animation, recognition, and segmentation of dynamic textures.
Since, in absence of any additional prior knowledge, the visual reconstruction problem from images alone is ill-posed, in this work we give up trying to infer the physical model that generated the images, and analyze sequences of images solely as visual signals. We do so by building a statistical framework, and draw on disciplines like time series analysis, system, control, and identification theory.
We derive three generative models, the simplest possible, that are able to capture, respectively, the temporal second-order statistics, the spatio-temporal second-order statistics, and the higher-order temporal statistics of dynamic textures. We propose to learn model parameters in the maximum-likelihood sense, or minimum prediction error variance. We derive efficient closed-form inference procedures for learning the second-order statistics, and revert to nonlinear optimization techniques for the higher-order ones. After learning a model, it can be used to extrapolate, or predict new image data both in the temporal and spatial domain. We analyze the meaning of the parameters of a model, and show how they can be manipulated to control, or animate the simulation. Using the geometry of subspaces, and statistical pattern recognition theory we derive a technique to discriminate between models, and assess the potential for building a recognition system. Finally, by combining these results with a variational framework, we design a region-based segmentation system able to partition a video sequence into regions characterized by different spatio-temporal statistics.},
  Bib2html_pubtype         = {Theses},
  Bib2html_rescat          = {Dynamic Textures, Visual Motion Analysis, Visual Motion Segmentation, Visual Motion Recognition, Shape and Appearance Modeling, Image Based Rendering},
  File                     = {doretto05dissertation.pdf:doretto\\thesis\\doretto05dissertation.pdf:PDF;doretto05dissertation.pdf:doretto\\thesis\\doretto05dissertation.pdf:PDF},
  Owner                    = {doretto}
}