Representing and modeling the motion and spatial support of multiple objects and surfaces from motion video sequences is an important intermediate step towards dynamic image understanding. One such representation, called layered representation, has recently been proposed. Although a number of algorithms have been developed for computing these representations, there has not been a consolidated effort into developing a precise mathematical formulation of the problem. This paper presents one such formulation based on maximum likelihood estimation (MLE) of mixture models and the minimum description length (MDL) encoding principle. The three major issues in layered motion representation are: (i) how many motion models adequately describe image motion, (ii) what are the motion model parameters, and (iii) what is the spatial support layer for each motion model
Published in:
Computer Vision, 1995. Proceedings., Fifth International Conference on
Date of Conference: 20-23 Jun 1995