Skip to Main Content
Accessing, organizing, and manipulating home videos present technical challenges due to their unrestricted content and lack of storyline. We present a methodology to discover cluster structure in home videos, which uses video shots as the unit of organization, and is based on two concepts: (1) the development of statistical models of visual similarity, duration, and temporal adjacency of consumer video segments and (2) the reformulation of hierarchical clustering as a sequential binary Bayesian classification process. A Bayesian formulation allows for the incorporation of prior knowledge of the structure of home video and offers the advantages of a principled methodology. Gaussian mixture models are used to represent the class-conditional distributions of intra- and inter-segment visual and temporal features. The models are then used in the probabilistic clustering algorithm, where the merging order is a variation of highest confidence first, and the merging criterion is maximum a posteriori. The algorithm does not need any ad-hoc parameter determination. We present extensive results on a 10-h home-video database with ground truth which thoroughly validate the performance of our methodology with respect to cluster detection, individual shot-cluster labeling, and the effect of prior selection.