Skip to Main Content
This paper presents the trajectory tree as a hierarchical region-based representation for video sequences. Motion, as well as spatial features from multiple frames are used to generate a set of temporal regions structured within a hierarchy of scale and motion coherency. The resulting representations offer a global description of the entire video sequence and enhance semantic analysis potential. A multiscale segmentation strategy is proposed whereby region-merging criteria of progressively greater complexity are used to define partition layers of increasing aptitude for object detection. A novel data structure, called the trajectory adjacency graph, is defined for the long-term analysis of partition sequences. Furthermore, mechanisms for assessing connectivity, verifying temporal continuity, and proposing merging operations based on color, affine, and translational motion homogeneity characteristics over the entire sequence are also introduced. Finally, as demonstrated through experimental results, the trajectory tree offers a concise yet versatile support for video object segmentation, description and retrieval tasks.