Skip to Main Content
Efficiently and effectively identifying similar videos is an important and nontrivial problem in content-based video retrieval. This paper proposes a subspace symbolization approach, namely SUDS, for content-based retrieval on very large video databases. The novelty of SUDS is that it explores the data distribution in subspaces to build a visual dictionary with which the videos are processed by deriving the string matching techniques with two-step data simplification. Specifically, we first propose an adaptive approach, called VLP, to extract a series of dominant subspaces of variable lengths from the whole visual feature space without the constraint of dimension consecutiveness. A stable visual dictionary is built by clustering the video keyframes over each dominant subspace. A compact video representation model is developed by transforming each keyframe into a word that is a series of symbols in the dominant subspaces, and further each video into a series of words. Then, we present an innovative similarity measure called CVE, which adopts a complementary information compensation scheme based on the visual features and sequence context of videos. Finally, an efficient two-layered index strategy with a number of query optimizations is proposed to facilitate video retrieval. The experimental results demonstrate the high effectiveness and efficiency of SUDS.