Skip to Main Content
Content-based video copy detection over large corpus with complex transformations is important but challenging. It is not surprising that most existing methods fall short of either sufficient robustness to detect severely deformed copies or high accuracy to localize copy segments. In this paper, we propose a video copy detection approach which exploits complementary audio-visual features and sequential pyramid matching (SPM). Several independent detectors first match visual key frames or audio clips using individual features, and then aggregate the frame level results into video level results with SPM, which calculates video similarities by sequence matching at multiple granularities. Finally, detection results from basic detectors are fused and further filtered to generate the final result. Excellent performance evaluated on TRECVid 2010 copy detection task demonstrates the effectiveness of our approach.