Skip to Main Content
In this paper, we propose a fast coarse-to-fine video retrieval scheme using shot-level spatio-temporal statistics. The scheme consists of a two-step coarse search followed by a fine search. In the coarse search stage, the shot-level motion and color distribution is computed as spatio-temporal features for shot matching. The first-step coarse search uses the shot-level global statistics to reduce the size of the search space drastically. By adding an adjacent shot of the first query shot, the second-step coarse search introduces a "causality" relation between two consecutive shots to improve the search accuracy. Finally, the fine-search step refines the search result by using the local color features extracted from the key frames of the query shots. Our experimental results show that the proposed method achieves good retrieval performance with a much reduced complexity compared to single-pass methods.