In this paper, we propose a novel content-based video retrieval method for short video clips which are stored on consumer video sharing Web sites. It is based on the Earth Mover's Distance which enables us to evaluate dissimilarities among videos where the number of shots and time length are different. As features extracted from videos, we use color, motion, sound and position of shots. By defining the ground distance of EMD as the weighted sum of Euclid distances of these four kinds of features, we integrate them when calculating EMD. In the experiments on video retrieval for You Tube videos, we obtained the 0.98 average precision at most, which shows effectiveness of the proposed method. In addition, the results of integration of four kinds of features outperformed the ones of single features, which shows that feature combination is effective.
Published in:
Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on
Date of Conference: 12-15 Oct. 2008