Skip to Main Content
Approximately 105 video clips are posted every day on the Web. The popularity of Web-based video databases poses a number of challenges to machine vision scientists: how do we organize, index and search such large wealth of data? Content-based video search and classification have been proposed in the literature and applied successfully to analyzing movies, TV broadcasts and lab-made videos. We explore the performance of some of these algorithms on a large data-set of approximately 3000 videos. We collected our data-set directly from the Web minimizing bias for content or quality, way so as to have a faithful representation of the statistics of this medium. We find that the algorithms that we have come to trust do not work well on video clips, because their quality is lower and their subject is more varied. We will make the data publicly available to encourage further research.