Abstract:
Many emerging application areas in video and image processing require real-time or faster visual concept detection. Examples include indexing of online user-generated vid...Show MoreMetadata
Abstract:
Many emerging application areas in video and image processing require real-time or faster visual concept detection. Examples include indexing of online user-generated video content and 24/7 archiving of TV broadcasts. The current state-of-the-art in concept detection uses bag-of-visual-words features with computationally heavy kernel-based classifiers. We argue that this approach is not feasible for real-time applications, and propose instead to use combinations of fast linear classifiers. In experiments with the large-scale TRECVID 2011 video database and 50 concepts, we compare several methods to improve the retrieval performance of standard linear classifiers. Fusing classifiers trained on different features and using multi-learn and homogeneous kernel maps achieve state-of-the-art retrieval precision, while retaining real-time performance even for large sets of concepts.
Date of Conference: 11-15 November 2012
Date Added to IEEE Xplore: 14 February 2013
ISBN Information:
ISSN Information:
Conference Location: Tsukuba, Japan