Skip to Main Content
It has been proved that graph-based semi-supervised learning approaches can effectively and efficiently solve the problem of the limitation of labeled data in many real-world application areas, such as video concept detection. As a significant factor of these algorithms, however, similarity metric has not been fully investigated. Specifically, for existing approaches, the estimation of similarity between two samples relies on the spatial property of video data. On the other hand, temporal property, an essential characteristic of video data, is not embedded into the similarity measure. Accordingly, in this paper, a novel framework based on the spatio-temporal correlation is proposed for task of video concept detection. This framework is characterized by simultaneously taking into account both the spatial and temporal property of video data to improve the computation of similarity. We apply the proposed framework to video concept detection and report superior performance compared to key existing approaches over the benchmark TRECVID data set.