Skip to Main Content
Semantic feature extraction of video shots and fast video sequence matching are important and required for efficient retrieval in a large video database. A novel mechanism of similarity retrieval is proposed. Similarity measure between video sequences considering the spatio-temporal variation through consecutive frames is presented. For bridging the semantic gap between low-level features and the rich meaning that users desire to capture, video shots are analyzed and characterized by the high-level feature of motion activity in compressed domain. The extracted features of motion activity are further described by the 2D-histogram that is sensitive to the spatio-temporal variation of moving objects. In order to reduce the dimensions of feature vector space in sequence matching, the discrete cosine transform (DCT) is exploited to map semantic features of consecutive frames to the frequency domain while retains the discriminatory information and preserves the Euclidean distance between feature vectors. Experiments are performed on MPEG-7 testing video streams, and the results of sequence matching show that a few DCT transformed coefficients are adequate and thus reveal the effectiveness of the proposed mechanism of video retrieval.
Date of Conference: 10-12 Dec. 2003