By Topic

Multimodal feature fusion for robust event detection in web videos

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

8 Author(s)
Pradeep Natarajan ; Speech, Language and Multimedia Business Unit, Raytheon BBN Technologies, Cambridge, MA 02138 ; Shuang Wu ; Shiv Vitaladevuni ; Xiaodan Zhuang
more authors

Combining multiple low-level visual features is a proven and effective strategy for a range of computer vision tasks. However, limited attention has been paid to combining such features with information from other modalities, such as audio and videotext, for large scale analysis of web videos. In our work, we rigorously analyze and combine a large set of low-level features that capture appearance, color, motion, audio and audio-visual co-occurrence patterns in videos. We also evaluate the utility of high-level (i.e., semantic) visual information obtained from detecting scene, object, and action concepts. Further, we exploit multimodal information by analyzing available spoken and videotext content using state-of-the-art automatic speech recognition (ASR) and videotext recognition systems. We combine these diverse features using a two-step strategy employing multiple kernel learning (MKL) and late score level fusion methods. Based on the TRECVID MED 2011 evaluations for detecting 10 events in a large benchmark set of ~45000 videos, our system showed the best performance among the 19 international teams.

Published in:

Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on

Date of Conference:

16-21 June 2012