By Topic

Merging Segmentations of Low-level and Mid-level Time Series for Audio Class Discovery

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Regunathan Radhakrishnan ; Mitsubishi Electric Research Labs, 201 Broadway, Cambridge, MA, 02139, USA. ; Ajay Divakaran

In our previous work, we proposed a time series analysis framework that detects outlying subsequences from an input time series to bring out events of interest in sports and surveillance audio [1]. The input time series in this framework could consist of mid-level audio classification labels or low-level cepstral features. In this paper, we present an algorithm using kernel alignment to merge the segmentation results of these two time series representations for the same content. The algorithm first finds an optimal kernel bandwidth parameter (Sigma) that aligns the similarity matrices obtained from the low-level and the mid-level time series. Then, it uses the gain in kernel alignment as a measure to further match the segmentations. Our results with sports audio show that the proposed algorithm combines the advantages of both low and mid-level time series, by suppressing irrelevant patterns while maintaining sufficient information for discovering key-audio classes.

Published in:

2006 Fortieth Asilomar Conference on Signals, Systems and Computers

Date of Conference:

Oct. 29 2006-Nov. 1 2006