By Topic

Perceptual audio features for unsupervised key-phrase detection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Dirk von Zeddelmann ; Fraunhofer-FKIE, KOM Department, 53343 Wachtberg, Germany ; Frank Kurth ; Meinard Müller

We propose a new type of audio feature (HFCC-ENS) as well as an unsupervised method for detecting short sequences of spoken words (key-phrases) within long speech recordings. Our technical contributions are threefold: Firstly, we propose to use bandwidth-adapted filterbanks instead of classical MFCC-style filters in the feature extraction step. Secondly, the time resolution of the resulting features is adapted to account for the temporal characteristics of the spoken phrases. Thirdly, the key-phrase detection step is performed by matching sequences of the resulting HFCC-ENS features with features extracted from a target speech recording. We evaluate the proposed method using the German Kiel Corpus and furthermore investigate speech-related properties of the proposed feature.

Published in:

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Date of Conference:

14-19 March 2010