By Topic

A novel feature-extraction for speech recognition based on multiple acoustic-feature planes

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Nitta, T. ; Multimedia Eng. Lab., Toshiba Corp., Kawasaki, Japan

This paper describes an attempt to incorporate the functions of the auditory nerve system into the feature extractor of speech recognition. The functions include four types of well-known responses to sound stimuli: the local peaks of the steady sound spectrum, ascending FM sound, descending FM sound, and sharply rising and falling sound. Each function is realized in the form of a three-level derivative operator and is applied to a time-spectrum (TS) pattern X(t,f) of the output of the BPF with 26-channels. The resultant acoustic cue of an input speech represented by multiple acoustic-feature planes (MAFP) is compressed by using the Karhuenen-Loeve transform (KLT), then classified. In the experiments performed on a Japanese E-set (12 consonantal parts of /Ci/) extracted from continuous speech, the MAFP significantly improved the error rate from 34.5% and 29.6% obtained by X(t,f) and X(t,f)+ΔtX(t,f) to 17.0% for unknown speakers (dimension=64)

Published in:

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on  (Volume:1 )

Date of Conference:

12-15 May 1998