By Topic

The Automatic Speech Recognition System for Conversational Sound

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Sakai, T. ; Department of Electrical Engineering, Kyoto University, Kyoto, Japan. ; Doshita, S.

This paper describes the method and the system investigated to solve the problem encountered in the automatic recognition of speech sound. From research in the automatic analyzer of speech sound, a monosyllable recognition system was constructed in which the phoneme is used as the basic recognition unit. Recently this system has been developed to accept the conversational speech sound with unlimited vocabulary. The mechanical recognition of conversational speech sound requires two basic operations. One is the segmentation of the continuous speech sound into several discrete intervals (or segments), each of which may be thought to correspond to a phoneme, and the other is the pattern recognition of such segments. For segmentation, by defining two criteria, ``stability'' and ``distance,'' the properties of the time pattern obtained by the analysis of input speech sound may be examined. The principle of the recognition is based on the mechanism of the articulation in our speech organ. Corresponding to this, the machine has the functions called phoneme classification, vowel analysis and consonant analysis. A conversational speech recognition system with the phonetic contextual approach is also applied to the vowel recognition where the time pattern of input speech is matched with the stored standard patterns in which the phonetic contextual effects are taken into consideration. The time pattern which has great variety may be effectively expressed by the new representation of ``sequential pattern'' and ``weighting pattern.''

Published in:

Electronic Computers, IEEE Transactions on  (Volume:EC-12 ,  Issue: 6 )