By Topic

The representation of continuous speech with a periodically sampled orthogonal basis

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
O'Neill, L. ; Bell Telephone Laboratories, Inc. Holmdel, N.J.

The efficient transmission or processing of speech requires that a compromise be made between quality and bandwidth. Systems for bandwidth reduction, such as the vocoder, are usually designed to preserve the spectral content of the signal. High-quality systems, on the other hand, generally preserve waveshape by using high digital sampling rates. The determination of an adequate compromise is seriously impeded by the basic differences in these two approaches. The objective here is to investigate an analysis-synthesis procedure, that has been used to represent other signals, as a vehicle for determining this compromise. The continuous speech is divided arbitrarily into time periods and each period is expressed as a set of coefficients of an exponential expansion. The distinctive nature of speech is reflected in the choice of basis and analysis period rather than by special processing operations such as the pitch extraction of a vocoder. It has been demonstrated by digital simulation that with a proper selection of parameters both temporal waveshape and the spectrum can be preserved by this method. The statistically selected basis consists of ten pairs of damped sines and cosines and the experimentally chosen analysis period is 5.2 milliseconds. The coefficients of this expansion were measured by digital filtering on the computer. The simulated system is capable of synthesizing high-quality speech for speakers whose average pitch varied from 80 to 245 Hz without changing either the basis or the period. Although the feasibility of such a system has been demonstrated, a detailed investigation of coding techniques will be necessary before its efficiency can be compared to other approaches.

Published in:

Audio and Electroacoustics, IEEE Transactions on  (Volume:17 ,  Issue: 1 )