Skip to Main Content
The efficient transmission or processing of speech requires that a compromise be made between quality and bandwidth. Systems for bandwidth reduction, such as the vocoder, are usually designed to preserve the spectral content of the signal. High-quality systems, on the other hand, generally preserve waveshape by using high digital sampling rates. The determination of an adequate compromise is seriously impeded by the basic differences in these two approaches. The objective here is to investigate an analysis-synthesis procedure, that has been used to represent other signals, as a vehicle for determining this compromise. The continuous speech is divided arbitrarily into time periods and each period is expressed as a set of coefficients of an exponential expansion. The distinctive nature of speech is reflected in the choice of basis and analysis period rather than by special processing operations such as the pitch extraction of a vocoder. It has been demonstrated by digital simulation that with a proper selection of parameters both temporal waveshape and the spectrum can be preserved by this method. The statistically selected basis consists of ten pairs of damped sines and cosines and the experimentally chosen analysis period is 5.2 milliseconds. The coefficients of this expansion were measured by digital filtering on the computer. The simulated system is capable of synthesizing high-quality speech for speakers whose average pitch varied from 80 to 245 Hz without changing either the basis or the period. Although the feasibility of such a system has been demonstrated, a detailed investigation of coding techniques will be necessary before its efficiency can be compared to other approaches.