Skip to Main Content
In spite of their great potential for bandwidth saving in long distance telephony, vocoders have not found wide-spread acceptance. Two major problems have retarded their application. First is their strong electrical accent. Second is the so-called "pitch problem;" namely, deducing the nature of the talker's vocal excitation from his speech waveform. The reliability of this deduction and measurement depends critically upon high input speech-to-noise ratio, particularly between 50 and 200 cps. In many communication situations, this requirement precludes satisfactory operation. This limitation can be removed by a new method known as "voice excitations' which eliminates the necessity for a decision-making pitch detector. The principal advantage of voice excitation is its insensitivity to input signal-to-noise ratio and equalization. A voice-excited vocoder (VEV) with a 720 cps (250-970 eps) baseband and 17 spectrum channels low-passed to 25 cps each, covering the band 970-3700 cps, has been built and evaluated. The test shows an average PB-word intelligibility of 86%, compared to 92% for input speech of the same bandwidth, both with an 18 db signal-to-noise ratio. Quality tests indicate that listeners rate VEV speech "as good as" the input in about 90% of the test utterances. 0nly 19% of conventional vocoder utterances were so considered. The vocoder performed about equally well for each of the 12 speakers in the quality test. Voice-discrimination tests indicate that voice identity is well preserved. Crucial factors influencing the remade speech quality are the accuracy of spectral flattening and the impulse response of the analyzer low-pass filters. These results indicate that the principle of voice excitation provides the key to practical speech bandwidth reduction.