This paper presents some possible acoustic feature differences between natural and synthesized speech. Sentences spoken in a natural adult male voice and synthesized on VOTRAX ML-1 Speech Synthesizer were recorded in a sound proof booth. The recorded sentences were classified into voiced, unvoiced and silence regions contained in these sentences. Parameters like the zerocrossing, linear prediction coefficient and energy were used in making the classification. The results obtained indicate that the synthesized speech tends to contain more unvoicing than the natural speech. The classification accuracy was 99% in the natural speech and 85% in the synthesized speech.
Published in:
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '79.
(Volume:4
)
Date of Conference: Apr 1979