Skip to Main Content
This paper presents a voice activity detection (VAD) algorithm based on the Wavelet Packet Transform and the Teager Energy Operation (TEO) processing. The signal is decomposed into subband signals. We used the multi-resolution analysis property of the Wavelet Transform to extract and analyse time-frequency components corresponding to speech. In order to obtain a parameter called Voice Activity Shape (VAS), we used TEO processing to better distinguish subband signals corresponding to speech. The subband variance values of each TEO signal are summed to obtain the VAS, which is higher in speech regions than in non speech regions. Experimental results show that our VAD perform better than the G729B, particularly in difficult noise conditions and also in the case when the speech sound is passed in a nonlinear communication channel. Experimental results are shown in the case of real speech communications from a spaceship to terrestrial 3G cellular network assuming nonlinear interferences.
Date of Conference: 20-25 July 2009