By Topic

An Improved Voice Activity Detection Using Higher Order Statistics

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Ke Li ; Siemens, Beijing, China ; Swamy, M.N.S. ; Ahmad, M.O.

In this paper, by using the properties of the higher order statistics (HOS) of speech and noise signals, we develop an improved voice activity detection (VAD) scheme. The proposed scheme employs the logarithm of the kurtosis of the LPC residual of a speech signal and is shown to be more effective and efficient in detecting active speech in medium to low signal-to-noise ratio (SNR) conditions without being unduly affected by the variations in the signal energy. To overcome the inability of the HOS in detecting unvoiced speech, another metric (the low band to full band energy ratio) is introduced. Depending on the estimated mean SNR, the proposed scheme works adaptively in two modes: a simple mode using only the SNR, and an enhanced mode using the HOS, the low band to full band energy ratio and the SNR. This scheme is capable of avoiding unnecessary computations, while maintaining the same performance as that working only in the enhanced mode. Simulations results are presented to demonstrate the effectiveness of the proposed voice activity detection scheme.

Published in:

Speech and Audio Processing, IEEE Transactions on  (Volume:13 ,  Issue: 5 )