Skip to Main Content
Voice enabled applications over the Internet are rapidly gaining popularity. Reducing the total bandwidth requirement can make a non-trivial difference for the subscribers having low speed connectivity. Voice activity detection algorithms for VoIP applications can save bandwidth by filtering the frames that do not contain speech. In this paper we introduce a novel technique to identify the voice and silent regions of a speech stream very much suitable for VoIP calls. We use an information theoretic measure, called spectral entropy, for differentiating the silence from the speech zones. Specifically we developed a heuristic approach that uses an adaptive threshold to minimize the miss detection in the presence of noise. The performance of our approach is compared with the relatively new 3GPP TS 26.194 (AMR-WB) standard, along with the listeners' intelligibility rating. Our algorithm yields comparatively better saving in bandwidth, yet maintaining good quality of the speech streams.