Skip to Main Content
This paper describes a fast multiscale time-domain technique for the analysis of natural speech waveforms in the presence of noise. The technique is based on the variance fractal dimension trajectory algorithm that is used not only to detect the external boundaries of an utterance, but also its internal pauses representing the unvoiced speech. The algorithm can also identify internal features of phonemes. The features can be amplified so that the speech utterances can be segmented into sentences, words and phonemes. This approach is superior to other energy-based boundary-detection techniques. These observations are based on extensive experimental results on speech utterances digitized at 44.1 kilosamples per second, with 16 bits in each sample.