Skip to Main Content
The problem of estimating the pitch period of a speech waveform contaminated by acoustically coupled background noise is formulated to include the properties of the spectral envelope by postulating a state-variable model for the speech generation process. Applying the maximum likelihood estimation technique, the optimum processor uses a Kalman filter preprocessor to flatten the spectrum. The resulting signal is then passed through a bank of comb filters and the optimum pitch corresponds to the comb filter for which the output energy is smallest. The Kalman prefilter reduces to an LPC filter only when the speech is generated by an all-pole process and the signal-to-noise ratio is large. For the low signal-to-noise ratio case, a parallel formant speech generation model is more likely to lead to practical numerical algorithms for estimating the spectral coefficients.