Research has shown that degrading acoustic background noise influences speech quality across phoneme classes in a nonuniform manner. This results in variable quality performance of many speech enhancement algorithms in noisy environments. A phoneme classification procedure is proposed which directs single-channel constrained speech enhancement. The procedure performs broad phoneme class partitioning of noisy speech frames using a continuous mixture hidden Markov model recognizer in conjunction with a perceptually motivated cost-based decision process. Once noisy speech frames are identified, iterative speech enhancement based on all-pole parameter estimation with inter- and intra-frame spectral constraints is employed. The phoneme class-directed enhancement algorithm is evaluated using TIMIT speech data and shown to result in substantial improvement in objective speech quality over a range of signal-to-noise ratios and individual phoneme classes
Published in:
Speech and Audio Processing, IEEE Transactions on
(Volume:3
,
Issue:
1
)
Date of Publication: Jan 1995