Skip to Main Content
In this paper, an investigation to establish a possible relationship between the performance of a telephony speech recognition system and the method for objective speech quality assessment described in ITU-T Recommendation P.862, known as Perceptual Evaluation of Speech Quality (PESQ), is presented. Experiments using various additive background noises, and at different separations between the microphone and the sound-source have been conducted to establish such a relationship. The preliminary results suggest that telephony speech recognition rates can be mapped to the mean opinion score (MOS) obtained by PESQ using a relatively simple polynomial relationship. This indicates that the PESQ MOS can act as a reliable predictor for the achievable speech recognition rates for telephony-based speech recognition systems.
Date of Conference: 17-21 May 2004