Skip to Main Content
Previously, we compared several objective measures to estimate the subjective speech intelligibility scores of the Japanese Diagnostic Rhyme Test (DRT). PESQ-derived MOS, segmental SNR (SNRseg), frequency-weighed segmental SNR (fwSNRseg), and composite measures were tested. We mapped these measures to its corresponding intelligibility scores using quadratic equations trained on one speaker and one noise type, and tested on a different speaker with the same gender and noise type. Accurate intelligibility estimation was possible, especially when using fwSNRseg and SNRseg. In this paper, we further investigated the estimation accuracy when the training and testing speaker gender or the noise types do not match. There was almost no decrease in accuracy with speaker gender, but a slight decrease with noise type. However, with fwSNRseg, correlation between subjective and estimated intelligibility was above 0.8, while other measures showed much lower correlation.