Abstract:
Four non-intrusive models are compared that predict human speech recognition thresholds (SRTs, i.e., signal to noise ratios with 50% word recognition rate) in different a...Show MoreMetadata
Abstract:
Four non-intrusive models are compared that predict human speech recognition thresholds (SRTs, i.e., signal to noise ratios with 50% word recognition rate) in different acoustic environments. Three of them use the blind binaural processing stage (bBSIM) as front-end, while one model uses the spectral representation of the left and right ear signal channels together with their difference. Predictions are evaluated for three acoustic environments (anechoic, office, and cafeteria) with speech from the front and noise from different directions. Despite many technical differences across the models, all of them perform quite accurately (root mean squared prediction errors below 2.2 dB for all models). This implies that any of the non-intrusive models facilitates to predict SRTs for listeners with normal hearing measured in stationary noise, different acoustic environments, and spatial configurations.
Published in: Speech Communication; 14th ITG Conference
Date of Conference: 29 September 2021 - 01 October 2021
Date Added to IEEE Xplore: 21 December 2021
Print ISBN:978-3-8007-5627-8
Conference Location: online