A Study of Salient Modulation Domain Features for Speaker Identification | IEEE Conference Publication | IEEE Xplore

A Study of Salient Modulation Domain Features for Speaker Identification


Abstract:

This paper studies the ranges of acoustic and modulation frequencies of speech most relevant for identifying speakers and compares the speaker-specific information presen...Show More

Abstract:

This paper studies the ranges of acoustic and modulation frequencies of speech most relevant for identifying speakers and compares the speaker-specific information present in the temporal envelope against that present in the temporal fine structure. This study uses correlation and feature importance measures, random forest and convolutional neural network mod-els, and reconstructed speech signals with specific acoustic and/or modulation frequencies removed to identify the salient points. It is shown that the range of modulation frequencies associated with the fundamental frequency is more important than the 1–16 Hz range most commonly used in automatic speech recognition, and that the 0 Hz modulation frequency band contains significant speaker information. It is also shown that the temporal envelope is more discriminative among speakers than the temporal fine structure, but that the temporal fine structure still contains useful additional information for speaker identification. This research aims to provide a timely addition to the literature by identifying specific aspects of speech relevant for speaker identification that could be used to enhance the discriminant capabilities of machine learning models.
Date of Conference: 14-17 December 2021
Date Added to IEEE Xplore: 03 February 2022
ISBN Information:

ISSN Information:

Conference Location: Tokyo, Japan

Contact IEEE to Subscribe

References

References is not available for this document.