Journals & Magazines >IEEE/ACM Transactions on Audi... >Volume: 31

Improvement of Accent Classification Models Through Grad-Transfer From Spectrograms and Gradient-Weighted Class Activation Mapping

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Automatic accent classification is an active research field concerning speech processing. It can be useful to identify a speaker's region of origin, which can be applied ...Show More

Metadata

Abstract:

Automatic accent classification is an active research field concerning speech processing. It can be useful to identify a speaker's region of origin, which can be applied in police investigations carried out by Law Enforcement Agencies, as well as for the improvement of current speech recognition systems. This article presents a novel descriptor called Grad-Transfer, extracted using the Gradient-weighted Class Activation Mapping (Grad-CAM) method based on convolutional neural network (CNN) interpretability. Additionally, we propose a methodology for accent classification that implements Grad-Transfer, which is based on transferring the knowledge acquired by a CNN to a classical machine learning algorithm. The article works on two hypotheses: the coarse localization maps produced by Grad-CAM on spectrograms are able to highlight the regions of the spectrograms that are important for predicting accents, and Grad-Transfer descriptors computed from audios represent distinctive descriptions of the target accents. These hypotheses were demonstrated experimentally, clustering the generated Grad-Transfer descriptors according to the original accent of the audios using Birch and

$k$ -means algorithms. We carried out experiments on the Voice Cloning Toolkit dataset, seeing an increase of macro average accuracy, and unweighted average recall in the results obtained by a Gaussian Naive Bayes classifier up to 23.00%, and 23.58%, respectively, compared to a model trained with spectrograms. This demonstrates that Grad-Transfer is able to improve the performance of accent classification models and opens the door to new implementations in similar tasks.

Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 31)

Page(s): 2859 - 2871

Date of Publication: 21 July 2023

ISSN Information:

DOI: 10.1109/TASLP.2023.3297961

Funding Agency:

Contents

References is not available for this document.

Improvement of Accent Classification Models Through Grad-Transfer From Spectrograms and Gradient-Weighted Class Activation Mapping

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Improvement of Accent Classification Models Through Grad-Transfer From Spectrograms and Gradient-Weighted Class Activation Mapping

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?