Loading [MathJax]/extensions/MathMenu.js
Complementary Learning System Theory-based Active Learning for Audio Classification | IEEE Conference Publication | IEEE Xplore

Complementary Learning System Theory-based Active Learning for Audio Classification


Abstract:

Deep learning has significantly advanced the audio classification, achieving remarkable results. However, these successes often rely on extensive manual annotation of aud...Show More

Abstract:

Deep learning has significantly advanced the audio classification, achieving remarkable results. However, these successes often rely on extensive manual annotation of audio, a labor-intensive and costly process. Active Learning (AL) presents a promising solution by minimizing the required amount of annotation through the iterative selection of the most informative audio samples. Current AL methods for audio classification typically depend solely on the latest model checkpoint, overlooking the dynamics of the entire training process. The Complementary Learning Systems (CLS) theory posits that the interplay between short-term and long-term memory systems can effectively measure sample uncertainty, offering a means to capture training dynamics. In this work, we introduce a novel AL framework for audio classification, termed CLS-AL, which addresses the limitations of existing methods by simultaneously maintaining both short-term and long-term memory models. This dual-memory approach allows for a more comprehensive consideration of training dynamics. The divergence in predictions between these memory models provides a new metric for evaluating the uncertainty of unlabeled samples, enhancing the effectiveness of the AL sample selection process. We demonstrate the effectiveness and generalizability of CLS-AL through extensive experiments on a diverse set of audio datasets, showing that CLS-AL obviously outperforms existing state-of-the-art methods.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Funding Agency:


I. Introduction

The applications of audio classification have significantly increased, and their impact has become more evident [1], [2]. These applications cover various domains, including acoustic scene classification [3], underwater acoustic signal classification [4], [5], and urban sound classification [6]. Although supervised learning methods and self-supervised learning method achieve superior performance in these tasks, they typically require extensive labeling, which can be time-consuming and labor-intensive [7]. A key challenge in this field is to efficiently select the most informative audio data for labeling within a limited budget, thereby reducing costs while maximizing the model’s performance.

Contact IEEE to Subscribe

References

References is not available for this document.