Abstract:
Automatic speech recognition (ASR) in air traffic control (ATC) is a low-resource task with limited data and difficult annotation. Fine-tuning self-supervised pre-trained...Show MoreMetadata
Abstract:
Automatic speech recognition (ASR) in air traffic control (ATC) is a low-resource task with limited data and difficult annotation. Fine-tuning self-supervised pre-trained models is a potential solution, but it is time-consuming and computationally expensive, and may degrade the model's ability to extract robust features. Therefore, we propose a continuous learning approach for end-to-end ASR to maintain performance in both new and original tasks. To address catastrophic forgetting in continuous learning for ASR, we propose a knowledge distillation-based method combined with stochastic encoder-layer fine-tuning. This approach efficiently retains knowledge from previous tasks with limited training data, reducing the need for extensive joint training. Experiments on open-source ATC datasets show that our method effectively reduces forgetting and outperforms existing techniques.
Published in: IEEE Signal Processing Letters ( Volume: 32)