No Data Required: Zero-Shot Domain Adaptation for Automatic Music Transcription | IEEE Conference Publication | IEEE Xplore

No Data Required: Zero-Shot Domain Adaptation for Automatic Music Transcription


Abstract:

Automatic music transcription (AMT) takes a music recording and outputs a transcription of the underlying music. Deep learning models trained for AMT rely on large amount...Show More

Abstract:

Automatic music transcription (AMT) takes a music recording and outputs a transcription of the underlying music. Deep learning models trained for AMT rely on large amounts of annotated training data, which are available only for some domains such as Western classical piano music. Using pre-trained models on out-of-domain inputs can lead to significantly lower performance. Fine-tuning or retraining on new target domains is expensive and relies on the presence of labeled data. In this work, we propose a method for taking a pre-trained transcription model and improving its performance on out-of-domain data without the need for any training data, requiring no fine-tuning or retraining of the original model. Our method uses the model to transcribe pitch-shifted versions of an input, aggregating the output across these versions where the original model is unsure. We take a model originally trained for piano transcription and present experiments under two domain shift scenarios: recording condition mismatch (piano with different recording setups) and instrument mismatch (guitar and choral data). We show that our method consistently improves note- and frame-based performance.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Contact IEEE to Subscribe

References

References is not available for this document.