Abstract:
Annotation of the perceived emotion of a music piece is required for an automatic music emotion recognition system. Most music emotion datasets are developed for Western ...Show MoreMetadata
Abstract:
Annotation of the perceived emotion of a music piece is required for an automatic music emotion recognition system. Most music emotion datasets are developed for Western pop songs. The problem is that a music emotion recognizer trained on such datasets may not work well for non-Western pop songs due to the differences in acoustic characteristics and emotion perception that are inherent to cultural background. The problem was also found in cross-cultural and cross-dataset studies; however, little has been done to learn how to adapt a model pre-trained on a source music genre to a target music genre of interest. In this paper, we propose to address the problem by an unsupervised adversarial domain adaptation method. It employs neural network models to make the target music indistinguishable from the source music in a learned feature representation space. Because emotion perception is multifaceted, three types of input feature representations related to timbre, pitch, and rhythm are considered for performance evaluation. The results show that the proposed method effectively improves the prediction of the valence of Chinese pop songs from a model trained for Western pop songs.
Date of Conference: 17-20 December 2018
Date Added to IEEE Xplore: 17 January 2019
ISBN Information: