Abstract:
Convolutional neural network (CNN)-based methods have dominated the recent research of cover song identification (CSI). A typical example is the ByteCover system we propo...Show MoreMetadata
Abstract:
Convolutional neural network (CNN)-based methods have dominated the recent research of cover song identification (CSI). A typical example is the ByteCover system we proposed, which has achieved state-of-the-art results on all the mainstream datasets of CSI. In this paper, we propose an up-graded version of ByteCover, termed ByteCover2, which further improves ByteCover in both identification performance and efficiency. Compared with ByteCover, ByteCover2 is designed with an additional PCA-FC module, which integrates the capability of principal component analysis (PCA) and fully-connected (FC) neural network for dimensionality reduction of the audio embedding, allowing ByteCover2 to perform CSI in a more precise and efficient way. We evaluated ByteCover2 on multiple datasets in different dimension sizes and training settings, where ByteCover2 beat all the compared methods including ByteCover, even with a dimension size of 128, which is 15 times smaller than that of ByteCover.
Published in: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 23-27 May 2022
Date Added to IEEE Xplore: 27 April 2022
ISBN Information: