Abstract:
In this paper, we propose a new framework called independent deeply learned matrix analysis (IDLMA), which unifies a deep neural network (DNN) and independence-based mult...Show MoreMetadata
Abstract:
In this paper, we propose a new framework called independent deeply learned matrix analysis (IDLMA), which unifies a deep neural network (DNN) and independence-based multichannel audio source separation. IDLMA utilizes both pretrained DNN source models and statistical independence between sources for the separation, where the time-frequency structures of each source are iteratively optimized by a DNN while enhancing the estimation accuracy of the spatial demixing filters. As the source generative model, we introduce a complex heavy-tailed distribution to improve the separation performance. In addition, we address a semi-supervised situation; namely, a solo-recorded audio dataset can be prepared for only one source in the mixture signal. To solve the limited-data problem, we propose an appropriate data augmentation method to adapt the DNN source models to the observed signal, which enables IDLMA to work even in the semi-supervised situation. Experiments are conducted using music signals with a training dataset in both supervised and semi-supervised situations. The results show the validity of the proposed method in terms of the separation accuracy.
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 27, Issue: 10, October 2019)