Skip to Main Content
In this work, we study different initialization methods for the nonnegative matrix factorization (NMF) dictionaries or bases. There is a need for good initializations for NMF dictionary because NMF decomposition is a non-convex problem which has many local minima. The effect of the initialization of NMF is evaluated in this work on audio source separation applications. In supervised audio source separation, NMF is used to train a set of basis vectors (basis matrix) for each source in an iterative fashion. Then NMF is used to decompose the mixed signal spectrogram as a weighted linear combination of the trained basis vectors for all sources in the mixed signal. The estimate for each source is computed by summing the decomposition terms that include its corresponding trained bases. In this work, we use principal component analysis (PCA), spherical K-means, and fuzzy C-means (FCM) to initialize the NMF basis matrices during the training procedures. Experimental results show that, better initialization for NMF bases gives better audio separation performance than using NMF with random initialization.