Skip to Main Content
In this paper a new approach called nonnegative tensor principal component analysis (NTPCA) with sparse constraint is proposed for speech feature extraction. We encode speech as a general higher order tensor in order to extract the robust feature from multiple interrelated feature subspace. First, speech signals are represented by cochleagram based on frequency selectivity at basilar membrane and inner hair cells; Then, a low dimension sparse representation based on tensor structure is extracted by NTPCA for robust speaker modeling. Alternating projection algorithm is used to obtain a stable solution and makes sure the useful information of each subspace in the higher order tensor being preserved. Experiment results demonstrate that our method can increase the recognition accuracy specifically in noise environments.