Skip to Main Content
Visualization of emotional speech data is an important tool for speech researchers who seek means to gain a deeper insight into the structure of complex multidimensional data. A visualization method is presented that utilizes feature selection and classifier optimization for learning Isomap manifolds of emotional speech data. The resulting manifold is based on those features that best discriminate between given emotional classes in the target space of specified embedding dimension. A nonlinear mapping function based on generalized regression neural networks (GRNNs) provides generalization for new data. A low-dimensional manifold of emotional speech data consisting of neutral, sad, angry, and happy expressions was constructed using prosodic and acoustic features of speech. Experimental results indicate that a 3D embedding provides the best classification performance. The manifold structure can be readily visualized and matches the circumplex and conical shapes predicted by dimensional models of emotion. Listening tests show excellent correlation between the organization of the data on the manifold and the listeners' judgment of emotional intensity.