Skip to Main Content
In this paper we present a novel method for extracting affective information from movies, based on speech data. The method is based on a 2D representation of speech emotions (Emotion Wheel). The goal is twofold. First, to investigate whether the Emotion Wheel offers a good representation for emotions associated with speech signals. To this end, several humans have manually annotated speech data from movies using the Emotion Wheel and the level of disagreement has been computed as a measure of representation quality. The results indicate that the emotion wheel is a good representation of emotions in speech data. Second, a regression approach is adopted, in order to predict the location of an unknown speech segment in the Emotion Wheel. Each speech segment is represented by a vector of ten audio features. The results indicate that the resulting architecture can estimate emotion states of speech from movies, with sufficient accuracy.