Skip to Main Content
Emotion expression is an essential part of human interaction. Rich emotional information is conveyed through the human face. In this study, we analyze detailed motion-captured facial information of ten speakers of both genders during emotional speech. We derive compact facial representations using methods motivated by Principal Component Analysis and speaker face normalization. Moreover, we model emotional facial movements by conditioning on knowledge of speech-related movements (articulation). We achieve average classification accuracies on the order of 75% for happiness, 50-60% for anger and sadness and 35% for neutrality in speaker independent experiments. We also find that dynamic modeling and the use of viseme information improves recognition accuracy for anger, happiness and sadness, as well as for the overall unweighted performance.