Emotion Recognition in Speech by Multimodal Analysis of Audio and Text | IEEE Conference Publication | IEEE Xplore

Emotion Recognition in Speech by Multimodal Analysis of Audio and Text


Abstract:

Emotion recognition remains a very challenging task in research because of its sensitive and multifaceted nature. Recently, emotion recognition has garnered a lot of atte...Show More

Abstract:

Emotion recognition remains a very challenging task in research because of its sensitive and multifaceted nature. Recently, emotion recognition has garnered a lot of attention owing to its significance in psychology, human-computer interaction, and healthcare, where people’s facial expressions, voice qualities, and spoken words are used to better understand it. While emotion recognition holds the power to facilitate various health problems, the main challenge emotion recognition systems face is to accurately identify hidden nuances in expressions and thus, the underlying emotions conveyed by them. The true emotions of a person may remain concealed or not properly identified when only one mode of input is analyzed, therefore, multimodal streams of inputs are used to provide a more holistic view of a person’s emotions. In this paper, a novel framework that fuses the results of two uni-modal methods of emotion recognition, audio, and text, to develop a robust and versatile emotion recognition system is proposed. The results show that signal processing and language processing can be utilized to reliably detect emotion from audio and text, with an accuracy of 96% and 94.1% respectively. Further, the approach presented in this paper can be used as a depression detection and monitoring tool to further enable mental healthcare professionals accurately detect symptoms of depression.
Date of Conference: 19-20 January 2023
Date Added to IEEE Xplore: 22 February 2023
ISBN Information:
Conference Location: Noida, India

I. Introduction

Humans have the innate ability to recognize the emotions conveyed by other humans. Emotion recognition drives interaction between people and is crucial in social environments for mutual understanding, gauging affability, and avoiding conflict. Humans are very revealing by nature, as they disclose emotions in many ways [1]. For example, even something as seemingly uninvolved as eyebrows can convey a multitude of emotions [2]. While being taken for granted due to its intrinsic nature, the potential of emotion recognition in a wide variety of applications cannot be understated.

Contact IEEE to Subscribe

References

References is not available for this document.