Loading web-font TeX/Main/Regular
M3F: Multi-Modal Continuous Valence-Arousal Estimation in the Wild | IEEE Conference Publication | IEEE Xplore

M3F: Multi-Modal Continuous Valence-Arousal Estimation in the Wild


Abstract:

In this paper, we propose a multi-modal multi-feature (M^{3}F) approach for in-the-wild valence-arousal estimation. In the proposed M^{3}F framework, we fuse both vis...Show More

Abstract:

In this paper, we propose a multi-modal multi-feature (M^{3}F) approach for in-the-wild valence-arousal estimation. In the proposed M^{3}F framework, we fuse both visual features from videos and acoustic features from the audio tracks to estimate the valence and arousal. We follow a CNN-RNN paradigm, where the spatio-temporal visual features are extracted with a 3D convolutional network and/or a pretrained 2D convolutional network, and a bidirectional recurrent neural network. We evaluated the M^{3}F framework on the validation set provided by the Affective Behavior Analysis in-the-wild (ABAW) Challenge, held in conjunction with the IEEE International Conference on Automatic Face and Gesture Recognition (FG) 2020, and it significantly outperforms the baseline method.
Date of Conference: 16-20 November 2020
Date Added to IEEE Xplore: 18 January 2021
ISBN Information:
Conference Location: Buenos Aires, Argentina

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.