Audio and Video-based Emotion Recognition using Multimodal Transformers | IEEE Conference Publication | IEEE Xplore