Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network | IEEE Conference Publication | IEEE Xplore