Learning Cross-Modal Audiovisual Representations with Ladder Networks for Emotion Recognition | IEEE Conference Publication | IEEE Xplore