Representation Learning Through Cross-Modal Conditional Teacher-Student Training For Speech Emotion Recognition | IEEE Conference Publication | IEEE Xplore