Cross-Modal Alignment for End-to-End Spoken Language Understanding Based on Momentum Contrastive Learning | IEEE Conference Publication | IEEE Xplore