MOMA: Mixture-of-Modality-Adaptations for Transferring Knowledge from Image Models Towards Efficient Audio-Visual Action Recognition | IEEE Conference Publication | IEEE Xplore