Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection | IEEE Conference Publication | IEEE Xplore