Cross-Aligned Fusion For Multimodal Understanding | IEEE Conference Publication | IEEE Xplore