A Cross-Modal Alignment Method Based on Adaptive Feature Aggregation and Spatial Fusion Mechanism | IEEE Conference Publication | IEEE Xplore