Abstract:
Visible-infrared (RGB-IR) image fusion has shown great potentials in object detection based on unmanned aerial ve-hicles (UAVs). However, the weakly misalignment problem ...Show MoreMetadata
Abstract:
Visible-infrared (RGB-IR) image fusion has shown great potentials in object detection based on unmanned aerial ve-hicles (UAVs). However, the weakly misalignment problem between multimodal image pairs limits its performance in object detection. Most existing methods often ignore the modality gap and emphasize a strict alignment, resulting in an upper bound of alignment quality and an increase of implementation costs. To address these challenges, we propose a novel method named Offset-guided Adaptive Feature Alignment (OAFA), which could adaptively adjust the relative positions between multimodal features. Considering the impact of modality gap on the cross-modality spa-tial matching, a Cross-modality Spatial Offset Modeling (CSOM) module is designed to establish a common sub-space to estimate the precise feature-level offsets. Then, an Offset-guided Deformable Alignment and Fusion (ODAF) module is utilized to implicitly capture optimal fusion po-sitions for detection task rather than conducting a strict alignment. Comprehensive experiments demonstrate that our method not only achieves state-of-the-art performance in the UAVs-based object detection task but also shows strong robustness to the weakly misalignment problem.
Date of Conference: 16-22 June 2024
Date Added to IEEE Xplore: 16 September 2024
ISBN Information: