I. Introduction
As An emerging sensor for autonomous driving [1], 4D millimeter-wave radar is increasingly becoming a key solution for environmental perception [2], [3], and simultaneous localization and mapping (SLAM) technology in complex environments [7]. Due to the sparsity of its point clouds [8], 4D radar faces challenges in distinguishing the semantic information of objects, whereas camera sensors offer rich texture and semantic details of scenes [9]. Therefore, the fusion of 4D radar and camera is considered a robust combination strategy. For multi-sensor fusion [10], [11], real-time and stable calibration is essential for efficient integration.