Abstract:
Salient Object Detection (SOD) aims to identify and segment the most striking elements within an image. Salient object detection methods can be differentiated into severa...Show MoreMetadata
Abstract:
Salient Object Detection (SOD) aims to identify and segment the most striking elements within an image. Salient object detection methods can be differentiated into several types according to the input data, such as RGB-D (Depth) and RGB-T (Thermal). Previous research primarily focused on saliency detection for single data types. However, forcing an RGB-D SOD model to process RGB-T data will degrade its performance significantly. In addition, current methods still face challenges in detecting fine edge details of salient objects and achieving end-to-end training. To address these issues, we introduce diffSOD, which leverages stable diffusion and cross-modal feature rectification and fusion module for saliency detection by transforming salient object detection into a denoising process from a noisy mask to an object mask. It offers a unified solution for salient object detection that seamlessly spans both RGB-D SOD and RGB-T SOD. Extensive experiments validate the effectiveness of the proposed diffSOD, demonstrating its ability to efficiently detect salient objects across both RGB-D and RGB-T data, while achieving superior performance over state-of-the-art methods.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: