Abstract:
The key that hinders the performance improvement of current camouflaged object detection (COD) models is the lack of discriminability of features at fine granularity. We ...Show MoreMetadata
Abstract:
The key that hinders the performance improvement of current camouflaged object detection (COD) models is the lack of discriminability of features at fine granularity. We solve this problem from two complementary perspectives. Firstly, complex scenes result in the discriminative feature representations of camouflaged objects being present at different scales and semantic abstraction levels. Therefore, a mechanism is needed to increase the diversity of features to integrate more information potentially beneficial for COD. Second, appearance similarity between objects and environments will inevitably lead to similarity in features. Enhancing feature diversity alone is not enough to solve the above problems. Therefore, it is necessary to give the model semantic perception capabilities to expand the subtle discrepancies between objects and environments in feature embedding. Inspired by the first point, we propose a cross-scale interaction module (CSIM) that utilizes cross-attention between different scales to enhance the diversity of feature representations. Regarding the second point, the semantic guided feature learning (SGFL) is proposed to promote the model to expand feature discrepancies through explicit supervision. Experiments on four popular COD datasets show that our method outperforms recent SOTA methods. In addition, polyp segmentation experiments show that it is also effective for other COD-like tasks.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 12, December 2024)