Loading [MathJax]/extensions/MathMenu.js
Frequency Domain-Based Cross-Layer Feature Aggregation Network for Camouflaged Object Detection | IEEE Journals & Magazine | IEEE Xplore

Frequency Domain-Based Cross-Layer Feature Aggregation Network for Camouflaged Object Detection


Abstract:

Despite the progress of existing techniques in Camouflaged Object Detection, there are still problems such as multi-target omission, small-object misjudgment, and insuffi...Show More

Abstract:

Despite the progress of existing techniques in Camouflaged Object Detection, there are still problems such as multi-target omission, small-object misjudgment, and insufficient localization and segmentation accuracy. With the advantage of image frequency domain transformation, high-frequency features can capture detailed information such as edges and textures of the image, while low-frequency features depict the overall outline of the image, improving the accuracy of camouflage object detection. Therefore, this paper proposes a Frequency Domain-Based Cross-layer Feature Aggregation Network (FCFANet), aiming to improve the problems of multi-target omission, small target loss, object localization deviation, and insufficient segmentation accuracy in complex scenes. FCFANet mainly consists of an Intra- and Inter-layer Enhancement Module (IEM) and a Frequency-Spatial Interaction Fusion Module (FSIFM). IEM reduces noise and enhances the feature representation, while FSIFM extracts frequency information, and enhances feature discrimination by complementary fusion of spatial and frequency domains, thus realizing the precise positioning and high-precision segmentation of camouflaged objects. Compared with 16 state-of-the-art(SOTA) methods, experiments show that FCFANet outperforms other SOTA methods on four benchmark datasets.
Published in: IEEE Signal Processing Letters ( Volume: 32)
Page(s): 2005 - 2009
Date of Publication: 07 March 2025

ISSN Information:

Funding Agency:


I. Introduction

Camouflaged object detection (COD) seeks to accurately identify and segment highly blended visual objects within their surrounding environment. It has great potential for applications in areas such as healthcare [1], military [2], and agriculture [3]. With the rise of deep learning techniques, COD methods have made tremendous progress. To illustrate, Fan et al. [4] introduced their COD10 K dataset and a simple but effective network SINet. Similarly, C2FNet [5] built contextual connections by aggregating intermediate and high-level features. FDNet [6] employed feature grafting and interference sensing mechanisms to fine tune COD tasks. In addition, FSANet [7] enhanced object features through the multiplicative fusion of features, CamoFocus [8] enhanced the performance of COD through the Feature Splitting and Modulation module and the Context Refinement Module, DIRNet [9] improved the accuracy of COD by employing a Bilateral Interaction Module and an Adjacent Aggregation Interaction Module, and DINet [10] enhanced the three-dimensional perception capability for RGB-D COD by fusing depth maps. Taking the above methods as an example, despite having made some advances in COD methods, the high similarity of visual features between camouflage objects and backgrounds remains challenging. This similarity leads to multiple problems in the detection process, including multi-target omission and small target misjudgment, etc. To solve the problems of multi-target omission and small-target misjudgment in COD, some methods based on frequency domain transformation have been preliminarily explored. For instance, Zhong et al. [11] introduced frequency domain features as additional cues to enhance the capability of detecting camouflaged objects against the background. He et al. [12] proposed a feature decomposition and edge reconstruction model. Similarly, Cong et al. [13] proposed frequency sensing and correction fusion modules based on octave convolution. Liang et al. [14] proposed an efficient Frequency Injection Module that injects frequency domain cues at different stages to enhance feature representation.

Contact IEEE to Subscribe

References

References is not available for this document.