I. Introduction
Camouflage is a common phenomenon in nature. Animals often blend into their surroundings to confuse prey or hide from predators [1]. For example, to avoid predators, the skin of the leaf-tailed gecko is covered with bumps that mimic the texture of tree bark. Alligator snapping turtles, which have a dark brown or black shell with a series of sharp protrusions, often hide in the mud and wait for prey to approach. In human society, COD also has broad application prospects, such as medical image analysis, agricultural pest detection, architectural design, species conservation [2], [3]. The purpose of COD is to find these objects hidden in the surroundings [4]. However, compared with traditional object detection or segmentation tasks, COD has more challenges, which can be observed in Fig. 1. Firstly, the contrast between the object and the background may not be very strong. Even if the object occupies a large proportion in the image, it is still difficult to detect the object completely, as shown in row 1 of Fig. 1. This situation often results in failed detection of local parts of the object. Secondly, the complex background environment and the small size of the target objects further complicate the task. The imbalance of pixels between small targets and the environment greatly affects the detection effect, resulting in unclear detection of object details, as shown in row 2 of Fig. 1. Thirdly, the detection of multiple camouflage objects often leads to missed detections, as shown in row 3 of Fig. 1. Finally, occlusion between objects is also one of the main factors leading to incomplete detection of camouflaged objects, as illustrated in row 4 of Fig. 1.
Visual examples of camouflaged object detection results from different methods in challenging scenes. Row 1: camouflaged detection of large-sized objects that are highly similar to the background. Row 2: camouflaged detection of small-sized objects hidden in the environment. Row 3: Multi-target camouflaged object detection. Row 4: Occluded camouflaged object detection. (Best viewed digitally.)