Infrared Small Target Detection Based on Singularity Analysis and Constrained Random Walker

Effective infrared small target detection is still challenging due to small target sizes and the clutter in the background. Unfortunately, many advanced methods do not perform well in preserving and detecting multiscale objects in complex scenes. We propose an infrared small targets method to suppress the background and adapt the infrared small targets with different sizes. Based on the singular value analysis in the facet model, we propose a multiderivative descriptor to enhance the targets and suppress various clutter in the dual derivative channels. In the first-order derivative channel, we design four facet kernels with different directions to enhance and preserve the isotropic small targets and suppress the block clutter. In the second-order derivative channel, we use the facet kernel to enhance the center pixels of targets and suppress the band clutter. In order to adapt to the targets with various sizes, we propose a constrained random walker technique, including an adaptive matching algorithm to extract the local regions of each candidate adaptively based on the constraint of size and shape. The experimental results demonstrate that the proposed method can accurately detect multiscale small targets in complex scenes, resulting in better detection performance than the state-of-the-art methods.


I. INTRODUCTION
I NFRARED small target detection is one of the key techniques in infrared search and tracking (IRST) systems, such as precision guidance, early-warning and air defense systems [1], [2], [3]. Due to the extensive observation distance, the infrared targets are usually spatially small, resulting in a lack of shape, texture and structure detail. In addition, the target is usually immersed in a complex background and is prone to interference by different forms of clutter [4], [5], [6]. This causes difficulties with complex background suppression and the detection of diverse targets. Therefore, infrared small target detection is still a challenging task. In images captured with infrared technology, the small targets appear as bright areas with isotropic Gaussian characteristics. Furthermore, the targets symbolize the singularity using the second-order derivative facet model [7], which is used to fit the gray distribution in the local regions. The facet fitting transforms the target detection task into a search for singularities. However, the facet kernel with a single derivative channel is based on the assumption of fixed target size and grayscale distribution, which may cause information loss such as boundaries of targets and clutter residual. In order to address these problems, we propose a multiderivative descriptor with a dual-channel constructed facet kernel to model small targets simultaneously from multiple directions. The multichannel cross-fusion images enhance the isotropic small targets and ensure their integrity effectively. Furthermore, the anisotropic clutter can be suppressed in the cross-fused images. In addition, the facet kernel and random walker (FKRW) method was first applied to the segmentation task and achieved good performance. The random walker (RW) segmentation algorithm exceptionally depends on the marker points that are used to label the foreground (targets) and background, resulting in a local region. The mark points are determined by the intensity and single scale, which cause local position bias and mislabeling. Some targets with low contrast and diverse sizes will be wrongly classified as background and omitted. To address these problems, we propose a constrained random walker (CRW) technique, an adaptive matching algorithm to extract the local regions of each candidate adaptively based on the constraint of size and shape.
We use derivative filters of different orders to decompose the infrared image. The targets are transformed into singular points in each subchannel, as shown in Fig. 1. For the purpose of detecting the targets and suppressing the background clutter, all singular points in the directional channels of various orders are fused. To adapt to the size diversity of small targets, we propose a CRW technique to extract the appropriate local regions of each candidate to locate the background seed points based on an adaptive matching algorithm. The main research contributions of this article are summarized as follows.
1) We propose a multiderivative descriptor to not only enhance small targets but also suppress complex background, based on the isotropy of infrared small targets and the anisotropy of clutter. algorithm is designed to extract the appropriate local regions based on the connected domain constraint.
3) The proposed method outperforms the state-of-the-art methods on the three datasets, resulting in the optimal precision and intersection over union (IoU ). The rest of the article is organized as follows. Related work is given outlined in Section II. The proposed method is detailed in Section III. Section IV presents the experimental results and discusses the state-of-the-art performance. Finally, Section V concludes this article and presents future works.

II. RELATED WORK
In recent years, many methods have been proposed for infrared small target detection. The existing methods are mainly divided into two categories: 1) conventional methods and 2) deep learning methods.

A. Conventional Methods
Conventional methods can be divided into the following four categories: 1) filter-based methods [8], [9], [10]; 2) data structure-based methods [11], [12], [13], [14]; 3) human vision system-based methods (HVS) [1], [15], [16], [17], [18], [19]; 4) progressive detection-based methods [3], [4], [5], [7], [20]. a) Filter-based methods: Filter-based techniques have been frequently utilized in small target detection based on the differences in the spatial organization and gray information between the target and background. Based on a similarity judgment, a bilateral filtering algorithm [8] was proposed to judge the image boundary accurately. However, the texture residual of the background led to a high false alarm rate when the signalto-noise ratio was low. To address this problem, the phase spectrum of the quaternion Fourier transform [9] was proposed to enhance small targets in four data channels. Nevertheless, it led to high-frequency noise and clutter. The balanced ring Top-hat transformation [10] was presented to suppress the background adaptively and capture contrast information based on adaptive structuring elements and a balanced ring shape. However, it failed to detect the multiscale targets due to the manual selection of the morphological structure elements. Filter-based methods rely on prior assumptions, resulting in the inefficient detection of multiscale targets.
b) Data structure-based methods: Based on the sparsity of the target and low-rank background, the infrared patch-image (IPI) model [11] was proposed to detect small targets. Nevertheless, the background containing the heterogeneous clutter violated the low-rank assumption, resulting in noise residuals. The matrix decomposition and reconstruction increased the computational complexity. To alleviate these problems, the nonnegative infrared patch-image model (NIPPS) [12] adopted a partial sum minimization of singular values to eliminate salient residuals and a nonnegative constraint to accelerate the convergence speed. RIPT [13] introduced the local structure prior and designed an element-wise weight that could suppress the remaining edges while preserving the fuzzy targets. To overcome the bias problem caused by fixed weighting parameters, a nonconvex model of the partial sum of the tensor nuclear norm (PSTNN) [14] was proposed, which used tensor singular value decomposition (t-SVD) to reduce the computational complexity. Unfortunately, the rank of targets in low signal-to-noise ratio (SN R) images was unable to be estimated by the energy ratio. The data structure-based methods was stable; however, it was time-consuming. c) Human vision system-based methods: Based on the local texture characteristic of a small target, the local contrast measure (LCM) [15] was proposed to calculate the contrast difference in a local region. However, the dark targets were wrongly omitted. To solve this problem, the multiscale patchbased contrast measure (MPCM) [16] was designed to detect dark and bright targets simultaneously based on the image patch difference. To avoid the interference of highlighting noise caused by the average and maximum gray values of the subblocks [16], a new local contrast measure (NLCM) [17] was proposed to distinguish between noise and target using an improved variance (IVar) and mean (IMean) of region gray values. To suppress significant noise, a multidirectional derivative-based weighted contrast measure (MDWCM) [18] was presented to capture the multidirectional derivative properties of the target and clutter. However, the highlight noise reduced the performance, and the weak targets were omitted. In order to address the problem of target submergence caused by the high-brightness background, the enhanced closest-mean background estimation method [19] was proposed to suppress high-brightness background using the closest-mean principle under the eight orientations of the surrounding layer. Many works are still unsatisfying due to the imperfect knowledge of target structure. In pursuit of higher robustness, a novel IR small target detection method utilizing halo structure prior (HSP)-based LCM (HSPLCM) was proposed to adequately consider the structure characteristic of the target using the IR image structure tensor [1]. d) Progressive detection-based methods: In order to use various characteristics of small targets to improve the performance of infrared small target detection, some progressive detection methods were proposed. The local and nonlocal spatial information [5] was used to progressively suppress the structured edges, unstructured clutter, and noise. Based on the local contrast and gradient properties of small targets, a fast adaptive masking and scaling with iterative segmentation (FAMSIS) [20] method was proposed to strike a good balance between computational speed and performance. Furthermore, a method derived from the FKRW [7] was proposed to extract the core shape of the targets. The local segmentation scheme improved the detection performance. In order to detect dark and light targets, a novel method using multiple morphological profiles [4] was proposed to detect various types of targets by different attributes with discontinuous pruning values. To extract adequate target features from the images with low signal-to-noise ratio, an effective method based on three-order tensor creation and Tucker decomposition was proposed [3] to exploit more spatial and structural information.

B. Deep Learning Methods
Aiming at the problem of intrinsic feature scarcity of small targets, a deep network [21] was designed to reinforce the semantics of small targets, combined with the depthwise parameterless nonlinear feature refinement layer for extracting contrast features and bottom-up attention modulation for integrating the smaller-scale subtle details of low-level features into the highlevel features of the deeper layers. To implement unsupervised feature extraction of invariant features, a denoising autoencoder was proposed [22] to detect infrared small targets. The small targets were denoted as noise to suppress the background. An asymmetric contextual modulation module (ACM) [23] first contributed an open dataset, named SIRST, to advance the research of infrared small target detection. This module used both a top-down global attention module and a bottom-up local attention module to encode semantic information and spatial details in more detail. A local patch network (LPNet) [6] designed a supervised attention module and a weights sharing structure to jointly extract the global and local features of targets. However, due to the lack of learnable information, deep learning methods for small infrared targets detection need to be improved and developed.

III. PROPOSED METHOD
To enhance the small targets and reduce the clutter, we propose an infrared small target detection method based on singularity analysis and CRW. The whole framework of the proposed method is exhibited in Fig. 2, including the three stages. In the preprocessing stage, we use statistical filtering and mean filtering to eliminate bright areas and isolate noise points. Furthermore, we analyze the singularity of each point in the infrared image by the facet model. A multiderivative descriptor is proposed to enhance bright targets and suppress clutter, which consists of four first-order derivative filters with different directions for retaining the target areas, and a second-order directional derivative filter for suppressing the clutter. To adapt to the size diversity of small targets, we design a CRW, where an adaptive matching algorithm is introduced to extract the appropriate local region to locate the background seed points. Finally, we multiply the candidate target image with the segmentation result image by pixel points to obtain the final weight map.

A. Singularity Analysis
Facet fitting can transform the target detection task into a search for singular points [24], [25], [26], and the facet kernel of the second-order derivative [7] is employed to detect candidate targets. It can highlight the point region that fits the size of the filter kernel. However, this kernel only enhances the central areas of the small targets while suppressing the edge (see Fig. 1), resulting in detail loss and block noise. It is inefficient for multiscale targets. To overcome these limitations, we introduce the first-order to propose a multiderivative graph based on the facet model to extract more singular points. In facet first-order filtering, the targets are represented as high and low bimodal, whereas the clutter is transformed into slightly varying strip textures [27], [28]. It can detect isolated bumps in multiple directions of the signal, resulting in a structure of singular points for candidate extraction. The first-order facet model achieves a stronger and more complete singularity response to suppress the clutter in different directions. An infrared small target characterizes an isotropic Gaussian-like shape, and the background clutter is locally oriented [24]. Based on the isotropic of the target, singular points in different order channels attain overlapping responses, which improve the integrity to enhance the targets. For anisotropic clutter, the response points in different channels are different, resulting in weak overlapping responses. As a result, most of the clutter is effectively suppressed.

1) First-Order Facet Model:
We design four first-order derivative filters with different directions to decompose the image. Let R and C be index sets of symmetric neighborhoods, The intensity function f (r, c) represents the gray intensity value of point (r, c) in the neighborhood R × C, which is defined as follows: where P (r, c) is a series of discrete orthogonal polynomials (2) K n (n = 1, . . ., 10) denotes the coefficient that is the linear combination of the intensity values of f (r, c) estimated by least squares fitting W n denotes the weight of the nth element in P (r, c), which is defined as follows: According to (2), P n (r, c) are first-order when n = 2, 3, 7, 8, 9, 10, and W n are obtained as where angle α is the clockwise angle from the vertical axis. The first-order partial derivatives evaluated at center pixel (0,0) in neighborhood Ω ∈ R R×C are calculated by The response of the FODD filter at center (0,0) along the angle α is defined as We develop FODD maps from various angles (α = 0 • , 45 • , 90 • , 135 • ). The target is converted into maximum and minimum points in the corresponding direction on each FODD map. Through the fusion and addition of the extreme points in the four directions, we can obtain more singular points to express the target. At the same time, we take the negative extreme value as positive by absolute value processing to unify the target singularity . | · | means the absolute operation. Fig. 3 shows the FODD maps, where small targets reflect partial singular points of similar shape and intensity in different directional channels. However, this fusion will cause residuals on the edges of the clutter due to the anisotropy of the heterotropic clutter.
2) Multiderivative Descriptor: To suppress the heterotropic clutter in the different order channels and enhance the singularity, we perform secondary fusion of the first-order fusion  result and the second-order image to obtain the multiderivative description image. We also use the second-order facet kernel F in FKRW [7] to fit the gray distribution of the whole image I, obtaining the result M s I is the image preprocessed by statistical filtering and mean filtering, and * denotes the convolution operation. Consequently, M o and M s are fused to achieve the multiderivative fusion image where × represents the corresponding element multiplication operation of the image matrix. The position and shape of the clutter generated by first-order and second-order filtering are different. The fusion of the multiderivative descriptor can effectively eliminate these clutters, whose response values are almost zero, as shown in Fig. 4. The facet model is sensitive to brightness changes, which can detect targets with intensity mutations. However, some singular points inside the targets without mutations are lost, as shown in Fig. 5. More single points are missed as the target size increases, while the bright ring structure composed of singular points is still formed in the multiderivative fusion result.
To ensure the integrity of the target, we perform the connected domain pixel restoration operation on the multiderivative fusion image to recover the internal singular points. First, we propose a threshold value T to obtain the binary image where μ and σ are the mean and standard deviation of M m , respectively. k is an important parameter to extract the singular value structure. When k is small, we can extract more candidate pixels and form a complete target ring structure. When k is large, some candidate pixels will be lost, but less clutter will be introduced. The parameter k is empirically set to 5 in this article.  Then, we recover the singular points inside the ring structure on the binary image to form a complete candidate target. Finally, we give the candidate target pixel a gray intensity distribution in the original image, which restores the target well. We obtain the complete candidate target graph, denoted as M f . Compared with the filtering results of the raw facet kernel [7], the proposed multiderivative descriptor can extract more complete candidate targets with a higher signal-to-noise ratio, as shown in Fig. 6.

B. Constrained Random Walker
In order to meet the diversity of small target detection and remove residual clutter, we propose a CRW algorithm. According to the size and shape of the target-connected domain, the adaptive matching technology can match the appropriate local area and select the correct marker pixel for the candidate points. This algorithm can significantly enhance the accuracy of segmentation algorithm and improve the overall denoising and detection performance.
To remove the clutter residual in the results of the multiderivative descriptor, we employ the RW algorithm [7], [29], [30] to perform two-class (target and background) recognition in the local region of each candidate. The RW algorithm selects the edge points of a local region as the background seed points to extract the targets. Let G = (V, E) be a graph structure, where there are pixels v ∈ V and edges e ∈ E. The node set is defined as V = {v 1 , v 2 , . . . , v N }, where N is the number of pixels. The intensity change between two pixels v I and v j is quantified with weight W ij , which is defined as follows: where I v i and I v j indicate the intensity values of v i and v i . The free parameter β controls the weighting degree, which is empirically selected as 200. Define the discrete Laplacian matrix L to represent the weight between any two pixels in the graph. Note that L is symmetric since the edges E are undirected. The N × N matrix L is defined as Partition the node set V into two sets, V Ψ (marked/seed nodes) and contains all the seed points, and each seed point in V Ψ has been assigned a label K χ , K = {1, 2}. We label the points of the targets as 1 and that of the background as 2. From the intensity representation of graph G and the set V Ψ of the marked pixels, the RW algorithm can calculate the probability P jk from pixel v j to class k. The set of probabilities P k belonging to class k can be partitioned as P Ψ k and P Φ k , denoting the set of probabilities that V Ψ and V Φ belong to class k, respectively. The probabilities of seed set V Ψ are Then, v j ∈ V Ψ , and we may reorder the matrix L to reflect the subsets Given all of the above preparation, the solution to the combinatorial Dirichlet problem may be found by solving P Φ k is formulated as The segmentation performance of RW is determined by the accuracy of the labeled pixels. Therefore, it is important to select appropriate foreground and background seeds. The 11 × 11 segmentation area used in [7] cannot detect multiscale small targets. In other words, when the target is large, the background markers  will fall into the target area, and the target pixels will be divided into background clutter, resulting in the loss of the target (see Fig. 7). When the target is small, the relatively large segmentation area will increase the computational complexity. Therefore, an appropriate local region of each candidate is the precondition for this task. We design a CRW algorithm to accurately locate the seed points, where the adaptive matching technique can extract the appropriate local region for each candidate point based on the connected domain constraints. To meet the diversity target, the adaptive matching technique extracts the local segmentation region by constructing the segmentation box. The segmentation box should follow the following three principles.
1) All the target pixels should be included.
2) The target pixels will not appear on the edges of the segmentation box.
3) The area should be as small as possible. First, the smallest rectangle surrounding the connected domain of target is built, according to its size and shape. Second, another rectangular box is expanded appropriately to design the corresponding segmentation box. We construct a double window strategy to divide the local area around the target into the target area and the background area, as shown in Fig. 8. The inner box (red box) is designed to prevent the target pixel from entering the background area. The area between the inner and outer boxes belongs to the background. The design of the outer box (blue box) is used to select the appropriate segmentation area and locate the background seed point. We mark a circle of pixels on the edge of the outer box as the background seed point and look for the pixel with the highest intensity in the inner box as the target seed point. Sorting by pixel intensity in descending order. 6: for p = 1 : (P n ) do 7: x 1 = x 0 − ; 8: y 1 = y 0 − ; 9: n = b + 2 ; 10: m = a + 2 ; 11: Obtaining the segmentation box L m×n and its upper-left coordinate (x 1 , y 1 ) 12: L m×n (1, :)(end, :)(:, 1)(:, end) = 0, p = 1 13: The probability matrix (a) and label matrix (b) are obtained by Random Walker algorithm from matrix L m×n .
x 0 and y 0 are the abscissa and ordinate of the upper left corner of the inner box, respectively. a and b are the length and width of the inner box, respectively. We can obtain the upper-left coordinate of the segmentation box (x 1 , y 1 ), length m = b + 6 and width n = a + 6. It should be noted that, based on the integrity of the connected domain and preventing the target pixel from being on the border, we increased the length and width of the inner box by 6 pixels to locate the segmentation box. This ensures that the segmentation box is located in the background area, and then we take the region in the segmentation box as the local segmentation region to classify the pixels into two categories. The process of the Constrained Random Walker algorithm is described in Algorithm 1.
In Algorithm 1, c denotes one of the connected domains in N c , and p denotes one of the pixels in P n , denotes the distance between x 0 and x 1 , which are set to 3.
The core of adaptive segmentation is from the minimum rectangle box L a×b to the segmentation box L m×n . We define this process using the following equations: where · 3 means that the matrix is expanded three pixel levels in all directions. The = 3, L a×b is As a result, we will obtain a probability matrix of the unlabeled pixels to the target seed point and a label matrix of pixels belonging to the first class (objects), as shown in Fig. 9. Based on the label matrix and probability matrix, we use two novel local contrast descriptors defined in [7] that are related to the probability and pixel intensity to extract real small targets where R, θ, and η represent the local region, the pixels segmented as class 1, and the central pixel, respectively. θ \ η indicates all the pixels segmented as class 1 excluding the central pixel, whereas R \ θ represents the pixels segmented as class 2. ζ cp is the value of ζ for the θ pixels; P represents the probabilities of class 1;p indicates the mean value of the probabilities; η ⊂ θ means that more pixels are segmented as class 1 other than the central pixel.
The region of the surrounding background pixels is denoted as B = (θ D 2 ) D 2 − θ D 2 , where represents the dilation operation and D 2 is a disk-shaped morphological structuring element with a radius of 2 pixels. I(·) represents the infrared image intensities, and I(B) max and I(θ) are the maximum and Furthermore, we use the fusion results of the multiderivative descriptor to enhance the targets by (23), resulting in the true targets From the results in Fig. 10, the proposed constraint random walker can detect small targets with different sizes. (a) is a small rectangular target in a cloudy background, and our method can make the target more prominent while suppressing cloud clutter. (b) is a large target with an irregular shape, and our method can effectively preserve the texture and improve the integrity of the target. (c) is a ship-shaped target with a blurred boundary, and our method can distinguish the edges well and detect targets of various shapes. In short, the details of the small targets are absolutely preserved.

IV. EXPERIMENTS AND DISCUSSION
In this section, we use three public datasets published for the detection of small infrared targets. We exhibit the experimental results of the proposed method, where some metrics are employed to qualitatively evaluate the performance. Furthermore, we compare the proposed method to the state-of-the-art methods. The conventional methods include the MDWCM [18], infrared small target detection based on FKRW [7], infrared small target detection based on PSTNN [14], FAMSIS [20], and the deep learning methods include attentional local contrast networks (ALCNet) [21] and asymmetric contextual modulation (ACM) [23]. Meanwhile, we discuss the impact of different parameter settings on the proposed method. All experiments are conducted on a PC with a 2.30-GHz Intel i5-6300HQ CPU and 4.00-GB memory and with MATLAB 2016b.

1) Datasets Introduction:
To validate the effectiveness and robustness of the proposed method, we conducted tests on the public datasets MDvsFA (Dataset1) [31], SIRST (Dataset2) [23], and IRSTD-1 k (Dataset3) [32]. The detailed description about the real infrared image datasets are given in Table I. The three datasets are all real infrared sequence images,  TABLE I  DETAILS OF THREE INFRARED TARGET DATASETS which can effectively detect the performance of the proposed method.
2) Evaluation Metrics: We used the receiver operating characteristic (ROC) curve and AU C (which is the area under the ROC) to assess the performance of the compared method. The ROC curve represents the varying relationship of the detection probability P d and the false alarm rate F a , which are defined as follows: number of true target detection number of actual targets (24) F a = number of false pixels detection number of total pixels in images.
The signal-to-clutter ratio (SCR), SCR gain (SCRG), and background suppression factor (BSF ) can describe the degree of contrast between the response image and the original image, which are defined as follows: where μ t denotes the average pixel value of the target, and σ b and μ b represent the standard deviation and the average value of pixels in the neighboring region SCRG = SCR out SCR in (27) where SCR out and SCR in are the SCR values of the processed image and the original image, respectively BSF = σ in σ out (28) where σ in and σ out denote the standard deviation of the background region (i.e., the whole image except the target region) in the original image and the response image, respectively. The computation of SCR involves standard deviation of image intensities in the background pixels around the target pixels, but for some methods, it might be close to zero. Then, SCR may be infinity and, thus, difficult to evaluate the performance. Consequently, another metric is adopted to evaluate the performance of target enhancement, namely, contrast gain (CG) [33], which is defined as where CON out and CON in denote the contrast (CON ) of the response image and the original infrared image, respectively. The CON is defined as where μ t and μ b are the same as those in (26).
We also employ the intersection over union (IoU ) and normalized IoU (nIoU ) to evaluate the proposed method. IoU is a pixel-level metric to evaluate the contour description capability of the algorithm, and nIoU is specifically designed as a more balanced metric between conventional methods and deep learning methods where n is the total sample number, and T P , T , and P denote the true positive, true, and positive, respectively.

B. Qualitative Comparisons
Multiscale target detection and complex background suppression have always been two major difficulties in infrared small target detection. The proposed method can effectively detect the diversity targets in complex backgrounds, as shown in Figs. 11 and 12.
In Fig. 11, the first row shows the original images containing the smaller targets, and the third row includes the original images containing the larger target. The second and fourth row shows the corresponding detection results using our method. The bottom left corner are the enlarged results. As can be seen, our method not only significantly enhances the targets of different sizes, shapes, and brightnesses but also preserves its texture well. Overall, our method has a good detection performance for diverse small targets.
The six representative images in Fig. 12 show various types of complex background, including cloud clutter, highlighted areas, highlighted edges, and block noise. From left to right, the background of the six images becomes increasingly complex. The first row exhibits the original images; the second row exhibits the corresponding 3-D images; and the third row exhibits the Fig. 11. Examples of multisize infrared targets detection results of our method. corresponding 3-D images of the detection results. Cloud clutter, highlight interference, and irregular noise in the original image can be suppressed to highlight the real targets. It can be seen intuitively from the 3-D graph corresponding to the detection results that the proposed method has a significant suppression effect on various types of complex background. Fig. 13 shows some images of the detection results for all the compared methods and our method. Seven representative infrared images are selected as part of the visualization results for presentation. In these images, the size of the targets, the type of clutter, and the complexity of the background are significantly different. The experimental results show that the proposed method attains good detection performance for multiscale targets in complex backgrounds. For weak targets in complex backgrounds (sixth row), almost all methods are disturbed by clutter, leading to more false alarms. Although PSTNN suppresses complex backgrounds, it can also lead to loss of targets. It can be seen that our method can effectively preserve the target while suppressing the complex background. There are bright edges in the fourth and seventh images, and many algorithms have clutter residue, but our algorithm can suppress edge clutter well. The fifth image contains more cloud clutter and it can be seen that the deep learning based methods misdetect the block clouds as targets and introduce false alarms.
In order to have a more intuitive presentation, the detection results of the different algorithms in Fig. 13 are shown in Fig. 14 in 3-D accordingly. The background suppression effect and the target enhancement effect of the algorithm can be seen more directly in the 3-D image (see Fig. 14). It can be seen that the proposed method can suppress various types of clutter and then detect the real target from the complex background. FKRW is sensitive to bright areas and susceptible to interference, and it is not suitable for large targets. PSTNN suppresses clutter and removes weak targets; however, many targets are wrongly omitted. In contrast, our algorithm can detect diverse targets and effectively suppress complex backgrounds. Table II lists the results of CON and CG on the three datasets. Our method can obtain the highest CON and CG values on Dataset2. Although the CON and CG values of MDWCM and FAMSIS achieve slightly higher than our method on Dataset1 and Dataset3, we can achieve the highest P d on the three datasets as shown in Table IV. Table III shows the results of average BSF obtained by all methods. The average of BSF in Table III is the ratio of all the BSF values and the number of the images in a dataset. It evaluates the background suppression ability of the algorithm. Because of the prior assumption that targets have the highest brightness, the human vision system-based method, i.e., MD-WCM cannot enhance the targets with the light background effectively, resulting in the unsatisfactory performance evaluated by the BSF . PSTNN belongs to the data structure-based method, which can suppress the background more effectively than MDWCM and achieve a higher BSF value. Compared to these methods, the proposed method can obtain optimal performance of background suppression. In particular, the BSF values of our methods are much higher than that of FKRW. It is demonstrated that the proposed multiderivative descriptor can suppress complex background with low signal-to-noise ratio and enhance the targets.
The ROC curves of the compared methods on the three datasets are shown in Figs. 15, 16, and 17, respectively. It can be seen that our method obtains the best performance on all three datasets. On Dataset1, when the F a reaches 0.2 × 10 −4 , the P d values of the proposed methods is up to 0.66, which is much higher than the compared methods mentioned in this article. Furthermore, the P d achieved by our method reaches 0.9 when the F a reaches 0.8 × 10 −4 , which attain the optimal performance. Since MDWCM is based on the intensity distribution values, it would be disabled once clutter with high brightness of complicated background exist in the infrared image. On Dataset2, the sizes of targets are various (a few pixels to a dozen of pixels) with different brightness. However, based on the assumption of target brightness distribution and fixed scale prior, FKRW achieves poor detection performance on this dataset. Compared to FKRW, the proposed multiderivative descriptor can accurately detect both dark targets and light ones, and the adaptive matching algorithm can locate the local region of each candidate to segment. As a result, our method attains P d over 0.9 and achieve an advanced performance. On Dataset3 with complex background, the F a values of conventional methods,     such as MDWCM, FAMSIS and PSTNN, are higher than the results on Dataset1 and Dataset2. The comparisons indicate that these methods cannot accurately detect targets in complex backgrounds. In order to address these problems, the proposed multiderivative descriptor can suppress various clutters generated in the complex background using four first-order facet kernels and a second-order facet kernel, resulting in the 0.9 P d value at F a with 0.6 × 10 −4 . Table IV lists the results of these methods evaluated by the metrics of P d , F a , and AU C. The best P d of the conventional methods on the three datasets are 0.7937, 0.8354, and 0.8357, respectively. Compared to the conventional methods, our method achieves the state-of-the-art performance of the infrared small targets detection, and attains the best P d , F a , and AU C values, which are listed in Table IV.   For the deep learning methods, the highest P d reaches 0.7986 and 0.8856 on Dataset1 and Dataset2. Compared to the deep learning methods, the P d values of the proposed method are 0.8417 and 0.9213, which are higher than ALC-Net [21] P d . Although ALC-Net designs local contrast module to extract long-range context features and achieves the highest P d value on Dataset3, the P d of our method is still up to 0.9213, which is almost equal to that of ALC-Net. Furthermore, the results of our method evaluated by F a and AU C are better than ALC-Net. Table V shows the results of the IoU and nIoU obtained by all the methods. IoU and nIoU reflect the segmentation performance of the algorithm, which can reflect the integrity of the target in the detection result. An interesting phenomenon is that the MDWCM has a higher detection rate but a lower IoU and nIoU . This indicates that the segmentation performance of this method is poor; only part of the target can be detected and the integrity of the target cannot be guaranteed. We can see that the deep learning methods on the segmentation performance has better performance, and the conventional methods have lower IoU and nIoU . Our method is based on pixel-level segmentation detection, so that it shows superior performance in target segmentation and significantly improves the integrity of the target.

C. Parameter Sensitivity Analysis
In this section, we perform experiments to analyze the effects of the parameter values of the proposed method. The parameter k controls the extraction of the candidate pixels. A smaller k value extracts more candidate pixels, whereas a larger k value removes more candidate pixels. It is an important parameter to balance the detection probability and false alarm rate. The ROC curves show the P d and P f results obtained by the proposed method on three datasets using different parameters k ∈ [2,10]. According to the experimental results, when the value of k is between 5 and 6, the detection probability is in a stable state and the false alarm rate is kept within a certain range. To minimize the false alarm rate with a high detection probability, we set k to 5.
We also use computational load as metrics for evaluation and the results are listed in Table VI. Traditional methods have only inference time, while deep learning methods include training time and inference time. Compared to conventional methods, our inference time is not very slow and the P d value is higher than them. Compared to the deep learning methods, our overall time is much less than it and the P d value is also higher than them.

V. CONCLUSION
To enhance the clutter suppression capabilities under complex background and further improve the detection performance of changeable targets, a novel method based on singularity analysis and CRW is proposed in this article. A multiderivative descriptor based on facet fitting is proposed to transform small targets into singularities in the directional channels of the different orders, which is used to compose the candidate pixels. At the same time, the missing singular points inside the target are restored to improve the integrity of the candidate pixels. The analysis, search, fusion, and recovery of singular points not only enhance the targets but also suppress the background clutter effectively. In addition, the proposed Constrained Random Walker algorithm based on the target-connected domain designs an adaptive matching technology to select an appropriate segmentation region for each candidate point. This method can locate the background seed point accurately and attain an effective distinction between the foreground and background. The experiments demonstrate that the performance of this method is better than the other state-of-the-art methods in target detection and background suppression.
There are still some issues worth considering. Our method still leaves much room for improvement in the segmentation performance, and our future research directions include improving the segmentation algorithm and the overall detection efficiency.