Multi-Scale Continuous Gradient Local Binary Pattern for Leaky Cable Fixture Detection in High-Speed Railway Tunnel

The feature of leaky cable fixture extracted by Local Binary Pattern (LBP) and its variants in high-speed railway tunnel has the defects of lacking description and high dimension. This paper proposes a new operator named Multi-scale Continuous Gradient Local Binary Pattern (MCG-LBP), which can realize the scale transformation of feature maps and ensure the low dimensionality of descriptors. For MCG-LBP, firstly a bi-directional triplet around the central pixel is presented to indicate the specific direction of gradient in circle neighborhood. Then, an effective dimensionality reduction strategy is introduced to perform successive down-sampling iterations. Finally, the multi-scale joint descriptors are encoded by continuous gradient sequences from different down-sampling maps, and Support Vector Machines is used to classify faulty cable fixtures. The proposed MCG-LBP can elicit a discriminative description through complementary gradient information generated by the combination of different single-scale features. While the low dimensionality of descriptor and no complex parameter to deal with both make it has higher computational efficiency. Experimental results show that the Recall and Precision of MCG-LBP reach 92.6% and 83.5% respectively on cable fixture data set, which is superior to the state-of-the-art methods.


I. INTRODUCTION
High-speed rail is one of the most economical transportations globally, and it's also a landmark product of modern informatization construction. As the transportation demand increases, it is particularly essential to maintain the regular operation of railway communication systems. The network of mountain tunnel sections is mainly realized by covering leaky cables that are usually hung on the tunnel wall and fixed with specialized fixtures [1]. Fig. 1 shows the leaky cable fixtures which will be seriously affected by air pressure and power waves generated when a high-speed train passes through the tunnel. At the same time, the humid tunnel geological environment will also accelerate the loosening or even detachment of cable fixtures. The detection method of manual positioning and regular inspection one by one is not only restricted severely by environmental and human factors, but also has great safety hazards. With the increasing mileages of railway tunnels, it is obvious that traditional inspection methods can no longer meet actual needs, and automatic inspection has become an inevitable trend of the development in cable fixture detection. To achieve this technology, firstly a high-speed camera needs to be placed on the train to collect an entire picture that is taken during the VOLUME 4, 2016 tunnel section. Then computer vision approaches are used to extract cable fixture characteristics frame by frame to realize inspection work [2]. Currently, the database acquisition part has been implemented already, but the detection part is still mainly in the stage of manual playback video to troubleshoot, which requires a lot of human resources. As for handcraft methods, Shang uses Snake to determine the location of leaky cable, and also operates Haar to extract the characteristics of the fixtures, which are easily disturbed by wall cracks and background noise [3]. To achieve fixture detection, Ma uses Yolo.v4 [4], and Zhang uses SSD [5], but their results are not good due to the insufficient data set of faulty cable fixture. Therefore, it is necessary to continuously optimize the feature extraction algorithm, reduce the dependence on the fault data set, and improve the anti-interference ability. Some comprehensive surveys about texture descriptors can be found in [6].
Due to the discriminative quality and ease to train with a small amount of data, Local Binary Pattern (LBP) proposed by Ojala et al. can get more favor on cable fixture detection [7]. Moreover, feature extraction and texture classification based on statistical learning of LBP are both widely used in other fields [5]- [6]. Such as fault detection [8], medical images analysis [9], face detection and recognition [10], texture classification [11], and many others. Cable fixture fault detection in this paper is just one application of the related research that can make good use of texture analysis by spatial distribution of gray level values. The conventional method to improve detection performance of LBP is conducting deep mining of texture information or making multi-scale features fusion, which can deal with external changes in the imaging conditions like rotation [12][13], scaling, illumination [14], and noise interference based on threshold scheme [15]. Lots of variants have been proposed after the emergence of LBP. Local Ternary Pattern (LTP) quantizes local difference into three levels [16], which responses sensitively to grey-scale changes and is also easily affected by noise. Center-Symmetric Local Binary Pattern (CS-LBP) as one representative variant of LBP, which is proposed to enhance robustness, and its expansion of central symmetric difference information make computation complexity and feature dimensionality reduce greatly [17]. But CS-LBP ignores the role of central pixel and it is very troublesome to parameters. Completed Local Binary Pattern (CLBP) consisted of three complementary components by introducing signal information, magnitude information, and mean gray level [18]. CLBP operator takes both local structure and textural magnitude into consideration to achieve much better classification accuracy. Then Completed Local Ternary Pattern (CLTP) has subdivided texture features further, it also causes a serious limitation of excessively large dimensions [19]. The operator called Attractive-and-Repulsive Center-symmetric Local Binary Pattern (ARCS-LBP) establishes the relations between central pixel and center-symmetric pixels pairs to improve the quality of descriptors [20]. To highlight the advantages of gradient components, Local Derivative Pattern (LDP) computes over derivative images in four directions and is coded by comparing the edge and corner responses [21], but none of them is able to achieve feature scaling transformation. Three-Patch Local Binary Pattern (TPLBP) enlarges feature scaling by extending extraction unit from pixel to block to calculate the similarity difference between relevant blocks [22]. Local Directional Ternary Pattern (LDTP) uses mask to expand scaling, and combines the ternary mode and derivative transformation to enhance the expression of edge features [23]. But it will also cause a lot of calculations, especially for the small target in large image. In order to realize the automatic detection of leaky cable fixture, the issues such as incomplete expression, high dimensionality, complex calculation, and other challenges need to be resolved.
In this paper, we propose an effective operator to achieve cable fixture detection, and the main contributions are highlighted as follows: • The bi-directional triplet (BDT) is presented to express a more specific gradient direction in circle neighborhood, which can also avoid complicated parameter adjustments.
• An effective dimensionality reduction strategy is introduced in the down-sampling process, we perform nonmaximum suppression according to the key gradient frequency in Cell unit and obtain the following downsampling map, which can realize the scale transformation and further improve the computational efficiency of extracting continuous gradient features.
• The continuous gradient feature is proposed to obtain the edge and corner features of cable fixture contour, which counts the combination of relevant continuous gradient sequences in Block unit. By cascading continuous gradient features from multiple down-sampling maps, the final MCG-LBP descriptor is constructed. Multiple sets of continuous gradient feature extracted from multiple down-sampling maps of different scales are cascaded to obtain the final MCG-LBP descriptor. The outline of this paper is organized as follows: In section II, there is a briefly discuss about the related methods such as LBP, some variant operators and non-LBP operators. Section III describes BDT, dimensionality reduction strategy, and continuous gradient feature in detail. In section IV, extensive experimental results compared with other LBP variant and non-LBP operators are demonstrated. Conclusion is given in Section V.

II. RELATED WORK
In this part, we mainly introduce the feature extraction algorithms such as LBP, CS-LBP and ARCS-LBP. A review of LBP variant and non-LBP descriptors is also given at below.

A. LBP AND CS-LBP
LBP is proposed by Ojala et al. originally, and it is likely to capture the sharp difference between the gradation fine texture [7]. A slight change on the encoding method can make this operator have better orientation invariance and rotation invariance, so it is widely used in face recognition and texture classification field. The coding idea of LBP is to compare grayscale difference value between center pixel and P sampling pixels in the circle neighborhood where R is the radius. Then use the Boolean function to calculate a string of binary sequence that contains only 0 or 1. The calculation formula of LBP is defined as follow: where (x c , y c ) is the coordinate of the center pixel c, whereas g c and g i (i = 0, 1, . . . , P −1) denote the gray value of center pixel c and sampling pixels in the neighborhood respectively, P is the total number of involved sampling point, and R is the radius of the circle neighborhood.
In order to enhance the performance of LBP descriptor in the spatial direction, CS-LBP is mentioned by Heikkila et al. [17]. Instead of LBP operator that encodes the grayscale difference value between sampling points and center pixel, CS-LBP codes the relationship between the gray level difference of pixel pairs that is symmetrical around the center pixel, and the threshold needs to be calculated from experiment. The formula of CS-LBP is defined as follow: where g i (i = 0, 1, . . . , (P/2) − 1) and g i+(P /2) both correspond to the gray value of peripheral pixels, and T is the threshold that can be specified by the global average gray value of input picture generally. Obviously, CS-LBP is closely related to gradient operator, as it refers to the gray level difference between pixel pair in a circle neighborhood.
The following three aspects are the main advantages of CS-LBP operator: 1) As shown in Fig. 2, under the same conditions of sampling number and sampling radius, the feature vector of CS-LBP operator is more compact, lesser number of comparisons is required, and the feature dimension of the descriptor is also lower [17]. 2) CS-LBP inherits good properties from both texture and gradient operators [24]. 3) CS-LBP descriptor has higher stability on flat regions and performs more robustly than those descriptors that only uses texture or gradient [25].

B. ARCS-LBP
CS-LBP operator ignores the key role of the center point in texture feature extraction, which leads to an incomplete display of directional characteristics. ARCS-LBP proposed by Y. El merabet is used to improve the incomplete performance [20], which not only considers the center point, but  When calculating the magnitude difference of a pixel pair, it is also quite difficult to adjust threshold, especially for the images that have obvious brightness differences.

C. OTHER LBP VARIANT AND NON-LBP OPERATORS
The calculation process of LBP operator is very simple and easy to modify, which make it more suitable to meet different application requirements including texture classification, face recognition, scene classification, and medical image analysis, etc. Pan et al. proposed Local Vector Quantization Pattern (LVQP) for texture classification [26]. Chakraborty et al. offered Local Directional Gradient Pattern (LDGP) and Local Gradient Hexa Pattern (LGHP) to make facial image recognition and retrieval [27]- [28]. Local Maximum Edge Binary Pattern (LMEBP) and Local Tetra Pattern (LTrP) are proposed to obtain the edge and corner information for fault detection [29]- [30]. These deep mining operators are mainly used to extract texture features instead of contour features. Another classic local operator, Histograms of Oriented Gradients (HOG) mainly used to detect target contour's VOLUME 4, 2016 feature by analyzing the gradient magnitude distribution from different orientations statistically [31]. Hence these fusion descriptors, like HOG-LBP and HOG-CLBP descriptors are proposed [32]- [33], Principal Component Analysis (PCA) is also used to make further dimensional space reduction on these fusion descriptors [34], such as Linear Discriminant Analysis (LDA) [35]- [36], Fisher's Linear Discriminant Analysis (FLD) [37], and Independent Component Analysis (ICA) [38]. Li et al. performed the two-dimensional histogram constituted by LBP to make wood defect classification [39]. However, both of the deep mining algorithms and feature fusion algorithms have the limitations in scaling transformation, and it is also difficult to reduce the feature dimension of descriptors. Although deep learning has made outstanding contributions in fault detection during recent years, just like using the enhanced network based on SSD to deal with faulty cable fixtures [40]. There are also many problems with computational complexity and difficulty in parameter adjustment [41], especially when the number of faulty fixture data set is not enough. This leads to the fact that local features based on statistical learning seem to be able to better achieve cable fixture detection [42].

III. PROPOSED METHOD
CS-LBP and ARCS-LBP operators only roughly explain that there are gradient differences in some diameter directions within the circle neighborhood, while none of them point out the specific gradient direction of the pixel pairs. Therefore, this paper proposes a bi-directional triplet model that is based on the fundamental idea about calculating gray difference value of center-symmetric pixel pair. The center-symmetric pixel pair and its center pixel are combined into a triplet, and the specific gradient direction is determined by calculating the ratio of the difference between the center pixel and the pixels on both sides of this triplet. Since the output of this step is a binary sequence and no decimal conversion is performed, the number of channels in resulting feature maps will be the same as the number of sampling points. Then we will implement the scaling transformation of the multi-channel binary feature maps. The final MCG-LBP descriptor is constructed by extracting continuous gradient features from a series of down-sampling maps on different scales, and Fig. 4 shows the overall process of feature extraction. In this section, the involved approaches will be described.

A. THE BI-DIRECTIONAL TRIPLET
For the purposes of convenient presentation, the same sampling conditions as the mentioned LBP and CS-LBP above are selected when calculating BDT value. In the circular domain, the number of sampling pixels P is 8, and the sampling radius R is 1. Firstly, the bilinear interpolation is used to obtain the gray values of all sampling pixels in a circular neighborhood. In the triplet, the grayscale difference between the center pixel and the two center-symmetric sampling pixels are calculated respectively. Then the ratio of these two differences is compared with the threshold to determine the specific gradient direction. Finally, the gradient directions of all triples in the circular domain are calculated, and the output result of BDT is obtained. The complete calculation process is shown in Fig. 5, and we will explain these four groups of triples in detail. The calculation of BDT always uses triplet as the basic unit, and the values of u, v, w for each group need to be calculated in advance. For Group0, the result of − u0 v0 is negative value, so the two count bits of 0 and 4 corresponding to the gradient directions of 0°and 180°are both marked as 0, which means the triplet does not output gradient direction. For Group1, the value of − u1 v1 is 1.5 that is within the range of threshold T , and the other constraint w 1 is a positive value. So, the count bit of 1 is marked as 1 and the count bit of 5 is marked as 0, which means the output gradient direction of this triplet is 45°. In order to express the calculation process more conveniently, the value of T is specified, but it usually needs to be obtained based on experiments. For Group2, although the value of − u2 v2 is positive, it is out of the range of T , so the two count bits of this triplet are both marked as 0, which also means that the pixel pair has no gradient direction. For Group3, the value of u 3 is 0, so there is no gradient direction for this pair, and the two count bits are also marked as 0. In summary, the output value of this BDT is 00000010. The complete formula of BDT is defined as follow: If u = 0 and v = 0: where the result of BDT(x c , y c ) is a one-dimensional array whose number of elements is P , and x i (m)(i = 0, 1, · · ·, (P/2) − 1) is the binary value of each bit in array. Repeating above operations on the input image, we can obtain a complete preliminary gradient feature map with eight channels. The value of each pixel in the preliminary feature map uniquely corresponds to a one-dimensional array that is composed of eight binary digits. In short, there are four groups of triplets in the circle neighborhood, and the output of each triplet is only three cases, which means the binary value of each one-dimensional array pixel has 81 (3 4 ) cases.

B. SCALE TRANSFORMATION AND DIMENSIONALITY REDUCTION
If the continuous gradient feature is extracted directly from a preliminary gradient feature map, the dimensions of the obtained descriptor will be too large. Therefore, an effective dimensionality reduction strategy is proposed to decrease the total number of binary pixel value arrangements. At the same time, this strategy can also realize the scale transformation by calculating different down-sampling feature maps rather than enlarging the radius of sampling circle or dividing the input image into 2×2 or 4×4. The value range of a single pixel is compressed, and the key gradient is used to represent the trend of overall gradient direction in a Cell unit. The relationship between gradient directions and count bits is shown in Fig. 5, where the count bits distributed from 1 to 8 in a counterclockwise way correspond to eight gradient directions respectively. In Cell unit, the values of all binary array are accumulated one by one according to the same count bit, so that the direction corresponding to the maximum count bit of the accumulated result is the key gradient direction. In Fig. 6,the Cell is composed of four binary array pixels FIGURE 6. Calculating the characteristics of the key gradient in a Cell, and the output value is a one-dimensional array. VOLUME 4, 2016 a, b, c, d whose values are calculated by BDT, and we take the key gradient direction of this Cell as an example. It can be seen from the cumulative result that the values in direction 45°and 90°are maximum, so the corresponding count bits 1 and 2 are marked as 1, and other count bits are marked as 0.
The final output coding value of this Cell is 00000110. The calculation formula is defined as follow: Where X i is the pixel value of binary array, i(i = 0, 1, · · ·, n) is the serial number of binary array in the Cell, n is the total number of binary array in the Cell, and j(j = 0, . . . , P − 1) is the count bit which also represents the channel serial number in preliminary gradient feature map and downsampling feature maps. Y is the accumulation result of all one-dimensional arrays in the Cell, and the function f (Y ) outputs the value J that corresponds to the channel with the maximum accumulation. Repeat all above operations on the entire preliminary gradient feature map to obtain the first down-sampling feature map. Note that one Cell can output multiple key gradients, which means that there may be not only one channel that output digital 1. Then perform the same iteration on the first down-sampling feature map to get the second down-sampling feature map. In down-sampling process, the stride is always 2, and the Cell size is 2×2. The size of down-sampling feature map in each layer corresponds to the size of sub-image segmented by traditional methods like 2×2, 4×4, and 8×8. Non-maximum value suppression is performed according to the frequency of key gradients in a Cell, which is also different from the maximum pooling in deep learning. Fig. 7 is a schematic diagram of the output result for each down-sampling layer. Except that the input image has only one channel, other down-sampling feature maps and the preliminary gradient feature map have eight channels. Through scale transformation, typical contour features such as edges and corners will be preserved, and some isolated noise points or wall cracks will be removed during down-sampling process.

C. THE EXTRACTION PROCESS OF CONTINUOUS GRADIENT FEATURE
Deep mining operators and fusion feature operators of LBP almost only perform texture analysis based on the gray level difference by local sampling points, but ignore the similarity difference between adjacent gradients. Therefore, we propose continuous gradient feature, which can excellently respond to the relevant information between key gradients and related adjacent gradients that appear on multiple downsampling feature maps. In order to distinguish from the Cell unit in above down-sampling process, a 3×3 window is defined in extraction process, and the unit is named as Block. The number of channels in a Block is also eight, and the stride is set to 1 for feature extraction.
The specific calculation process of extracting continuous gradient features is shown in Fig. 8. Firstly, the reference channels whose central point value is 1 need to be determined in this Block unit. If there is no central point with value 1 in all channels, the Block does not output continuous gradient feature. It can be seen that only channel 2 and channel 4 meet the condition, so these two channels will be used as a reference to extract continuous gradient features indepen- dently. The maximum continuous sequence length composed of eight neighborhood sampling points in reference channel 2 is four, and the maximum continuous sequence length of adjacent channel 1 and channel 3 is one and two respectively. Then the coding number 2412 belonging to reference channel 2 is obtained. In the same way, the coding number of reference channel 4 is 4222. Notice that the maximum continuous sequence length of reference channel 4 is less than three, which means that the reference channel 4 does not output value, so the final output of this block is only 2413. The constraints of calculating continuous gradient features are summarized as the following three points: 1)At least one channel has a center value with 1 in a Block with eight channels.
If not, then this Block does not output continuous gradient feature.
2) The maximum continuous sequence length must not be less than three in reference channel, which can ensure the dominance of reference gradient feature. 3) Since the reference channel and the adjacent channel can be converted mutually in different Blocks where the gradient characteristics can be complemented with each other, so when the maximum continuous sequence length of adjacent channels is over three, the length is still marked as 3. The specific sequence information of adjacent channels does not affect its performance when used as reference channel information in other Blocks. Therefore, there are 768(768=8×6×4×4) encoding typies that can meet the above conditions.

IV. EXPERIMENT & DISCUSSION
The performance of MCG-LBP operator is validated by a comprehensive comparison with a large number of state-ofthe-art variant methods based on LBP and non-LBP operators. The superior quality of our proposed method is verified by quantitative and qualitative evaluations using leaky cable fixture data sets. In order to obtain the best MCG-LBP descriptor, the extraction efficiency and the classification accuracy of each single-layer feature are also compared. Furthermore, all experiments are implemented by using PYTHON 3.7, the CPU used in experimental running environment is Intel(R) i5-4210H, the running memory is 8GB, and the GPU is NVIDIA GTX 850.

A. EXPERIMENTAL SETUP 1) LEAKY CABLE FIXTURE DATA SET
In this section, the leaky cable fixture data set is introduced in detail. As is shown in Fig. 9 (a), a CMOS dual-line camera is used to take the entire picture during the whole train journey, and the size of obtained picture is 2048×512.
Since the camera has its own light source and is open on during the entire working time, the influence of illumination on the image is reduced. It can be found clearly that the brightness of different backgrounds varies largely even in the same tunnel section, just like Fig. 9 (b) and (c), which causes the threshold setting to be quite troublesome. Fig.  10 (a), (b), and (c) show that the shapes of normal cable fixtures are relatively simple, especially for their contour characteristics are tend to be the same basically. While the shapes of faulty cable fixtures shown in Fig. 10

2) THRESHOLD OF BDT
In order to obtain the most suitable threshold of BDT, a total of 50 images with global average grayscale distributed in different intervals are selected. Then the specific corresponding relationship between the global average grayscale and threshold is analyzed through the imaging effect of leaky cable fixture contour. Gaussian filtering is performed on the input image, and the imaging effects of their respective preliminary gradient direction feature maps under different threshold conditions are compared. To make the process as intuitive as possible, the one-dimensional binary array is converted into a decimal gray value. Although the range of threshold should be as small as possible, it still needs to meet the following two requirements: 1) It must be ensured that the preliminary gradient direction feature map can show the outline of the leaky cable fixture completely and clearly. 2) It can filter the background noise of the tunnel wall to a certain extent.
For the convenience of display, the cable fixture part is cut from entire image. It can be seen from Tab. 1 that the larger range of threshold, the better display effect in low brightness images, while more interference will appear in high brightness images. Although a small range of threshold can effectively remove background noise, its expression of contour features will also be weakened. Therefore, it is necessary to set the adaptive threshold according to the global  gray average value, which is defined as follows: Where x is the global grayscale average value of input image. From Tab. 1, it can be seen that the ratio of gray-scale difference is basically maintained at around 1. For images with high brightness, the imaging effect of different threshold ranges is generally better, so the threshold range can be smaller, which can filter out more noise. For images with low brightness, the imaging effect with a smaller threshold range will cause the loss of contour feature. To ensure that the contour feature can be displayed completely, the threshold range needs to be larger at this time. Then, the threshold range is set to five different levels, and different average gray levels correspond to different thresholds.

B. EXPERIMENTAL COMPARISON RESULTS
The following two aspects of experimental research are carried out. The first experimental study is to compare between different down-sampling layers to obtain the best fusion descriptor of MCG-LBP, and the other is to compare with LBP variant and non-LBP operators to verify the superiority of MCG-LBP.

1) COMPARISON BETWEEN DIFFERENT DOWN-SAMPLING LAYERS
The down-sampling process will iterate out many feature maps, even if their size will decrease gradually, but compared with directly dividing the feature maps into 4×4 or 2×2, the process still has a certain amount of calculation. Therefore, it is necessary to comprehensively consider computational efficiency and detection accuracy to extract the most costeffective descriptor. In the experiment, a total of four iterations are performed, then continuous gradient features are extracted from the four down-sampling feature maps respectively. Their histogram features are displayed in Fig.  11, it can be seen that the proportions of dominant features remain unchanged basically, but other features have changed considerably. The subsequent sampling map contains fewer continuous gradient features than previous layer, which is the main reason why only four iterations are performed. In down-sampling progress, the stride of Cell is 2, and in continuous gradient feature extraction process, the stride of Block is 1. Therefore, it can be inferred that the relationship of occupancy time between down-sampling process and feature extraction process is as follow: Due to the different strides of these two processes, the feature extraction process takes more time than the down-sampling process, so it is critical to select appropriate sampling maps to extract continuous gradient features. Average Recall rate, average Precision rate and average Accuracy are used as the evaluation indicators of classification result. Recall refers to the proportion of these fixtures that are judged correctly as faulty in all faulty fixtures, Precision refers to the proportion of these fixtures that are judged correctly as faulty in all fixtures that are judged as faulty, and Accuracy refers to the proportion of these fixtures that are judged correctly in all fixtures. The specific definitions are as follows: Where T P represents the number of faulty fixtures identified as faulty actually, T N represents the number of normal fixtures identified as normal, F P represents the number of normal fixtures identified as faulty, and F N represents the number of faulty fixtures identified as normal. As we can see from Tab. 2, the time occupancy of extracting each layer feature conforms to (17). It also shows the performance of extracting continuous gradient features from each single down-sampling map, including the evaluation indicators of Recall and Precision. The performance of identifying faulty cable fixtures is more in line with actual needs, so Recall is the primary factor. Although it will take less time to extract continuous gradient features from subsequent downsampling maps, the performance of classification is reduced to a certain extent.
There are a total of six cascading types to combine these four single-layer features in pairs, and the detection results are shown in Tab. 3. Among that, 1-2 means to connect the first layer descriptor and the second layer descriptor  together, and others are the same. There is not only a big gap between Recall and Precision of these fusion features, but also the time taken by different cascading types is very different, especially for these descriptors including the first layer. In addition, the three-level connections such as 1-2-3 that require more time and its feature dimensions will also increase. The Recall of 2-3 reaches 92.3% which is the highest compared to other descriptors. Finally, considering the classification performance and time cost, the fusion type of 2-3 is defined as MCG-LBP, which will be applied in comparison with other variant operators.

2) COMPARISON WITH LBP VARIANT AND NON-LBP OPERATORS
The fusion descriptor of MCG-LBP is compared with the upto-data methods on cable fixture data set. Some supplementary instructions of these LBP variant and non-LBP operators are as follows: 1) CLBP_S riu2  [43]. So the final feature dimension of output descriptor is 3200 (3200=200×4×4). For the convenience of expression, CLBP is referred to this method specifically, and the calculation of CLTP is the same. 2) As for CS-LBP operator, the input images are divided into 8×8, and the resolution of sub-image is 256×64.
Since the feature dimension of each sub-image of CS-LBP is 16, the final feature dimension is expanded to 1024 (1024=16×8×8). 3) As for HOG operator, the bins of each gradient component are 9, and each Block size is 128×32, which is very similar to dividing image into 16×16. There are 256 windows on the whole input image to count the histogram features, so the dimension is 2304 (2304=256×9). 4) The length of LDP for a reference pixel is 32 bit which is too large. So, we just calculate it to second order, and the uniform pattern is used to code in derivative space of different directions. The input images are divided into 4×4, and the feature dimension is 3776 VOLUME 4, 2016 (3776=16×4×59). 5) As for LDGP operator, we just calculate it to second order, and divided the input image into 4×4. The feature dimension is 4096 (4096=16×4×2 6 ). 6) The masks of LDTP are the same as [23]. To prevent too many dimensions, we divide the image into 2×2, and the dimension is 4096 (4096=4×2×2 9 ). 7) HOG_LBP, HOG_CS-LBP, and HOG_CLBP fusion features refer to concatenating the descriptor obtained at before, and their dimensions need to accumulate directly. 8) For other variant patterns, we perform a 4×4 division with non-overlapping, the resolution of sub-image is 512×128, the sampling radius is 1, and the sampling point is 8. The volumes of normal fixture set and faulty fixture set are obviously unbalanced, so another evaluation index AUC needs to be added. It is widely used in machine learning twoclass classification, which has a greater tolerance for samples unbalance. In order to explain specifically, the following formula needs to be explained firstly: Where TN represents the number of normal fixtures identified as normal actually.The curve drawn with FPR as abscissa and Recall as ordinate is ROC, and the area enclosed by ROC curve and coordinate axis from 0 to 1 is AUC value. From the comparison results in Tab. 4, it can be seen that the MCG-LBP operator proposed in this paper has significant advantages in faulty cable fixtures classification. Recall and Precision reached 92.3% and 85.7% respectively, and the  On the other hand, from the perspective of feature dimensions, the dimension of MCG-LBP is only slightly higher than that of CS-LBP, and is far lower than other LBP variant and non-LBP operators, however the detection performance is greatly improved. The size of second down-sampling map is 512×128, and the size of third down-sampling map is 256×64. If we divide input image by 4×4 directly, there are sixteen sub-images with the size of 512×128. Compared with image segmentation, extracting features from the downsampling image can obviously reduce the computational cost. The main reason is that our dimensionality reduction strategy shrinks the area of extracting features by down-sampling iterations, instead of extracting from sub-images one by one directly, which also shortens the length of final descriptor effectively. It is not difficult to find that even the continuous gradient features obtained from each single layer, and their classification results also have high Recall and Precision rates such as 92.3% and 94.1%. The second and third downsampling maps not only realize the scale transformation of feature map through iteration, but also contain the key gradient information of cable fixture contour. Therefore, the MCG-LBP descriptor proposed in this paper has obvious advantages.

V. CONCLUSION
This paper proposes MCG-LBP operator for leaky cable fixture detection in the railway tunnel. In order to extract more specific gradient information in the circle neighborhood, the model of bi-directional triplet is introduced. Then, a dimensionality reduction strategy is adopted to implement successive down-sampling iterations, which not only realizes the scale transformation of cable fixture features, but also reduces the calculation for extracting continuous gradient features and greatly increases the computational efficiency. Finally, to further enhance the single layer descriptor, the continuous gradient features extracted from different downssampling maps are cascaded to promote classification performance. MCG-LBP has a great improvement in both detection performance and feature dimension, especially decreasing the dependence on the faulty data set, which has superiority in leaky cable fixture detection.The proposed algorithm also provides a reference for the detection of small targets in large images such as airports and subway security inspections, and remote sensing image detection. In further research, combining MCG-LBP with deep learning is a topic worthy of study. Especially to realize a real-time detection system for leaky cable fixtures.