Obstacle Distance Measurement Under Varying Illumination Conditions Based on Monocular Vision Using a Cable Inspection Robot

Obstacle distance measurement is one of the key technologies for autonomous navigation of high-voltage transmission line inspection robots. To address the robustness of obstacle distance measurement under varying illumination conditions, this article develops a research method that fuses image enhancement with robot monocular vision so that the robot can adapt to various levels of illumination running along the transmission line. During the inspection of high-voltage transmission lines in such an overexposed (excessively bright) environment, a specular highlight suppression method is proposed to suppress the specular reflections in an image; when scene illumination is insufficient, a robust low-light image enhancement method based on a tone mapping algorithm with weighted guided filtering is presented. Based on the monocular vision measurement principle, the error generation mechanism is analyzed through experiments, and we introduce the parameter modification mechanism. The two proposed image enhancement methods outperform other state-of-the-art enhancement algorithms in qualitative and quantitative analyses. The experimental results show that the measurement error is less than 3% for static distance measurements and less than 5% for dynamic distance measurements within 6 m. The proposed method can meet the requirements of high-accuracy positioning, real-time performance and strong robustness. This method greatly contributes to the sustainable development of inspection robots in the power industry.


I. INTRODUCTION
Power infrastructure is an important foundation for people's livelihood and industrial development. The inspection of high-voltage transmission lines is necessary and routine work for the safe operation of power systems [1]. As the transmission network grows, transmission lines inevitably cross more complex terrains and cover more areas [2]. Transmission lines exposed to harsh natural conditions (hail, strong wind and rainstorms) over a long period will lead to strand breakage, counterweight slippage, line fitting damage and changes in the safe distance. Therefore, there is an urgent demand for inspecting transmission lines regularly to ensure their stable operation [3]. Currently, the inspection of high-voltage transmission lines is mainly divided into three categories: manual inspection [4], unmanned aerial vehicle (UAV) inspection [5] The associate editor coordinating the review of this manuscript and approving it for publication was Hai Wang . and robot inspection [6]. Manual inspection is laborious, time intensive and inefficient. Some power line segments near rugged mountains and rivers cannot be inspected, as they are difficult for inspectors to access. The UAV inspection load is limited, and its endurance time is short. UAVs lose control when encountering strong wind. Robot inspection features high safety, low cost, and strong load capacity and can adapt to bad weather. On this basis, research on robots for highvoltage transmission line inspection is of great significance to the sustainable development of the power industry and the protection of people's lives and safety.
During the inspection of power transmission lines, the counterweights, suspension clamps and insulator strings hinder the robot from running efficiently and stably on the ground wire. Obstacle recognition and localization have become an important development direction of inspection robots. After the robot completes the identification and classification of obstacles, the pan-tilt-zoom (PTZ) camera of the VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ robot is used to locate the obstacles and measure the corresponding distance to implement the corresponding obstacle crossing action. A series of factors that affect the detection of obstacles is highlighted as follows: 1) In a strong-light environment, the image features of the obstacles are damaged to varying degrees. 2) In a weak-light environment, insufficient light leads to the situation in which the target features cannot be recognized. The existing research methods do not fully consider these factors, so it is difficult to ensure the accuracy of obstacle positioning when a robot is running in the field. Methods based on monocular vision positioning have been widely used in inspection robots. In reference [7], a method based on a homography matrix and structure constraint was proposed to detect obstacles, and the calculation speed was fast. However, most experiments are carried out in ideal environments, and strong illumination factors do not interfere with the images. In long-term field environment inspection, the robot is vulnerable to strong-light and weak-light interference. Especially at noon and in the evening, strong light and weak light, respectively, have a great influence on the image, which leads to a low recognition rate of the robot and a reduction in the obstacle positioning accuracy. Therefore, it is of great significance to eliminate the light interference for the location of obstacles. In recent decades, numerous highlight removal methods have been presented. These methods are roughly divided into four categories: multiple-imagebased methods, single-image-based methods, learning-based methods and polarization filter methods. For multiple-imagebased methods, Shah et al. presented a specular highlight removal method from video frames by detecting correspondences [8]. Guo et al. proposed a robust method to employ three prior structures of decomposed layers [9]. Although multiple-image-based methods can better remove specular highlights, their practicability is low since the source image is often unavailable [10]. For polarization filter methods, Nayar et al. used a polarization filter to determine the diffuse reflection component. This method can address surface highlights with rich texture and different material properties [11]. Umeyama and Godin obtained images of constant diffuse reflection components and specular reflection components with different intensities by the polarization filter [12]. However, the polarization filter is not suitable for many practical applications [13]. Among learning-based methods, Chen et al. proposed a method for removing highlights from face images, which is its only application [14]. Funke et al. used a residual convolution neural network (CNN) to remove highlights from the image. However, this method requires considerable training data to improve the robustness and generalizability of the algorithm [15]. For single-image-based methods, Tan and Ikeuchi [16] also introduced the concept of specular-free images, which use a single image to remove the highlights of texture surfaces. However, this method is very time consuming and does not meet the requirements of real-time applications [16]. Shen and Zheng calculated the chromaticity distance between unclassified pixels and the center points of all classes, but it is necessary to set a threshold to control the classification of pixels, which results in different classification results depending on the threshold. In recent years, many weak-light enhancement algorithms have been proposed. These methods are mainly divided into three categories: enhancement methods based on the histogram equalization algorithm [17], [18], enhancement methods based on retinex theory [19]- [21] and enhancement methods based on deep learning [22]. It is difficult to adjust the intensity of image enhancement based on the histogram equalization algorithm, which will produce an over-enhanced result [23]. Because the retinex algorithm usually uses a Gaussian low-pass filter to estimate the illumination component, the overall contrast of the image is not high and still suffers from halo artifacts [24]. Chen established lowlight image datasets and developed a network to learn the enhancement function, but the method performed well on only the constructed datasets [25].
Robot distance measurement methods mainly include lidar ranging [26] and optical ranging [27], [28]. Lidar sensors are expensive and heavy [29]. Optical detection comprises monocular ranging and binocular ranging. Binocular ranging requires accurate matching, and the matching process is time consuming, so the real-time performance impact on visual navigation systems cannot be ignored. Compared with binocular ranging, monocular ranging has the advantages of simple principles, good real-time performance and low cost [30]. Therefore, monocular ranging is more practical than binocular ranging. Li Cheng et al. studied a monocular ranging algorithm for visual navigation of line inspection robots and ultimately achieved effective obstacle crossing; however, this method did not consider the influence of outdoor illumination on the ranging effect [7]. In view of the abovementioned light interference factors on the visual system, this paper studies the characteristics of light and proposes the solution of division and rules. The reflection component separation algorithm is applied to suppress strong light, and an effective pixel clustering method and method of estimating the intensity ratio of each cluster are proposed to suppress the strong light in the image. The tone mapping algorithm based on weighted guided filtering enhances weak illumination, and global tone mapping is used to preprocess the image; then, the weighted guided filter is used to replace the Gaussian filter of the single-scale retinex algorithm; finally, the logarithmic domain is converted to the real domain to obtain the enhanced image. This paper is based on the monocular vision ranging model of the inspection robot, combined with the characteristics of ground line imaging and introduces a parameter modification model to improve the accuracy and robustness of obstacle positioning. It also improves the adaptability of the robot to the external environment.
In summary, by analyzing transmission line corridors and inspection robot detection methods, a method based on monocular vision is proposed under the condition of light variations. The main innovations are as follows.
(1) Highlight suppression: According to the two-color model, we propose a real-time highlight separation algorithm for a single image. First, the chromaticity map of the original modified specular-free (MSF) image is calculated, and then the chromaticity values in the chromaticity map are sorted. The data points of 1/3 and 2/3 of the whole dataset are selected as the initial centers, and the color clustering is split or terminated adaptively according to the sum of squared errors. Finally, an optimization algorithm is applied to estimate the single intensity ratio of each cluster, which is used to separate the diffuse reflection components from the specular pixels.
(2) Low-light image enhancement: To increase the contrast in the image, the image is preprocessed by S-equation transformation. In local mapping processing, to solve the problems of high computational complexity and fuzzy image details of the Gaussian filter, a weighted guided filter (i.e., a filter combining the variance in the local window to adjust the regularization factor adaptively) is used to replace the Gaussian filter. The filter has a good effect on image edge processing, effectively removing the halo phenomenon, and has a fast processing speed.
(3) Monocular vision: Combined with the characteristics of camera models and ground line imaging, a monocular distance estimation algorithm for inspection robots is proposed. The ranging error mechanism is analyzed by static experiments. According to the error generation mechanism, a parameter modification scheme is proposed.
This article is organized as follows. In the ''CIR Vision System'' section, we describe the mechanical structure of the proposed CIR. In the ''Method'' section, the reflection component separation algorithm, tone mapping algorithm based on weighted guided filtering, and obstacle distance measurement based on monocular vision are proposed in detail. The ''Experiments'' section describes several experiments conducted to verify the performance of the proposed method. The ''Conclusions'' section concludes this article.

II. CIR VISION SYSTEM
As shown in Figure 1, the structure of the inspection robot is an antisymmetric suspension type with two arms [31]. The vision system of the inspection robot is composed of two network PTZ cameras on both sides of robot box I. The lens of one network PTZ camera can be rotated by any angle to detect the damage of transmission lines, broken strands of ground wires, loose strands, fitting damage, tower deformation, etc.
Another lens of network PTZ camera II is fixed at an upward angle so that its field of view shows the length of the wire used for positioning by measuring the distance between obstacles on the ground line and the robot.
When the robot moves on the linear rail, the servo drive mode is adopted in the whole process of the walking wheel motor; that is, the walking wheel motor controls the rotation of the walking wheel through the driver in the whole process of inspection to ensure the controllable speed of the robot. In order to achieve the goal of motion control, the robot must control the multi-joint motion mechanism to perform the corresponding action. By using a PC104 industrial computer, the movement of multiple motors can be effectively controlled at the same time, such as the start, stop and rotation direction of the lifting joint motor and the inward and outward rotation of the swing arm joint. Because the lens angle of camera II is fixed, the angle and intensity of direct illumination also cause damage to the strong-light area of the image features. When the robot inspects in the evening, the lack of light affects the recognition of image features. Therefore, to facilitate subsequent image recognition, it is necessary to enhance strong and weak illumination, improve the image contrast, and enhance the image details. Monocular ranging must be used to obtain the known parameters according to the actual application scenarios. In the monocular vision ranging model, the known parameters include the focal length, the vertical distance between the camera optical center and the ground wire, and the distance between the lens of the camera and the nearest imageable position on the ground wire.

A. MONOCULAR VISION SYSTEM FLOW
The designed monocular vision system consists of five modules dedicated to different tasks: image acquisition, image classification, highlight suppression, low-light image enhancement, and obstacle distance measurement. This system can be described by the flow diagram shown in Figure 2. When the average brightness range of the image is within [0, 85], it is defined as a low-light image. When the average brightness of the image is within [85,170], it is defined as a medium image. When the average brightness range of the image is within [170,255], it is defined as a highlight image.

B. REFLECTION COMPONENT SEPARATION ALGORITHM
We summarize the procedure of the specular highlight suppression algorithm as follows.

1) INTENSITY RATIO ESTIMATION
First, the MSF image is calculated, and then the minimum and maximum diffuse reflectance of all pixels are calculated according to the MSF image. The classification of the relevant pixels in the highlight area is performed in the maximum and minimum chroma space. The intensity ratio is estimated to be a separate diffuse component and specular reflection component according to the classification results.  According to the dichromatic reflection model [32], each highlight pixel is a linear superposition of diffuse and specular components. The expression is as follows: where I (x) is the intensity value of the image pixel; X = (x, y) is the pixel coordinates of the image; m d is the diffuse reflection weighting coefficient; m s is the specular weighting coefficient; = [ r , g , b ] T is the diffuse chromaticity; and = [ r , g , b ] T is the specular chromaticity. We assume that the light source has been corrected to white and normalized for the input image = [ , , ] T with = 1/3. Surfaces of nonuniform materials with the same color can be divided into pixels with only diffuse reflection components and pixels with both diffuse and specular components. On this basis, Shen et al. proposed the concept of the intensity ratio, which used the ratio of the maximum intensity values to the intensity range values (the maximum intensity value minus the minimum intensity value) of the diffuse pixel to suppress the specular highlight in the image [13]. For a nonuniform surface, the pixels are clustered in the minimum and maximum chromaticity space according to the intensity ratio to suppress the highlight in the image. We use the concept of the intensity ratio to separate the diffuse and specular components. We obtain the minimum intensity per pixel by taking its minimum and maximum value of the single-channel images: where min = min{ r , g , b } and max = max{ r , g , b }. For a pixel with pure diffuse reflection, I min (x) = m d (x) min and I max (x) = m d (x) max . Based on equations (2) and (3), the range intensity is The intensity ratio is For a pixel with diffuse and specular reflections, the intensity ratio is After clustering pixels with almost the same diffuse chromaticity, we must select the intensity ratio for each cluster. Shen et al. arranged the luminance ratio of all pixels in the region in ascending order and then selected the luminance ratio at the appropriate position as the luminance ratio of the diffuse reflection pixels in the region. The brightness ratio of the diffuse reflection pixels in this algorithm must sort the whole image and keep the original position of each pixel after sorting. This operation needs extra calculation time. Therefore, we propose an alternative method to estimate the luminance ratio of diffuse reflection pixels without sorting each cluster. The pseudocode of our algorithm is given in Algorithm 1.

2) PIXEL CLUSTERING FOR COLOR IMAGE SEGMENTATION
For color images, we use color clustering to divide the image into different color regions and then separate the highlight components in each color region according to the algorithm proposed in the previous section. Because the existence of highlights seriously affects the real color of the image, it is necessary to eliminate the influence of highlights in the image color before clustering the highlight image. Shen proposed the concept of MSF images based on specular-free images [33]. The calculation process of the MSF image is simpler and more robust in the case of noise. Therefore,

Algorithm 1 Intensity Ratio Estimation
Input: I ratio−origin : an image with unselected intensity ratios; n: maximum number of iterations; t: a ratio threshold; for each cluster c of an image do p ← average I ratio (x) for pixels in c; s ← number of pixels in c; while iteration times < n do d ← number of pixels whose I ratio (x) ≤ p in c; the MSF chromaticity map is used to cluster the original image.
For a color image, the expression of the MSF image is as follows: I min are described as follows: where N is the number of pixels in the image, and the chromaticity map of the original image is where MSF (x) = [ MSF−r , MSF−g , MSF−b ] T , and the MSF chromaticity map retains the original color features of the object, so it can be used to cluster the image color. The K-means clustering algorithm is one of the most commonly used clustering algorithms. However, the K-means clustering algorithm also has some defects: (1) the selection of the initial clustering center directly affects the final clustering result, which may lead to the formation of a local optimal solution and cluster failure; (2) the number of clusters cannot be determined and can only be roughly estimated according to previous experience, which generally cannot form the best clustering effect. In view of the shortcomings of the K-means clustering algorithm, we propose a new pixel clustering method. By sorting the chromaticity values in the dataset, the data points whose chroma values are 1/3 and 2/3 of the whole dataset are selected as the initial centers, and the cluster is split adaptively. The selection of the initial center is not random, which can effectively avoid selecting outliers as the initial center and can effectively reduce the number of iterations in the clustering process. The clustering method used in this paper can dynamically determine the number of clusters in the process of clustering and decide whether to split or end a cluster according to the sum of squared errors. The procedure of our algorithm is given in Algorithm 2.

C. TONE MAPPING ALGORITHMOR BASED ON WEIGHTED GUIDED FILTERING
We summarize the procedure of enhancing the low-light algorithm in Figure 4.
According to the definition of retinex theory proposed by Land [34], the image observed by the human eye can be expressed as the product of the reflection component and the illuminance component, and the only thing that can represent the true attributes of the object is the reflection component, which has nothing to do with the illuminance component. Jobson et al. proposed single-scale retinex (SSR) [19] and multiscale retinex (MSR) [20] based on the center/surround retina theory. The definition of SSR is: 10) where i ∈ r, g, b, representing the R, G, and B color channels; R i (x, y) is the pixel value of the reflection image in the ith color channel; I i (x, y) is the pixel value of the original image I at the ith color channel (x, y); * represents the Gaussian convolution operation; and G(x, y) is the Gaussian surround function, whose formula is: where K is the normalization constant, K = 1 exp(− x 2 +y 2 2σ 2 )dxdy , and σ represents the scale parameter of G(x, y). The value of σ can significantly affect the image enhancement results. When the SSR algorithm enhances the image, it cannot achieve a good balance between the local detail information and the color fidelity of the image, and the dynamic range compression of the image cannot achieve good results. In later research, Jobson et al. developed the MSR algorithm. MSR is defined as: where i ∈ r, g, b, representing the R, G, and B color channels. R i (x, y) is the pixel value of the reflection image R at the ith color channel (x, y); N is the number of scales; w n is the weight of the nth scale; I i (x, y) is the pixel value of the Algorithm 2 Pixel Clustering Input: A dataset: S = {x 1 , x 2 , · · · , x n }; Number of initial cluster centers: K 0 = 2; Error threshold: ϑ; 1) The chroma values in the dataset are sorted from smallest to largest, and then two data points x 1 and x 2 are selected as the initial cluster centers c 1 and c 2 .
, where S sort is the result of all pixels' chroma values arranged in ascending order, · represents rounding down, and n is the number of data points in the dataset. 2) Calculate the distance between two cluster centers , calculate its distance to each cluster center, find the minimum distance and the corresponding cluster category, and divide the sample points into the corresponding cluster category. If we find 2 ) between each data point and its cluster center in each cluster of data set S, and calculate the total sum of squares of distances ( < ϑ, the clustering ends; otherwise, continue.
as the largest cluster, where S c i is the number of data points in the dataset with a cluster center of c i ; S max is the largest subset in the dataset S, and its cluster center is c max . Find two points x p , x q in subset S max .
Remove the cluster center c max of the S max subset, and merge x p and x q into the total cluster center; that is, let c new = c k -{c max }, c new = c new ∪ {x p , x q }. c k is the cluster center set of this round of clustering. 6) Take the cluster center in c new as the initial center of a new round of clustering. The K-means clustering algorithm is used to divide the dataset S to obtain the new cluster center set c new , turning to 3). 7) The current clustering result is the final clustering result.
original image I at the ith color channel (x, y); G n (x,y) is the Gaussian surround function under the nth scale. Multiscale retinex is much better than SSR at color retention and detail highlighting, but it is much more computationally complex in time and prone to halo effects. The guided filter has advantages in image detail enhancement, which can effectively reduce artifacts. However, since all windows used by the guided filter are fixed regularization factors and the  The global mapping algorithm uses the same mapping function to map all pixels in the image. The algorithm is simple and fast and has low computational complexity. The compression curve can be set in advance or according to the image content. We use the hue mapping algorithm based on the S-equation [35], and the formula is as follows: where L out is the brightness image after compression; L wa_max = L wmax /L avg ; L wmax is the maximum brightness value of the original image; L wa = L w /L avg ; and L w is the brightness value of the original image. Converting the RGB color space of L out to the HSV color model yields VOLUME 9, 2021 where L out−R (x, y) is the gray value of image L out at the R color channel (x, y); L out−G (x, y) is the gray value of image L out at the G color channel (x, y); L out−B (x, y) is the gray value of image L out at the B color channel (x, y); L out−V (x, y) is the gray value of image L out at the V color channel (x, y); I V (x, y) is the gray value of guidance image I V at the V color channel (x, y); is a regular parameter; and P(x, y) is the value of the input image of the guided filter at (x, y).
After preprocessing by the global tone mapping algorithm, the dynamic range of the image is compressed, and then local tone mapping based on the retinex algorithm is used to maintain high local contrast and detail information. To eliminate the halo phenomenon, this paper adopts the local edgepreserving filter and uses the weighted guided filter to replace the Gaussian filter of the retinex algorithm. Li et al. added edge weights to guided filtering to form weighted guided filtering; that is, the regularization factor is adaptively adjusted in combination with the variance in the local window [36]. The edge weight is expressed as: where M is the total number of pixels in guided image I; σ 2 I ,1 (i) and σ 2 I ,1 (j) are the variances of I in the 3 × 3 neighborhood centered on pixel i and pixel j, respectively; and χ is a small constant. To preserve the details of the image and minimize the difference between the input image and the output image, it is transformed into an optimization problem; that is, the following formula is minimized: where I i is the guided image of pixel i; a k and b k are the coefficients of the linear function when the window center is k; P i is the input image of pixel i; and ε is the regularization factor.
where m is the number of pixels in window w k ; q i is the output image of pixel i processed by weighted guided filtering; and I i is the weighted guided image. The weighted guided filter has the characteristics of fast image processing and retaining image edge information. The local adaptation equation can be expressed as follows: where L R (x, y) is the locally adaptive output, and q i (x, y) is the weighted guided filter. Finally, the enhanced image is obtained by converting the logarithmic domain to the real domain.

D. OBSTACLE DISTANCE MEASUREMENT BASED ON MONOCULAR VISION
A network PTZ camera is installed on both sides of the control box. When the robot is walking on the wire, it must identify obstacles before ranging. In the algorithm, the visual method is used to identify obstacles on the line. Because of space constraints, the process of image recognition is not discussed in this article. In previous work, our research team recognized the images, and many years of long-term research and field experiments provided an effective guarantee for obstacle recognition. After the obstacle is identified, the distance along the wire between it and the robot can be determined. The monocular ranging model of the robot is shown in Figure 5. As shown in Figure 5, endpoint C of a counterweight is closest to the PTZ camera, and the projection point on the imaging plane is C'. The optical center of the PTZ camera is O 0 , the vertical distance between the optical center and the transmission line is |O 0 A| = L 0 , and the optical center axis is OO 0 , which is perpendicular to the image plane. The focal length is f = |OO 0 |. The distance between the lens of the camera and the nearest imageable position on the ground wire is the length of |O 0 B| in Figure 5. Nearer points on the wire are not in the vision field. Position B is considered a near-critical point. The final measured distance is the distance between the obstacle at C and A in Figure 5; that is, |AC| = S: the distance between the obstacle and the PTZ camera lens along the transmission line. Under ideal conditions, the transformation relationship between image coordinate system o-xy and pixel coordinate system o-uv is as follows: where u 0 , v 0 , d x , d y and f are all internal parameters of the PTZ camera, which can be obtained by calibration. (u 0 , v 0 ) represents the pixel coordinates of the image center; d x and d y represent the physical dimensions of each pixel in the x-axis and y-axis directions, respectively. From equation (19), we can obtain 55962 VOLUME 9, 2021 As shown in Figure 5, we have According to the geometric model of the camera imaging, from equation (21), the following transformation relationship can be known: According to the imaging geometric model of cameras based on monocular vision, we obtain the following: The calculation formula of the distance between the obstacle and the robot can be obtained.

IV. EXPERIMENT
To verify the effectiveness of the image enhancement and monocular matching algorithm, the image of the simulated experimental site on the top of a college building was selected as the test data for experimental verification. The image quality evaluation method usually considers subjective and objective image evaluation. Subjective evaluation uses personal intuition to evaluate the quality of the image. Subjective evaluation is simple and fast, but the results of evaluation vary greatly. When the image results after image enhancement are not much different, it is difficult to find these subtle differences with subjective evaluation. The subjective evaluation must be integrated with the objective evaluation to fully analyze the quality of the enhanced image processing results. Objective evaluation uses mathematical modeling based on the human visual system to evaluate the quality of the image through evaluation indicators. Common evaluation indicators include the information entropy (IE), peak signal-to-noise ratio (PSNR), mean value (MV), standard deviation (SD) and average gradient (AG). The calculation formulas are as follows.
(1) The MV (µ): The MV refers to the brightness of the image in the overall range. The larger the MV, the greater the overall brightness of the image, and vice versa. The expression is as follows: where M × N represents the size of the image, and I (i, j) represents the pixel gray value of the ith row and jth column.
(2) The SD (σ ): The SD of the image represents the size of the change in the gray value-that is, the change in the edge texture of the image. The edge texture can mainly express the detailed information of the image. The higher the SD, the richer the texture details in the image. The greater the contrast of the image, the more conducive it is to human observation; in contrast, the smaller the SD, the less obvious the details of the image. The expression for the SD is as follows: where M × N represents the size of the image; I (i, j) represents the pixel gray value of the ith row and jth column; and τ represents the average value.
(3) The IE (E): The IE reflects the richness of information available in the image. The larger the IE value, the richer the information contained in the image; the expression is as follows: where P e represents the proportion of pixels with gray value e in the image.
(4) AG: The AG reflects the clarity of the image and the clarity of the texture and edges of the tiny details in the image. The expression is as follows: where M × N represents the size of the image, ∂f /∂x represents the gradient in the horizontal direction, and ∂f /∂y represents the gradient in the vertical direction.
(5) Spatial frequency (SF): The SF reflects the overall activity of the spatial domain of an image. The formula is where RF is the spatial row frequency, and CF is the spatial column frequency. The expressions of RF and CF are as VOLUME 9, 2021 follows: where F(i, j) represents the pixel gray value of the ith row and jth column of the image. (6) PSNR: The PSNR is the most important objective evaluation index for images. The larger the PSNR is, the better the anti-noise performance of the image enhancement algorithm.
Here, MSE represents the mean squared error of the current image X and the reference image Y; M ×N represents the size of the image; n is the number of bits per pixel, generally taken as 8; and the number of pixel gray levels is 256. The unit of the PSNR is dB, and the larger the value is, the smaller the noise in the image.

A. PARAMETERS FOR THE SPECULAR HIGHLIGHT SUPPRESSION METHOD
In the experiment, the highlight images of three analog circuits are collected. There are two key parameters in the highlight suppression method: the proportional threshold t and the error threshold ϑ. From the above analysis, the setting of these two parameters has a certain impact on the results, so we focus on the analysis of the proportion threshold and error threshold to select the best algorithm parameters. In the experiment, we tested the images under different thresholds. Figure 6 shows the distribution of the PSNR in three images under different proportional thresholds t and error thresholds ϑ.
The results in Figure 6 show that, to obtain the best highlight separation results, the threshold parameters required for different images should be different. However, in practice, it is not realistic to set different parameters for each image to obtain the optimal result. We find a set of robust threshold parameters, which are t = 0.55 and ϑ = 0.3. To prove the superiority of the highlight elimination method in this paper, it is compared with other algorithms. The experimental results were evaluated from both subjective and objective aspects. Compared with other typical algorithms and the latest specular highlight suppression algorithm, the results are shown in Figure 7.
As shown in Figure 7 and Table 1, the counterweight is in a strong-light environment, and the upper part of the counterweight has strong sunlight. The MSR algorithm increases the average gray value of the image by 32.16%, making the  strong-light image brighter. From the perspective of visual effects, the HE and AK [40] algorithms do not separate the sun highlights in the image. The illumination of the image processed by the proposed method is uniform, and the specular reflection component of the image is effectively separated.
As shown in Figure 8 and Table 1, the tower head and obstacle group are in a strong backlit environment, and the sunlight in the lower part is very strong and occupies a part of the image area. After HE algorithm processing, the stronglight-occupied part of the image becomes larger. After MSR treatment, they became white and over-enhanced. The Shen et al. and Yang et al. algorithms are less effective on the obstacle images. There is no obvious change in the image after AK algorithm processing. After processing by the proposed method, the specular highlight component is separated, and the image becomes clear. Figure 9 shows the highlight image of a double-loop simulated experimental circuit. The SD of the HE method  is 37.69% higher than that of the original image, and the contrast of the image is stretched. However, it can be seen from the image that HE mainly stretches the contrast of white clouds in the background, while the foreground of the dark area becomes darker, and the contrast decreases. The MSR algorithm improves the brightness of the highlight image by 22.58%. The SD of the Shen and Zheng [13] and Yang et al. [37] algorithms are 22.72% and 19.7% higher than that of the original image, respectively. The SD of the proposed algorithm is 25.38% and higher than that of the original image. The contrast of the whole image is greatly improved. The above comparison is put forward at the level of subjective perception, and the evaluation results are easily affected by the external environment and psychological factors. For the same image, different people will come to different conclusions, so it is difficult to develop a unified standard for subjective evaluation, which is unstable and difficult to apply on a large scale. Due to the shortcomings of subjective evaluation in practical applications, the objective evaluation method relies on only the image itself to establish a mathematical model. Objectively, the image quality can be compared with specific values. The objective evaluation of images mainly VOLUME 9, 2021   Table 1. includes the IE, average brightness, SD, AG, SF and PSNR. The objective evaluation method has the advantages of simple calculation, fast operation speed and objective stability. Therefore, it is often used as the gold standard to evaluate image quality. Table 1 lists the objective evaluation standards of images and gives the five evaluation indexes of each image.
According to the test data in Table 1, data analysis is carried out; the results are shown in Figures 10-12.
The data in Table 1 show that the HE, AK and proposed algorithms suppress the brightness of the highlight image, and the MSR algorithm excessively increases the brightness of the image.
The SD value of the HE algorithm is the largest because the histogram after HE enhancement is not flat, and the gray level is reduced. The SD value of the proposed method is second only to that of the HE algorithm, and the experimental results show that the method enhances the image contrast and edge texture changes.
The proposed algorithm reaches the maximum IE value, indicating that the enhanced image information and high definition are rich in this method. The AK algorithm also reaches a higher IE value because it is a good estimate of image diffuse reflection.
The AG value obtained by the HE algorithm is the highest because the image processed by the HE algorithm increases the contrast of the background interference information, so the HE algorithm is not considered. The proposed algorithm achieves the second highest AG value, which proves that the image processed by this method has richer levels.
For the image SF, the HE algorithm achieves the highest SF value because it excessively enhances the local contrast. The SF value of the proposed method is second only to that of the HE algorithm, which proves that the image enhanced by this method has better clarity.
For the image PSNR, the value of the proposed method is the highest, and the method has the best image noise suppression effect. The PSNR of the AK algorithm ranks after the proposed method, with a strong denoising ability and good robustness. The Shen et al. and Yang et al. algorithms increase the definition of the image while reducing the sense of hierarchy.

B. PARAMETERS FOR THE WEAK-LIGHT ENHANCEMENT METHOD
Lambda is a key parameter in the weak-light enhancement algorithm. From the above analysis, we can see that the setting of this parameter has a certain impact on the results of exposure enhancement, so we focus on analyzing lambda to select the best algorithm parameters. We tested different thresholds in the experiment. The results of the image exposure enhancement under different lambdas are shown in Figure 13.
The results in Figure 13 show that when lambda = 0.2, 0.3, 0.5, 0.8, 1.0 and 1.2, the weak-light-enhancement degree is not as good as when lambda = 0.1. When lambda = 0.08 and 0.1, the weak-light-enhancement effect is very good, and the change is not obvious; we set lambda = 0.1.  To prove the superiority of the weak-light-enhancement algorithm, we compare it with other algorithms. The experimental results were evaluated from both subjective and objective aspects. The results of the selected images are displayed in Figure 14.
As shown in Figure 14, Figure 15 and Table 2, the MSR, AFBE [38], BIMEF [39], and LIME [41] algorithms improve the overall brightness of weak-light images to the level of the strong-light images. The PSNR of the HE algorithm is the lowest among the six enhancement algorithms, and it is easy to introduce new noise. The PSNR and IE of the proposed method are the highest, which indicates that the effects of noise reduction and clarity enhancement are good.
As shown in Figure 16 and Table 2, although the values of AG and SF of the HE algorithm are the highest, the experimental image distortion is too high, and the image details are seriously lost. Obviously, the HE method is not desirable. The   can achieve high definition, prominent local information and good color fidelity at the same time.
The µ, σ , E, AG, SF and PSNR are used as evaluation indexes to objectively evaluate the image enhancement algorithms.
The data in Table 2 show that the brightness of the weak-light image is greatly improved by the six algorithms. The MSR and AFBE algorithms over-enhance these three images, and the enhanced images become high-brightness images.
The HE method achieves the maximum SD value compared with the other five algorithms because the HE algorithm has a good ability to improve the degree of image gray value change. The proposed method achieves the next largest SD, which shows that the method can effectively enhance the details of the image.
The enhanced image reaches the maximum IE value, which shows that the weak-light-enhancement factor selected by the proposed method is also the best. The proposed method also achieves a higher IE value because it estimates the illumination of each pixel of the image well. The MSR, AFBE, and BIMEF algorithms achieve poor IE results.
For the AG of the image, the value obtained by the HE algorithm is the highest because the local contrast of the image processed by the HE algorithm is not naturally enhanced, so the HE algorithm is not included. The second highest AG is obtained by the proposed algorithm, which proves that this method can enhance the definition of the fine details of the texture and edge of the image.
For the SF value of the image, the HE algorithm obtains the highest value because the gray level of the transformed image is reduced and is not flat. The SF value of the proposed method is second only to that of the HE algorithm, which proves that the proposed method has a good ability to highlight image details.
The PSNR of the proposed method is the highest, and the effect of suppressing image noise is the best. The PSNR of the HP algorithm ranks very high, which effectively denoises while maintaining the image edge clarity. The HE and MSR algorithms easily cause color distortion and introduce new noise in the color image enhancement results.

C. OBSTACLE DISTANCE MEASUREMENT PARAMETER ANALYSIS
For the estimated distance, from equation (25), the factor that affects the distance measurement result is the focal length f. From the image, we obtain |OB'|, |OC'| and |B'C'|. |O 0 A| and |O 0 B| must be obtained through actual measurement. During the ranging process, the focal length and rotation angle of the navigation camera must be fixed to keep the values of |O 0 B'|, O 0 BC and O 0 B C unchanged. Therefore, the estimated distance is related only to the position of the target on the transmission line, which is determined by the sizes of |O 0 B| and B O 0 C . Among them, B O 0 C can be directly obtained by image processing, and it is related to the focal length f. From equations (22) and (23), it can be seen that The values of u 0 and v 0 can be obtained from the acquired images. The resolution of the captured image is 1280 * 720; therefore, u 0 = 640 and v 0 = 360. The parameters of the monocular ranging experiment are listed in Table 3.
The focal length f is the result of calibration, and there are errors in the calibration parameters. We set f = 947 pixels as the initial value.

D. OBSTACLE DISTANCE MEASUREMENT PARAMETER MODIFICATION
Traditional parameter modification methods include function fitting and neural networks. The neural network parameter modification method can be attempted in the case of determining the causality and when the conventional fitting effect is not good; the neural network needs a long learning phase, which does not meet the requirements of the robot ranging for instantaneous efficiency, so it is not a wise choice. The first type of traditional parameter modification function is to    Table 2. add a variable behind the model to modify the parameter, but this method invisibly increases the complexity of the model. Another traditional parameter modification function is to fit and modify the parameters of the model itself. It changes only the variables in the original model and keeps the complexity of the original model unchanged.
From equation (33), it can be found that the cause of error is related to the size of the focal length f. Before establishing the parameter modification, first, the laser rangefinder is used to measure the obstacles and place the obstacles on the ground wire at fixed intervals (0.1 m); then, the ground wire is marked, as shown in Figure 20.
The expression between the focal length and the actual distance is as follows: . Quantitative comparison results for Figure 16 in Table 2. where (34), (i = 1, 2, · · · , 52) represents the mark number of the distance from the obstacle to the robot from 0.8 m to 6 m (with an interval of 0.1). B C i represents the value of B C when the obstacle is at the ith position on the transmission lines. OC i is the value of OC when the obstacle is at the ith position on the transmission lines. S i is the actual distance of S (measured by a laser rangefinder) when the obstacle is at the ith position on the transmission line. If the actual distances S i , B C i , OC i , O 0 B C , L 0 and L 1 are known during data preparation, then there is only one unknown quantity f i in the model. After calculation, f i is obtained. This paper intends to use the fitting method to fit the calculated f i . Equation (34) shows a change occurs with the change in B C i . As shown in Figure 21, f i can be simulated according to B C i . When the results of fitting parameter f i are obtained, the relationship between f i and B C i is as follows.

E. MONOCULAR VISION STATIC RANGING EXPERIMENT
Multiple targets with different distances were selected for the distance measurement experiment. When the obstacle distance measurement parameters are L 0 = 0.8 m, L 1 = 0.93 m, AO 0 B = 30.66 • , O 0 B C = 50.08 • and O 0 B = 1268 pixels, we take the counterweight distance measurement as an example. Table 4 shows the experimental data. Fifty measurements were made, and then the measured values were averaged.
Because the focal length f is directly calculated from the actual distance, the distance calculated by monocular vision is basically equal to the actual distance in static experiments. According to Table 4, the monocular vision parameter modification greatly reduces the error rate, and the error rate is basically kept below 3%. This finding shows that the introduction of monocular vision parameter modification has achieved good results, and the ranging accuracy has fully met the static error level of 5% of the inspection robot. When the obstacle image is in a strong-light environment, the effects of the obstacle distance measurement before and after applying the highlight suppression algorithm in this paper are compared in Table 5.
In Table 5, the SHB is the distance value of the obstacles measured through a visual method before applying the specular highlight suppression algorithm. The SHA is the distance value of the obstacles measured through a visual method after applying the specular highlight suppression algorithm. The SHBE is the error rate of the obstacles measured through a visual method before applying the specular highlight suppression algorithm. The SHAE is the error rate of obstacles measured through a visual method after applying the specular highlight suppression algorithm. Table 5 shows that when the specular highlight suppression algorithm is introduced, the ranging accuracy is significantly improved. When the obstacle image is in a low-light environment, the effects of the obstacle distance measurement before and after applying the weak-light-enhancement algorithm in this paper are compared in Table 6.
The LLB is the distance value of the obstacles measured through a visual method before applying the low-light image enhancement algorithm. The LLA is the distance value of the obstacles measured through a visual method after applying the low-light image enhancement algorithm. The LLBE is the error rate of the obstacles measured through a visual method before applying the low-light image enhancement algorithm. The LLAE is the error rate of the obstacles measured through a visual method after applying the low-light image enhancement algorithm. Table 6 shows that when the low-light image  enhancement method is introduced, the ranging accuracy is significantly improved.

F. MONOCULAR VISION DYNAMIC RANGING EXPERIMENT
In addition to the static test, this paper completed robot motion test experiment planning [42]- [44]. Because the robot itself not only provides vision equipment but also is equipped with a walking odometer, it can verify the distance estimation accuracy. In the experiment, the robot moves at three different speeds: the wheel speeds are 20.85 r/min, 46.23 r/min and 68.46 r/min, and the corresponding linear speeds are 6.36 m/min, 14.1 m/min and 20.88 m/min. The experimental scheme steps of robot dynamic ranging are as follows.
(1) The time stamp is added to the monocular vision program, and the time stamp and monocular ranging values are displayed in the image. As shown in Figure 22, the timestamp is the time in Beijing.
(2) When the robot runs at a constant speed, the test distance at a certain moment of monocular vision is S 1 , and the corresponding time stamp is displayed as T 1 . When the time stamp is T 2 (T 2 > T 1 ), the test distance of monocular vision is S 2 .
(3) The speed V of the robot is known. Let S difference = |S 1 -S 2 |, and use D 1 = V (T 2 -T 1 ) to determine whether D 1 is equal to S difference . The absolute value of the difference   between D 1 and S difference is D monocular vision dynamic error = (| D 1 -S difference |/D 1 ) * 100%. Figure 23 shows the curve of the distance estimated by monocular vision and the actual distance when the robot moves at three speeds. In Figure 23, the curves marked with '' '' (red), ''$'' (blue) and ''∇'' (cyan) correspond to the actual distance curves between the robot and the obstacle at three speeds of 6.36 m/min, 14.1 m/min and 20.88 m/min. The curves marked with '' '' (black), '' '' (magenta) and '' '' (green) correspond to the distance curves estimated by the robot vision under the three speeds of 6.36 m/min, 14.1 m/min and 20.88 m/min.
In Figure 23, the abscissa represents the time, and the ordinate represents the distance. To reflect the advantages of the proposed ranging algorithm, the analysis is shown in Table 7 and Figure 24. VOLUME 9, 2021 At different time points, the visual dynamic error rate curves of the robot at the three speeds of 6.36 m/min, 14.1 m/min and 20.88 m/min are shown in Figure 24. Table 7 shows that the error rate of dynamic ranging is basically kept below 5%. The ranging accuracy reaches 10% of the dynamic error level of the robot.

V. CONCLUSION
Aiming at the problems of illumination influence and obstacle distance measurement of obstacle images taken by a high-voltage transmission line inspection robot in the process of autonomous operation, an image enhancement fusion monocular vision method is proposed. By using a highlight suppression algorithm to suppress the highlight part of an image, the overall brightness of the processed image is significantly reduced, and the image contrast is enhanced. Compared with the other five image enhancement methods, this method has the best detail information retention ability as shown through visual analysis and objective evaluation of the image quality. The low illumination part of the image is enhanced by the hue mapping algorithm based on a weighted guided filter, which improves the overall brightness and clarity of the image and highlights the image details. Compared with the other five latest algorithms, this method achieves the best performance on illumination compensation and contrast enhancement through visual analysis and objective evaluation of the image quality. In this paper, the monocular ranging model of the robot is established, and the ranging parameter modification is introduced. According to the mechanism, the experiment is improved, and the accuracy of the ranging method is finally verified. The experimental results of outdoor ranging show that the error rate is less than 3% for static ranging and less than 5% for dynamic ranging. It has good robustness and strong anti-jamming ability. YI WU received the B.S. degree in mechanical engineering from the School of Power and Mechanical Engineering, Wuhan University, in 2018, where he is currently pursuing the M.M. degree. His research interests include system modeling and intelligent control algorithms. VOLUME 9, 2021