The Centerline Extraction Algorithm of Weld Line Structured Light Stripe Based on Pyramid Scene Parsing Network

Based on the good feature learning ability of the pyramid scene parsing network, a method for extracting the centerline of structured light stripes of weld lines based on the pyramid scene parsing network and Steger algorithm is proposed. This method avoids the traditional complex weld image preprocessing technology, and simplifies the operation steps of extracting the centerline of the structured light stripe of the weld image line. In this paper, the pyramid scene parsing network is used to predict the pixels containing weld feature information. Through the pyramid pooling module, the local and global context feature information is fused to supplement the feature information of the weld edge, and then the Steger algorithm is used to extract the weld feature centerline. The results show that the method in this paper can accurately extract the centerline position of the structured light stripe of the weld line under the interference of reflection, and the average value reaches 86.8% on the accuracy evaluation index mean intersection over union, the 18.93 pixels on the weld extraction accuracy index root mean square error, and the average time of extracting the center line of structured light stripe of weld line is $0.188s$ .


I. INTRODUCTION
In recent years, with the rapid development of automation process in many industries such as industrial manufacturing, and the rapid improvement of related technologies, relying on robots for metal welding operation has gradually become an indispensable link in the production process. In order to meet the requirements of welding process in modern automatic production, it is necessary to obtain real-time information of weld centerline and weld feature points in the welding process. Welding seam tracking technology based on active vision has become one of the research hotspots in the field of automatic welding [1]. In the production process of welding in the factory, it is always affected by many interference conditions such as splash, dust and strong arc light. Therefore, in the process of automatic welding, it is particularly The associate editor coordinating the review of this manuscript and approving it for publication was Orazio Gambino . important to use line structured light technology to accurately obtain weld position information [2].
At present, there are many methods for feature extraction of line structured light stripe center line in weld image. Weiming et al. [4] proposed a fast line laser stripe center extraction algorithm, which has certain anti-noise performance. However, when the background is complex, the light quality of line structured light is not good, and the stripe is non-uniform and robust, the algorithm needs to be combined with more algorithms to operate. Keqi et al. [5] proposed a weld detection method based on image preprocessing and post-processing, but the detection method is extremely complex and has poor anti-noise performance. Jie and Yawen [6] proposed a variable threshold segmentation algorithm based on Otsu threshold (Otus). This method effectively removes the influence of background noise, but it takes too long and affects efficiency. However, these methods for processing weld images need to cooperate with complex image preprocessing techniques due to the smoke and dust conditions and light conditions during welding, which requires more practical experience of practitioners to select suitable operation methods for different scenes and complex conditions. And the practical process of operation steps are too cumbersome, not conducive to universal application. Yongshuai et al. [7] proposed a method of weld feature extraction based on full convolution neural network, but the accuracy is not high. It can only extract the edge contour of the weld, and cannot extract the center line of the structured light stripe of the weld line. Therefore, the feature extraction algorithm of weld image needs to be developed to the intelligent learning algorithm with wider learning ability [8]. The feature extraction algorithm of weld image based on deep learning began to appear in relevant research fields to improve the adaptability and anti-interference ability of weld feature extraction [9].
Traditional neural network image segmentation algorithm in image segmentation, always for some not obvious small size object image segmentation accuracy is insufficient, or even misclassification. PSPNet [10] for scene parsing and semantic segmentation tasks, by introducing the Pyramid Pooling Module, the local and global information are fused together, and the optimization strategy of moderate supervision loss is used. The segmentation accuracy on multiple datasets exceeds FCN, DeepLab-v2 [11] and other neural network models. In this paper, a weld laser line feature extraction method based on PSPnet and Steger [12] algorithm is proposed. This method can accurately obtain the position information of the weld center line in the weld image, which can ensure the extraction accuracy of the weld center line and greatly simplify the operation steps.

A. PSPNet ALGORITHM PRINCIPLE
PSPNet is a network proposed for relatively complex scene analysis problems. Among them, its network structure is based on the residual network (ResNet) [13] as the main skeleton, and some unique designs are added after the end of ResNet. This network is mainly aimed at the traditional convolutional neural network that does not make good use of the context information in the scene, and cannot highlight and predict some unobvious category features (details that are easily overlooked), so the pyramid structure is introduced to solve this problem. In deep learning, the size of the receptive field indirectly determines the extent to which image context information is used. ResNet effectively expands the receptive field through-hole convolution and cross-layer connection, but as the depth of the network increases, the actual receptive field is still more than theoretical Feel the wild to be small. The pyramid pooling module in PSPNet uses pooling of different scales to effectively alleviate the situation that the actual receptive field in the network is less than the theoretical receptive field. Among them, the image data input size: W 1 is image width, H 1 is image height, D 1 is image depth (number of channels). The output image size after convolution is: W is the width of the input picture; H is the length of the input picture; D is the image depth (number of channels); K is the specific value; F is the dimension of the convolution kernel; S is the step size; P is the number of pixels to be filled. n represents the order. Using the same function to ensure that the output is positive and prevent overfitting, the mathematical expression of the relu function is: Convolution and activation function operation: The part in brackets is the main operation for extracting features of the convolutional layer: l is the current convolutional layer of the network, j represents one of the feature maps of the current convolutional layer of the network, x l j is the feature map, x l−1 i is the value of the i-th pixel in the receptive field, M j represents the set of selected input maps, and k l ij is the connection between the i-th map in the x l−1 i layer and the convolution kernel of the j-th map in the l layer, b i j is the bias term, and f () is the activation function. The q-th layer is the sub-sampling layer (pooling layer), and the h-th feature map is: down() is the down-sampling function (downsampling using the maximum method), w l j is the weight, and b i j is the offset. When the network propagates forward, the weight parameters of the network are updated by minimizing the training error divided by the loss. Losses during training are calculated as follows: In the formula, Loss is the network training Loss, Q(x = m) is the probability that the pixel x belongs to the category m, n is the number of pixels in the current training image, z j is the eigenvalue of the jth category, M is the total number of categories, there are two categories in this paper, so M = 2. In the phase of backpropagation updating network weight parameters [14], this paper adopts the stochastic gradient descent method (SGD) to update the weight by the linear combination of the negative gradient ∇L(W t ) and the updated value of the previous weight. The calculation is as follows: (9) In the formula, W t is the weight matrix of the tth iteration, V t is the weight update value of the tth iteration, α is the basic learning rate of the negative gradient, µ is the weight of the weight update value V t , used to weight the influence of the previous gradient direction on the current gradient descent direction.

B. PRINCIPLE OF STEGER ALGORITHM
Steger algorithm is a method for extracting the centerline of structured light stripes of weld line image based on Hessian matrix [15]. The direction where the absolute value of the second-order directional derivative in the image I ( x, y) takes the maximum represents the normal direction of the linear structured light stripe and this direction can be determined by calculating the eigenvalues and eigenvectors of the Hessian matrix. In the online structured light stripe image (weld image), the Hessian matrix is expressed as follows: H (x, y) = I xx I xy I xy I yy (10) In the formula: The formula G ( x, y) is a two-dimensional Gaussian convolution template, which highlights the gray distribution characteristics of the stripe. I ( x, y) An image matrix centered on an image point (x, y) and equal in size to a two-dimensional Gaussian convolution template.
The normal direction of the image point (x 0 , y 0 ) is the direction where the absolute value of the second-order derivative of the point takes the maximum value [16], which is given by the eigenvector (n x , n y ) corresponding to the absolute value of the maximum eigenvalue of the Hessian matrix of the image point(x 0 , y 0 ).Its tangent direction is perpendicular to the normal direction. The gray distribution function in the normal direction of the point (x 0 , y 0 ) is expanded by the second-order Taylor expansion, which is represented by I (x 0 + tn x , y 0 + tn y ). Since the gray value of the light stripe is Gaussian distribution in the normal direction of the light stripe, and the greater the light intensity is, the closer it is to the center of the light stripe. Thus the point where the first derivative of I (x 0 + tn x , y 0 + tn y ) is zero that the sub-pixel stripe center point (x 0 + tn x , y 0 + tn y ) in the normal direction of point (x 0 , y 0 ).The value of t is shown below: t = − n x I x + n y I y n 2 x I xx + 2n x n y I xy + n 2 y I yy (14) In the formula:

A. OVERALL DESIGN OF ALGORITHM
In this paper, the extraction method of weld line structured light stripe centerline is mainly based on PSPNet for image segmentation of line structured light stripe centerline in weld image and then combined with Steger algorithm for line structured light stripe centerline extraction. The algorithm consists of two parts, the specific process is shown in Figure 1. Among them, in order to reduce the time required for training the model and accelerate the convergence rate, the pre-trained ResNet-50 is used for parameter fine-tuning. Among them, PSPNet extracts the features of weld image from ResNet-50 network, and extracts the deep and shallow features of the image through the pyramid pooling module composed of pooling layers of different sizes of pooling cores, and fuses them at multiple levels to reduce the probability of false segmentation. PSPNet network for image feature extraction classification stage and the data after processing up-sampling stage, and finally through a convolution layer and up-sampling layer, to achieve end-to-end output. The data set is input into the network and trained by fine-tuning the network parameters. When the network Loss converges to a certain extent, the network training is stopped. The trained neural network model is used to predict the corresponding image segmentation, and then the Steger algorithm is used to extract the centerline of the structured light stripe. Finally, the segmentation effect and the completed parameters are checked. PSPNet network structure is shown in Figure 2. The black dotted line in Figure 2 is the core part of the PSPNet network, which is the pyramid pooling module. The resulting feature maps are pooled adaptively on four scales of 1 × 1(red), 2 × 2(yellow), 3 × 3(blue), and 6 × 6 (green), which helps the network integrate global context information. Taking the uppermost layer as an example, the uppermost layer is the roughest 1×1 global pooling to generate a singlepixel multi-channel output. If N levels are set in the pyramid, 1 × 1 convolution is used to reduce the number of channels to the original 1/N after pooling. After convolution, each level is up-sampled by bilinear interpolation, and fused with the original feature map as a priori information. The feature maps of different scales are up-sampled and fused with the original feature map before the pyramid pooling structure is applied, and further convolution is performed in turn. Finally, a complete output image is obtained.

C. EXTRACTION OF STRIPE CENTER LINE OF LINE STRUCTURED LIGHT IN WELD IMAGE
After obtaining the processed weld image by the above algorithm, the Steger algorithm with high accuracy is used to extract the center line of the weld line structured light stripe. In order to reduce the processing time, according to the light stripe area of the weld image, the corresponding image coordinates are set by OPENCV (computer vision and machine learning open source software library) Rect class operation, and the line structured light stripe line in the image is divided into a separate region of interest (ROI). In order to ensure the real-time tracking of weld seam, reduce the time of image processing as far as possible, and reduce the interference of unnecessary weld image area on the extraction accuracy of line structured light image stripe center line. For the collected weld image, the specific steps of extracting the stripe center line of the line structured light image are as follows: Step1: Get the original weld image and input it into the PSPNet model that has been trained by the data set to segment the image.
Step2: According to the preset image size, the ROI image region is obtained with the weld image as the center.  Step3: Steger algorithm is used to extract the centerline of the weld line structured light stripe. And output the image extraction results.
Through the above steps to optimize the original image of the weld, you can see the final results as shown in Figure 3, the algorithm effectively removes the interference such as reflection, and retain the integrity of the centerline of the line structured light weld image stripe.

IV. EXPERIMENTAL PROCESS AND RESULTS ANALYSIS A. DATA SET PREPARATION
In this paper, the PSPnNet-50 (one of PSPnet) with high accuracy is selected. They are all neural networks for supervised learning, so it is necessary to prepare the relevant sample label data set for network training, generate the corresponding network model, and realize the segmentation of weld images. The original image data set used in the experiment is 800 original image training data of welding seam taken by high resolution industrial digital camera, 200 validation set and 100 test data. In order to reduce the hardware resource cost and time cost of network model training, the size of the collected images is adjusted to [320 × 320] size. The Label-Me image annotation tool is used to annotate weld line structured light images, where background represents the background and the corresponding visual RGB value is (0,0,0).1001 represents the weld, and the corresponding visual RGB value is (128,0,0). The details are shown in Figure 4.
In order to ensure the final segmentation effect of the image, improve the generalization performance of the trained network model, and prevent the underfitting problem caused by too little sample data. The original image data set and label image data set are expanded, and the original image and label image are mirrored and rotated. Finally, 3200 images are obtained as the training set, 800 images as the validation set and 400 images as the test set, as shown in Figure 5.

B. PSPNet NETWORK MODEL PARAMETERS AND TRAINING
The hardware experiment environment in this article is: CPU is i7−9750 processor; RAM is 16G; GPU graphics VOLUME 9, 2021 Among them, Python is Anaconda3 (64bit) installed. Using Adam as the optimizer of the model, the momentum parameter is set to 0.9, and the learning rate is 0.0001. The batch size is set to 1, and the epochs is set to 300. Use BatchNorm for normalization. ReLU is used as the activation function. Among them, Conv and Conv2d are both convolutional layers. In order to prevent over−fitting, this article uses Dropout after conv2d_5 with a ratio of 0.1. MaxPooling2D represents the maximum pooling operation. Res_branch represents the residual module, lambda represents the linear interpolation module, and concatenate_1 represents the fusion operation between the layers. average_pooling2d_1 represents average pooling. The specific parameter information of each inter−layer model is shown in Table 1.

C. EXPERIMENTAL RESULT ANALYSIS
In order to verify whether the algorithm in this paper has some progress in image segmentation accuracy, this paper selects the Unet network and SegNet network to compare the segmentation results and uses the same data and Resnet-50 model weight parameters for training. Use test data sets to validate the trained model. Among them, to reduce the chance of algorithm optimization, in this paper, in the 3200 images of the data set (800 images for each weld image), four original images are randomly selected from four weld images, and three algorithms are used for image segmentation prediction. The comparison results of image segmentation accuracy of PSPnet algorithm, UNet algorithm and Seg-Net algorithm are shown in Figure 6. To better compare the performance of this algorithm, several important evaluation indexes commonly used in the field of neural network image segmentation algorithm are used to measure the accuracy of image segmentation: pixel accuracy (PA), mean pixel accuracy (MPA) and mean intersection over union (MIoU). The comparison results of evaluation indicators are shown in Tables 2, 3, and 4. As can be seen from the experimental results, the PSPNet algorithm used in this paper, four weld image segmentation mean, of which PA is 99.8 %. MPA was 96.7 %. The MIoU value is 86.8 %. From the comparison of Tables 2, 3, and 4, it can be seen that PSPNet algorithm is 0.9 % ahead of UNet algorithm in MPA and 1 % ahead of UNet algorithm in MIoU. On PA they are the same. The gap between PSPNet algorithm and SegNet algorithm is even larger. PSPNet algorithm leads SegNet algorithm by 8.8 % in MPA and 3.1% in MIoU. On PA, the difference is only 0.2 %. The main reason for this phenomenon is related to the data set used in this paper. The background in the weld segmentation image accounts for the main body of the whole image, so the PA value is high. And for lap weld, butt weld segmentation, the performance of three algorithms are good, so in the pixel accuracy PA, the three algorithms are not much different. Based on the above data, the PSPNet algorithm adopted in this paper has certain advantages in accuracy compared with the traditional UNet algorithm and the SegNet algorithm.
In order to verify whether the centerline extraction algorithm in this paper has made some progress in the accuracy of the centerline extraction of the line structured light stripes in the weld image, and whether it has good stability and extraction speed. In this paper, the segmentation results of the above algorithms are used as experimental objects, and the traditional gray weighted centroid algorithm (GWCA) and extreme value algorithm (EVA) are selected to compare with the proposed algorithm. The weld line structured light stripe center lines extracted by each algorithm are shown in Figure 7, 8, 9.The centerline of the structured light stripe of the weld line extracted by the algorithm in this paper is obviously close to the actual center position of the stripe. For the centerline of the stripe extracted by the other algorithms, there are obviously different degrees of position offsets. In particular, the centerline of the light stripe extracted by the Extreme value algorithm in several groups of weld images can obviously see that the centerline offset of the light stripe is very serious. In order to accurately compare the accuracy of each algorithm, the root mean square error (RMSE) is used to quantitatively analyze the effect of extraction. The results are shown in tables 5, 6and 7. It can be seen from Table 5, 6and 7 that the overall RMSE of the PSPnet-Steger algorithm extraction results is 18.93 pixel according to the extraction results of the center line of the structured light stripe of the weld line of the T-type weld, lap weld and butt weld. This result is smaller than that of Segnet-Steger method and UNet-Steger algorithm, and is much smaller   than that of PSPnet-extreme value algorithm and PSPnet-gray weighted centroid algorithm. The extraction of the center line of the structured light stripe of the weld line of the V-type weld is slightly inferior to that of the UNet-Steger algorithm. And from the average results of each algorithm, the overall RMSE of Steger algorithm is less than gray center of gravity algorithm and extreme value algorithm. It shows that this algorithm still has high accuracy.
In order to accurately compare the time loss between algorithms, the time comparison results of PSPNet−Steger algorithm and each algorithm are shown in tables 8, 9 and 10. In terms of the time comparison results of each algorithm, the PSPNet-Steger algorithm takes slightly more    time than the PSPNet−gray gravity center algorithm and the PSPNet−extremum value algorithm. Among them, the PSPNet−extremum value algorithm with the fastest speed takes 0.062s. PSPNet−Steger algorithm takes 0.188s. However, the three algorithms have little difference in time, and all meet the real−time requirements in the welding process. In summary, from the overall results, the performance of    the algorithm in this paper is still better than that of the other two algorithms. The accuracy error of the extracted centerline is the smallest, and the accuracy of the extracted center point of the light stripe is the highest under the condition of almost the same algorithm time. Therefore, the performance of the weld centerline processing algorithm proposed in this paper meets the requirements of the extraction of the centerline of the structural light stripe of the weld line.

V. CONCLUSION
In this paper, a central line extraction algorithm of a weld line structured light stripe based on PSPNet is proposed. At present, the linear structured light welding robot is susceptible to environmental factors, which leads to welding deviation, and the image preprocessing is too complex. Combined with deep learning technology, PSPNet is applied to the extraction process of weld line structured light stripe centerline. Compared with the SegNet algorithm and the UNet algorithm, PSPNet has good anti-interference ability. The extraction accuracy of the centerline of the structured light stripe of the weld line is high, and the image preprocessing process is simplified. The manual complex operation is omitted, which provides a new idea for the weld image processing method.