Prediction Model for Back-Bead Monitoring During Gas Metal Arc Welding Using Supervised Deep Learning

Creating and consistently maintaining the weld shape during gas metal arc welding (GMAW) is vital for ensuring and maintaining the specified weld quality. However, the back-bead is often not uniformly generated owing to the change that occurs in the narrow gap between the base metals during butt joint GMAW, which substantially influences weldability. Automating the GMAW process requires the capability of real-time weld quality monitoring and diagnosis. In this study, we developed a convolutional neural network-based back-bead prediction model. Specifically, scalogram feature image data were acquired by performing Morlet wavelet transform on the welding current measured in the short-circuit transform mode of the GMAW process. The acquired scalogram feature image data were then analyzed and used to develop labeled weld quality training data for the convolutional neural network model. The model predictions were compared with welding data acquired through additional experiments to validate the proposed prediction model. The prediction accuracy was approximately 93.5%, indicating that the findings of this study could serve as a foundation for the future development of automated welding systems.


I. INTRODUCTION
Gas metal arc welding (GMAW) is a welding method in which metal is melted by generating an arc between a consumable electrode and a base metal. Owing to its high metal deposition rate, GMAW is suitable for automatic welding and is widely applied in various industries, such as shipbuilding and automobile manufacturing [1]. In butt joint GMAW, the back-bead is typically not uniformly generated on the back side of the welds because of differences in penetration depth or gaps between the workpieces of each welded section. Lack of back-bead uniformity adversely affects the mechanical properties and weldability of the welded structure [2]. Consistently generating a back-bead improves productivity by reducing the work required to repair the welds. Therefore, The associate editor coordinating the review of this manuscript and approving it for publication was Shunfeng Cheng. the real-time prediction of back-bead generation is crucial to monitoring and minimizing changes in the back-bead shape.
Several studies of the GMAW process have been conducted to predict the shape of the weld bead, optimize process parameters, and improve processes. Most of these studies employed statistical models, artificial neural network (ANN) models, or machine vision. For example, Lee et al. [3] performed multiple regression analysis for welding process variable control to obtain the desired back-bead shape and developed a process variable prediction system using an inverse transformation that could be practically applied to automated welding. Jeong et al. [4] applied a back-propagation neural network to model the welding process and identify the relationships between welding variables and the bead shape. From the results obtained, they developed a system to determine the optimal back-bead shape. Lee and Koh [5] proposed a method for predicting the width and depth of the back-bead using an ANN. They considered four welding parameters: groove gap, welding current, arc voltage, and welding speed. Kim et al. [6] developed a mathematical model for predicting the width and height of the back-bead. They optimized welding parameters, including welding current, welding speed, wire feed speed and torch angle, based on their empirical models during an open-gap pipeline joining process. Joseph et al. [7] calculated the wire transfer rate and the heat input based on the current waveform measured during pulse-GMAW to investigate their correlation with the weld bead shape. Nagesh and Datta [8] used a back-propagation neural network algorithm to correlate welding process variables with characteristic weld bead geometry parameters and degree of penetration. They applied the algorithm to predict the bead geometry and weld penetration. Cho et al. [9] analyzed the behavior of the molten pool and weld beads by performing three-dimensional transient numerical simulations using the volume of fluid method in a gas metal arc V-groove welding process. Pinto-Lopera et al. [10] developed a system that provides real-time weld bead width and height measurements during the GMAW process using a camera and an optical sensor.
Despite the above efforts, no studies to date have reported prediction of back-bead generation and assessment of weldability during the butt joint GMAW process by using only the welding current waveform signal measured during the welding process, without requiring additional cameras.
The welding current signal measured during the GMAW process has been the primary parameter used to judge and determine general welding quality [11]. The welding current is not only an important welding parameter in determining the depth of penetration, but also affects the transfer mode of the molten metal. However, the welding current signal is irregular, and it is difficult to obtain reliable data in real time. Kanti and Rao [12] applied an ANN to estimate the bead shape by extracting the peak value of the welding current and the wire feed rate (WFR) in the time domain. In addition, monitoring technology has been developed to determine weldability using a multi-sensor based on acoustic signals [13]- [15]. However, it is difficult to implement realtime weldability diagnostic monitoring using acoustic signals in noisy manufacturing conditions. Numerous recent studies have been conducted to develop a method that uses the measured signal characteristics to determine a correlation or a critical relationship in the frequency domain for determining whether the signal to be applied is appropriate [16]- [20]. Chu et al. [21] proposed an effective time-frequency analysis method for the short-circuit transfer mode of the GMAW process to detect welding defects and verify welding quality. Further, Huang et al. [22] studied the frequency distribution variations of welding current signals and proposed a methodology to classify welding quality. Their method uses feature vectors to extract the entropy of the intrinsic mode functions using the complete ensemble empirical mode decomposition with adaptive noise combined with an extreme learning machine.
Convolution neural networks (CNNs) have recently received significant attention. They are designed to easily learn and respond to visual object features with local, transform, and distortion invariants. CNNs have achieved performances that surpass those of humans for some complex image processing problems. Liu et al. [23] proposed a hybrid CNN-LSTM algorithm for online defect recognition of CO2 welding that extracted the primary features of the molten pool image using CNN and uses it as an input to the LSTM model. Zhang et al. [24] developed a deep learning algorithm based on convolutional neural network (CNN) to detect three different welding defects during high-power disk laser welding. Zhang et al [25] introduced an innovative monitoring system capable of diagnosing penetration conditions during laser welding process based on image processing algorithms using power-efficient computing TX2 and CNN (Convolution Neural Network). Moreover, they have been actively applied to a variety of fields, including speech recognition [26], object recognition [27], and drones [28]. The main advantage of a CNN is that the characteristics of each hidden layer are automatically learned from the input data [29]. Various approaches have been adopted to improve the generalization performance of CNNs by adding normalization to the training process [26], [27] or proceeding to a deeper layer [30]. Hasan and Kim [31] applied CNN-based transfer learning to achieve higher bearing failure diagnosis performance than those of an ANN, a support vector machine (SVM), and other methods. Verstraete et al. [32] diagnosed errors in rotating element bearings by automatically extracting features from time frequency images by applying a deep CNN (DCNN) method. Lin et al. [33] reduced the scrap ratio and improved the production quality of a casting process by applying an efficient DCNN automatic feature extraction method. Therefore, a CNN model with efficient and excellent classification performance may be a novel and effective approach for predicting back-bead generation.
In this study, we developed a system for the monitoring and prediction of back-bead generation by acquiring a scalogram feature image using the Morlet wavelet transform (MWT) of the welding current during the short-circuit transfer mode of GMAW and applying the image data to the designed CNN model. An experiment was performed using the shortcircuit transfer mode of GMAW on a butt joint of a zinccoated 590 MPa grade hot-rolled steel plate, from which a training dataset was constructed and used to train the proposed CNN. The performance of the trained CNN model and the developed monitoring system were evaluated using new welding data, which included additional cases representing other welding conditions. The remainder of this paper is organized as follows. In Section 2, MWT and the CNN theory are introduced. In Section 3, the structure of the proposed method and experimental procedure are detailed. The obtained results are discussed and experimentally validated in Section 4. Our conclusions are summarized in Section 5. VOLUME 8, 2020

A. MORLET WAVELET TRANSFORM
The welding current signal measured during welding is often nonstationary. Fourier transform (FT) can obtain the frequency component for the analysis of this type of abnormal signal but cannot effectively represent the frequency information owing to its temporal variation and multiscale characteristics [34]. A short-time FT was hence previously developed. It defines a window function and expresses the signal component corresponding to the frequency of the window function in the time frequency domain to compensate for this limitation of the FT. In this case, once the size (resolution) of the window function is determined, the size of the resolution cell in the time-frequency domain is maintained as constant. Therefore, it is difficult to interpret the multiscale function, making these methods inappropriate for nonstationary signal analysis. Instead of these techniques, the MWT was adopted as the signal analysis method in this study. as it can represent all components of a signal in one time frequency domain. The MWT is a general form of the continuous wavelet transform [35], [36] and is defined by Eq. (1).
where * is the conjugate complex number of the wavelet function and a,b (t) is a family of wavelets, consisting of the synthesis of the wavelet compression coefficient a and the transition coefficient b from the wavelet mother function (t). The mother wavelet of the MWT can be rewritten as Eq. (2).
where C n is the normalization constant, and ω 0 is the center frequency of the Morlet wavelet. The compression coefficient a of the mother function of the MWT is extremely significant to the wavelet transformation and is defined as 2 1 v , where v is the scale per octave; the larger the value of v, the more precise the scale discretization. A smaller a value provides higher time resolution, whereas a larger a value provides a higher scale resolution. In this study, the MWT analysis was conducted using MATLAB software, and L1 normalization [37], [38] was applied to normalize the frequency amplitudes. The scale per octave was set to 32, and an MWT [39] with a compression coefficient of 1 was applied to the welding current signal to derive its scalogram. The scalogram is defined as the square of the wavelet transform and can be expressed as Eq. (3).

B. CNN DEVELOPMEN
The CNN was inspired by a biological vision system that repeatedly captures information from a local area through many sensory cells. A CNN is intended to use the spatial information between image pixels. Through convolution, down-sampling, and weight sharing, it reduces computational burden. Therefore, it overcomes the limitations of existing ANNs and provides excellent performance [39]. In general, a CNN has a deep neural network structure wherein alternating convolutional and pooling layers are successively combined. Eq.(4) represents the operation performed in the convolutional layer and is depicted in Fig. 1.
where M j is the selected input feature map, and b l j denotes the bias matrix. w l ij represents the weighting matrix and connects the j th feature map of the 1 − 1th convolutional layer to the i th feature map of the 1 convolutional layer. * represents the convolutional operation, such that the feature map of the previous layer y l−1 i obtains the feature map of the current layer y l j through a trainable weight matrix and is activated by the activation function f (·) in the output layer. The filter can be described as the sum of the weights. For this weighting, a nonlinear ReLU function is used as the activation function. The ReLU function is defined as f (x) = max(0, x) and has the advantages of simplified weight re-estimation and fast training speed. The weight connected to the entire neural network is continuously updated until convergence is reached for a predetermined number of learning iterations (epochs). Further, the training occurs more quickly than for the previously used sigmoid or hyperbolic tangent activation functions [40].
The down-sampling layer reduces the computation time by reducing the size of the input feature map through the pooling operation. Scherer et al. [41] found that max pooling can lead to faster convergence and generalization. The max pooling layer, which creates a certain level of invariance to small changes, is defined by Eq. (5).
where m and n are the sizes of the max pooling kernel in the pooling layer. The max pooling kernel is applied to the area (m × n) specified in the feature map to which the activation function is applied, and the largest value is extracted to form a new feature matrix. In this process, the weight vector is not employed. Fig. 2 shows an example of using a 2 × 2 pooling kernel and stride of two. The largest input value in each kernel is passed to the next layer, and the remaining values are discarded. After max pooling, the output image is 1/4 of the input image.
The last layer of the model introduces the softmax function to classify multiple classes of the CNN. The softmax function is a transfer function that guarantees that all output values are real numbers between zero and one and that their sum is one. The k-th output of the softmax function is defined as the value divided by the exponential function of the k-th input, as shown in Eq. (6).
where n is the number of classes, and vector x is the input value for the softmax node. In a multi-class classification problem, the classification is carried out based on the probability values of the last layer. That is, when the score of each class is given to the input sample x, the softmax function estimates the probability that the sample belongs to class n. Two classes were considered in this study, corresponding to the cases of the back-beads being generated and not generated.

III. MATERIALS AND METHOD
A. PROPOSED CNN-BASED BACK-BEAD PREDICTIO METHOD Figure 3 shows the structure of the CNN-based back-bead prediction framework proposed in this study. It consists of four steps: signal preprocessing, feature image extraction, CNN model training, and prediction. First, the current signal acquired from the welding experiment (excluding the sections of unstable welding at the beginning and end of the weld) is converted into a characteristic image in the time-frequency domain using the MWT. Then, a dataset is constructed by overlapping each feature image with 90% of a window with a frame section of 0.5 s and storing the image corresponding to each window. In addition, class labels of ''0'' and ''1'' are assigned for the cases without and with back-bead generation, respectively. The feature image extraction is trained using the dataset previously constructed as the input to the developed CNN model. The training filter (convolutional layer and max pooling layer) of the CNN model slides at intervals based on the stride value corresponding with the loss function given to the input image, extracts important information (features), and stores the trained model. The CNN back-bead prediction model was verified using additional experimentally obtained welding data as the verification dataset.

B. MATERIALS AND EQUIPMENT
In this study, a galvanized 590 MPa grade hot-rolled steel sheet, with a thickness of 2.3 mm and a zinc coating that was 10 µm thick, was used as the welding material. The chemical composition and mechanical properties of the steel sheet are provided in Table 1. One-pass welding was performed in a butt joint shape without a groove. Two 150 mm × 150 mm steel plates were used as test sheets. The welding experiments were performed in constant voltage short-circuit transfer mode by a 0 • torch at a work angle of 90 • , and the material was placed horizontally and welded downward. Fig. 4(a) shows the case where a back-bead was not generated when a weld was formed between the workpieces but not fully penetrated, and Fig. 4(b) shows the case where a backbead was generated when the weld fully penetrated all the workpieces. Figure 5 provides a photo and schematic of the experimental setup. A constant voltage direct current inverter welding machine (Fronius TPS-4000, Wels, Austria) was used, and the welding experiment was performed using an X-Y stage with biaxial motion. The welding current and voltage values were controlled during the GMAW experiment based on the WFR by a synergy program, which is a programmed welding program for the GMAW process that is included with the welder. The welding current data generated in the experiment were measured in real-time using the LabVIEW program (National Instruments, Texas, USA) at 10 kHz using a current clamp and an analog-to-digital converter (National Instruments-9229, Texas, USA). The current clamp was installed on the cable between the welding power source and the workpieces, and the current signal was transmitted to the data-acquisition device, which then sent the synchronized current signal information through the Universal Serial Bus port to the PC, which processed and analyzed the signal.

C. EXPERIMENTAL PROCEDURE
To predict the back-bead generation, the welding conditions were set as summarized in Table 1, and the welding conditions including welding speed, contact to work distance (CTWD), and shielding gas were set similar to those used in the field. Further, a welding length of 14 cm were used.  Shielding gas composed of 90% Ar and 10% CO 2 was used to prevent the welding bead from oxidation during the experiment. The welding wire met the requirements of American Welding Society (AWS) standard, namely, AWS A5.18 ER70S-3, for 1.2 mm Ø wire; additionally, CTWD was fixed at 15 mm and the welding time was 14 s. The WFR was set to 4 m/min and 5 m/min, and the root gap between the test sheets was set to 0 mm and 0.5 mm.

IV. RESULTS AND DISCUSSION
A. TIME-FREQEUNCY DOMAIN ANALYSIS Table 3 presents the front and back surfaces of the beads according to the welding conditions. An observation of the back-bead shape indicates that under different WFRs, in the absence of a root gap between the test sheets, back-bead is not generated by the weld; however, in the presence of a root gap, back-bead is generated by the weld. In general, in the GMAW process, as the WFR increases, the rate of deposition increases and the arc heat applied to the base material increases accordingly, thereby increasing the penetration depth. Fig. 6 shows the typical welding current waveforms in the regions with back-bead and those without back-bead (0.5 s) shown in Table 3. The welding current signals in the regions without back-bead generally have a slightly larger current peak value than those in the regions with back-bead, and the welding current frequency is observed to be higher. In addition, the arc closing stage was irregular in the welding current signal where the back-bead was not generated, whereas the arc closing stage was regular in the welding current signal where the back-bead was generated. It is considered that the welding current waveform frequency decreases because the arc length increases slightly during back-bead generation, due to which the arc becomes unstable, and the arc time is prolonged. Moreover, the welding current value is slightly decreased in this process.
Frequency analysis can effectively distinguish between normal and defective results by comparing and analyzing the frequency components of the welding signal [21]. In this study, the window size for the measured welding current signal was set to 0.5 s (5000 samples), and data were acquired by overlapping each by 90% to analyze the frequency of the welding current. Power spectral density (PSD) analysis was then performed on each welding current window and displayed in the frequency domain. Fig. 7 provides an example of this analysis. It shows the window sliding for a certain VOLUME 8, 2020  period (3-4 s) of the experimentally obtained welding current signal, and it then overlaps by 90% and slides. Fig. 8 shows the PSD analysis results for the welding current data. For the welding current signal acquired at a WFR of 4 m/min, the maximum PSD was observed at 58 Hz and 53 Hz for conditions without and with back-beads, as shown in Figs. 8(a) and (b), respectively. When the WFR was 5 m/min, the maximum PSD values were observed at 53 Hz and 48 Hz for the conditions without and with back-beads, as provided in Figs. 8(c) and (d), respectively. The maximum PSD decreased as the WFR increased with or without backbead generation, although the value of the welding current and voltage increased as the WFR increased. In addition, the maximum PSD and frequency decreased when the backbead was generated under the same welding conditions.
The scalogram is a 3D color representation of the wavelet coefficient and is converted from the welding current signal by the MWT. Fig. 9 shows the scalograms for 4 m/min (welding current: 156 A, voltage: 18.3 V) and 5 m/min (welding current: 188 A, voltage: 19.5 V) WFR conditions without and with back-bead generation, respectively. The x- and y-axis of the scalograms represent time and frequency, respectively, and the z-axis provides a color representation of the magnitude of the frequency. The range of scale bar for all scalograms was 0-50. Figs. 9(a) and 9(b) evidently show that the magnitude of the scalogram of the welding current signal is higher in the region without back-bead than that in the region with back-bead; in addition, a similar phenomenon was observed in Figs. 9(c) and 9(d). In generating the backbead, as the arc length increases, the frequency component of the welding current value decreases accordingly. Therefore, it is confirmed that this frequency change of the welding current is accurately displayed on the scalogram after the MWT process. Furthermore, to accurately analyze the scalogram under each condition, the scalogram of the 2 s interval for each condition is shown in Table 4. Each scalogram image was acquired using MWT from the welding current signal of 0.5 s. Larger magnitudes of frequency were observed, mainly in the frequency band of 45-65 Hz. The magnitude was larger in the scalogram under the 4 m/min WFR condition wherein back-bead was not generated, compared with the case where back-bead was generated; additionally, the frequency band was wider. A similar phenomenon was observed in the experimental data obtained with a WFR of 5 m/min. The results are similar to the PSD analysis. In this study, 231 image datasets were acquired in the specified 12 s interval by overlapping 90% of the scalogram feature image corresponding to a 0.5 s window of the welding current in the 1 cm to 13 cm section. The data from the start and end of the weld were not used because of the unstable current signal generated in these regions.

B. MODEL DEVELOPMENT AND TRAINING
The more filters a CNN has, the more features that it can learn in each layer during training. Visualization techniques can  be applied to identify features from the learned feature map patterns systematically. The critical factors in CNN model design are the number and size of the filters in the convolu-tional layer. Table 5 details the network structure, parameters, and training hyperparameters used for training the proposed CNN. Table 5 shows the design of the CNN model with a 5 VOLUME 8, 2020 × 5 pixel size filter and with a stride of one [27]. Dropout [42] can prevent overfitting by omitting part of the neural network from the training process. Therefore, it was placed behind the down-sampling layer and the dense layer (full connection), and the drop rate was set to 0.5. The ReLU activation function was applied, and the error function was calculated using an adaptive moment estimation optimizer [43], which randomly initializes and updates the weights by storing the exponential means of the gradient and the square of the gradient. The learning rate of the optimization function was set to 10 −4 . The training images used in this study were 128 × 128 pixels with three channels (RGB), and the dataset comprised a total of 1667 samples. This data set was divided at a ratio of 8:2 into a training set (1333 samples) and a validation set (334 samples). The number of training epochs was set to 100 and transmitted with a batch size of 32. The plot of training loss and validation loss as a function of the epoch is shown in Fig. 10.
During the 100 epochs performed after the weight and bias were initialized, the loss values of the training and validation datasets were 0.09 and 0.06, respectively. This indicat that they almost converged to zero, and no overfitting phenomenon was observed; it might also explain why the learning accuracy increased with successive epochs. The classification results for the validation dataset are shown in the confusion matrix in Fig. 11. When the training was completed, the classification results from the CNN model could be evaluated using indicators such as accuracy, precision, recall, and F 1 score. Accuracy is the ratio of correct predictions to the total validation dataset and can be expressed as Eq. (7).
where TP (true positive) denotes the true data that were accurately predicted to be in the true class; FP (false positive)  represents the false data that were incorrectly predicted to be in the true class; FN (false negative) means that the true data were incorrectly predicted to be in the false class; TN (true negative) represents that the false data were accurately predicted to belong to the false class. Precision is the ratio between the detected amount for a given class TP and all cases assigned to this class, i.e., TP + FP, and is defined by Eq. (8).
The recall value is the relationship between the numbers of true positive and false negative results, i.e., the ratio of the correct true classifications to all the elements of the given class, and is defined in Eq. (9).
Finally, the F 1 score is the score of the harmonic mean using the recall and precision values obtained from Eq. (8) and Eq. (9). It can be expressed as Eq. (10). Table 6 provides the results of classifying the 334 validation images (20% of the total training dataset) during the training of the proposed CNN-based back-bead prediction model. For the validation data representing the cases of no VOLUME 8, 2020  back-bead generation (class 0), 144 data were accurately classified, and six data units were incorrectly classified. By comparison, 182 data were correctly classified, and two data units were classified incorrectly for the validation data representing back-bead generation (class 1). The average accuracy, precision, recall, and F 1 score were 98%, 98%, 97.5%, and 97.5%, respectively, demonstrating excellent classification performance. The performance of the proposed model was then validated using new welding data that were not part of the training data.

C. MODEL LAYER VISUALIZATION
Two-dimensional image data can be used directly as the lowest-level input to the CNN, and the primary features of the image can be extracted for each layer through convolution and pooling. The hidden layers of the proposed CNN model are visualized in Figs. 12-15, which provide the feature maps of the test images calculated by the weight filters trained from each convolutional layer (Conv2D_1-4). An input scalogram image that was not included in the training data was used, and both input images (data with and without back-bead generation) were applied to the same trained weight filters. Fig. 12(a) is the feature map of the test image with no backbead generation, while Fig. 12(b) is the feature map of the image with the back-bead generated. The low-level features of the input test image were intensively extracted in the first convolutional layer. In addition, because this was the initial layer, it was confirmed that no distinct feature was extracted. When the convolutional layer was complete, the feature map  was transferred to the next layer by the ReLU activation function. Because the ReLU function assigned as ''0'' all values less than zero in the activation layer, the feature maps were usually dark. Fig. 13 shows the second convolutional layer, in which more feature information was trained, and richer features were extracted in the time domain (x-axis) than from the first convolutional layer. In addition, specialized features were extracted from the scalogram of the time-frequency domain. Fig. 14 and Fig. 15 show the sixth and eighth layer feature maps after passing the max pooling layer and activation layer (ReLU), respectively. During the down-sampling process, the weight filter was changed from 64 to 128, and the size of the feature map was reduced by half. During this process, the features of the scalogram became clearer as the layer became deeper. The magnitude lines in the feature maps are shown in bold black because when the back-bead was not

D. PERFORMANCE OF THE PROPOSED MODEL
An additional welding experiment was performed (WFR 4.5 m/min, other welding conditions same), and the obtained data were used to verify the CNN-based back-bead prediction model proposed in this study. The results are shown in Fig. 16 and Table 7. Fig. 16(a) shows the shape of the weld bead (top and bottom views) after the welding experiment. The back-bead was generated from the middle to the end of the weld . Figs 16(b) and (c) show the welding current signal measured in real-time during the welding process and the scalogram obtained by the MWT of the corresponding current signal, respectively. Data extracted from the 1 cm to 13 cm section of the weld were used to verify the performance of the final model; the results are shown in Fig. 16(d), which compares the predicted and actual values of the back-bead generation for the new welding signal. In total, 231 test samples were analyzed, 15 prediction errors occurred, and the average prediction accuracy was 93.5%. These results confirm that back-bead generation could be accurately predicted and classified in real-time using the proposed CNN model, even for new welding data acquired under welding conditions different from those represented by the training dataset.

E. CLASSIFICATION PERFORMANCE COMPARISON
The traditional machine learning techniques of SVM [44], [45] and HOG [46], which extracts features using edge orientation information, and the local binary pattern (LBP) [47] were compared with the proposed CNN-based method. A linear SVM model with a cost of 0.1 and a gamma of 0.01 was constructed using the HOG and LBP features for this comparison, and the sizes and characteristics of the images were optimized. Figure 17 shows the classification performance indices, i.e., precision, recall, F 1 score, and accuracy, for each classification model, and illustrates that the SVM method was the least accurate. This confirms that traditional machine learning techniques, such as SVM, are not suitable for classification problems involving time frequency images. The proposed CNN model demonstrated higher performance index values than the other three methods. Therefore, the proposed CNNbased back-bead prediction method for the GMAW process was confirmed to have excellent prediction and classification performances.

V. CONCLUSION
In practice, the accuracy with which the monitoring system can predict back-bead generation is the most critical issue in one-pass butt joint GMAW. In this paper, a back-bead generation prediction monitoring system based on the MWT and a CNN was proposed. Notable results and conclusions from this study are as follows.
1. The proposed back-bead generation prediction monitoring system converts the welding current signal into a scalogram using the MWT method to derive the scalogram feature image, and a distinct feature difference was observed based on the presence or absence of a back-bead in the scalogram feature image. 2. Through frequency analysis, the frequency components of the welding current signal generated irregularly in the time domain were obtained and analyzed in relation to the presence or absence of back-beads. It was confirmed that the welding current signal had a relatively low frequency and a low PSD value when the back-bead was generated, and the current signal of the welding without the back-bead had a higher frequency and PSD value. 3. The classification performance of the proposed model was tested by applying new welding data (acquired through additional experiments) that were not included in the training data. The classification accuracy for the test data representing the cases with back-bead generation was 95.7%, whereas that for conditions in which the back-bead was not generated was 91.2%.
The average prediction accuracy was approximately 93.5%. 4. Based on classification performance evaluation indices, such as accuracy, precision, recall, and F 1 score, the proposed MWT-CNN-based method was demonstrated to be superior to other methods such as SVM, HOG, and LBP.
Based on the excellent classification results of the developed and proposed back-bead generation prediction model, it is expected that this study can serve as a foundation for the future investigation of automated welding monitoring processes. However, the direct applicability of the study results is limited because the proposed method was specifically developed for the short-circuit transfer GMAW process. Therefore, future research will focus on the generalization of welding quality evaluation and the development of a general-purpose monitoring system that accounts for different welding methods (such as CMT and FCAW) and welding conditions (e.g., various material types and thicknesses).