Weld Classification With Feature Extraction by FRESH Algorithm Based on Surface Topographical Optical Coherence Tomography Data for Laser Welding of Copper

The topographical information of a weld seam bears information about quality relevant characteristics such as humping or spatter. Optical coherence tomography (OCT) can be used for inline scanning the weld topography coaxially mounted at a laser scanning optic. Feature extraction from this topographical information is challenging due to finding mathematical representations for the identification of relevant features. Feature extraction based on scalable hypothesis tests (FRESH) allows for feature extraction by a combination of various time series characterization methods. FRESHs feature selection is supported with an automatically configured hypothesis test and hence allows for quick extraction of significant features from sensing data in laser welding processes. In this work, a proof-of-concept is demonstrated for weld result categorization from OCT data by feature extraction using the FRESH algorithm. Changes in weld topography are characterized in a vast variety of process parameters for weld categories such as spatter, deep penetration welding, humping and heat conduction welding. As a result, a quantified separation of weld categories is possible and shows the feasibility of the FRESH algorithm for future quality assessments with different sensing technologies in laser welding.


I. INTRODUCTION
Laser welding of copper gains relevance in an increasing number of applications in e-mobility solutions. The physical properties of copper make laser welding a challenging task. Different weld seam inhomogeneities are possible which reduce the quality of the weld. Requirements and recommendations on quality levels for imperfections can be defined according to ISO 13919-2:2021 [1]. These imperfections can The associate editor coordinating the review of this manuscript and approving it for publication was Michael Lyu. be classified into surface imperfections and internal imperfections.
Internal imperfections such as pores are originated in instabilities of the vapor capillary [2], [3]. Pores can be generated if the pressure in the vapor capillary does not expell the load of the molten metal [4]. Melt ejections in form of spatter can occur if the pressure in the vapor capillary exceeds the load of the molten material [4]. Spatter is a surface imperfection that can result in underfill or craters. Another surface imperfection is an unregular seam topography with excess weld metal. This type of imperfection can be based on humping. Humping is a process phenomenon that appears as a regular drop formation due to fluid flow instabilities [5]. Humping occurs at the border from deep penetration welding to heat conduction welding. Heat conduction welding differentiates from deep penetration welding due to local heating and melting on the surface of the workpiece without the generation of a vapor capillary [6]. This results in weld seams with a bigger weld width than weld depth.
In conclusion, the resulting solidified weld seam shows different seam topographies depending on the fluid dynamics in the molten material and pressure in the vapor capillary. This should allow for a classification of different weld inhomogeneities, and weld regimes based on the topography of the weld seam.
Optical coherence tomography (OCT) is an interferometric measurement technology that enables the inline measurement of weld seam topography coaxially to the processing laser in a scanning optic [7], [8], [9], [10], [11]. For instance, Hartung et al. [11] combined an OCT with external highspeed video camera measurements for the identification of spatters which represent height deposits on the workpiece surface. However, the height information by OCT data was only used to support the labeling of camera images in the training process of a neural network. Stadter et al. [10] showed correlations between in-process weld depth measurements with OCT and the weld seam surface topography with a machine learning approach. The authors evaluated the height profile along the weld center by using discrete wavelet decomposition. This approach enabled a quantification of the height measurement for a characterization of the weld seam surface. Preliminary analyses were necessary to identify a wavelet decomposition by a level 5 approximation as suitable. In this term, the number of peaks was counted to classify in good and poor welds depending on the weld depth. This approach for classification lacks objective criteria for the classification of weld seam inhomogeneities.
Even though, deep neural network architecture can be efficiently applied for the classification in good and poor welds for quality control and can solve prediction problems [12], [13], deep neural networks do not facilitate information extraction [14]. Feature creation followed by a feature analysis helps to determine the driving features of the system and information further used for deriving physics knowledge [14]. Feature engineering tools like tsfresh [15], tsfeatures [16], and hctsa [17] enable an automated feature extraction and hence the identification of objective criteria for the classification of weld seam inhomogeneities. A benchmark of these tools can be found in literature [18].
Feature extraction based on scalable hypothesis tests (FRESH; Python package: tsfresh) allows for feature extraction by a combination of various time series characterization methods specifically designed for time-series data sets (e.g., height profile data) [15], [19]. FRESH's feature selection is supported with an automatically configured hypothesis test and hence allows for quick extraction of significant features from sensing data in laser welding processes. In this work, a proof-of-concept is demonstrated for weld result classification from topographical OCT data by feature extraction using the FRESH algorithm. In this report, we investigate a vast variety of laser parameters for the separation of laser welding regimes between heat conduction and deep penetration welding, humping, and spatter occurrence based on surface topographical information from OCT measurements. Features are selected with the help of the FRESH algorithm and discussed for classification applicability under consideration of process knowledge. Finally, a proof-of-concept is given to approve the applicability of the FRESH algorithm for the identification of features from weld surface information according to process results in laser welding.

A. EXPERIMENTAL SETUP
The experimental setup consists of a programmable scanning optic with cross-jet, processing laser with fiber, and OCT device (see Fig. 1). The laser welding process is performed using a continuous wave disk laser (Trumpf TruDisk 6001) at a wavelength of 1030 nm with a maximum average power of 6000 W. The laser light is coupled into programmable focusing optics (Trumpf PFO 33-2) with the help of a fiber (core diameter 100 µm). The focusing optic has a focal length of 255 mm and results in a laser spot diameter of 170 µm. The scanning optics consist of galvanometer scanners. These allow for scanning in an elliptical field of 90 × 50 mm. The OCT is attached to the programmable focusing optics and hence enables a coaxial positioning of the measurement beam. The OCT is an SD-OCT with a superluminescent diode with a central wavelength at 840 nm and a bandwidth of 40 nm. The measurement beam is detected on a 2048-pixel line sensor with a maximum measurement frequency of 70 kHz. The OCT system has an axial resolution of 12 µm (z-direction) and the lateral resolution is 25 µm (y-direction).

B. EXPERIMENTAL PROCEDURE
The workpiece (pure copper Cu-OF; 70 × 30 × 5 mm 3 ) is mounted in the focus position. The weld seam length is 60 mm (bead-on-plate). The OCT measurement line has a length of 2 mm with 200 measurement points, whereby the OCT measurement line is orientated perpendicular to the weld seam and scanned with 12 mm/s along the weld seam after the welding process in the x-direction (see also Fig. 1). These parameters enable a resolution of 42 µm in the x-direction. Processing parameters are varied by average laser power and welding speed. The average laser power is varied from 2000 up to 6000 W with 1000 W steps. Welding speed is varied in a range from 2 m/min up to 100 m/min (see Table 1). Measurements are repeated three times for each set of parameters. Results are included from 309 experiments. Parameters that did not result in a welding process are excluded from the study (e.g., cutting, no melting of the workpiece).

C. DATA PROCESSING
The data processing of the OCT data follows three steps. These steps are the categorization of process results, preprocessing of OCT data as well as feature extraction and selection (see Fig. 2).
In the beginning, manual categorization of process parameters is necessary. Three categories are differentiated: spatter, humping, and desired welding regime (see Fig. 2; Step 1). Internal defects like pores and weld depth fluctuations are not specifically evaluated. Preliminary tests with X-ray analysis at a resolution of 69 µm have shown that these defects are often accompanied by spatter events for the given set of parameters. These findings are supported by literature [2]. Spatter events are chosen for categorization as spatter clearly indicates changes in surface topography. The number of spatters is counted manually with the help of the existing OCT data and offline microscopy images. Here, height depositions and underfill are attributed to spatter events. Humping is determined with visual inspection and metallographic analysis. Heat conduction welding is separated from deep penetration welding by the calculation of the aspect ratio between weld depth and weld width. Weld width and depth are determined by metallographic analysis for the set of parameters.
After labeling the process parameters into categories, OCT data must be processed for applicability to the FRESH algorithm (see Fig. 2; Step 2). First, a region of interest (ROI) is applied to crop individual OCT images to avoid influence by image artifacts and improve processing speed. The ROI is in the size of 320 × 172 pixels [z × y]. The size of the ROI in y-direction results from excluding 14 pixels at the beginning and the end of the measurement line to avoid unnecessary processing time and measurement information from the turning point of the scanned measurement beam. The size of the ROI in z-direction is the consequence of maximum and minimum measured height. Afterward, the 2D image data is compared to a reference column according to the reference height at the zero coordinate of the programmable focusing optic (x = 0). This processing step is necessary because the optical path difference changes slightly at different positions within the programmable focusing optic. In the following step, the 2D OCT data requires a dimensional reduction. Two different dimensional reductions are performed. First, the image data is reduced to a heat map for an intuitive representation of the surface topography. Second, the image data is reduced to a height profile for further processing with the FRESH algorithm. For the latter, the absolute maximum height gradient value h of each 2D height profile is extracted. These values are combined into a height gradient vector that contains the absolute maximum height value of each image frame. The height gradient vector represents a time series with surface topographical information which contains information about the weld characteristics.
A feature represents a measurable characteristic of the time series. The identification of relevant features is performed in Step 3 with the FRESH algorithm (see Fig. 2; Step 3). FRESH is an open-source code in a Python package called tsfresh that allows feature extraction based on scalable hypothesis tests. This package applies feature extraction, hypothesis tests, and a multiple testing procedure to improve finding significant features. Feature extraction is performed with 63 time-series characterization methods, which calculate 794 time-series features with established feature mappings (e.g., mean, median, etc.). The extracted features are summarized in a feature matrix. Afterward, each feature X i is evaluated with respect to its significance for predicting the target Y (e.g., humping). Therefore, each feature X i is statistically tested to check the following hypotheses [19]: As a result, p-values are calculated for each hypothesis test H i 0 to quantify the probability that feature X i is irrelevant for predicting the target Y. The smaller the p-value, the higher the relevance of a specific feature to predict the target under investigation. Generally, a p-value can be considered as statistically significant in the order of 0.05 or lower. Afterward multiple hypothesis testing is applied to identify relevant features. The significance of each feature is based on nonparametric hypothesis tests (Fisher test, Kolmogorov-Smirnov test, Kendal rank test) which are chosen depending on the feature/target to be binary/continuous. The comparison Step 1 with the categorization of process results in spatter, humping and corresponding weld regime.
Step 2 focuses on pre-processing of OCT data with artifact reduction by setting a ROI, eliminating the influence of chromatic aberration by leveling and 1D reduction of the 2D data.
Step 3 applies the feature extraction and selection with the FRESH algorithm before selected features are analyzed.
of multiple hypotheses and features leads to an accumulation of errors. The Benjamini-Yekutieli procedure reduces this error by controlling the false discovery rate (FDR) and telling which hypotheses need to be rejected [19].
In the end, the features with the lowest p-values are selected. These features are evaluated with regard to their capability for the classification task. The most suitable features are presented in the following results and are discussed for their applicability for weld process categorization under consideration of process phyiscs.

III. RESULTS AND DISCUSSION
The goal of this study is the methodical identification of features that enable the classification of weld status based on surface topographical information from the weld seam without losing process knowledge. Three categories are subject to this analysis: spatter, humping, and desired weld regime. Each category describes the process result under investigation and evaluates the applicability of the chosen feature under consideration of process physics. In the end, a proof-ofconcept shows the feasibility of FRESH for the identification of relevant features for weld status classification in the case of quality monitoring.

A. WELD REGIME CLASSIFICATION
The separation of weld regime between deep penetration welding and heat conduction welding can give an insight into reaching a certain weld depth. In heat conduction welding a smaller weld depth than weld width is achieved (see Fig. 3 top left). During deep penetration welding, the vaporization temperature is reached, and a keyhole is formed which enables a higher weld depth than weld width (compare Fig. 3(a) with Fig. 3(b)). The different melt pool dynamics result in different seam topographies (see Fig. 3(a) and Fig. 3(b) middle). In heat conduction welding less excessive weld (weld protrusion) can be found in comparison to deep penetration welding. Deep penetration welding reaches a higher amount of absorbed laser radiation due to multiple reflections in the keyhole. As a consequence, higher temperatures can be achieved in deep penetration welding than in heat conduction welding. Presumably, thermal warping might support excessive weld metal during solidification. Additionally, the formation of a bulge by the keyhole might support the solidification at an elevated position.
The difference in weld topography gives rise to the separation of both weld regimes.
As a feature for weld regime classification, the feature with the lowest p-value is chosen. For separating the weld regime, the autocorrelation with lag 2 (acl 2 ) is identified with a p-value of 1.25 × 10 −38 . This very low p-value indicates a very good separability of the weld regimes. The autocorrelation with lag acl l calculates the autocorrelation of the time series S with its lagged version of lag l [20]: Here, n is the length of the time series,h is the mean height of the time series and s 2 is the standard deviation.
The autocorrelation is a mathematical representation of the degree of similarity between a time series and a lagged version of itself. A positive or negative value of 1 represents a perfect correlation between the current value and the lagged value. Fig. 3(c) shows the feature values of acl 2 over the two classified weld regimes for the entire set of parameters under consideration. Values higher than 0.424 are classified as deep penetration welds, whereas values below 0.424 are classified as heat conduction welds according to the maximum identified value in the heat conduction regime. High positive values can be interpreted as a measure of the persistence of data points separated by this lag to stay above or below the mean value of the signal. A lower value indicates that the data points separated by this lag alternate close to the mean value. As welds in the heat conduction regime result in less excess metal, the mean height above the reference value alternates around a low mean value. This results in lower autocorrelation values close to zero in comparison to deep penetration welds. In the deep penetration welding regime, excessive weld metal with higher mean values can be seen and melt ejections are possible. Melt ejections lead to a height profile, where the height values change in tandem because of negative height values due to missing material in the weld seam as well as added material by spatter on top of the weld seam. In consequence, the autocorrelation function shows higher positive values.

B. CLASSIFICATION OF HUMPING
The separation between humping and no humping allows for addressing the problem of unregular surface topography. Fig. 4(a) depicts an example of a weld with humping. Here, the characteristic recurring surface structures can be seen along the weld (see Fig. 4(a) top; blue arrows), which are accumulations of melt in the form of droplets. The shown weld has a weld depth of 214 µm and a weld width of 345 µm, where the droplet exceeds the surface by 91 µm. The aspect ratio results to be 0.62, which indicates a heat conduction weld. However, as humping occurs at the border between heat conduction and deep penetration welding, humping also could be found in the deep penetration regime. The height profile in Fig. 4(a) shows that the recurring surface structure can be identified by the inline measurement system.
Autocorrelation with a lag of 6 (acl 6 ) is identified as a feature for the classification of humping (see also (2)). This feature is extracted with a p-value of 3.97 × 10 −15 and hence also shows a good statistical capability for separating topographies with humping and without humping. Positive values for autocorrelation with lag 6 can be identified for deep penetration welds with spatter and without spatter (see Fig. 4(b)). These values tend to show similar results for the classification of weld regimes. Values close to 0 can be considered heat conduction welds. Negative values can be correlated with humping welds. Values between 0 and -0.085 show an overlap for the classification of humping and no humping. Here, heat conduction welds are identified to be classified for no humping in the overlap region. Humping can be clearly separated if the autocorrelation function is below −0.085. The autocorrelation lag acl 6 for welds with humping is influenced by a change in the frequency of the droplet occurrence. Depending on the laser parameters, humping shows different frequencies of recurrence of the molten droplets. If the humping frequency is very high, the weld tends to show similar topographical features for autocorrelation with lag 6 (acl 6 ) like a regular heat conduction weld. A low humping frequency can be considered pre-humping [5], which is characterized by surface waves with small amplitudes and hence increases the absolute acl 6 value. The change in humping frequency influences the clear separation between humping and heat conduction welding. Nevertheless, this feature shows its ability in the separation of deep penetration welding regime without humping from welds with humping.

C. CLASSIFICATION OF SPATTER
In the deep penetration welding regime, spatter might occur which may negatively affect surrounding components. Spatter is characterized either by a height deposit on top of the weld or by a melt ejection from the weld seam. The latter can be represented as a remaining surface pore. Fig. 5(a) shows a regular deep penetration weld in comparison to a VOLUME 10, 2022 , a corresponding heat map shows the 2D height profile at reference level (green) with height deposits (yellow, red) and voids (blue) as a distance from the reference line in pixel. This measurement information is transformed into a 1D height profile for a characteristic weld with humping. (b) shows the identified feature for humping classification (autocorrelation lag 6) based on the given 1D height profile data from all included laser process parameters. Weld seams in the heat conduction regime (green) could not be clearly separated from welds with humping (red) (b).
weld with spatter (see Fig. 5(b)). The heatmap in Fig. 5(b) clearly indicates height deposits in red, while showing melt ejections in blue. The 1D height profile supports this view (see Fig. 5(b)) and shows its applicability for the identification of such spatter events.
The most suitable feature for a binary separation between spatter and deep penetration welds with no spatter is the root mean square rms with a p-value of 3.88 × 10 −47 . This feature can be calculated from the addition of the squared height values h i of the 1D height profile and division of the length of the height profile n [20]: Fig. 5(c) shows the height root mean square rms for welds with spatter and without spatter. No spatter can be found if the root mean square is below 10.05. Contrary, a root mean square above 10.05 indicates weld seams with spatter events. The higher the height root mean square, the more spatter can be found. A clear separation is only possible in the case of deep penetration welding and heat conduction welding with no humping as humping shows overlap with the height root mean square of welds with spatter up to a value of 17.02. Nevertheless, this feature enables a classification of undesired weld status.

D. PROOF OF CONCEPT
In conclusion, we identified three different features which enable the classification of weld status based on surface topographical information from the weld seam. To show the proof-of-concept for possible quality assessment with features extracted from surface topographical data with FRESH algorithm, we compare two different welds regarding the identified limits for the features (see Table 2). Heat conduction welding can be found for autocorrelation with lag 2 below 0.424. Humping can be found for autocorrelation with lag 6 below 0. Spatter can be identified for a root mean square above 10.05. As exemplary welds, we choose a weld with spatter (exemplary weld 1, power 4000 W, welding speed 2 m/min) and a weld in the heat conduction regime (exemplary weld 2, power 3000 W, welding speed 30 m/min). Both welds can be clearly classified to each target. The exemplary weld 1 with spatter shows a height root mean square value of 44.62 which is an indicator for spatter. The other features, autocorrelation lag 2 with a value of 0.66 and autocorrelation lag 6 with a value of 0.39 show that no humping is present and that the weld is in the laser deep penetration regime. The example for weld 2 (heat conduction welding) shows an autocorrelation value with lag 2 of 0.37 which identifies heat conduction welding as a welding regime. The other features, root mean square with 2.49 and autocorrelation value with lag 6 of 0.00 show that no humping can be found, and no spatters are identified. . At the middle section of (a) and (b) corresponding heat maps show the 2D height profile at reference level (green) with height deposits (yellow, red) and voids (blue) as a distance from the reference line in pixel. This measurement information is transformed into a 1D height profile for a weld seam with no spatter (a) and with spatter (b). (c) shows the identified feature for spatter classification (root mean square) based on the given 1D height profile data from all included laser process parameters.
This shows the feasibility of the FRESH algorithm to identify relevant features for weld quality assessments based on inline OCT data of the weld seam topography. As the features for the classification of humping and spatter show an overlap region, it is recommended to use a combination of these features to improve the false discovery rate in the case of quality assessments. The identified values are bound to the specific resolution of the measurement system. A change in measurement frequency by changing the scanning speed or the number of measurement points will have an impact on the identified limits and relevance of the features. The resolution of the measurement system in x-direction must be lower than the size of the surface structure under observation to allow a classification. An improved resolution may result in a better separability. However, the separability will remain as each welding phenomenon shows specific characteristics regarding the surface topographical structure size and tendencies in process parameters.
The accuracy of the method in terms of separability of weld categories could be further increased by improving the feature selection process. A detailed discussion of possible different machine learning pipelines for improving the accuracy can be found in the literature [19]. Suggested improvements for feature selection are the combination of FRESH with a principal component analysis (PCA) to avoid the consideration of highly correlated features. This is relevant as the selection of the most significant feature does not necessarily result in the best feature for a classification task, especially when the most significant feature shows similar p-values towards other relevant features. A further implementation of classifiers (e.g., AdaBoost, random forest classifier) in the machine learning pipeline can lead to improved accuracy for the classification task [19]. However, the number of considered features is limited by the given algorithm and hence may also limit the separability. Future alternative feature engineering tools should be considered for application, even though FRESH includes a brought set of features and resulted as the best solution for time-series feature selection in a benchmark [18].

IV. CONCLUSION
In this report, experimental work was performed with a vast variety of process parameters for the classification of weld conditions in the context of laser welding of copper. The topography was measured inline with an OCT system to identify the height profile of the weld seam. The FRESH algorithm was used to identify features for weld classification depending on the weld regime (deep penetration welding/heat conduction welding), the occurrence of spatter or humping. Features were discussed for their applicability for weld process categorization under consideration of process physics. It was shown that each classification results in a feature that allows a separation between spatter, humping, and welding regime due to the specific resulting surface topographies of the weld seam. Some features showed an overlap for specific weld conditions. A combined application of the features can be used to avoid false detections and to identify defective welds. Consequently, this enabled a proof-of-concept for inline monitoring of the laser welding result based on surface topographical information that can be classified with the help of the FRESH algorithm. We expect these findings to be beneficial in the future for any manufacturing process in the context of quality monitoring. Future work will be necessary to identify limitations of our method in the case of changing measurement resolution, weld conditions, and targets under investigation (e.g., pores, oxidation).