Tool Wear Condition Monitoring in Milling Process Based on Current Sensors

Accurate tool condition monitoring (TCM) is essential for the development of fully automated milling processes. This is typically accomplished using indirect TCM methods that synthesize the information collected from one or more sensors to estimate tool condition based on machine learning approaches. Among the many sensor types available for conducting TCM, motor current sensors offer numerous advantages, in that they are inexpensive, easily installed, and have no effect on the milling process. Accordingly, this study proposes a new TCM method employing a few appropriate current sensor signal features based on the time, frequency, and time – frequency domains of the signals and an advanced monitoring model based on an improved kernel extreme learning machine (KELM). The selected multi-domain features are strongly correlated with tool wear condition and overcome the loss of useful information related to tool condition when employing a single domain. The improved KELM employs a two-layer network structure and an angle kernel function that includes no hyperparameter, which overcome the drawbacks of KELM in terms of the difficulty of learning the features of complex nonlinear data and avoiding the need for preselecting the kernel function and its hyperparameter. The performance of the proposed method is verified by its application to the benchmark NASA milling dataset and separate TCM experiments in comparison with existing TCM methods. The results indicate that the proposed TCM method achieves excellent monitoring performance using only a few key signal features of current sensors.


I. INTRODUCTION
Milling is a common and efficient machining operation employed in modern industrial manufacturing for fabricating various mechanical parts, such as flat surfaces, grooves, threads, and other complex geometric shapes. Cutting tools are key components in machine milling operations that are inevitably subject to wear during milling and therefore present conditions that vary over their effective lifetimes [1]. However, Konstantinos et al. [2] and Karandikar et al. [3] have determined that cutting tools are typically used for only 50%-80% of their effective lifetimes owing to excessive tool wear and breakage (i.e., tool faults). These tool faults are major causes of unscheduled downtime in milling processes and typically account for 7%-20% of the total downtime [4].
The associate editor coordinating the review of this manuscript and approving it for publication was Gongbo Zhou . In addition, tools and tool changes account for 3%-12% of the total processing cost [5]. As such, tool faults have negative direct (capital) and indirect (time loss) effects on milling performance. Therefore, the timeliness of detecting tool conditions is critical to provide effective information for implementing scheduled tool replacement decisions without interrupting normal machine operations [6]. As a result, tool condition monitoring (TCM) has become an essential task in industrial milling processes for scheduling operations based on objective tool condition evaluations [7].
Researchers have investigated TCM in milling processes for over thirty years based on either direct or indirect monitoring methods. Direct monitoring methods adopt optical components for visual inspection and are not suitable for industrial manufacturing settings due to the expense of the optical equipment involved and the interference of cutting fluid and cutting chips [8], [9]. Therefore, indirect monitoring VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ methods have been widely adopted. Indirect TCM methods are data-driven methods that synthesize the information collected from one or more sensors to estimate tool condition based on machine learning approaches [10]. This study first considers past efforts focused on indirect TCM methods using various sensors in Section 2 and discusses their limitations. The problems associated with these past efforts are addressed by the theoretical framework and learning algorithm of the TCM method proposed in Section 3. The prediction performance of the proposed method is verified in Sections 4 and 5 by its application to the open-access benchmark NASA milling dataset [11] and separate TCM experiments in comparison with several methods. Conclusions are given in Section 6.

II. LITERATURE REVIEW A. SENSORS
Numerous types of sensors have been employed to obtain signals for conducting indirect TCM, such as cutting force, vibration, acoustic emission (AE), and motor current sensors.

1) CUTTING FORCE
The progressive wear of cutting tools during the milling process increases the roughness of the cutting surface, and this leads to a corresponding increase in the applied cutting force. Many studies [1], [12], [13] have demonstrated that the cutting force is very sensitive to changes in tool condition and can therefore accurately estimate the tool state. For example, Wang et al. determined that the cutting force signal is the most stable and reliable signal among all commonly employed sensor signals that are closely related to tool wear [14]. Huang et al. employed a piezoelectric dynamometer to monitor the cutting force of an end milling operation [15]. Bulent et al. adopted a rotary dynamometer to capture the cutting forces in three dimensions and the torque of the drive moment on a rotating tool [16]. However, cutting force sensors are difficult to apply in industrial settings because their physical properties are not appropriate for conducting TCM when milling medium and large workpieces, such that milling processes monitored by cutting force sensors are limited to relatively small physical workpiece sizes [17]. In addition, Koike et al. established that cutting force monitoring interferes with the motion control of the spindle and stage in a milling machine by reducing its rigidity [18]. Moreover, the expense of commercial dynamometers can unacceptably increase manufacturing costs [19], [20].

2) VIBRATION
Vibration sensors are widely employed in TCM because they are inexpensive, easily installed, and provide similar periodic signal shapes to those of cutting force sensors [21]- [23]. Besmir et al. established that the level of vibration generated during the milling process increases with increasing deterioration in the tool condition [24], and the feasibility of adopting vibration signals for TCM in milling processes has been demonstrated by numerous subsequent studies [25]- [28]. For example, Hsieh et al. demonstrated that variations in tool conditions during micro-milling processes can be distinguished according to spindle vibration acceleration signals when used in conjunction with appropriate feature extraction and classifiers [29]. Madhusudana et al. adopted a tri-axial integrated piezoelectric accelerometer on the spindle housing to capture the spindle vibration acceleration signal during face milling [30]. Gao et al. achieved good tool condition diagnostic accuracy by adopting a laser vibrometer to acquire the vibration displacement of a tool holder [31]. However, the characteristics of milling processes limit the accuracy of TCM methods employing vibration sensor signals. First, vibrations are generated during machine operation even when the tool is not engaged in cutting, as during an air-cut operation. In fact, effectively distinguishing air-cut operations from actual cutting operations remains a significant challenge in TCM methods employing vibration sensor signals. Second, vibration signals are difficult to filter and are therefore prone to providing erroneous data [24]. Finally, the position of sensor installation and cutting-fluid conditions can affect the vibration signal, which greatly complicates the training process, and can lead to inaccurate monitoring results [9].

3) ACOUSTIC EMISSION
Sensors based on AE are particularly suitable for conducting TCM in milling processes because the resulting signals are not mechanically disturbed, have a superior sensitivity to the those of cutting force and vibration sensor signals, and propagate at a frequency much greater than the characteristic frequency caused by cutting, which reduces interference [32], [33]. Hassan et al. demonstrated the potential of AE signals for detecting the unstable crack propagation preceding tool chipping/breakage within a time span on the order of 10 ms [34]. Vetrichelvan et al. demonstrated that the AE signals obtained from sensors located on the top surface of the tool holder can effectively monitor crater wear in the cutting surface [2]. Mathew et al. experimentally demonstrated with 1-tooth, 2-tooth, and 3-tooth milling cutters that AE signals exhibit marked responses to changes in tool condition such as tool breakage and tool chipping [35]. Ren et al. established that AE signals captured in micro-milling processes are easily recorded and provide very rapid responses to changing conditions in the contact between the tool and workpiece [36]. However, intermittent cutting during milling processes results in AE signal spikes when individual teeth enter or exit the workpiece, which greatly complicates the analysis of AE signals [32]. In addition, AE sensors are highly sensitive to environmental noise [37], which increases the difficulty of extracting valid signal feature information.

4) MOTOR CURRENT
Because the cutting force increases with increasing tool wear, the current drawn by the electric motors of a milling machine undergoes corresponding increases [38]. Motor current sensors are considered to be more suitable for industrial manufacturing settings than cutting force sensors due to their relatively simple application and lack of installation effects on machining operations [39], [40]. Ghosh et al. demonstrated that TCM methods based on current sensor signals provide monitoring results that are fairly comparable to methods based on cutting force sensor signals in actual industrial TCM applications [8]. Stavropoulos et al. demonstrated that motor current signals correlate more strongly with tool wear than vibration signals [20]. Ammouri et al. established a TCM index based on the measured current values of the spindle and drive motors of a milling machine [38]. Hassan et al. proposed an effective signal processing technique for applying spindle motor current signals to describe tool wear conditions under different cutting parameters in high speed roughing milling operations [41]. However, current sensors are less commonly employed for TCM in milling processes than the above three types of sensors. because motor current signals include a considerable amount of noise, which obstructs the detection of small fluctuations in the cutting force, and the application of filtering typically results in the loss of highfrequency components [18], [42]. Teti et al. suggested that the proportion of spindle power required for material removal is a very small component of the total power, and that temperature increases inherent in electrical motors under load influence power consumption [43]. Specifically, the spindle current and voltage frequency in high-speed milling processes could be modified by a pulse-width modulation (PWM) module when using a 400-Hz 2-pole induction spindle motor to generate superimposed signals for maintaining a set rotational speed [44].
The limitations associated with the single-sensor TCM applications discussed above have generated an increasing interest in multi-sensor TCM [45], [46]. For example, Torabi et al. applied the signals obtained from dynamometer, accelerometer, and AE sensors to conduct TCM for a ball nose milling process [47]. Downey et al. developed a TCM system based on AE, vibration, and cutting force signals [48]. Jahromi et al. applied the signals derived from cutting force, accelerometer, and AE sensors for conducting TCM in a highspeed milling process [49]. Sohyung et al. applied threeaxis dynamometer, three-axis accelerometer, AE, and current sensors for TCM in an end milling process, and demonstrated that the diagnosis accuracy was greater than that obtained using any of the single sensors [49]. Hassan et al. presented a generalized, nonintrusive multi-signal fusion approach for real-time tool wear detection by using unprocessed spindle motor current, voltage, and power signals directly [44]. However, while the use of multiple sensors can enhance the richness of information indicative of potential tool wear levels, the production and maintenance costs and the difficulty of maintenance in industrial machine milling operations increase when adopting multiple sensors, and the interference caused by sensors in the milling process increases with an increasing number of sensors.

B. MONITORING MODEL
The rapid development of artificial intelligence technology in recent years has led to the use of many AI models to predict tool-wear conditions based on sensor signal data. These predominantly include artificial neural networks (ANNs), hidden Markov models (HMMs), support vector machine (SVM), and kernel extreme learning machine (KELM). More recently, deep learning technologies such as convolution neural networks (CNNs) and recurrent neural networks (RNNs), with wide applications in image processing and other fields, have emerged as alternative AI models in TCM for milling processes.
Many studies have applied ANNs and HMMs to TCM in milling processes with outstanding results [33], [51], [52]. Deep neural networks such as CNNs [53], [54] and RNNs [55], [56] have also been applied with considerable success. However, ANN-and HMM-based TCM models have several drawbacks [57], [58]. First and foremost, they require a large number of training samples to obtain accurate monitoring performance, which is time-consuming and costly for industrial milling operations. Second, they require the preselection of critical parameters. For example, the number of hidden layers of an ANN and the number of neurons in each layer (i.e., the network structure) are critical to the performance, but the selection of network structure depends on researcher experience and is not directly related to the tool-wear process. As such, selecting the optimal network structure for conducting TCM in milling operations from among the many possible structures remains an unsolved issue. In addition, an accurate determination of the tool state duration distribution in the milling process is critical to the performance of HMM, although no truly objective means for determining this distribution presently exists.
In contrast to the above-discussed AI models, SVM and KELM have generated considerable interest in TCM research because of their superior performance with small sample sizes [59]. An SVM applies statistical learning theory to map input samples in the original space to a high-dimensional feature space nonlinearly using a kernel function and thereby constructs a linear algorithm corresponding to the solution in the original space. A KELM was proposed for use in a single hidden layer feed-forward neural network (SLFN) with a kernel function that learns quickly, and its learning accuracy and speed have been demonstrated to be greater than those of other models such as SVM, ANN, and HMM in various applications, including classification, regression, time series forecasting, and fault diagnosis [60], [61]. Moreover, the KELM tends to achieve not only the smallest training error but also the smallest norm of the output weights. Unfortunately, KELM suffers from two drawbacks. First, as a special case of SLFNs, KELM has a shallow architecture that fails to completely extract the inherent features in raw data (particularly microarray data) like deep architectures [62]. Second, similar to SVM, the selection of the kernel function, such as a Gaussian kernel, polynomial kernel, or sigmoid kernel, and its hyperparameter greatly impact its performance with respect to the extraction of inherent features in raw data. However, no theoretical basis exists for selecting the kernel function objectively, and its hyper-parameter must be manually preset or tuned by cross-validation (CV), such as is the case for the kernel width in radial basis function (RBF) kernels. As a result, it remains unknown whether KELM can obtain optimal extraction results in the context of small sample size.

III. PROPOSED METHOD
The purpose of this study is to develop a high performance TCM method based on KELM with use of current sensors, in which, the drawbacks of current sensor signal could be reduced significantly and no need to preselect kernel function and its hyperparameter in KELM.

A. FRAMEWORK OF THE PROPOSED METHOD
The proposed TCM method employs a few key current sensor signal features based on the time, frequency, and timefrequency domains of the signals and an advanced monitoring model based on an improved KELM to achieve excellent TCM performance. Here, current sensors are deemed most appropriate due to their low cost and simple installation that has no effect on the milling process, while the selected multidomain features, which are strongly correlated with tool wear condition, overcome the drawbacks associated with the use of current signals described in Subsection 2.1 and the loss of useful information related to tool condition when employing a single domain. The proposed TCM method is schematically illustrated in Figure 1. Its operation comprises three steps: the first step is current sensor signal acquisition, where the dynamic signals obtained from current sensors are collected to depict the characteristics of the milling process. The second step is feature extraction, where a few key statistical parameters in the time, frequency, and time-frequency domains of the current sensor signals are extracted. The last step involves monitoring the tool condition using an improved KELM. The second and third steps are discussed in detail in the following subsections.

B. FEATURE EXTRACTION
The three key statistical parameters associated with feature extraction include the average amplitude (T avg ) of the spindle motor current in the time domain, the mean of the power spectrum (F mps ) in the frequency domain, and the average wavelet energy of the first frequency band (E 1 ) obtained using the wavelet packet transform (WPT) with the db2 wavelet basis function in the time-frequency domain. Here, the WPT conducts a multi-level band division over the entire signal band, which inherits the advantages of good time-frequency localization from the wavelet transform (WT) and further decomposes the high-frequency band to increase the frequency resolution [63], [64]. As discussed in Section II, the current increases with increasing cutting force as tool wear becomes progressively severe. Therefore, changes in the values of T avg , F mps , and E 1 of the spindle motor current correspond approximately to changes in tool wear, as can be seen in Figures 2-4. In the time domain, the value of T avg is defined as follows: where x i is the amplitude of the i-th current signal sample in a collection of n samples in the sample set. As an example, the relationship between the values of T avg obtained for the AC spindle motor current and the tool wear values in the NASA milling dataset is shown in Figure 2. The results in the figure indicate that the peaks in tool wear with respect to cut number are strongly correlated with the peak values of T avg with a correlation coefficient R = 0.3779.  In the frequency domain, the value of F mps is defined as where P i is the power spectrum of the signal sample corresponding to x i obtained by the fast Fourier transform (FFT). The relationship between the value F mps obtained for the AC spindle motor current and the tool wear condition in the NASA milling dataset is shown in Figure 3. Again, the results in the figure indicate that the peaks in tool wear with respect to cut number are strongly correlated with the peak values of F mps (R = 0.6203).
In the time-frequency domain, the value of E 1 is calculated using the db2 basis function as follows: where d 1,k denotes the wavelet packet coefficients of signal x(t), and w 1,k (t) are the wavelet packets localized at 2k in the scale of 2. The relationship between the value E 1 obtained for the AC spindle motor current and tool wear condition in the NASA milling dataset is shown in Figure 4. These results also indicate that the peaks in tool wear with respect to cut number are strongly correlated with the peak values of E 1 .

C. IMPROVED KERNEL EXTREME LEARNING MACHINE
The KELM problem can be expressed as minimizing an objective function [61]: where β = [β 1 β 2 . . . β L ] T is the vector of output weights between the L nodes of the hidden layer and the output node, ||•|| F is the Frobenius norm, ε i is the training error of the i-th training sample, C is the regularization parameter that facilitates a tradeoff between the norm of output weights and training errors, (X,Y) = {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n )} is the training sample set, and f ( T is the hidden-layer output vector with respect to x i . Here, f (•) is a form of feature mapping that maps the input data from the original dimension space to the L-dimensional hiddenlayer feature space. The optimal value of β (β) that minimizes Eq. (4) can be efficiently solved as follows: where is the hidden layer output matrix, I is an identity matrix, and Y is the dependent value vector in the training samples. Because the sensor signals in TCM are highdimensional, nonlinear, and heterogeneous, the feature mapping φ(·) is unknown. Therefore, we define a kernel matrix for the extreme learning machine (ELM) using Mercer's conditions, as follows: Then, the prediction score for test point x is determined as follows.
In this context, which is similar to the context of SVM, φ(·) need not be known. Instead, a common kernel function, such as a Gaussian, linear, or polynomial kernel, can be used. In addition, the dimensionality L of the feature space (number of hidden nodes) need not be explicitly given. However, as discussed in Section 2, KELM underperforms with respect to the extraction of inherent features in raw data. The discussed drawbacks of KELM are addressed in the present work by proposing an improved KELM denoted as two-layer angle KELM (TAKELM), which introduces an angle kernel function to avoid manual presets or tuning of the kernel function hyperparameter. The TAKELM architecture is illustrated in Figure 5. In detail, the input layer in VOLUME 8, 2020 the training phase consists of the independent variables x i of the training samples X, and each variable x i includes 3 × Ns feature parameters, where Ns denotes the number of applied current sensors. The two hidden layers consist of the optimization of two output weight vectors, where one is the weight between hidden layer 1 and the hidden input layer assigned according to Eq. (5) using the angle kernel k 0 , and the other is the weight between hidden layer 2 and the output layer assigned according to Eq. (5) using the angle kernel k 1 . The output layer consists of the tool wear values y i corresponding to x i . In the testing phase, the input layer consists of one independent variable x' to be predicted in the testing samples, which includes nine features (i.e., three features per sensor for three current sensors), and the output layer is the predicted tool wear values y' assigned according to Eq. (7).
Cho and Saul introduced a new kernel function denoted as the arc-cosine kernel, which mimics the process flow in large, multilayer neural nets [65]. The angle kernel function measures the similarity between two vectors through their angle. Let θ denote the angle between vectors x and y : A general n-th order kernel function in this family can be written as follows: where the angular dependences are captured by the functions J n (θ ). These functions are given by In general, J n (θ ) takes its maximum value at θ = 0 and decreases monotonically to zero at θ = π. The first two expressions of J n (θ ) are given as follows: However, the angular dependence is more complicated for n > 1, which could affect the learning speed of TAKELM. Therefore, the kernel function is truncated at n = 1 in the present work. We also note the absence of any continuous tuning parameter in Eqs. (11) and (12), which avoids the drawback associated with manually presetting or CV tuning of a hyperparameter. Thus, the k 0 and k 1 angle kernels are, respectively, used as the kernel functions of the first and second hidden layers in TA-KELM, as shown in Figure 5. These kernels are given as follows:

IV. COMPARATIVE VALIDATION USING BENCHMARK MILLING DATA A. DESCRIPTION OF MILLING DATASET
The NASA milling dataset employed for validation testing of the proposed TCM method was obtained from the Matsuura machining center (MC-510V) during dry rough milling processes of cast iron or stainless steel J45 workpieces using a six-tooth face milling cutter with KC710 carbide inserts under different cutting parameters. The individual milling conditions are listed in Table 1. The parameter selections were guided by industrial applicability and recommended manufacturer settings. Therefore, the cutting speed was set to 200 m/min and the spindle speed was 826 rpm. Two different depths of cut were selected, that is, 1.5 mm and 0.75 mm, and two feed rates were considered, that is, 0.0833 mm/tooth and 0.0417 mm/tooth. The dataset includes sensor signals obtained from two vibration accelerometers, two AE sensors, and two CTA 213 current sensors clamped on the cable connectors for measuring the AC current of the AC spindle motor and the DC current of the DC spindle motor. Sampling of the sensor signals was conducted using Lab-VIEW, and the signals were directly transmitted to a computer for storage. Each experimental case was initiated with a new cutting tool, and the flank wear of each of the six inserts was measured offline based on optical microscopy imaging after each surface of the workpiece was completely finished. The flank wear associated with the insert obtaining the maximum flank wear value was used as the tool wear value. Here, a completely milled surface represents a single milling stage, and the number of milling stages varied depending upon the milling parameters. Of the total number of 16 cases given in the dataset, a total of 17 stages, including five stages in the 6-th case, six stages in the 8-th case, and six stages in the 16-th case were not considered in this paper due to incomplete measured tool wear data. This yielded a total of 150 complete tool wear condition experimental datasets for the 13 cases listed in Table 1. It should be noted that the reduction in the total sample size from 167 to 150 has little impact on the proposed algorithm, which is designed for use with small sample sizes. In addition, the cutting parameters of the 6-th, 8-th, and 16-th cases are the same as those of the 15-th, 14-th, and 5-th cases, respectively. As a result, the loss of these three cases did not reduce the number of milling condition types in the samples.

B. ANALYSIS AND RESULTS
The performance of the proposed TCM method was evaluated for both workpiece materials by dividing the sample set corresponding to the two materials into training samples and test samples according to the different cutting depths and feed rates. As can be seen in Table 1, cast iron workpieces are employed in 8 cases (1)(2)(3)(4)(9)(10)(11)(12) and stainless steel J45 workpieces are employed in 5 cases (5, 7, 13-15).
Therefore, the training set for cast iron was selected from the 2-th, 4-th, 9-th, and 11-th cases while the training set for stainless steel J45 was selected from the 5-th and 13-th cases, and the remaining cases (i.e., 1, 3, 10, and 12 for cast iron and 7, 14, and 15 for stainless steel J45) were used as the testing set. As such, the training set included data obtained for 74 milling stages, while the testing set included that for 76 milling stages. In addition, the regularization parameter C in TAKELM was optimized using a 10-fold CV method. The ground truth tool wear values and the predicted values obtained from the proposed TCM method based on TAKELM are shown in Figure 6. Here, the wear results obtained for the 7 tools employed in cases 1, 3, 7, 10, 12, 14, and 15 are listed in sequence. Qualitatively, the tool wear prediction results are observed to agree well with the ground truth tool wear data. In addition, TCM methods based on least squares SVM (LS-SVM) and KELM with current sensor signals were applied to the NASA milling dataset, and the Gaussian kernel was selected as the kernel function in LS-SVM and KELM. The regularization parameter C and the hyperparameter h of the Gaussian kernel were optimized using a 10-fold CV method. The tool wear prediction performances of the TCM methods based on LS-SVM, KELM, and TAKELM were evaluated according to several performance metrics including the mean absolute error (MAE), root mean square error (RMSE), and correlation coefficient (R). The results for the three TCM methods are presented in Table 2. A comparison of the results indicates that the proposed TCM method based VOLUME 8, 2020   on TAKELM provides superior tool wear prediction from the standpoint of all performance metrics considered.

V. EXPERIMENTAL INVESTIGATIONS A. DESCRIPTION OF EXPERIMENT
The experimental setup employed for conducting TCM under various milling conditions is shown in Figure 7.
A three-axis VDL850A Vertical Machining Center (Dalian Machine Tools Group, Dalian, China) was used for the experiments. The cutting tool used in the experiments was an uncoated three-tooth tungsten steel end milling cutter ( 10 mm), and the workpiece material was #45 steel (carbon content is about 0.45%). Because the motor was a threephase motor, three current sensors were clamped on the motor wires to measure the currents of the three phases of the motor. In addition, several accelerometers were mounted on the spindle and table for other research purposes. As shown in Figure 8(a), the sensor signals were collected during toolwear testing at a continuous sampling frequency of 12 kHz using an Avant MI-7016 data-acquisition instrument (Econ Technologies Co., Ltd., Hangzhou, China) and stored on a personal computer. As shown in Figure 8(b), the wear of each  individual flute of the cutting tool was measured offline using a GP-300C microscope (Gaopin Precise Instrument Co., Ltd., Suzhou, China) each time after completely finishing a workpiece surface. The workpieces were uniformly sized such that a surface was completely finished after five cuts, that is, three forward cuts and two backward cuts. It is noteworthy that we found the influence of the length of rake face wear (KB) on the surface roughness of the workpiece after milling was greater than that of flank wear (VB) and the depth of rake face wear (KT). Therefore, KB was employed as the tool wear criterion in the experiments, and the tool wear value after each cutting stage was defined as the maximum KB value of the three teeth. Figure 9 illustrates the progression of tool wear after finishing a single workpiece surface 1, 5, and 10 times (i.e., 1, 5, and 10 milling stages). Figure 10 presents the tool wear value with respect to cutting time. It can be seen that the tool wear varies greatly under the same feed rate conditions. A total of 14 operational conditions were generated with a random combination of three cutting parameters: spindle speed (2300, 2400, and 2500 rpm), depth of cut (0.4, 0.5, and 0.6 mm), and feed rate (0.058, 0.065, and 0.072 mm/tooth). The operational parameters of each experimental case are listed in Table 3. Each experimental case was initiated with a new tool, and varying numbers of milling stages were conducted until the degree of measured tool wear was at least 1.7 mm. Each milling stage consisted of five cutting passes for finishing a surface (i.e., three times forward and VOLUME 8, 2020 two times back). Each experimental case involved 10 milling stages, except for the 7-th case (11 stages) and the 14-th case (7 stages), which represented data collected for a total of 138 milling TCM samples.

B. ANALYSIS AND RESULTS
The signals obtained from the three current sensors corresponding to the last cut in each milling stage were selected to extract nine feature parameters (i.e., three feature parameters per sensor for three current sensors).
Considering the sample size of different cutting parameters, the sample set was divided into training samples and test samples according to the different cutting depths and spindle speeds. The training set included cases in which the spindle speed was 2300 rpm or 2400 rpm and the cutting depth was 0.4 mm or 0.6 mm, while the testing set included cases in which the spindle speed was 2500 rpm and the cutting depth was 0.5 mm. Thus, the 1-st, 2-nd, 4-th, 5-th, 6-th and 8-th cases were selected for the training set (Table 3), and the remaining cases (i.e., 3, 7, and 9-14) were used as the testing set. Therefore, the sample sizes in the training and testing set were 60 and 78, respectively. In addition, the regularization parameter C in TAKELM was optimized using a 10-fold CV method.
The ground truth tool wear values and the predicted values obtained from the proposed TCM method based on TAKELM for the testing set are shown in Figure 11. We note here that the prediction results agree particularly well with the ground truth data. Same as Section IV, the proposed TAKELM method was compared with LS-SVM and KELM. The Gaussian kernel was selected as the kernel function in LS-SVM and KELM. The regularization parameter and the hyperparameter of the Gaussian kernel were optimized using a 10-fold CV method. The MAE, RMSE, and R values obtained for these TCM methods are listed in Table 4. A comparison of the results in the table indicates that the proposed TCM method based on TAKELM provides superior tool wear prediction from the standpoint of all performance metrics considered. In fact, the MAE and RMSE values obtained when using the proposed method are sufficiently small as to represent a practically negligible prediction error.

VI. CONCLUSION
This study proposed a TCM method employing a few key current sensor signal features based on the time, frequency, and time-frequency domains of the signals and an advanced monitoring model based on an improved KELM to achieve excellent TCM performance for monitoring milling processes. Current sensors were deemed to be most appropriate due to their low cost and simple installation that has no effect on the milling process, while the selected multi-domain features, which are strongly correlated with tool wear condition, overcome the loss of useful information related to tool condition when employing a single domain. The proposed TAKELM employs an angle kernel function that includes no hyperparameter. This approach overcomes the drawbacks of KELM in terms of learning the features of complex nonlinear data and avoiding the need for preselecting the kernel function and its hyperparameter. The prediction performance of the proposed method was verified by its application to the open-access benchmark NASA milling dataset and separate TCM experiments in comparison with TCM methods based on the LS-SVM and KELM. The results demonstrate that the proposed method outperforms the methods based on KELM and LS-SVM and obtains prediction results with very small errors. As such, the proposed TCM method achieves excellent monitoring performance using only a few key signal features of current sensors.
It must be noted that the NASA dataset and the separate TCM experiments were both limited to cases involving low spindle speeds (i.e., 826 and 2300-2500 rpm, respectively), and only gradual changes in tool condition were observed. Therefore, further investigation must be conducted for verifying the effectiveness of the proposed TAKELM method in high-speed milling and for monitoring other tool conditions (e.g., chipping and breakage).