Enhancing Bearing Fault Diagnosis Using Transfer Learning and Random Forest Classification: A Comparative Study on Variable Working Conditions

Rotating machines require bearings to operate smoothly. However, wear, misalignment, and poor lubrication can degrade bearings over time. Fault diagnosis models identify and classify bearing faults. A fault diagnosis model trained in a specific working condition may not perform well in different working conditions. Real-world datasets are mixed with various work environment conditions; therefore, validating a model using different working conditions datasets is necessary. In this study, raw vibrational accelerometer data of variable working conditions is preprocessed using the window length and stride method to generate a data format suitable for evaluating the proposed model. This model employs the Transfer learning-based VGG16 model as the feature extractor and random forest as the classifier, and it has proven to be highly effective. This proposed fault diagnosis model adapts to different work environments and enhances fault classification at variable working conditions. The performance of the proposed model is evaluated using various metrics such as confusion matrix heatmap, t-SNE plot, precision-recall curve and learning curve. Results obtained from these metrics indicate that this model performs well compared to others. The overall accuracy of the model is 99.90%, and both the training and testing of this model are fast. It is evident from the learning curve evaluation that this model is free from over- or under-fitting issues. Overall, this model is reliable and suitable for classifying bearing faults at different working conditions and can be useable for real world purposes.


I. INTRODUCTION
The heart of any industrial entity or manufacturing facility is the machine.There are numerous varieties of machinery available in the industry.The profitability of any manufacturing facility is extremely reliant on the time machines are operational.In order to increase the company's profit margin, it is crucial to decrease outages, as maintenance The associate editor coordinating the review of this manuscript and approving it for publication was Orazio Gambino .
expenses account for approximately 15-20% of total production costs [1].It is a fact, however, that nearly 30% of maintenance expenditures are a waste of money due to improper maintenance strategy implementation and failure to conduct maintenance at the appropriate times.The unexpected failure of machine components may result in significant production losses.Mechanical systems rely heavily on the reliable and safe operation of rotating machinery due to its widespread importance in industry [2].Although every machine is necessary for the operation of the plant, only 5-10% of machinery is deemed critical.Examples of essential components are motors, compressors, turbines, boilers, and generators.According to statistics, electrical motors are utilized as prime movers in over 90% of mechanical engines, including gearboxes, compressors, and pumps [3].Rolling element bearings, one of the most crucial parts of an electric motor, have a major effect on performance and dependability [4], [5].Bearing-related faults, stator windingrelated faults, and rotor-related faults are the most common types of motor failure.Nearly 40% of these defects are attributed to bearing-related problems [6].However, these bearings are frequently subjected to harsh environments during operation, including high speeds, temperatures, and pressures, all of which can cause malfunctions and breakdowns.Preventing major accidents and costly economic losses requires prompt and precise detection of bearing defects [7], [8].Therefore, working on reliable techniques for identifying bearing faults is important.Model-driven and data-driven approaches can be used for defect identification [9], [10].Some model-driven techniques are vibration, temperature, and wear debris analysis.Vibration analysis is the most useful of these techniques since it may provide plenty of information about anomalies [11].Establishing the physical model is difficult without prior understanding of the physical structure, so model-driven approaches present a significant challenge.On the other hand, data-driven approaches do not need bearings expertise to identify problems using the vibration signals collected from sensors.Fault identification relies heavily on carefully selecting vibration signal features [12].Conventional data-driven fault diagnostic models use machine learning algorithms, classifiers, and signal processing techniques [13], [14], [15].For example, using manifold learning and wavelet neural networks, Wu et al. successfully classified faults in a diagnostic setting [16].Cerrada et al. proposed a fault diagnosis methodology that utilizes a combination of a genetic algorithm and random forest [17].Fault diagnostic techniques based on deep neural networks have also been incorporated.Jia et al. [18] trained a deep neural network with frequency spectra collected from vibration data to detect problems.The use of stacked denoising autoencoders in health state recognition was studied by Lu et al. [19].A model for bearing diagnosis using an adaptive convolutional neural network was proposed by Guo et al. [20].Ren et al. [21] used a model based on deep neural networks to estimate the remaining service life of rolling bearings.Using Deep Boltzmann Machines, Deep Belief Networks, and Stacked Autoencoders, Chen et al. [22] were able to detect faults in rolling bearings.Gaussian-Bernoulli deep Boltzmann machines were created by Li et al. [23] for signal analysis and feature learning.A deep neural network-based fault diagnostic model was suggested by Zhang et al. [24] that could learn directly from time series data without the need for signal preprocessing.However, the performance of these conventional data-driven approaches using machine learning algorithms is constrained to the same feature space, distribution, and working conditions.They are unable to adapt to the ever-changing work environment and its plethora of data sources.However, in practical situations, the operating conditions of machinery, particularly bearings, are variable.Due to the growing fault diameter makes a constant radial load impossible to maintain.Traditional diagnostic methods rely on manual feature extraction techniques, demanding domain expertise and significant time investment.Moreover, these methods often struggle to capture complex nonlinear relationships in the data [25].Transfer learning is popular in bearing fault diagnosis systems to solve these issues.Transfer learning uses pre-trained models like VGG16 as feature extractors to learn important representations from enormous datasets.A random forest classifier classifies faults using extracted features.This method improves generalization, feature extraction, and manual labor.They can extract relevant features from complex data, deep learning methods like VGG16 are optimal for bearing fault diagnosis.The pre-trained VGG16 model can readily identify vibration signal patterns and discriminative properties.This study uses VGG16 feature extraction and a random forest classifier to identify bearing defects.It employs transfer learning to enhance diagnostic performance and minimize overfitting resulting from insufficient training data.The bearing defect dataset from CWRU is used to evaluate this method.Several criteria, such as confusion matrix analysis, training and testing duration, precision recall curve, t-SNE (t-distributed Stochastic Neighbor Embedding) plot and comparisons to existing methods, are used to evaluate the efficacy of this method.In addition, this study uses learning curve visualization to acquire understanding of the model's learning abilities with changing the number of training sample.The results of this study demonstrate that the proposed methodology is efficacious in accurately detecting bearing faults under diverse operational conditions.The contributions of the research can be extended beyond bearing fault diagnosis.By employing transfer learning and random forest classification, our model exemplifies the potential of combining deep learning techniques with traditional machine learning algorithms.Knowledge gained from this research can be used to make educated decisions during the fault diagnostic process for a wide variety of mechanical and electrical parts.In the following sections methodology of proposed model, experimental setup, data of different working conditions, analysis of results and findings are discussed elaborately.

II. THEORETICAL BACKGROUND OF THE PROPOSED MODEL A. TRANSFER LEARNING AND VGG 16
Transfer Learning is a technique where the model is pretrained using a massive dataset.Transfer learning mainly consists of two components: pre-trained and transfer networks.During this training, the model learns to recognize essential features, which helps them classify complex data patterns easily.Transfer learning helps the model to become efficient when dealing with small-scale datasets due to its pre-trained network.It also reduces dependence on computational resources and training duration.The VGG-16 model architecture is illustrated in Fig. 1(a), and the proposed model architecture is depicted in Fig. 1(b), which is discussed in Chapter 3, section B. It is composed of a total of thirteen convolutional layers, two fully connected layers, and one SoftMax classifier.A sixteen-layer network was constructed by Karen and Andrew, comprising convolutional and fully connected layers.The authors of the study employed a configuration of 3 × 3 convolutional layers stacked sequentially to maintain a simplified architecture for the model [26].The initial two convolutional layers of the network consist of 64 feature kernel filters, each with a filter size of 3 × 3. Upon insertion of an RGB image with a depth of 3 into the layers, as mentioned above, the resultant output dimensions transform 224 by 224 by 64.Subsequently, the resulting output is forwarded to a max pooling layer with a stride of 2 [26].The network's third and fourth convolutional layers consist of 128 feature kernel filters, each with a filter size of 3 × 3.After the layers mentioned above, a max pooling operation is executed with a stride of 2, which leads to a decrease in the output dimensions to 56 × 56×128.The convolutional layers of the fifth, sixth, and seventh network layers are characterized by a kernel size 3 × 3. The present study reveals that the three-layered framework under investigation comprises 256 feature maps in each constituent layer.Subsequent to the convolutional layers, a max pooling layer is implemented with a stride of 2 [26].The eighth to thirteenth layers of the model consist of two sets of convolutional layers, each utilizing a kernel size of 3 × 3. The convolutional layers are structured into sets, each comprising 512 kernel filters.After the layers, as mentioned earlier, a minimum pooling layer is implemented with a stride of 2. The present study reveals that the neural network architecture under investigation comprises two fully connected hidden layers, namely the fourteenth and fifteenth layers, each consisting of 4096 units.After the layers above, a SoftMax output layer consisting of 1000 units is present at the sixteenth layer [26].
Each layer of VGG 16 needs to be discussed in detail to get a clear picture of this model.Fig. 2 depicts the convolutional layer of the vgg16 model.A kernel matrix is applied to the input matrix in the convolutional layer to generate a feature map for the succeeding layer.The kernel matrix is slid across the input matrix as part of a mathematical operation known as convolution.An element-wise matrix multiplication is conducted at each position, and the resulting values are added to create the feature map.Convolution is a specialized linear operation utilized extensively in numerous fields, including image processing, statistics, and physics.It is applicable along multiple axes.Calculation of the convoluted image for a 2-dimensional input image (I) and a 2-dimensional kernel filter (K) is as follows.
The activation function is a node that succeeds the convolutional layer and transforms the input signal nonlinearly.ReLU is a piecewise linear function that returns the input value if positive and returns zero otherwise [26].The disadvantage of the convolutional layer's feature map output is that it captures the precise position of input features.This means that even minor adjustments to the input image, such as cropping or rotation, can result in an entirely different feature map.We employ down-sampling techniques on the convolutional layers to address this issue.After the nonlinearity layer, a pooling layer can be applied to accomplish down sampling.Pooling facilitates the creation of a representation that is approximately invariant to minor input translations.
Invariance to translation signifies that if we translate the input marginally, the majority of the pooled outputs retain their  original values.A schematic of the pooling layer is shown in Fig. 3.The output of the final pooling layer functions as the input for the fully connected layer in the final stage of a convolutional neural network.There may be one or more of these layers with complete connectivity.The term ''fully connected'' suggests that each node in the first layer is connected to each node in the second [26].A schematic of the vgg16 fully connected layer is depicted in Fig. 4.

B. RANDOM FOREST
The random forest classifier is a combination of tree classifiers, each generated by independently sampling a random vector and the input vector.Each tree then assigns a unit vote to the most prevalent class for the purpose of classifying an input vector [27].Fig. 5   Information Gain Ratio criterion [28] and the Gini Index have commonly employed attribute selection measures in decision tree induction.In the random forest classifier, the Gini Index is used to measure the degree of impurity of an attribute in relation to the classes.The Gini index can be expressed as P+ is the probability of the positive class, and P-is the probability of the negative class [29].The random forest classifier employs a combination of features to train each tree to its maximum depth.One notable benefit of utilizing the random forest classifier compared to other decision tree techniques, such as the one introduced by Quinlan [28], is that the fully developed trees are not subjected to pruning.
According to existing research, the performance of tree-based classifiers is more significantly influenced by the selection of pruning methods than attribute selection measures [30], [31].According to Breiman's [27] proposal, the generalization error converges consistently as the number of trees increases, even in the absence of tree pruning.This scenario does not concern overfitting, as the Strong Law of Large Numbers applies.The random forest classifier requires two user-defined parameters: the number of features used at each node to generate a tree and the number of trees to be produced.The feature of importance in random forest can be calculated using this formula.
where f i i is the importance of feature i and ni j is the importance of node j.This formula can achieve a normalized form of the feature The final feature importance of random forest on each tree can be calculated by dividing the total number of trees.
RF f i i = jϵ all features norm f i i j T (5) where RF f i i is the feature of importance i calculated from all the trees in random forest, Norm f i i is normalized feature importance for i in node j and T is the total number of trees [32].The optimal split is determined solely based on the designated features at each node.The random forest classifier comprises a specific number of trees, denoted as N, determined by the user's input.The assigned value can be customized to meet specific preferences.Before classification, every case in a novel dataset undergoes evaluation by each N tree.Subsequently, the forest determines the category that has received the most votes from the N trees in the current scenario.

A. DATA PREPROCESSING
The data collection process entailed acquiring unprocessed vibration data from the Bearing Data Center located at Case Western Reserve University.The primary dataset comprised mainly of vibration measurements obtained from an accelerometer transducer situated at the drive end of a bearing.The data was gathered under diverse working conditions, ranging from a load of 0 horsepower to 3 horsepower and with varying rotational speeds.In order to initiate the data preprocessing phase, the original vibration data was line plotted and shown in Fig. 6.This visualization allowed for a comprehensive understanding of the patterns and characteristics of the data.In this study, a sliding window technique with window size and stride parameters was implemented to generate experimental data.This procedure involved sliding a window of a predetermined size over the vibration data with a predetermined step size (stride), allowing the extraction of overlapping segments.The preprocessed dataset was then constructed by arranging these segments into a two-dimensional array of RGB images, which are shown in Fig. 7.

B. PROPOSED CLASSIFICATION MODEL
The study conducted a multiclass analysis that involved identifying and classifying several distinct types of bearing faults.

C. FEATURE EXTRACTION AND CLASSIFICATION
The utilization of the VGG16 model as a feature extractor in this study enables the capture of significant representations from the preprocessed vibration data.A concise and informative depiction of the input data can be achieved by omitting the ultimate classification layer.
The feature representation extraction is accomplished by utilizing the output obtained from the ultimate pooling layer.The VGG16 model's extracted features indicate significant information about the fundamental patterns and structures inherent in the vibration data.Upon acquiring the feature representations from the VGG16 model, the Random Forest classifier is utilized for training and prediction purposes.The extracted features from the training dataset were utilized as input to train the Random Forest classifier.The process involves training the classifier to establish a correlation between the feature representations and the labels that signify the health status of the bearing, such as normal or faulty.In the training phase, a random forest classifier is constructed by creating multiple decision trees, each using only 5990 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.predictions regarding the health condition of the bearings.The classifier employs the acquired patterns from the training stage to generate accurate predictions for the unobserved vibration data.Fig. 8 serves as an illustrative representation of the comprehensive workflow employed in this research.

IV. A CASE OF BEARING FAULT CLASSIFICATION IN MULTIPLE WORKING CONDITION
This section pertains to the discussion of experimental configurations.The present study evaluates the efficacy of the VGG16 model in conjunction with a random forest classifier to diagnose bearing defects, utilizing the CWRU bearing dataset.The gradual process of natural bearing degradation spanning several years prompts researchers to induce bearing defects artificially or utilize accelerated life testing methods to conduct experiments and gather data, as per existing literature [33].However, the process of collecting data remains a time-intensive task.Thankfully, some institutions have provided their bearing fault datasets for academic research, such as the CWRU dataset [34] and the Intelligent Maintenance Systems (IMS) dataset [25].The CWRU bearing dataset is widely recognized as a highly utilized and dependable     of 0.040 inch at the inner raceway, rolling element, and outer raceway.Accelerometers were positioned at the 12 o'clock position at DE, FE, and the motor housing's baseline to record vibration signals.In inches, the fault depth (FD) represents the severity level.Bearing defects were used to record vibration data for motor loads of 0, 1, 2, and 3 horsepower (motor speeds of 1797 to 1730 RPM).Using a 16-channel DATA recorder, vibration signals were acquired at sampling rates of 12kHz and 48kHz for drive end bearing defects.Records were made regarding the torque transducer/encoder's measurement of speed and horsepower [35].This work used 48k drive end bearing fault data to evaluate the proposed model.Three distinct cases are evaluated in this study.Each case is tabulated with fault categories Inner Race (IR), Outer Race (OR), Ball(B), and fault depths, such as 0.007-inch, 0.014-inch, and 0.021-inch, for bearing faults.As shown in Table 1, the CWRU dataset is selected for training and testing purposes based on various bearing defect types and loading conditions.The model is trained using 75% of the RGB image data, while 25% is utilized for evaluating the model's efficacy.
In Table 2, the number of training and testing samples utilized in this study are broken down.In this research, Python was the primary programming language employed for tasks encompassing data preprocessing, feature extraction, classification, and the generation of evaluation metrics and graphical visualizations to assess the effectiveness of the proposed methodology.

A. PERFORMANCE METRICS AND ACCURACY OF THE PROPOSED MODEL
Accuracy is a widely used performance metric for classification models.The provided statement refers to the accuracy metric, which quantifies the proportion of correctly classified instances in relation to the total number of instances in the dataset.Table 3 demonstrates that our proposed model is robust under multiple load and fault conditions.In cases 1, 2, and 3, the proposed model's accuracy is 99.8%, 99.91%, and 100%.The evaluation of model accuracy compared to existing models is a crucial factor in assessing the performance of the proposed model.Table 4 showcases the comparison of the accuracy of the proposed model with existing literature.The performance metrics of the proposed model for three cases are presented in Fig. 10.By analyzing this figure, one can gain insight into the classifier's performance in correctly identifying positive instances and averting false positives and negatives.Analyzing the precision, recall, and F-1 score from the preceding figure shows that the proposed model for bearing defect classification is robust under all working conditions.In addition, 0.6% of the testing sample is misclassified by the model as IR021_0 rather than IR007_0.Therefore, IR021_0's true positive response was 98%, and IR007_0's was 99%.Similarly, the model generates incorrect results for Fig. 11(b), 1.3% of the testing sample was misclassified as B007_1 rather than OR014@6_1.Case 2 had a true positive rate of 99%.In Fig. 11(c), the proposed model correctly classified all fault categories.Figure 12 demonstrates the proposed model's precision-recall curve for three cases.In Fig. 12(a), class IR007_0 and IR021_0 are misclassified.Since IR007_0 has a precision and recall of 99%, just 1% of instances in this class are incorrectly classified.IR021_0 has a precision and recall of 98%, meaning that 2% of instances of this class are misclassified.Similarly, in Fig. 12(b) precision and recall is decreased because of one fault categories incorrect classification.The precision of OR014@6_1 is 99%, indicating that 1% of instances classified as OR014@6_1 is incorrectly classified.Similarly, class OR014@6_1 has a recall score of 99%, meaning that 1% of instances in this class are incorrectly identified.In Fig. 12(c), in this instance, all fault classes are classified flawlessly, resulting in 100% recall and precision for all fault categories.This indicates that the model identifies each instance of each fault class accurately.Figure 13 illustrates the Receiver operating characteristic curve of the proposed model.

B. PERFORMANCE VISUALIZATION OF MODEL ON DIFFERENT CLASSES
In Fig. 13(a), all other classes are accurately classified except for classes IR007_0 and IR021_0.The AUC of class IR007_0 is 0.99, and class IR021_0 is 0.98.Similarly, in Fig. 13(b), except for class OR014@6_1, all other classes are perfectly distinguishable; therefore, class OR014@6_1 has an AUC of 0.99.In Fig. 13(c), the classifier can distinguish each class from the others, indicating that the AUC value for all classes is 1.

C. VISUALIZATION OF DATA BEFORE AND AFTER CLASSIFICATION
The visualization of data before and after classification for three different cases are presented in figures 14 and 15.In Fig. 14, t-SNE is used to plot raw vibrational data.Before classification, the data points representing various fault categories overlapped, making fault classes difficult to distinguish.However, after training the data with the proposed model, the model effectively learned the underlying patterns present in the diverse datasets.After plotting the model's learning using t-SNE, as depicted in Fig. 15, it is evident that a simple visual inspection can now distinguish the defect types.

E. RESEARCH CHALLENGES
The most challenging aspect of bearing defect prediction is the lengthy and highly technical data acquisition process.Researchers currently rely on open-source databases to pursue effective prediction methods.Another issue is the quality of the data that is available.Feature extraction from unprocessed vibration data is also difficult; consequently, researchers are continually searching for more robust and efficient feature extraction techniques.Additionally, deploying a data-driven model for fault detection in a complex industrial environment is challenging.
Some issues need to be addressed when analyzing bearing data using the proposed model.A number of outliers must be addressed during the data preprocessing phase.In a few instances, the dataset was unbalanced because one class contained fewer data points than the others.Before preprocessing, it is essential to balance each class's data points.At the moment of feature extraction, it was also necessary to modify the input shape of the vgg16 model in order to accommodate the shape of the raw vibration data.During the training phase of the random forest classifier, it is also crucial to optimize the hyperparameter in order to achieve optimal model output.

F. CURRENT RESEARCH TRENDS AND FUTURE DIRECTIONS
Current research trends are largely influenced by various deep learning and machine learning algorithm combinations.Transfer learning is a contemporary trend that employs pre-trained models that are more robust in feature extraction.Consequently, contemporary researchers often utilize conventional machine learning methods combined with trans-fer learning to enhance the fault diagnosis procedure and the model's robustness.In the future, a more sophisticated, precise, and effective deep learning-based model will be developed, which will be utilized by numerous researchers for bearing defect diagnosis.Furthermore, scholars suggest adopting specific methodologies for future utilization in this domain.These methodologies include the application of semi-supervised learning to maximize the utility of available labeled data alongside extensive unlabeled datasets.
Additionally, introducing data augmentation techniques, such as Generative Adversarial Networks (GANs) and advanced models like Big GAN, is recommended to address data imbalance and scarcity challenges.Another avenue of research involves the implementation of few-shot learning algorithms to achieve robust classification accuracy with significantly reduced data volumes.Moreover, exploring transfer learning in diverse contexts within the field is encouraged, signifying its potential to enhance various techniques.Integrating deep learning with cloud computing and the Internet of Things (IoT) can generate a more intelligent and effective bearing fault diagnosis model.Incorporating a deeplearning model into a software package facilitates the end user's ability to identify malfunctions in rotating machinery.

VI. CONCLUSION
The present research introduces a novel approach for the automated identification and categorization of bearing defects.This approach utilizes the Transfer-learning-based model VGG16 as the feature extractor and Random Forest as the classifier.To evaluate the performance of the proposed model, this study is conducted on multiple fault types operating under varying load conditions.The raw vibration accelerometer data is preprocessed for training and testing into a suitable 2D array format, represented as an RGB image.This proposed model has several advantages over other models, including the fact that VGG 16 is trained on a large dataset of ImageNet, which makes feature extraction of complex datasets very simple for this model due to its use of pre-trained weights, which conserves computational power.Pretrained weights enable this model to extract meaningful information from complex images without requiring extensive data preprocessing or feature development.This also reduces overfitting and improves the performance of classifiers on new samples.This model's overall accuracy is 99.90%, which is significantly higher than other models.The random forest model requires 4 seconds for training and 1 second for testing.This is possible due to feature extraction with vgg16, as the extracted features are very straightforward for the random forest model to train and test the data.The duration for training data feature extraction was 19 seconds, while the duration for testing data feature extraction was approximately 3 seconds.All graphs utilized in this study demonstrate that the proposed model can accurately classify various fault categories.In conclusion, all results indicate that the proposed method is efficient and practicable, with promising implementation prospects in bearing fault diagnosis.

FIGURE 2 .
FIGURE 2. Element-by-element matrix multiplication and summation of the results on the feature map in convolutional layer.

FIGURE 6 .
FIGURE 6.(a) Spectrum of Normal condition at 0 HP load (b) Spectrum of Ball fault at 2 HP load (c) Spectrum of Outer race fault at 1 HP load.addresses the limitations of VGG16, yielding a more robust and efficient model tailored to a particular objective.This proposed model eliminates the final fully connected layer responsible for classification.This model uses the Random Forest Classifier instead of a fully connected layer for classification purposes.The Fig. 1(b) illustrates the architecture of the proposed model.

FIGURE 7 .
FIGURE 7. (a) Preprocessed RGB image of Normal condition at 0 Hp load (b) Preprocessed RGB image of outer race fault at 1 Hp load (c) Preprocessed RGB image of ball fault at 0 Hp load.

FIGURE 9 .
FIGURE 9. Instrumentation for the acquisition of bearing vibration signals for CWRU bearing dataset.
data source within the research community.As such, this study employs CWRU datasets to authenticate the efficacy of the proposed diagnostic model.CWRU datasets are vibration recordings from a 2-horsepower Electric motor.The experimental data acquisition configuration of Case Western Reserve University (CWRU) is depicted in Fig.9.Using electric discharge machining, defects are seeded into the drive end (DE) bearing (6205-2RS JEM SKF) and fan

FIGURE 11 .
FIGURE 11.Heat map confusion matrix of proposed model for (a) case-1.

Figure 11
Figure 11 illustrates the heat map confusion matrix of the proposed model.The model performed poorly in Fig.11(a), exhibiting inconsistent classification.Instead of IR021_0, the model incorrectly classifies 1.7% as IR007_0.In addition, 0.6% of the testing sample is misclassified by the model as IR021_0 rather than IR007_0.Therefore, IR021_0's true positive response was 98%, and IR007_0's was 99%.Similarly, the model generates incorrect results for Fig.11(b), 1.3% of the testing sample was misclassified as B007_1 rather than OR014@6_1.Case 2 had a true positive
D. PERFORMANCE OF MODEL WITH CHANGING TRAINING SET SIZE The learning curve performs cross-validation and returns training and validation scores for different training dataset sizes.It helps to assess how well the model generalizes to unseen data as more training examples are used.Figure 16 illustrates the learning curve of the proposed model with changing training set size.Examining the learning curves in Fig.16, it is evident that as the number of training samples increases, the validation accuracy of the proposed model gradually improves.The training and validation accuracy of the proposed model demonstrates a high degree of similarity as the number of training samples increases.Additionally, the moderate gap between the training and validation accuracy indicates that the model is not afflicted by either over-fitting or under-fitting.All three curves indicate a healthy classifier that accurately predicts bearing fault types.

TABLE 1 .
Information on CWRU data selected for training and testing.

TABLE 2 .
Distribution of training and testing samples for different working conditions.

TABLE 3 .
Accuracy of proposed model in various working conditions.