Assessment of Selected Machine Learning Models for Intelligent Classification of Flyrock Hazard in an Open Pit Mine

This paper presents an alternative methodology for the study of flyrock hazards in mining, utilizing Artificial Intelligence (AI) through machine learning by classification. By using distance as a delineator to denote the consequences of a blast, the models generated two classes of blasts: safe and unsafe. In this study, statistical learning models best suited for classification, that is, K Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Artificial Neural Networks (ANNs), were used, and their classification abilities were assessed. Machine performance was evaluated using a Confusion Matrix (sensitivity and specificity) and Receiver Operating Characteristic (ROC) curve. A higher weight was assigned to the minority class (unsafe blasts). Overfitting assessment was also performed. The Wide Neural Network (WNN) demonstrated the highest classification superiority. During training and validation, 75% sensitivity, 100% specificity, and an ROC of 0.9853 were achieved. In the test phase, perfect stratification (100 %) was maintained, with an ROC of 1. The Cubic SVM exhibited 50% sensitivity, 100% specificity, and an ROC of 0.9412 during training and validation. In the test set, it achieved 100% sensitivity, 100% specificity, and a ROC of 1. Fine KNN showed 50% sensitivity, 94.1% specificity, and an ROC of 0.7206 in the validation set. The test set displayed 100% sensitivity, 100% specificity, and an ROC of 1. Conversely, Coarse DT had a higher misclassification rate, resulting in a 25% sensitivity, 76.5% specificity, and an ROC of 0.5221 during the validation phase. In the test set, it showed 50% sensitivity, 100% specificity, and an ROC of 0.75. A feedforward neural network (FNN) was designed, trained, and demonstrated to be a highly flexible classification tool. The FNN achieved an excellent classification score of 100%. These findings demonstrate the potential for the broad applicability of machine learning through classification in addressing flyrock challenges in open-pit mines.


I. INTRODUCTION
Recently, Machine Learning (ML) has received significant attention in science and engineering.This is because of their ability to improve the quality of human life [1] by solving complex problems associated with different phenomena [2].
The associate editor coordinating the review of this manuscript and approving it for publication was Alberto Cano .
Machine learning (ML) is a subfield of artificial intelligence (AI) dedicated to training machines to automatically learn from data and past experiences [3] without the need for explicit programming.This process involves identifying patterns to make predictions, with minimal human intervention.ML employs various statistical learning methods to improve the performance of machines over time.
ML has wide application in various fields [4].In the mining industry, the deployment of intelligent machines has promoted efficiency and productivity, which are crucial to profitability.One of the key safety concerns in mineral exploitation is the generation of flyrock from poor blast designs.As depicted in Fig. 1, flyrocks are rock fragments from a blast propelled beyond the limits of the blast area [5].This is an undesirable occurrence in blasting operations owing to its adverse environmental impact.These rock fragments can cause significant damage to infrastructure such as machinery, buildings, and other equipment.It can also cause minor to severe injuries.Additionally, fatalities have been reported [6].
Flyrock is a complex phenomenon [7].Bajpayee et al. postulated that this arises because of a mismatch between the mechanical strength of the rock mass, confinement of the charge in a blast hole, and distribution of the explosive energy [8].Other investigations [9], [10] point to controllable parameters such as the use of inappropriate blast patterns, incorrect burdens, improper stemming lengths, excessive powder factors, and ill-suited delay times.In addition, uncontrollable parameters, such as density, porosity, Uniaxial Compressive Strength (UCS), primary wave and secondary wave velocities, and geological conditions, such as rock mass discontinuities, have been noted as the most probable causative agents.
Traditional methods such as the use of empirical techniques have proven cumbersome (technically difficult to perform), unreliable, time-consuming, and prone to large errors [11].These pitfalls have led to a rapid interest in ML technology, which, when trained properly, can be fast, reliable, and can accommodate many parameters.The possibility of applying ML interventions to flyrock problems is of increasing interest.Therefore, several studies on flyrock distance prediction have focused on the use of AI through ML to solve flyrock-associated problems.For example, Monjezi et al. showed that an Artificial Neural Network (ANN) was effective in predicting flyrock at the Sangan iron mine in Iran [12].Hasanipanah et al. found that the Genetic Algorithm (GA) model had high accuracy compared to the Imperialist Competitive Algorithm (ICA)-and Particle Swarm Optimization (PSO)-based models [13].Marto et al. demonstrated that ICA -ANN was a more accurate model than the Back Propagation-ANN [14].Trivedi et al. proved that an ANN using Back Propagation was a superior predictive tool than the multivariate regression analysis (MVRA) tool [15].Jamei et al. compared Kernel Extreme Learning Machine (KELM) with Response Surface Methodology (RSM), Boosted Regression Tree (BRT) and Local Weighted Linear Regression (LWLR) and discovered KELM to have good predictive capability since it was computationally efficient [16].These studies were aimed at flyrock distance predictions and how ML can inform proper decision-making that minimizes or eliminates flyrock hazards.
ML can be used to solve problems via regression or classification.Regression models output continuous variables.However, for classification models, there are a discrete number of possible outcomes [17].For an imbalanced dataset (data with unequal distribution of classes), which is usually the case for flyrocks, regression solvers, especially linear solvers, have been found to be less than ideal as they are highly sensitive.Machine learning through classification offers an alternative approach for understanding flyrock problems more realistically than the regression approach.For example, Hudaverdi and Akyildiz used a novel classification approach, Multiple Discriminant Analysis (MDA), to group the blasts.The study estimated the severity of the flyrock rather than estimating a numerical value (flyrock distance).They noticed that even though their regression model exhibited high correlation coefficients, some of those regression estimations had huge deviances of more than 10 -20 metres; therefore, they could not be relied upon because accuracy was as crucial as it could mean, in extreme cases, the difference between life and death.Their study demonstrated that flyrock throw distance can be presented using categorical variables rather than numerical values [18].
As outlined in Table 1, many recent studies have focused on how ML can solve regression problems for safe blast outcomes.However, few studies have focused on how ML through classification can be utilized to address the negative impacts of flyrocks.Therefore, as an alternative approach, more work is required to demonstrate the potential of using classification models in the study of flyrock generation and mitigation.
The aim of this paper to illustrate the capability of ML classification in flyrock control through a case study.In this study, the classification was based on a predetermined distance criterion that delineated the blast-clearance zone.The goal was to classify, on the premise of distance, a set of blasts in an open pit mine into 'safe' and 'unsafe' blast denoting their severity.A safe blast in the context of this study was one whose blast-induced fragment throw was less than 90 m.Any throw distance from 90 m onwards would result in a flyrock and therefore the blast would be considered 'unsafe' since it posed the highest risks.Thus, the primary objective was to use different ML methods to assess classification performance.This comparative study determined the best classifier based on its superior performance.Based on this ML approach, this study describes the selected machine learning methods: K Nearest neighbors (KNN), Support Vector Machines (SVM), Artificial Neural Networks (ANNs) and Decision Tree (DT).This paper shows that classification can be relied upon to solve flyrock-related problems.To the best of our knowledge, this is the first study that has focused purely on classification techniques to characterize blast safety with reference to flyrock generation.Moreover, it has considered rock mass conditions, an aspect recommended by a number of studies [19], [20] owing to insufficient attention in past studies.

II. MATERIALS AND METHODS
This section discusses the data acquisition process and the learning methods used in this study, that is, the DT, KNN, SVM, and ANNs.

A. MATERIALS
The current investigation involved sampling and analysis of blast data from an open-pit gold mine in the southern part of Kyushu Island, Japan.A total of 29 experimental blasts conducted in the mine formed the dataset used in this study.The blast design parameters are presented in Table 2. To examine the relationship between the input variables and the output variable (flight distance) for each blast, seven blast design parameters were chosen for this study.They included the burden, stemming length, Brazilian Tensile Strength (BTS), Uniaxial Compressive Strength (UCS), rock density, powder factor, and crack density of the rock mass.Crack density is the number of joints found within a 1 metre-traverse of the bench face, as shown in Fig. 3.
The dataset consisted of blast geometry, explosive parameters, and rock mass properties.The majority of other studies have rarely incorporated adequate rock parameters in their study.However, in this study, a significant number of key rock parameters, such as rock density, BTS, UCS, and crack density, were investigated.
After analyzing the collected blast data, a 90 m boundary was chosen.This decision was aimed at creating an imbalanced dataset, ensuring sufficient data for training in machine learning, while leaving an ample amount for testing.An 'unsafe' blast (represented by 0) denotes a flyrock distance of 90 m and above, presenting the highest risk to infrastructure and potential harm to personnel in the mine.On the other hand, a 'safe' blast (represented by 1) has a flyrock throw length below 90 m.This two-class grouping was achieved through binning using Python in the Pandas library, consequently transforming it into a binary classification problem.Rock blasting is a rapid occurrence; hence, a proper qualitative analysis of any blast is not possible with the aid of the naked eye.Recently, high-speed cameras have gained popularity because of their fast capture speed, high resolution, and light sensitivity.They can also be used to obtain quantitative measurements, such as the flyrock direction and velocity.In this study, a high-speed camera (Vision Research, Phantom Ver.7.3) was set up perpendicular to the face direction, and the blast footage was recorded at a frame rate of 1000fps.A 2-metre white line, serving as a scale, was marked perpendicular to the bench face to aid in image analysis.For each blast, 12 trackable blast-induced rock fragments were analyzed using Phantom Camera Control (PCC) software.The average velocity of each fragment was computed and determined as the 'initial velocity' of the fragment.The 'maximum initial velocity' was defined as the largest 'initial velocity' among the analyzed fragments.The experimental setup is shown in Fig. 2.
To investigate the influence of rock mass conditions, a digital camera was used to capture the joints on the bench face prior to each blast.A 1-metre traverse across the entire face was established, as shown in Fig. 3.
An overview of the key stages of this study is presented in Fig. 3.The data generated by the research are presented in Table 4, and the statistical summary of the dataset is presented in Table 5.The powder factor, stemming length, and burden were used to generate a prediction equation for the flight distance of fragmented rocks.This was due to their lower p-values compared to the other parameters, which had higher p values upwards of 0.5 (statistically insignificant).The high p-values for rock density (0.50), BTS (0.51), UCS (0.74), and crack density (0.99) rendered them insignificant parameters for predictive value; thus, for the sake of the equation, these parameters were omitted.A significance F of 0.008886, which is below the recommended p-value of 0.05, inferred that the selected features were statistically significant.Multiple regression analysis, summarized in Table 3, was conducted to determine the influence of the selected parameters on the initial velocity (V o ).Thus, the linear equation of the initial velocity (V o ) is presented.
V o = 297.13PF + 5.05 SL + 4.92 B − 53.24 + ε where V o is the initial velocity in m/s, PF is Powder Factor in kg/t, B is Burden in metres, SL is the stemming length in metres and ε is the error term.The initial average velocity (µ) was determined to be the average of the measured maximum initial velocities which were estimated by comparing the distance traveled by the fragmented rocks after 1.0 ms with the scale.The travel distance of the fragmented rocks was calculated every 50 ms from the initiation time up to 600 ms.
Using the three-sigma (3σ ) rule on the assumption that the measured maximum initial velocity could be expressed with a normal distribution curve, the potential maximum initial velocity(V max ) was calculated using Equation 2 below.
where µ is the average initial velocity and σ is the associated standard deviation.In order to incorporate the riskiest of  scenarios, as in the case of typical hazard analysis, a 3 σ was adopted.
Thus, the maximum flight distance of the fragmented rocks became a cross product of the horizontal component of the potential maximum initial velocity (V x ) of the fragmented rock projectiles and the total flight time.
And total flight time (t) is given by: 8590 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.where H is the ench height, g is the acceleration due to gravity (9.81m/s 2 ), and θ is the elevation angle in degrees.Therefore, the maximum flight distance (FD) is given by the following equation: where FD is in metres, V x is the horizontal component of the potential maximum initial velocity in m/s, and t is the total flight time of the blast-induced fragmented rock in seconds.
FD was calculated on the assumption that flyrock occurred from the top of the face (face bursting) at an elevation angle of 45 degrees.The nature of face bursts can be indicative of overcharging or can easily express the impact of geological weaknesses, such as joints or cracks present on the bench face.Fig. 4 shows the correlation matrix developed using Matplotlib in the Scikitlearn library.It clearly indicates how the selected parameters are associated with each other.

B. METHODS
This section introduces and provides a description of the machine learning models (Decision Tree, KNN, ANNs and SVM) that were proposed and applied in this study.Machine learning tools, that is, classification, regression, and pattern recognition tool (nprtool) in MATLAB R2023b, were used in a Windows 10 environment to evaluate the performance of the models on the blast dataset.

1) K -NEAREST NEIGHBORS (KNN)
KNN is one of the simplest and most widely used ML algorithms, making it a baseline classifier for many pattern recognition problems [21].It is a supervised learning method primarily used for classification tasks.KNN is nonparametric, meaning that there are no fixed parameters for any given data size [22].The number of parameters depended on the size of the training dataset.In addition, no assumptions must be made regarding the underlying data distribution.
The algorithm works on the assumption that 'similar things exist in close proximity,' that is, similar things are most likely to be found closer to each other.This means that an unforeseen point or tested example can be classified based on the values of the closest existing points (training examples).
The ''K'' value refers to the number of nearest neighbor data points to include in the majority voting process.The assignment of the class label on a test example is a majority vote by the K-nearest neighbors closest to it.
where d i is a test example, x j is one of its k nearest neighbors in the training set, and y (x j , c k ) indicates whether x j belongs to class c k .Equation (6) indicates that the prediction will be the class with the most members in the k-nearest neighbors.
The distance metric is a method for determining the distance between a new data point and an existing training dataset [23].KNN has different distance metrics (over 50 in number).The use of the best distance metric yields good results for the test data, that is, the highest precision, recall, and accuracy [24].For this study, Euclidean, Manhattan, Cosine and Minkowski distance metrics were selected

2) SUPPORT VECTOR MACHINE (SVM)
SVM has proven to be dependable owing to its balanced accuracy and reproducibility when learning data classification patterns.The objective of every SVM is to optimize for correct labelling and ensure that the classifier can generalize well to the new data.Although it can be used to perform regression, the SVM has been widely developed as a tool for classification.This is because it is highly versatile, making it a convenient tool across a range of data science scenarios [25].
The SVM decision function is an 'optimal' hyperplane.The hyperplane distinguishes observations belonging to one class from another (see Fig. 5).This is a result of patterns of information about these observations, which we call 'features' or support vectors.Thus, the most probable label for unseen data can be determined from the hyperplane.In addition to the raw data, interpolation can be conducted to generate derivative data that are used to make up the features that are used to infer the hyperplane.Therefore, the SVM can classify both linear and nonlinear data.
A good SVM algorithm maximizes the marginal distance between the two classes while minimizing the classification error.Transformation is made possible using a kernel function.The kernel maps non-linearly separable data into a feature space or higher-dimensional space, where it is linearly separable [26].SVM models have different kernels.In this study, the following kernels were investigated: Linear, Quadratic, Cubic, SVM -Fine Gaussian, SVM -Medium Gaussian and SVM -coarse Gaussian.

3) ARTIFICIAL NEURAL NETWORKS (ANN)
The human biological nervous system inspired the development of Artificial Neural Networks (ANNs) [27].The goal of neural networks is to build machines using components that mimic the biological neurons.Consequently, machines are capable of executing complex tasks similar to those of the human brain.
A neural network, as shown in Fig. 6, is a mathematical model that consists of an input layer, hidden layer, and output layer.The mathematical relationship between the input and output values is obtained once the historical data is trained [28].Weights are associated with the input depending on the degree of relevance for each input that is associated with the output.An activation (transfer) function is then applied to produce the output value.The unit step (threshold), piecewise, linear, Gaussian, and sigmoid functions are among the most commonly used activation functions.ANNs can have different architectures, causing them to function or behave differently [29].
Feed-Forward Neural Network (FNN): This is also known as multilayer perceptron (MLP), or simply Neural Networks.In FNN, the connections between nodes do not form a cycle; that is, no loops are formed (see Fig. 7).This approach is simple because information processing is unidirectional.Once data is passed through hidden layers (nodes), it can never move backwards.

4) DECISION TREE (DT)
This is a non-parametric type of supervised machine learning, in which the decision-making process results in a visual flowchart that resembles an upside-down tree-like structure (see Fig. 8).The decision starts at the top of the tree, referred to as the root node.It adopts a data-separating sequence that results in Boolean values, that is, Yes or No/True or False.Therefore, information processing using this sequence flows from the root node to the internal nodes (branches), and terminates at the leaf nodes.Each node represents a feature, This algorithm is best utilized in solving classification problems but can also be used for regression applications [30].
Hyperparameter tuning is performed to optimize the performance of the algorithm.This can involve proper splitting i.e. division of the main nodes into sub-nodes.This is attained using a split measure function that selects the 'best' splitting attribute based on the impurity measure.As indicated in equations ( 7), (8), and (9), information gain/entropy (E), Gini index (G), and misclassification error (M) are the most popular impurity measures [31].
C. PERFORMANCE MEASURES For this study, the following performance indicators were used to measure machine performance:

1) CONFUSION MATRIX TECHNIQUE
In this study, the minority class, that is, the important class with fewer samples, had a higher risk loss compared to the majority class (with more samples).The main goal was to correctly classify minority data points.To avoid poor generalization owing to hyperplanes that are more biased towards the majority class, standard machine learning aimed at achieving an optimized overall accuracy should be avoided.
For such an imbalanced set (because flyrock does not occur in every blast), the overall accuracy of a classifier's goodness is usually biased.To overcome this, a confusion matrix (shown in Fig. 9) with information such as False Positives/Negatives and True Positives/Negatives, sensitivity, and specificity offers a more reliable measure of the classifier performance [32].
Because the data classification is divided into two classes, the resulting confusion matrix is made up of four cells, with each cell corresponding to True Positives (TP), False Positives (FP), False Negatives (FN), and True Negatives (TN).All parameters are evaluated to determine their accuracy, sensitivity, and specificity.This is another indicator that graphically illustrates the useful information that details the performance of a classifier on an imbalanced dataset.ROC Curves are generated by plotting true positives (sensitivity) against false-positive rates, that is, (1-specificity).Sensitivity and Specificity are inversely correlated; that is, a decrease in sensitivity indicates an increase in specificity.As shown in Fig. 10, the line of equality (50% specific and 50 % sensitive) where no discriminative value exists is represented by an area under curve (AUC) of 0.5.This is a diagonal line that runs from the lower-left corner to the upper-right corner.The incorrectly predicted results are represented by a curve below the line of equality.A curve close to the line of equality signifies a low accuracy and is not different from that obtained by chance.A curve above the line of equality indicate correct predictions or possesses a high discriminatory power [33], [34].

3) CROSS ENTROPY ERROR
The cross-entropy error or loss is a metric that evaluates the performance of a classification model during machine learning.The model weights are adjusted during training to minimize loss.A better model is one that has comparatively lower loss values in the range of 0 to 1. Cross-entropy loss of zero points to a perfect model.

III. RESULTS AND DISCUSSION
In this study, several machine classification algorithms were developed to classify flyrock hazard.The models would be able to predict based on severity whether a blast would be 'safe' or 'unsafe'.Decision Tree (DT), K Nearest neighbors (KNN), SVM and ANNs were investigated.As proven by Hudaverdi and Akyildiz [18], a classification approach can be utilized to indicate the potential risk of a blast.
Regression models were also evaluated to justify the use of the classification.Among the four machine learning models, namely, Support Vector Regression (SVR), KNN, ANNs and Decision Tree, the SVR with a linear kernel demonstrated the highest performance.It achieved the lowest Root Mean Square Error (RMSE) of 10.90, indicating the smallest prediction errors.Additionally, the SVR model showed a commendable R-squared value of 0.83, signifying a strong ability to explain variance in the data.The Variance Accounted For (VAF) was also impressively high, at 84.20%, further validating the model's capability.
However, the Decision Tree model performed poorly, displaying negative values for both R-squared (-0.13) and VAF (-0.61), suggesting a weak fit to the data.The Neural Network model showed a high RMSE (48.95) and negative R-squared value (-2.50), indicating inadequate predictive power.In this regression context, the K-Nearest Neighbors model displayed moderate performance, with an RMSE of 22.95 and an R-squared value of 0.23.SVR emerged as the most suitable regression model for predicting flyrocks in the experimental mine blasts, offering the highest accuracy and reliability compared to other models.As observed in Fig. 12, despite the SVR performing well, it still exhibited a high magnitude of differences between the true and predicted values, more so, in the unsafe blasts whose consequences are significant.Overall, this analysis, as illustrated in Fig. 11, proved that machine learning via regression was not optimal in our research context.Thus, we could explore the potential of classification learning to derive meaningful learning from  machine performance, which is crucial in decision-making with regard to blast safety in a mine.
For classification, this study utilized different criteria for the hyperparameters, particularly within each modeling technique.For example, in SVM, a number of SVM methods, such as Linear SVM, Quadratic SVM, Medium Gaussian SVM, have been examined to determine the best technique.This section presents the results of the learning models based on their classification abilities.
Initially, the dataset was stripped into x-and y-features.The x features represent all the explanatory variables.Consequently, the y features (response variable) displayed categorical values for two types of blasts, that is, safe and unsafe blasts.
Training was conducted using the default settings of the MATLAB Classification Learner.Feature selection was not performed.Misclassification costs were set to default.No optimizer properties were used, and Principal Component Analysis (PCA) was deactivated.This was performed to preserve the integrity of the original dataset.Thus, the machine performance was judged based purely on the original data.Twenty models were trained using the Levenberg-Marquardt algorithm (LMA).
At the end of the training, all models were assessed according to their performance.Accuracy, Total cost and Training Time were the three key performance indicators that informed the selection of the most ideal machine models.Simultaneously, the hyperparameter attributes of each model were investigated.This would show which specific tuning parameters were responsible for the training, validation, and test results.
A summary of the classification training and testing is presented in Table 6.Once their performances were ranked, the best four models were selected.
The Cubic SVM performed better than the other SVM kernels.It produced a hyperplane that was able to completely separate the vectors into two non-overlapping classes.This means that the nonlinear region was capable of separating the data more efficiently.Usually, the higher the degree of the polynomial, the more curved the resulting hyperplane line.Cubic SVM, which is a third-order polynomial when compared to quadratic SVM, was therefore more robust.
From the ANNs tested, the Wide Neural Network (WNN) performed comparatively better, perhaps due to the high  Using the Gini index, all DT algorithms yielded similar results.However, Coarse DT was unique because it managed the shortest training time.A possible inference is that the number of splits has a direct bearing on the machine performance, that is, the lower the number of splits, the faster the training time.
The Fine KNN has 1 neighbor compared to 10 in Medium KNN or to 100 in Coarse KNN.The classifier accuracy increased when the number of neighbors decreased.Fine KNN is therefore able to make finely-detailed distinctions between classes.Euclidean distance associated with Fine KNN was observed to be the best determiner of nearest neighbors compared to other KNN models.It performed better than the distance weighting used in the Weighted KNN, which was the second best KNN model.
The corresponding hyperparameters are listed in the Table 8.
The main purpose of this study was to assess the classification ability of SVM, KNN, ANN, and DT for the flyrock problem.After determining the best models, the next step involved carrying out a further comparative analysis among the selected models based on the main performance indicators.In this research, the Confusion Matrix and the Receiver Operating Characteristic (ROC) curve, denoted by the Area Under Curve (AUC), were the main statistical indicators used to measure the performance of the selected models.
Fig. 13 illustrates the performance of the Coarse DT algorithm.An inspection of its performance revealed that it correctly predicted 13 of 17 safe blasts (76.5% specificity).It gave 1 correct prediction out of 4 (25% True Positive Rate) for the unsafe blasts during training and validation.However, it attained a 50% sensitivity for unsafe blasts during the test phase.In contrast, it correctly classified all safe blasts and thus achieved a specificity of 100% for the same test sample.
Moreover, it achieved ROC of 0.5221 in the validation phase, and an ROC of 0.75 during testing.Therefore, it can be deduced that the Coarse DT algorithm showed a slight improvement in performance on the test set.
Analysis of the Cubic SVM model during validation indicated outstanding classification performance with 100% specificity for safe blasts. 2 of the 4 unsafe blasts were correctly classified (50% sensitivity).In the test set, the model accurately pinpointed the two categories of blasts with 100% scores for both sensitivity and specificity.
As presented in Fig. 14, using the Cubic SVM model, an ROC of 0.9412 was achieved during training and validation, whereas the test dataset recorded a perfect ROC score of 1.
In Fig. 15, it can be observed that the classification ability for unsafe blasts was similar to that of Cubic SVM, that is, 50% sensitivity (2 correct predictions of 4 unsafe blasts).However, it had a high specificity of 94.1% for the validation set.In the test phase, the Fine KNN model showed good distinction power because it appropriately classified all blasts.
The ROC measurements indicated an improvement from 0.7206 in the validation set to 1 in the test data.
As shown in Fig. 16, the WNN generated the highest classification accuracy.The matrix illustrated that all 17 safe blasts and 3 of 4 unsafe blasts were correctly identified.Moreover, it registered a 100% recognition rate for both types of blasts in the test set.
The ROC also demonstrated exceptional performance powers associated with the WNN.It also had the highest ROC scores of 0.9853 for the validation set and 1 for the test set.
Finally, all models were compared, and a ranking was established to ascertain the superiority of the classifiers (see Table 9).
Overall, in the assessment of the results based on the sensitivity and specificity measures from the confusion matrix and  ROC measures, WNN and Cubic SVM were found to be the most capable algorithms.The rest also displayed significant classification abilities.WNN and Cubic SVM displayed better ROC performances and were able to classify blasts better Fine KNN and Coarse DT.Coarse DT was the least desirable model.It had a misclassification (1 FN) in the test sample compared with the other models.
To validate the results of the machine-learning exercise, an assessment of overfitting was conducted.Given the small dataset, a 5-fold cross-validation was performed.Subsequently, the obtained performance was compared with the test accuracy of the four models.The ''5% points or less'' rule of thumb i.e. a heuristic guideline commonly used in machine learning, was chosen as a useful estimate for assessing the potential presence of overfitting.
Table 10 illustrates slight variations between the test accuracy obtained from the 70-30 train-test split dataset and the results obtained from 5-fold cross-validation for both the WNN and Cubic SVM.This implies a reduced probability of overfitting in the models.Conversely, the notable gaps in performance between Fine KNN and Coarse DT indicate a higher potential for overfitting.These results highlight the robust generalization capability of the WNN and Cubic SVM models.After determining that the WNN had the best overall performance, the second phase of the study investigated how ANNs can be trained for classification tasks using a different approach.The goal of the second phase was to develop a neural network model that could achieve a 100% accuracy rate in both the training-validation and testing phases, surpassing the performance of the WNN model.This would be achieved by building a new model and assigning new parameters and functions that would tune the machine to achieve a 100% classification accuracy for all types of blasts and in all datasets.
The nprtool in MATLAB was used in this study.This tool can be used to create models suited to pattern-recognition problems.Therefore, it is the best tool for training learning models for classification.The network architecture shown in Fig. 17   It contains sigmoid hidden neurons and softmax output neurons, both of which are ideal for classification.The training dataset was set to 70%, validation to 15%, and testing to 15%.
These optimal solutions can also be verified using ROC curve results.In all stages, from training to validation and testing, the two-layered FNN achieved a perfect score of 1 for the true-positive rate (see Fig. 19).
The training outcomes obtained using this network architecture are shown in Fig. 18.
As can be seen in the confusion matrix below, the FNN model also exhibited exceptionally high performance after it was trained and tested.The sensitivity and specificity based on the predicted outputs during the training and validation phases were 100% for the safe and unsafe blasts.Out of a total of 21 blasts, 18 blasts were successfully categorized as 'safe' while the remaining 3 were 'unsafe.'Consequently, the validation phase predicted 2 'safe' blasts and 2 'unsafe' blasts.To further confirm the classification power in this supervised environment, the test set successfully classified 3 'safe' blasts and 1 'unsafe' blast.No blast was misclassified.Finally, a review of its performance based on the 'All Confusion Matrix' revealed that it is also a far more accurate, flexible, and reliable model than classifiers such as Fine KNN and Coarse DT, which were trained using the classification learner app in MATLAB.
The best performance, as shown in Fig. 20, for the two-layered FNN occurred at epoch 30.The best model was selected based on the Cross-entropy and Error results.The cross-entropy loss function was selected as the objective function.Typically, high-accuracy classification models depict small values of cross-entropy.The goal was therefore to select the model with the smallest cross-entropy value as well as the smallest error values.
Table 11 lists the tabulated results from the training, validation, and testing of the two-layered FNN.The results therefore seem to indicate that machine learning models, such as KNN, SVM, DT, and ANN, can be applied to solve practical problems, such as flyrocks in mines.The experiment also implies that ANN have a significantly higher classification ability than KNN and DT.Among the ANN studied in this research, the two-layered FNN demonstrated how training on trained data can lead to the development of a superior learning model.
As earlier stated in this paper, most of the previous research studies that focused on flyrocks problems in mines were focused on building regression-based models for flyrock prediction.Upon closer examination of the data utilized in this study, it is evident that a nonlinear relationship exists between the input and output parameters.This nonlinearity is a challenging obstacle when attempting to solve regression problems.It is often difficult for regression models to predict with 100% accuracy compared to classification models.Although classification models have been used in other fields and in mining, very few, if any, have been used for flyrock distance determination.Therefore, this study tested classification algorithms as an alternative approach for solving flyrock-related problems.Moreover, it focused on the superiority of a selected number of classification models.
The aim of this study was to determine the safety aspects of a blast based on a predetermined distance.This made it possible to classify a blast into two classes: safe (< 90 m) and unsafe (> 90 m).Note that if the distance changes to let's say 60 m, the same process is repeated, and ML is applied  to determine the best classification algorithm.Although the distance is an important parameter, determining whether a blast is safe is a more significant goal.This classification strategy attempts to interpret the consequence of the flyrock distance in terms of the safety level assigned to a given blast.The results of the investigation indicated that Coarse DT, Cubic SVM, Fine KNN, and WNN were the four best machine-learning models to use.Among these, WNN and Cubic SVM displayed higher-level performance outcomes with a lower likelihood of overfitting.Fine KNN and Coarse DT were the inferior models compared to the former two (WNN & Cubic SVM); with Coarse DT being the least performing model.
The underperformance of DT may have been due to its inability to properly learn the complex relationships between the features in the data.DTs are also highly susceptible to overfitting, that is, high accuracy on training data but low accuracy on test data, or exceptionally high results on the test data with relatively low results during the training and validation phases.For example, Coarse DT registered 66.7% validation accuracy and 87.5% test accuracy.In addition, DTs generally work well with large datasets compared with the one investigated in this research, which had 29 sample points and thus considered a small dataset.
KNN and DT are simplistic compared to SVM and ANNs.Only one hyperparameter, that is, the distance between a group of data points expressed by the value of K (total closest neighbor), is investigated for KNN, whereas ANNs can have many hyperparameters.ANNs operate via black-box modeling, meaning that the mechanism that produces the output is not clearly understood.
A high degree of nonlinearity calls for complex operations in which models such as ANNs and SVM work optimally.SVM can use kernels that map nonlinear data points into a higher-dimensional space where linear separation is possible.In this case, Cubic SVM was the best kernel.A major advantage of SVM is its ability to classify datasets that have numerous attributes, even when a small sample size is available for training.This could be the reason for the exemplary performance of Cubic SVM presented in this paper.In ANNs, the hidden layers contain neurons that are largely responsible for their learning capabilities.In the WNN, the 100 neurons, which can be considered as an intermediate number of neurons in the hidden layer, encouraged proper learning.Small neuron numbers were associated with underfitting.The WNN also did not have a high number of neurons that resulted in overfitting, as evidenced by its ability to generalize well to the test data.
The research further investigated the classification ability of ANNs using the nprtool in MATLAB to train a Twolayered FNN with the sole aim of achieving 100% classification accuracy.This was attained and proved that the FNN was equally capable and that, overall, it was a highly flexible model since already trained data could be used for the next training phase, thus making the learning process quicker.

A. SENSITIVITY ANALYSIS
analysis is a crucial process because it can demonstrate the relationship between input factors and their effect on the response variable.Therefore, it was possible to determine the importance of each feature on a relative basis.In this study, a sensitivity analysis was performed on all seven features to determine their degree of influence on the output.First, scatter plots for each feature against the dependent variable (i.e., the flyrock distance) were plotted.The statistical measure used to establish the influence of the features was R-squared, which indicates the proportion of variance in the dependent variable explained by the independent variable of interest.The R-squared values in Table 12 indicate the magnitude of the influence for each feature.To validate the results of the scatter plots, two additional statistical methods were employed: the ANOVA and ReliefF algorithms.These are two popular algorithms used for feature selection in machine learning.The results are presented in  Fig. 21 and 22.To a large extent, all three methods were in agreement, suggesting that burden, stemming length, and powder factor were the most influential parameters on the throw distance.Additionally, they assigned varying degrees of importance to BTS, UCS, rock density, and crack density.The least influential parameters, as noted by all three methods, were the Uniaxial Compressive Strength, Crack density, and BTS.
Box plots were also used to examine the strength of relations among two selected features i.e. burden and powder factor.These features express sensitivity to the resulting flyrock distance.An inverse relationship was observed between burden and flight distance (see Fig. 23).This conforms with most other studies that have established a similar relationship [35].
Generally, a positive correlation was observed between powder factor and the resulting flight distance (see Fig. 24).This is supported by other research studies, which have shown that an increase in powder factor often results in a greater fragment throw distance.

IV. CONCLUSION
In this study, an alternative machine-learning method, that is, learning by classification for the prediction of flyrocks in a mine, was developed.
The following conclusions were drawn: i. Wide Neural Networks (WNN) demonstrated the strongest classification power for two categories of blasts: safe and unsafe blasts.This is followed by a Cubic SVM.This means that such a model can be relied upon to predict the safety of future blasts in the mine.ii.The performance of a model can be influenced by the dataset size.Coarse DT demonstrated poor machinelearning abilities.It had one false-negative result.It also exhibited the highest misclassification rate among the tested models.This may be attributed to the overfitting behavior common in most Decision Tree models.To a large extent WNN, Cubic SVM, and Fine KNN performed well, indicating that a sample size of 29 blast observations was sufficient.A larger sample size would have provided a clearer picture of the classification ability of the selected learning models, particularly with a clear improvement in the performance of the Decision Tree model.Future work should therefore focus on a larger dataset to ascertain, with very high confidence, the performance of machine learning models.iii. the case of the mine investigated, the Powder Factor (PF), Stemming Length (SL), and burden (B) were discovered to be the input parameters that had the most influence on the resulting fragment throw distance.Therefore, proper consideration of these variables during the blast design process should be performed to avoid flyrock accidents in mines.iv.The input parameters in the dataset did not exhibit a strong linear relationship when gauged against the response variable.However, this study showed that machine learning by classification can be applied successfully, even for phenomena that may not be well represented using regression analysis.v. Regression solutions are commonly validated using metrics, such as the Coefficient of Determination (R-square), Mean Absolute Error (MAE), and Mean Squared Error (MSE).Similarly, this study illustrates that for classification problems, combining the analysis from a confusion matrix, for example, the use of accuracy, recall, and Receiver Operating Characteristic (ROC) analysis, also offers a dependable evaluation of machine model performance.This alternative approach can aid domain experts in mining in making informed decisions.Lastly, to the best of our knowledge, this study is the first of its kind to purely use machine learning through classification intervention to show how flyrock problems can be analyzed.Therefore, it has opened up the possibility for deep consideration of machine learning by classification as a solution to tackle flyrock problems in other mines.By demonstrating the feasibility of using machine learning to solve flyrock problems, this study developed a framework for future studies to assess the performance of classification models.Future studies will aim to conduct additional test blasts to accumulate a robust dataset or sufficient sample of blast data for machine learning.Additionally, we intend to explore novel machine learning techniques, particularly those utilizing hybridization in machine learning (HML), to further evaluate their performance.

FIGURE 3 .
FIGURE 3. The critical stages of the research.

FIGURE 4 .
FIGURE 4. Correlation matrix of the blast dataset.

FIGURE 6 .
FIGURE 6. Shows a perceptron with its four basic parts: input, bias, weights, activation or step function.

FIGURE 10 .
FIGURE 10.An example of a ROC curve.

FIGURE 11 .
FIGURE 11.Performance metrics comparison in regression models.
number of neurons (100 in its first layer compared to 25 in the Medium Neural Network (MNN) and 10 neurons in the Narrow Neural Network (NNN).It also proved superior to Bilayered and Trilayered Neural Networks, which have two and three fully connected layers, respectively.

FIGURE 13 .
FIGURE 13.Confusion matrix and ROC for coarse DT during validation and testing phase.

FIGURE 14 .
FIGURE 14. Confusion matrix and ROC for cubic SVM during training and testing phase.

VOLUME 12, 2024 8599
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

FIGURE 15 .
FIGURE 15.Confusion matrix and ROC for Fine KNN during training and testing phase.

FIGURE 16 .
FIGURE 16.Confusion matrix and ROC for WNN during validation and testing phase.
is a two-layer feedforward neural network (FNN).

FIGURE 21 .
FIGURE 21.Strength of relation between input parameters and flyrock distance using ReliefF algorithm.

FIGURE 22 .
FIGURE 22. Strength of relation between input parameters and flyrock distance using ANOVA.

TABLE 1 .
Examples of some of the most recent research on flyrock problem using ML.

TABLE 4 .
Data set is presented.

TABLE 5 .
Descriptive statistics of the dataset.

TABLE 7 .
The top four selected models.

TABLE 8 .
Hyperparameters for the top four selected models.

TABLE 9 .
Ranking of the top four machine models.
FIGURE 17.Two layered feed-forward neural network architecture.

TABLE 11 .
Classification results using the two-layered FNN.

TABLE 12 .
Results from scatter plot analysis.