A Novel Intelligent Decision-Making Method of Shearer Drum Height Regulating Based on Neighborhood Rough Reduction and Selective Ensemble Learning

An intelligent shearer drum height regulating method is the key technology for mining at an unmanned coalface. In this study, a novel intelligent decision-making method of shearer drum height regulating is proposed, which makes a decision by selective ensemble the Kernel Extreme Learning Machine (KELM) with a self-learning ability. In this approach, the shearing coal process of the shearer is characterized based on the extended finite state machine. Transfer attributes are introduced to establish the decision information system of shearer drum height regulating. Then, propose a neighborhood rough reduction method is proposed to generate distinctive attribute subsets, which is applied to train the base classifiers based on the online KELM. Finally, we introduce an accuracy-guided forward search and post-pruning strategy to select part of the base classifiers for constructing an efficient and effective ensemble system of the shearer drum lifting prediction. For evaluating the proposed method, four evaluation metrics are used: accuracy, precision, recall rate and the F1-score, which are the most popular metrics for evaluating the performance of a classifier. We use the ten-fold cross validation method to optimize the hyperparameters. The proposed method is compared in two different scenarios: 1) three different classes of base classifier algorithms which including the Support Vector Machines (SVM), Support Vector Machines (CART) and K-NearestNeighbor (KNN) are used, and 2) two traditional ensemble methods including the bagging and random subspace. The proposed method is performed on the field datasets and the experimental results reveal that the method is effective in comparison to other approaches for shearer drum lifting prediction.


I. INTRODUCTION A. BACKGROUND
As the coal mining depth of underground increases, the disasters such as gas explosion, rock collapse and water inrush occur frequently in the process of the coal mining, which seriously threaten the lives of coal miners in full mechanized coalface [1]. To improve this situation, the unmanned fully mechanized coal mining method is considered as an effective The associate editor coordinating the review of this manuscript and approving it for publication was Gongbo Zhou . way to solve this problem for its high recovery ratio and extremely low mortality rate in recent years [2], [3]. The intelligent shearer, one of the main equipment in fully mechanized coalface, is a key equipment when using this method. However, it has always been difficult for the shearer drum height intelligent regulating to realize the intelligence of the shearer, since the natural occurrence boundary of coal seams is very irregular due to sinking roof rocks and rising floor rocks frequently.
This poses a challenge to make the shearer drum height adapt to the undulating changes of the coal roof. A number of techniques for the shearer drum height intelligent regulating have been devised so far [4], [5]. Roughly speaking, we can divide these techniques into two kinds: simple feedback regulating based on identification of the coal-rock interface and intelligent regulating based on multi-information fusion. The simple feedback regulating refers to making use of the information of the coal-rock interface from the sensors to achieve drum height regulating directly. Widely researched simple feedback control techniques include the control methods based on the gamma ray instrument [6], the radar coal thickness sensor [7], the thermal infrared-based coal seam tracking [8]. However, these technologies, which are based on simple drum height feedback, have not been widely applied due to the structural complexity of the coal seam, and technical problems related to identification of the coal-rock interface. Furthermore, stable and practical online identification of coal-rock interface is a complex problem, which will influence the simple and direct feedback regulating of the shearer for a long time.
To address these problems, the shearer drum height automatic regulating method based on multi-sensor information fusion can be considered a reasonable alternative. With the development of computational technologies, the shearer drum height intelligent regulating methods based on field data started to attract more attention of researchers. Chen et al. [9] developed a prediction method for the shearing trajectory using a LSTM neural network, in which the historical data of shearer drum height was used as the only input information. Li et al. [10] presented an automatic regulating method based on the grey-Markov model. The predicted height data of shearer drum were used to build the state transfer probability matrix of the Markov chain to achieve better precision and stability. However, Chen and Li did not achieve the expected results because the single input information hardly described the complex coal seam occurrence. Fan et al. [11] proposed a shearer intelligent height adjusting system control method based on Dynamic Fuzzy Neural Networks (D-FNN). However, the input parameters of D-FNN obtain from a mathematical model, which can be hardly adaptable to all cases. Si et al. [12] combined a CNN and Dempster-Shafer evidence theory to propose an intelligent multisensor data fusion and recognition method. By analyzing the vibration acceleration and current signals of six shearing conditions, the fusion algorithm predicted the shearing trajectory of the shearer. Wang et al. [13] obtained a model of self-adaptive adjustment height of the shearer drum based on artificial immunity and memory cutting. In this work, the shearing and working parameters of the shearer were considered to adapt the geological conditions of the coal-seam boundary changes. Yang and Xiong [14] also proposed an adaptive regulating height model of shearer drum, with two wavelet neural networks as identifier and controller respectively. Application results show that this control system is more effective than those based on normal neural network. Wang and Zhang [15] established an intelligent height adjustment model for shearer drums based on multi-sensor information fusion derived from the principle of minimum fuzzy entropy by testing vibration, current, acoustic emission and infrared signals during coal shearing. Xu et al. [16] proposed a self-adaptive cutting strategy using fuzzy theory, which can automatically adjust drum height as well as judge whether the shearer is cutting rocks. However, it is a challenge to define an appropriate fuzzy set and associated membership functions in such fuzzy logic-based models. However, none of these methods satisfies the expected standards of high accuracy because the existing methods have some disadvantages. First, the attributes describing changes of the coal seam from a local perspective are not considered. Second, there are methods depend on a single decision model which is difficult to effectively guide the shearer to intelligently adjust drum height in a fully mechanized coalface under complex and changeable geological conditions.

B. RESEARCH MOTIVATIONS
The research motivation of this study is twofold: employ -ment of the migration features and neighborhood rough reduction for describing and extracting the change law of the coal-rock interface, and the utilization of the selective ensemble learning methods-based the online self-optimizing KELM in height prediction of the shearer drum. The regulating pattern of the shearer drum height is classified into three types, that is, increasing its height, decreasing its height and holding its height. From this perspective, this research transforms the problem of the shearer drum height prediction into the problem of pattern recognition (multiclass classification). Generally, an intelligent shearer height regulating classification system framework based on multisensor information fusion can be divided into three stages namely (1) data acquisition, defining the perceptual parameters and data preprocessing, (2) classification and (3) controller. The data is collected from sensing system of the operating shearer. Now, the perceptual parameters which used to be the shearer height regulating come from the shearing and working parameters of the shearer [9]- [11], [12]. However, these parameters cannot describe the change law of the coal-rock interface to the cutting height adjusting system of the shearer drum. Therefore, we propose the migration features of the shearer working state to construct the complete parameters set, which describe the change law of the coal-rock interface from the local perspective. It is worth noting that there may be redundant parameters in the parameters set of field data because of using the migration features, which would constrain the efficiency and accuracy of the shearer height regulating model. In order to handle these problems and increase the classification accuracy, the appropriate parameters (attributes) reduction techniques must be used. Rough set theory, which was introduced by Pawlak [17], has attracted much attention. This methodology proves to be a powerful tool for attribute reduction [18], [19]. As we know, most of the applications select the reduction with the fewest attributes to construct a classifier at present [20], [21]. However, the information hidden in other reducts is wasted in this case. In order to avoid the decrease of accuracy of the height prediction caused by information loss, an essential technique needs to be employed.
Furthermore, a suitable classifier with a better performance must be employed to predict the regulating pattern of the height of the shearer drum. Several studies on developing ensemble models employing single machine learning methods are found in literature [22], [23]. Ensemble machine learning techniques are methods which combine opinions of the multiple learners to achieve a better performance. It allows using a group of simple predictors while achieving a better classification performance. Tsai [24] stated that the ensemble models based on the combination of diverse classifiers accomplish a better classification performance by eliminating the other classifiers' errors. Vijaya and Sivasankar [25] proposed a signal classification model to combine feature selection with the bagging ensemble classifiers to enhance performance in terms of training time and accuracy. Hu et al. [26] presented an ensemble classifier by classifiers built on the different attribute reduction subset by sampling randomly from the original feature set, which selected part of the base classifiers. Saha et al. [27] used SVM, Random Forest, Decision Tree and NB to build an ensemble learning method based on majority voting for prediction of protein interactions. In multiclass classification, Zhang et al. [28] used different classifier models to build an ensemble learning method for prediction of rockburst intensity and the algorithm using an ensemble classifier revealed a better classification performance. Hence, the main motivation of this study is to employ the ensemble classifiers in the shearer drum intelligent height adjusting to achieve a better prediction performance.
In this paper, a novel intelligent decision-making method of shearer drum height regulating is proposed based on neighborhood rough reduction and selective ensemble learning. The model can utilize the useful attributes information more effectively and it improves the efficiency and accuracy of the height prediction with KELM fast learning. Furthermore, enhancement of the fast learning and generalization performance of the proposed model is also achieved by the online self-learning KELM, which not only uses the general classification algorithms. The main contributions of this paper are as follows: (1) First, a novel intelligent decision-making method of shearer drum height regulating is proposed based on neighborhood rough reduction and selective ensemble learning.
(2) Second, we propose the migration features to describe the change law of the coal-rock interface from the local perspective to improve the generalization performance of model with irregular changes of coal-rock interface.
(3) Third, we selectively combine partial reductions that redundant attributes are removed using the attribute reduction based on multi-granularity neighborhood rough set, where the information hidden in reductions is utilized effectively for improving the accuracy of shearer height regulating.
(4) Finally, the online self-optimizing KELM algorithm is used in the ensemble classifier to achieve self-adaptive adjustment of the shearer drum height regulating model.
Besides, various experiments are conducted in order to compare the proposed method and the Bagging and Random Subset ensemble classifiers performance for the shearer drum height regulating.
The rest of this paper is organized as follows. Section 2 introduces the modeling process of the intelligent decision-making model of shearer drum height regulating. Section 3 describes the full experiments including performance evaluation, hyperparameters tuning, and verifying the performance of the proposed method. The main results of this study are summarized and discussed in Section 4. (1) Field data preprocessing. First, the parameters (attributes) set of the cutting and working state in the shearer process, including general attributes and migration attributes, is constructed. The decision information system for the shearer drum height regulating is established based on field data. Second, the VOLUME 9, 2021 different attribute subsets are obtained by the attribute reduction technology based on the multi-granular neighborhood rough set. Finally, the training sample sets and test sample sets are established corresponding to each attribute subspace. (2) Training and validating the base classifiers based on the kernel extreme learning machine. And then, an accuracy-guided forward search and post-pruning strategy is used to select parts of base classifiers for ensemble systems. (3) Application and online self-optimization of the SDHRM-SEoKELMRS model. Different modules are described in the following subsections.

A. ESTABLISHMENT OF THE SDHR ATTRIBUTE SPACE 1) REFINED REPRESENTATION OF THE COAL SHEARING PROCESS OF THE SHEARER
The shearer is one of the key equipment in fully mechanized coalface, whose main tasks are shearing and loading coal. The shearer adjusts the drum to adapt to change of coal seam thickness in fully mechanized coalface through the automatic height adjustment device, so that it can cut the coal seam along the coal-rock interface, and its principles are shown in Fig 2. X is the running direction of the shearer in the working face. Y is the advancing direction of the shearer in the working face. Z is the cutting height direction of the shearer.
Digital representation of the shearer shearing process is the basis for intelligent height adjustment of the shearer drum based on data-driven. Mataric [29] found that the interaction between an agent and the working environment can be modeled as synchronous finite-state automata. Thus, we propose a digital method for describing the cutting coal process of shearer based on an extended finite-state machine. In this way, we establish the mapping relationship between the state parameters and the behaviors of the shearer drum.
Definition 1: The extended finite-state machine of the cutting coal process of the shearer, formulated as a six-tuple SCEFSM = (Q , , D, W , δ, λ),where Q = q 1 , . . . , q n is the finite sensing parameters set of the cutting coal process of the shearer, = σ 1 , σ 2 , . . . , σ n is the value set of Q , D is the set of behavior parameters of the shearer drum, W is the value range of the behavior parameters, δ = Q × → Q is the state transition function of the shearer, and λ = Q × → W is the mapping between the finite sensing parameters and the behavior parameters. In addition, a three-tuple S = Q , , D is defined as the pattern, where = { 1 , 2 , . . . , n } is a set of the migration features of the cutting coal process of the shearer. C = {Q , } is defined as the set of condition attributes of the decision information system for the intelligent height regulating of the shearer.

2) CONSTRUCTION OF THE ATTRIBUTES SET
It is important to realize intelligent decision-making of the shearer drum height regulating based on multi-information fusion to establish the mapping relationship between the state parameters, the change law of the coal-rock interface and the behaviors of the shearer drum. We propose the transfer attribute set to construct the attribute set, which can make the pattern space S of the decision information system of the shearer drum height regulating.
The attribute set of the decision information system of the shearer drum height regulating needs to satisfy the essential conditions as follows: (1) the scale of the attributes does not change as the system changes, (2) there is a high correlation between the attributes and the behavior modes of the shearer drum, and (3) the attributes should be as complete as possible to reflect the changes of working state of the shearer drum. We establish the attributes set which is shown in Table 1 based on these three principles and engineering applications.
It can be seen that the attributes are not affected by the system scale in Table 1. Furthermore, the attributes are mainly the physical parameters reflecting the changes of working state of the shearer drum. The parameters c 1 -c 5 reveal the information of position and attitude of the shearer. The parameters c 6 -c 14 reflect the working information of the shearer. We define the change of the parameters c 6 -c 14 as the migration features. The parameters c 18 -c 26 are the migration attributes of the operational process, which reveals the evolution law of the coal-rock interface from the local perspective when the cutting trajectory of shearer drum changes in the time and space. The values of the migration features are obtained by the difference in the value of the parameters c 6 -c 14 between adjacent sampling periods. Based on the attribute set in Table 1, in order to analyze easily, we establish the decision information system of the shearer drum height regulating mode.
Definition 2: The decision information system of the shearer drum height regulating, DIS-SDHR is formulated as a five-tuple

B. CONSTRUCTION OF THE ROUGH SUBSETS
The key problem using the attribute subsets ensemble method to adjust height of the shearer drum is how to get a set of attribute subset with good predicting power. In this section, we employ the multigranular neighborhood rough sets to get the attribute subsets based on the decision information system of the shearer drum height regulating.

1) MULTIGRANULAR ATTRIBUTE REDUCTION FOR DIS-SDHR
The rough set theory was first proposed by Prof. Pawlak for processing imprecision, vagueness, and uncertainty of data [17]. Attribute reduction is an important application of rough set theory that is used to remove the redundant and irrelevant attributes in attribute set. Attribute values of the decision information system of the shearer drum height regulating are continuous and discrete data. However, Pawlaw rough sets can only deal with discrete data. In order to deal with this problem, the neighborhood rough set and attribute reduction technology [30] are used to establish the subspaces of attribute set by changing the neighborhood radius.
Given a decision information system of the shearer drum height regulating DIS SDHR . , x n } is a finite set of samples, C is the condition attribute set and D = {d} is the decision attribute set. B is a real-valued attribute subset for describing the samples. The δ-neighborhood of an arbitrary sample where δ B (x i ) is also called the δ-neighborhood information granule of x i . δ is the neighborhood threshold, which is a real number between 0 and 1.
is the distance function. Formula (1) represents sample set with similar attribute values to those of sample x i for attribute set B. The smaller δ is, the higher the similarity between sample x i and sample x j in the attribute space B, and the smaller the number of individuals in the neighborhood. When δ = 0, the neighborhood rough set becomes the classical rough set. Furthermore, the sample set is divided into sample set d 0 of the increasing drum height, sample set d 1 of the decreasing drum height, and sample set d 2 of the holding drum height according to the decision attribute d. The positive region of the sample set with respect to the feature set B is defined as follow.
According to formula (2), all samples with similar feature values to a sample which is in the positive region, have the same classification. This shows that the sample can be accurately classified under the feature set B. The more samples there are in the positive domain, the better the separability of the attribute set and the stronger the classification ability, which is beneficial to the learning of the classification algorithm. According to the decision information system of the shearer drum height regulating, neighborhood rough set-based attribute reduction is introduced to generate a set of reducts, and then each reduct is used to train a base classifier. To achieve this goal, a forward feature selection strategy is used based on attribute importance.
The significance of the attribute c in B relative to the decision d is as (4). Based on the advantages of the KELM with high learning speed, an online self-learning extreme learning machine of the shearer drum height regulating (OSL-KELM-SDHR) classifier is proposed, which is used to combine an ensemble classifier for adjusting cutting height of the shearer drum.
In this section, the OSL-KELM-SDHR model is discussed in detail.

1) ELM-SDHR BASE CLASSIFIER
Set the training samples of field data M = vector and T i ∈ R is the output value corresponding to x i (different values represent different adjustment patterns). The ELM model is defined as follow [31].
is the attribute mapping from the d-dimensional input to Ldimensional hidden layer attributes, ξ i is the training error corresponding to the i th sample and c is the regularization parameter, c ∈ R + .
Based on KKT optimization conditions, the optimization problem of formula (5) is solved, and the weight vector parametersβ BELM of the shearer shearing path adjustment model are calculated.
where T = [T 1 , T 2 , . . . T N ] T is the target value vector of the input samples, and H ( is the mapping matrix of the input samples. Assuming that the input vector of a new sample is x p , the height regulating mode T scp of the shearer drum can be calculated by formula (7). 2

) KELM-SDHR BASE CLASSIFIER
To further improve the accuracy of the base classifier and enhance the generalization performance with complex coal seam, we propose a KELM-SDHR classifier model based on KELM. Given a new input vector x p , the output T scp is calculated according to the weight vector parameterβ RELM .
According to the theory of the inner product of the kernel function [32], we can directly use the kernel function instead of the nonlinear explicit mapping HHT of the ELM hidden layer nodes to improve the robustness and generalization ability of the ELM. (9) is the classification model of the KELM-SDHR. The model does not need to provide an explicit hidden layer attribute mapping function h(x) and the number of hidden layer neurons L.

3) ONLINE SELF-LEARNING KELM-SDHR BASE CLASSIFIER
The boundary of a coal seam is complex and changeable in a fully mechanized coalface, but there is law. With the advancement of the working face, the newest structure information of the coal seam is exposed. However, the KELM-SDHR is unable to learn the newly acquired structure information of the coal seam online. With time passing by, the accuracy of the KELM-SDHR may decline. So, it is necessary to provide a dynamic update mechanism to improve the prediction accuracy of the KELM-SDHR. In view of this, this paper proposes a dynamic OSL-KELM-SDHR model based on the KB-IELM [33]. The data with labels are recorded by a remote control system in real time while operating the shearer. The label of new samples is based on the manual control and ensemble controller. The main steps using the OSL-KELM-SDHR are as follows.
Step 1: Initialization: where N 0 is the number of initial samples, K t = K (x, x 1 ) , . . . , K x, x N 0 , and Step 2: Online self-learning. Repeat the following process: It is assumed that k new samples with labels are obtained in the working process of the shearer. A new sample set with the initial samples and the new samples is formed. Then, we use the new sample set to train a new classifier.
where K t+1 = K x, x N 0 +1 , . . . , K x, x N 0 +k is kernel mapping of x for newly added training samples, Using the block matrix inverse formula [34] with A −1 t+1 , we can obtain The incremental updating basis for the classification model OSL-KELM-SCT is obtained by incorporating formula (11) into formula (10). where Step 3: Online use The application of the OSL-KELM-SDHR includes two stages.
(1) Self-learning optimization of the basic classifier. Considering the working cycle of the shearer and the structure of the coal seam, the learning cycle and the number of newly added samples are determined. Assuming that the learning time period is t, the N i samples are obtained. The newer the samples, the better generalization the trained classifier will get due to the local sudden change of the coal seam. Therefore, the forgetting mechanism is needed to enhance the impact of new samples on the classifier and remove the old samples. Assume that N max is the maximum number of training samples in the model. The OSL-KELM-SDHR discards d samples according to the historical timing of the samples when the model is updated in the (m + 1)th in order to guarantee the number of samples (2) Decision-making using the basic classifier. Suppose that the total number of samples is M ≤ N max after several updates, and then these samples are used to calculate the parameter A −1 M . When the input sample is x test , the shearer drum performs the action

D. SELECTIVE ENSEMBLE
The base classifiers trained with reducts and OSL-KELM-SDHR algorithm will have a good generalization power. What is more, the classifiers trained with different reducts should be diverse. It seems a good solution to construct ensemble model with the neighborhood rough set-based reducts. However, with experimental analysis, we find that there are usually hundreds of reducts for the field data. This triggers a problem: whether we require using all the reducts or not. If a small number of base classifiers are combined, the difference of the base classifiers is difficult to fully reflect, and the ensemble classification accuracy is insufficient. If all base classifiers are combined, the classification accuracy also will decrease due to the characteristics of the ensemble learning. However, it still has been difficult in how to choose base classifiers [35]. Although a large number of metrics and selection algorithms have been proposed, there is no consistently selective ensemble method that is better than other technologies in engineering applications [36]. Therefore, we introduce an accuracy-guided forward search and post-pruning strategy FS-PP [26] which selects part of the base classifiers to construct an efficient and effective ensemble system for the shearer drum lifting prediction.

2) DATA PREPROCESSING
The prediction of the adjustment mode of a shearer shearing drum is a complex classification problem based on data mining. The important premise behind machine learning theory is that we must have enough sample data with high quality. The quality of the data determines the quality of the model extracted by the classification algorithm or trained. When the sample quality is poor, it is difficult for the proposed model to achieve the expected results. Therefore, in the data preprocessing stage, the raw data set is processed to remove missing values and outliers, and then the data are normalized to construct a high-quality data set.
There are some data missing in the original sample data, including one-dimensional and multidimensional data. For samples with missing one-dimensional data, this article uses Lagrangian interpolation to impute the missing values; for samples with missing multidimensional data, to ensure accuracy, the sample is deleted. Some shearer shearing drum adjustment mode records are outliers; due to electrical failure or sensor failure, encountering some abnormal values is inevitable. According to the actual characteristics of these abnormal data, they are treated as missing values or corrected by using average values, and some are deleted.
To eliminate the dimensional interactions between the attributes, the original sample data are subjected to the minimum-maximum normalization process [37], where the relationship existing in the original data is retained, and each attribute value is mapped to [0,1] to achieve normalization. After preprocessing and normalization, the samples have 23,827 samples. Among the output variables, there are 6995 data in which the shearer drum rises, 7728 data in which the shearer drum falls, and 9104 data where the height remains the same. The data set composed of the attribute set {c 1 -c 14 , d} is called data set I, and the data set composed of the attribute set {c 1 -c 26 , d} is called data set II. The statistical results of the output variables in the two data sets are shown in Fig 4. The sample distribution has a small proportion of imbalance.

3) DATASET ESTABLISHMENT
This study uses a 10 folds cross-validation method to evaluate the performance of the proposed model. The attribute subsets are obtained based on the field data. According to each attribute subset, a corresponding sample subset is generated, and the number of samples is the same as that of the original data sample (duplicate samples are not removed). The samples split into a training subset of 70% and a subtest set with the remaining 30% samples. The whole training (testing)  set is composed of multiple training (subtest) subsets, and the training subtest) sets depends on the attribute reduction subsets. All training subsets were partitioned into 10 folds using cross-validation (one fold for test 9 folds for validating), and this is done on each dataset for 10 times, that is, each classifier is trained for 10 times (Fig 5.). The validating sets in all training sets are used to obtain the optimal hyperparameters for the base classifiers.

B. PERFORMANCE MEASURES
To accurately and comprehensively evaluate the ensemble classification model for shearer shearing drum height regulating mode, four evaluation metrics have been used: accuracy, precision, recall, and F1-scores [38] which are the most popular metrics to measure the performance of a classifier. A confusion matrix helps to intuitively compare and understand the classification performances of the proposed model. In the confusion matrix, predicted categories are represented by rows, and actual categories are represented by columns. Table 2 is the confusion matrix for classifications of the shearer shearing drum height regulating mode, where U, D and S represent increasing its height, decreasing its height and holding. UU, DD, and SS represent the number of samples with correct predictions, and other combinations represent the number of samples with incorrect predictions.
Based on the confusion matrix, the classification performance evaluation index of this study is defined as follows.
where p is the total number of test samples. Generally, the precision and recall are applied together in the F-measure for assessing the prediction accuracy of a classification model. These measures for height prediction problem of shearer drum are calculated as follows.
Pr ecision = 1 3 In this study, we use MATLAB R2018b for program development and perform our experiments on a hardware platform with an Intel(R) Core(TM) i5-6500 CPU @ 3.20 GHz, 16.0 GB RAM, and the Windows 7 64-bit operating system.

2) MODEL PARAMETERS
The parameters of the proposed model in this paper mainly include two categories, one is the neighborhood radii of the neighborhood rough set attribute reduction (NRS-AR) and the number of seeds K , and the other is the parameters of the base classifier ELM-SDHR and KELM-SDHR. We obtain the attribute subsets by changing the value K in the numerical range [0.1, 0.6] [39]. The initial value K is 50. All the hyperparameters of the base classifiers are selected with 10-fold cross validation while the base classifiers with the highest accuracies. The hyper-parameters for different classifiers in the proposed model are described in Table 3.

3) COMPARISON APPROACH
To evaluate the proposed method, three different perspectives are selected for comparison with the proposed SDHRM-SEoKELMRS: (1) Three base classifiers are listed in Table 4. Currently, there are many learning algorithms that can be used as basic classifiers for ensemble classification. However, we select three different categories of the most popular classification algorithm to compare with the base classification algorithm  selected in this article. All these algorithms are imported from the classification learning algorithm package of MAT-LAB R2018b. These algorithms are widely used for ensemble classification problems, which are simple and effective. There is no unified theory or guiding method for the selection of the related hyperparameters in the algorithm listed in Table 4. This paper uses a 10-fold cross-validation experiment to determine the optimal hyperparameters of each base classifier.
(2) Two traditional integrated classifiers are used: the bagging and random subspace methods. To show the superiority of the proposed FS-PP ensemble classification method, we use the bagging and random subspace methods to establish the same number of base classifiers as in the FS-PP and integrate them to carry out related comparative experiments.
(3) Two original attribute sets are used: the original attribute set containing transfer attributes and the original attribute set without transfer attributes. To show the influence of migration attributes on the classification effect for shearer shearing drum adjustment modes, the FS-PP ensemble VOLUME 9, 2021 classifier proposed in this paper is selected to compare the classification effects between the two original attribute sets.
Our comparison method is as follows: In Section D.1, we show the neighborhood data reduction subset based on the domain-granulated rough set; in Section D.2, we report the classification results of different base classifiers on the domain data set; in Section D.3, the relationship between the number of base classifiers and different ensemble classification performance indicators is shown; finally, in Section D.4, the ensemble classification method proposed in this paper is compared with the traditional ensemble classification method, and the classification results of the integrated system under the two sets of attribute spaces are presented.

1) ATTRIBUTE SUBSET
In the neighborhood threshold interval [0.1, 0.6], by adjusting the number of granulated seeds K in the neighborhood, Algorithm [26] is used to perform attribute reduction to the field data set. To ensure the complementarity of the classification information from attribute reduction subsets, we select as many attribute subsets as possible. In addition, to ensure the diversity of the base classifiers, each group of attribute subsets should have attributes that are at least 30% different. Combining the above two points, Table 5 shows the final numbers of reduced subsets that do not include transfer attributes and those that include transfer attributes.

2) THE PERFORMANCE OF SINGLE CLASSIFIERS
According to Table 4, the optimal parameters are described for each classification algorithm in Section Table 3 and  Table 4, and the classification effect of each algorithm on the shearer shearing drum height adjustment data set is analyzed. The base classifiers listed in Table 3 are independently evaluated using the field data set (data set I does not contain migration attributes and data set II contains migration attributes), and each base classifier is shown in Table 7 and Table 8, which show the accuracy, precision, recall and F1score results. Among them, Table 6 contains the classification prediction results for the data set that does not contain migration attributes, and Table 7 contains the classification prediction results for the data set that contains migration attributes. Table 6 and Table 7 summarize the prediction performances of different classifiers on the two sets of data. It can be seen that the KELM has the best prediction performance on the two data sets, and the highest prediction accuracy rates for the two data sets are 71.99 and 73.21, respectively. However, the prediction performance of the ELM on the two  sets of data is poor, and the prediction accuracy rates are 57.10 and 58.81, respectively. Among the five classification algorithms, the classification performances of the SVM and KELM are relatively close. The classification performances of different classification algorithms on data sets that contain migration attributes are better than those of data sets that do not contain the status transfer attributes. The prediction accuracy is improved by 1.80% on average, and the prediction accuracy of the CART algorithm is improved by up to 2.98%.

3) THE NUMBER OF BASE CLASSIFIERS
The number of base classifiers determines the final generalization performance of the ensemble algorithm. As the number of base classifiers increases, although the ensemble classification performance is improved to a certain extent, the data scale and the number of integrated classifiers increase, and the need to consume many computing resources causes the classification performance to decline. In this experiment, first, on the basis of the original data set I and the original data set II, 85 attribute subsets that do not contain transfer attributes and 128 attribute subsets that contain transfer attributes generated by reduction are used to generate the training data set of the base classifier. Then, the generated data set is applied to five classification algorithms, the KNN, CART, SVM, ELM, and KELM methods, to obtain two groups of base classifiers. Each group contains 5 types of base classifiers, each with 85 and 128 attribute subsets in dataset I and dataset II, respectively. Finally, the selective ensemble strategy in algorithm 2 is used to determine the number of base classifiers from each group to be finally integrated. The effect of the number of selective ensemble base classifiers on the performance of the ensemble system in terms of classification accuracy is shown in Fig 6.  that as the number of base classifiers increases, the recognition accuracy of the shearer drum height adjustment mode first gradually increases and then gradually decreases. The reason why the ensemble classification accuracy rate gradually rises during the initial stage of the ensemble is that at the beginning of the ensemble sorting process, the base classifier included in the ensemble is small, and the difference between the base classifiers is small, which directly leads to low ensemble accuracy. While continuously adding new base classifiers with better comprehensive performance, the diversity of the base classifiers is increased, so the ensemble accuracy gradually increases. After the ensemble accuracy reaches a certain value, the ensemble accuracy gradually decreases because at this time, after adding enough base classifiers, a base classifier with poor comprehensive performance appears in the integrated system, and the ensemble accuracy gradually decreases. In dataset I, which does not contain migration attributes, when the KELM base classifier is between 20 and 25, the recognition of the shearer drum height adjustment pattern can be highly accurate. However, as the number of base classifiers increases further (the number of base classifiers is greater than 60), the performance in terms of recognizing the shearer drum height adjustment pattern is not very helpful, and even the addition of too  many weak learners leads to a decrease in the final accuracy rate. Therefore, in subsequent comparative experiments, the final ensemble base classifiers number is 20 for data set I. In dataset II, which contains migration characteristics, when the KELM base classifier is between 35 and 45, the recognition of the shearer drum height mode can be highly accurate, and the number of base classifiers is determined to be 40. The ensemble numbers of other base classifiers is shown in Table 8.

4) COMPARISON BETWEEN SDHRM-SEoKELMRS AND OTHERS CLASSIFIER ENSEMBLE
Finally, we compare and analyze the classification performance of the SDHRM-SEoKELMRS along with those traditional ensemble classifier, the bagging and random subspace methods, on data set I and on data set II. The traditional ensemble classification methods (the bagging and random subspace methods) and the SDHRM-SEoKELMRS algorithm contain the same number of base classifiers. The test set is used to predict the shearer drum height regulating pattern shown in Fig 5. The base classifier algorithms adopt the ELM, KNN, CART, SVM, KELM, and OSL-KELM. Tables 9, 10    The prediction accuracy of the SDHRM-SEoKELMRS method on the two test sets (83.78 and 85.68, respectively) is higher than the prediction accuracies of other base classifiers, which are comparable to that of a single OSL. Compared with the KELM-based classifier, the accuracy of the ensemble method is increased by 15.60% and 13.86%, respectively. The prediction accuracy of each prediction model on test set II is generally higher than that on test set I. We believe that this shows the effectiveness of using the migration attribute  set to improve the accuracy of the classification prediction for the shearer drum. The proposed SDHRM-SEoKELMRS ensemble classifier method achieves the best predictive performance for these four metrics (accuracy = 85.68, precision = 84.08, recall = 85.89 and F1-Score = 84.97) on test set II. The FS-PP and random subspace methods both obtain multiple different groups of base classifiers by perturbing the sample attribute space, but for the industrial test data set of shearer drum height adjustment modes, the FS-PP method uses the redundancy of the neighborhood rough set attribute reduction technology to effectively improve the generalization performance of the integrated system.
To better compare the prediction accuracy and calculating time of the classification algorithm and its integrated system on the industrial test set of shearer drum height adjustment modes, we provide the results in the form of graphs, as shown in Fig 7 and Fig 8. Although our method does not achieve a completely accurate prediction effect, the obtained results are acceptable in comparison with those of other approaches. Although the response time of the  FS-PP +ELM combination is as short as 0.957 s, the prediction accuracy rate drops by 4.72%. The FS-PP+SVM combination has the highest prediction accuracy rate at 86.97, but the long amount of time required to make a decision makes it difficult to meet real-time performance requirements. In the process of industrial production, the actual operating speed of the shearer is between 0-0.08 m/s. In view of the decision-making time and prediction accuracy, the proposed SDHRM-SEoKELMRS method has the best performance. In addition, we can also adjust the maximum number of training samples of the SDHRM-SEoKELMRS algorithm or optimize the adaptive calculation method for the decision-making parameter to further enhance the real-time decision-making of the algorithm to adapt it to the needs of working conditions. This is also the focus of our future work. The SDHRM-SEoKELMRS method adopts an incremental learning method, which enhances the adaptability of the shearer to the randomly changing coal roof in the coal mine and can effectively predict the adjustment mode of the drum to avoid the shearer from being damaged or being involved in a safety accident.

IV. CONCLUSION
This research proposes a novel classifier selective ensemble method using the OSL-KELM classifier and neighborhood rough set attribute reduction technology to classify the height regulating pattern of the shearer drum. To improve the classification accuracy of the model, we introduce the migration attributes describing the shearing coal of the shearer. The multigranular attribute reduction technology based on the neighborhood rough sets is used to perturb the attributes space to generate different base classifiers. In addition, tenfold cross validation is used to tune hyperparameters of individual classifiers. The performance of the classifier ensemble on the field database is evaluated using statistical measures including accuracy, precision, recall and F1-score. The main results of this study are summarized as follows.
(1) The experimental results clearly demonstrate SDHRM-SEoKELMRS's superior performance in the height regulating pattern of the shearer drum, which the prediction accuracy is 84.68% and time requires 1.23s.
(2) The generalization performance of ensemble methods on data sets with migration features is higher than that on data sets without migration features, in which the accuracy has increased by up to 4.5%.
(3) The classification accuracy of SDHRM-SEoKELMRS which uses the proposed neighborhood rough reduction and the selective ensemble methods to remove effectively the irrelevant attributes and base classifiers has increased by 2.75% compared with the Random Subspace model.
(4) After comparing six base classifiers, the OSL-KELM method achieves the shortest time (1.23s) on the test set II, which has increased by 65.16% compared with the classifier SVM.
In summary, based on the observations above, SDHRM-SEoKELMRS shows great potential and advantages for shearer drum height regulating. In other words, we used this method for increasing prediction accuracy of the change law of the coal-rock interface and reducing classification error by selective ensemble. Moreover, the ensemble method can be easily extended to other classification problems in the fully mechanized coalface. It is worth mentioning that although the proposed method has advantages in calculating speed and accuracy, it still needs to be improved in order to meet the requirements of unmanned fully-mechanized mining coal. Nevertheless, it is certainly possible that the complete industry data can improve the generalization ability of the ensemble model. So, future work should focus on enlarging the current database and investigating other combination methods to improve the generalization ability of the ensemble model. WEI GUO received the M.Eng. degree in mechatronic engineering from the Xi'an University of Science and Technology, Xi'an, China, in 1988.
Since 1993, he has been a Professor with the Mechanical Engineering Department, Xi'an University of Science and Technology. Since 2012, he has been a Ph.D. Supervisor with the Xi'an University of Science and Technology. He has authored over 60 articles and conference proceedings in a well-known publishing platform. His current research interests include the coal mine machinery design, mining equipment intelligent control, and intelligent mining technology. He was an excellent teacher in Shaanxi province. He received two provincial science and technology awards, the Municipal Science and Technology Award, and two provincial teaching achievement awards.