Hepatitis C Virus Detection Model by Using Random Forest, Logistic-Regression and ABC Algorithm

This study proposes an automatic classifier for detecting the multiclass probabilities of hepatitis C virus (HCV) incidence based on patients’ blood attributes. The purpose of this study is to establish an artificial intelligence-based model that can identify HCV patients and detect the disease in early stage for future treatments. This model can be applied by using clinical data and keeps the performance from imbalanced datasets. The innovation in this article lies in considering the “unbalanced data” existing in medical record-based clinical data. Synthetic minority oversampling technique (SMOTE) algorithm was further employed to derive corresponding solutions. This objective was achieved using a cascade two-stage method combining the random forest (RF) and logistic regression (LR) algorithms. Two models were trained by applying the RF (Model 1) and LR (Model 2) to raw and preprocessed data, respectively. The artificial bee colony (ABC) algorithm was then used to determine the optimal threshold value required for filtering and separation, that is, the optimal combination of both models. The two-stage mixing algorithm combines algorithms of different search dimensions, thus integrating the strengths of those algorithms. The critical threshold value for separating Model 1 and Model 2 was obtained through an optimized search using the ABC algorithm. After conducting 10-fold Monte Carlo cross-validation experiments 50 times (for mean values), data from the recent pandemic were used to verify the proposed method. To evaluate the quantitative results, indicators, such as prediction accuracy, precision, recall, F1-score, and Matthews correlation coefficient, were compared with those of the latest algorithms used in relevant fields. The results indicate that the proposed model, named Cascade RF-LR (with SMOTE), can be used to detect the multiclass probabilities of HCV incidence using the ABC algorithm, thereby improving the effectiveness of relevant treatments.


I. INTRODUCTION
In medical research, redefining the influence of medical care data can improve medical care quality. In the medical care field, data centers that compile patients' medical records and examination results will serve as crucial factors for improving the quality of medical care for patients [1]. When knowledge The associate editor coordinating the review of this manuscript and approving it for publication was Huiyan Zhang . is extracted from the mining of medical record data, various perspectives can be adopted to explore disease incidence, progression, and spreading, and such exploration can provide valuable information for ascertaining the diagnosis and treatment of diseases. Therefore, data mining can uncover the underlying relationships, trends, and patterns between data, and in turn enhance the accurate identification of diseases [2].
In this study, the researchers targeted the patients diagnosed with liver diseases and defined hepatitis as the liver VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ inflammation from any cause resulting in damage of liver cells. When external substances or pathogens invade the human body, the immune system activates inflammatory cells (e.g., lymphocytes), which infiltrate into tissues and release immune substances to fight the invaders. This condition is known as the inflammatory response, or inflammation in laymen's terms. Hepatitis is mainly divided into two types: viral and non-viral. The types of viral hepatitis include hepatitis A, B, C, D, and E. Chronic hepatitis C virus (HCV) infection is one of the main causes of liver cirrhosis and hepatocellular carcinoma worldwide [3]. It increases the mortality and incidence rate of hepatic and extrahepatic diseases, particularly in patients with HCV viremia. Furthermore, alcoholic liver disease-, which progresses from mild liver disease to alcoholic hepatitis, and finally to cirrhosis-, is the main cause of global hepatitis incidence and mortality [4]. In Taiwan, the prevalence rate of HCV is 2.1%, which translates to a population of 489,000 patients with HCV viremia [5]. Moreover, during viral pandemics, patients with chronic liver diseases pose a huge challenge to the medical health care systems [6]. HCV belongs to the Hepevirus family, and hepatitis C is caused by HCV infection. After an acute infection, approximately 20%-30% of patients would exhibit clinical symptoms such as fever, fatigue, loss of appetite, slight abdominal discomfort, nausea, vomiting, jaundice, and other related symptoms [7]. The severity of HVC-related diseases can range from unobvious symptoms to the deadly fulminant hepatitis. Therefore, preventing the transmission of HCV is crucial, for which blood tests and screening for other variables are highly beneficial.
Currently, numerous liver disease diagnostic methods are based on machine learning. Several methods that have been used to examine pathological changes in hepatitis are described as follows: In [8], computed tomography was used to automatically locate the healthy segment of the liver and the segment with lesions using a modified method called CALOFCM, which combines fast fuzzy C-means (FCM), chaos theory, and the bioinspired ant lion optimizer (ALO). The chaos theory-based ALO prevented FCM from falling into the local minimum, enhanced the calculation performance, improved stability, reduced the sensitivity of the iteration process, and allowed the use of the optimal barycenter through FCM. In [9], ultrasound images of chronic liver diseases, laboratory examination results, and clinical records were used to perform auto classification of chronic liver disease stages. Specifically, a clinical-based classifier was first used to separate healthy conditions from pathological conditions. When an unhealthy condition was detected, this method classified the results into three types of exclusive pathologies: (1) chronic hepatitis, (2) compensated cirrhosis, and (3) decompensated cirrhosis. The features used and classifiers (Bayes, Parzen, support vector machine [SVM], and k-nearest neighbor [KNN]) were optimally selected for each stage [9]. However, there are many powerful optimization algorithms which are used in many research fields, such as dragonfly algorithm [10], ant lion algorithm [10], modified firefly algorithm [11], modified ABC algorithms [11], modified ant colony optimization [12], enhanced firefly algorithm [13], and so on. [13] is applied in the SoC-based test dispatch and time in order to save on the time and cost spent. The enhanced firefly algorithm is used. The performances of these algorithm are validated in the experimental results.
In [14], liver disease datasets were used to evaluate models, data mining models were compared to select critical features for predicting liver diseases, and the extraction, loading, transformation, and analysis method was used to compare different models, namely, random forest (RF), multilayer perceptron (MLP) neural network, Bayesian network, SVM, and particle swarm optimization. In [15], a machine-learning model was constructed based on 2009 clinical data to predict fatty liver disease (FLD). FLD is a clinical complication that commonly occurs during the early phase of chronic liver inflammation (chronic FLD may lead to the chronic inflammation of the liver). The classification models for FLD include RF, naive Bayes (NB), artificial neural networks, and logistic regression (LR). In [16], patients with HCV were analyzed by clinical traits (e.g., age) at first HCV screening, insurance at first HCV screening, race, gender, presence of fibrosis and/or cirrhosis, presence of other liver disease, presence of ascites, transplanted liver, presence of other types of liver cancer, presence of steatosis, presence of liver cell carcinoma, and ethnicity. The three care methods were modeled using decision trees and random forests. The methods were linkage to nursing care, initiation of antiviral treatments, and virologic cure. Furthermore, in response to the worldwide threat posed by COVID-19, clinical studies on the use of machine learning algorithms to combat the spread of the COVID-19 virus have applied virtual filters and machine learning algorithms to identify new drug candidates [17] and conduct the drug repurposing of anti-hepatitis C drug derivatives for COVID-19 treatment [18]. Furthermore, some articles [19], [20] also applied several artificial intelligence techniques for the liver disease detection. In addition, [21] gave attention to recent breast cancer disease topics that used machine learning methods. In addition, in exploring the diagnosis of Alzheimer's disease [22], the volumetric feature-based sMRI data of hippocampal slices was used. The convolutional neural network and deep neural network were adopted. In Article [23], the latest machine learning and deep learning method applied to detect four brain diseases: Alzheimer's Disease (AD), brain tumor, epilepsy and Parkinson's Disease were reviewed. In addition, different machine learning and deep learning methods, models, data sets, etc. were taken into account.
The main contributions of this study are as follows:(1) A two-stage joint model, in which the RF and LR models (Model 1 and Model 2, respectively) were integrated with the artificial bee colony (ABC) algorithm, and the synthetic minority oversampling technique (SMOTE) and feature selection method were also used to improve the model fit for Model 2, is constructed. (2) The proposed method addresses the problem of imbalanced data inevitably appearing in clinical data, which was not considered by other algorithms. (3) A diversified range of indicators is used to evaluate the proposed model under different evaluation needs. (4) Verification based on the mean values obtained through 10-fold Monte Carlo cross-validation experiments performed 50 times is performed and the scores are compared with those obtained using the latest algorithms.
The remainder of this article is as follows: Section II explains the major components of the proposed algorithms. Section III presents the proposed methodology in detail. Section IV provides the experimental results and discussion. Finally, Section V presents the conclusions of the present study.

II. RELATED WORK
This chapter introduces the methods used in the present study and their latest medical and clinical applications. These include the conventional RF, LR, ABC, and SMOTE methods.

A. CASCADE CLASSIFIERS
A cascade classifier is a classification method involving the combination of complicated classifiers and is often used in image object detection. A cascade classifier can rapidly discard the background of images to spend more calculation resources on the more hopeful target region; cascading can be regarded as a target-specific focus mechanism [24]. A cascade classification model is a type of joint classification model that combines a set of the latest classifiers to improve the results they produce, and sharing of information between tasks is achieved through the linkage of component classifiers [25].
A cascade classifier is a great tool for processing extremely imbalanced data (i.e., data with too many negative numbers and too few positive numbers [26]). One of the most recent studies [27] investigated the design of complexity-aware cascade pedestrian detectors.
In the field of biochemistry, cascade classifiers have been used in physical biochemistry networks. In a case study [28], the researchers proposed a cascade learning framework that incorporated semantic features from a knowledge embedding model and graph features from a graph embedding model. This framework combined the features into a single architecture that fully utilized the advantages of the two feature types. The case study empirically demonstrated the value of this framework in identifying potential relationships between diseases, drugs, genetics, and treatment methods. In neurology and clinical studies, cascade classifiers were used in the auto-evaluation of subjects' neurocognitive performance [29], which was achieved through the analysis of electroencephalographic signals. The cascade framework was composed of two long short-term memory recurrent neural networks.

B. RF
The RF method proposed by Breiman in 2001 [30] has achieved great success as a general classification and regression method. This supervised learning procedure operates according to a simple but effective divide-and-conquer principle: First, sampling is performed on data, and a random tree predictor is ''grown'' on each fragment. Then predictions can be made based on the mean values generated by these predictors. The RF method has become popular owing to its applicability in an extensive range of prediction problems. Apart from being simple and easy to use, this method is well known for its accuracy and competency in handling small samples and high-dimensional feature spaces. Moreover, it can easily be used in parallel with other algorithms, endowing it with the potential to realize large-scale reality processing systems [31].
In medical applications, RF is used to extract the important features of electrocardiogram signals for the classification of different arrhythmias [32]. RF is also used in the correct classification of Cushing's syndrome. In particular, it is used in promoting treatments and improving prognosis for patients with Cushing's syndrome. A relevant study indicated that RF is the most suitable method for classifying the syndrome [33]. Regarding the high costs involved in the prediction of treatment fees for patients with asthma, the frequently used comorbidity portfolio design involves the recombination of comorbidities in different budgets, where the training for comorbidity portfolio design includes the training of RF prediction models [34]. To resolve the class-imbalanced data problem in data classification, especially the lack of identification for minority groups, the class-weight RF method was introduced to assign a single weight value for each class [35].

C. MULTICLASS LR
Multiclass LR is an algorithm that is particularly suitable for the discovery of features or the associations between certain specific results: LR is a type of probabilistic classifier, differentiating it from the purely generative classifier (NB) or purely discriminative classifier (LR). In natural language processing, LR is a baseline-supervised machine-learning algorithm used for supervisory purposes, and it is closely related to neural networks.
Neural networks can be regarded as a series of LR classifiers piled on top of a logical network [36]. Assume there are n-th training instances inputting/outputting data to (x i , y i ); , and feature j will be named x j i . This classification problem is resolved through the learning of weight vectors and bias terms from the training set. The sigmoid function (softmax function would be used for multiclass) would be used to calculate the probability of p(y|x), which is then used to estimate the category of y.
The classifier would multiply each x i with weight value w i , sum up the weight feature, and add error term b, thus obtaining the expressed weighted sum of the category z. The purpose of setting a learning target is to minimize the error in the training samples to the greatest extent, and the formula used is the cross-entropy loss function equation. The result is the cross-entropy loss L CE , which is expressed in Eq. (3):

D. ABC ALGORITHM
The ABC algorithm is based on swarm intelligence and is often used to solve optimization problems; this method is inspired by the food foraging behaviors of bees [37]. Specifically, searching for food sources and locating food indicate possible solutions. The searching mechanism of this method involves three types of bees, namely employed bees, onlooker bees, and scouting bees. They work together so that the location of food sources can be determined in the iteration process of ABC. The employed bees and onlooker bees each constitute half of the population, and their roles are interconvertible. Onlooker bees represent the greed mechanism of ABC; these food foragers play different roles in the ABC algorithm [38]. Based on the estimated probability of the food sources, they will be appointed as the food and source locations, and work involving the development of food sources will be allocated according to these locations. Once an onlooker bee is appointed as the food source, it is converted into an employed bee. The employed bees represent the developing part of the ABC algorithm; they perform searching around the target food sources. Scouting bees are only sent out when a food source has been used for a continuous period because of the lack of better food sources. Scouting bees represent the searching mechanism of the ABC algorithm. Sending scouting bees to explore brand new food sources ensures that the ABC algorithm can break out from the local optimum.

E. SMOTE
SMOTE is an oversampling technique [39] used for resolving problems caused by imbalanced data, and it is often included as part of machine learning. SMOTE can freely create new minority class examples from the nearest neighbors of minority-class samples. The new instances are created by inserting new instances in the KNNs, and this process would not affect the distribution of the original source data.
This method can be used to eliminate the harmful effect of a skewed distribution [40]. These new examples are created based on the features of the original dataset; the purpose of creating them based on the features of the original dataset is that they will be similar to the original minority examples, which also prevents the occurrence of sampling bias [41], [42]. This method has been used in synthesizing minority samples in the medical field [43]; a further explanation of this technique is presented in Fig. 1 and concept presented in Eq. (6).

F. FEATURE SELECTION
In many classification tasks, feature selection is a crucial method in reducing the dimensionality of data in the preprocessing phase because these irrelevant and excess features would mislead the learning process; this is dependent on the chosen method.

III. METHOD
This section mainly explains the two type Models was generated and most crucially the method proposed in this study, namely Cascade RF-LR (with SMOTE) using the ABC algorithm. The cascade two-stage method uses the ABC algorithm to search for the optimal threshold value to connect the two models. Two models were trained by applying the RF (Model 1) and LR (Model 2) to raw and preprocessed data, respectively. The ABC algorithm was then used to determine the optimal combination of both models. Model 1 was an RF Model trained with raw data, whereas Model 2 involved processing the raw data with feature selection and SMOTE to form the training data for the LR model.

A. DATA USED TO CREATE MODEL 1 BY RF ALGORITHM
The purpose of training the RF Model with raw data is to obtain the estimated confidence probability of entities within the verification data and the weights of data features. Decision tree nodes were chosen at random to divide the features; consequently, model training was efficient when the sample features were highly dimensional. After training, the model could also yield the importance of each feature to the output; this information was used again in the training data of the LR model. In Eq. (7) Although the RF Model already provides a certain level of performance, data imbalances will inevitably appear in the clinical data of medical cases. The RF Model did not perform optimally in the subsequent performance evaluation experiments and was prone to overfitting when it was processing specific samples with high noise levels. Consequently, the raw data was preprocessed to differentiate the training data for Model 1 and to identify the corresponding relationships between features; preprocessing also solved the problem of data imbalance. Therefore, data preprocessing included the application of feature selection and SMOTE to the raw data.

1) FEATURE SELECTION BY THE RF MODEL
When the bagging method was applied on the component classifier algorithm during RF model training, different training datasets were generated using bootstrap sampling for the purpose of constructing different classifiers. These data are known as the out-of-bag (OOB) data. The OOB data were used to calculate the importance of each feature. After the data were subjected to the feature selection process, they were further passed to the LR model for model construction.

2) SOLVING LR-MODEL SKEWNESS DISTRIBUTION USING SMOTE
Minority-class data being used for second-stage model training would have resulted in a prominent impairment in accuracy. To eliminate the harmful effects of skewed distribution, the over-resampling technique was used to fill up the data for the minority class. SMOTE is one of the most renowned techniques for resolving this problem in the field because it enables the establishment of a model under the condition of balanced data.
After the raw data were preprocessed through the aforementioned steps, the LR model (i.e., Model 2) was built using the samples and the LR method. In contrast to the random sampling that is conducted during the application of the bagging method in the RL Model, LR is an algorithm that is particularly suitable for identifying the features of or the associations between specific results. LR is a type of probabilistic classifier, and it is one of the most widely applied machine learning algorithms. Logistic regression is the most straightforward algorithm to understand and apply to combinations of two different types of models, and its computational cost is low. Model 2 is expressed in Eq. (7)

C. CASCADE RF-MLR BY THE ABC ALGORITHM
The cascade two-stage mode identifies the optimal threshold value using the ABC algorithm, such that this value can act as the linkage between models, as shown in Fig. 2. First, two models were trained using the RF and LR methods and the original and preprocessed training data (in Algorithm 1 line 3), and the optimal combination for the two models was identified using the ABC algorithm (in Algorithm 1 line 15). Imbalanced data inevitably appear in the clinical data collected from medical cases. The method for using this combination was not considered in the case of other combinations. The data preprocessed using feature selection and the SMOTE method (in Algorithm 1 line 5). The estimated confidence probability was first identified using the verification data and RF model. It is expressed in Eq. (10) as model probabilities RF (D val ) ≤ i, where D val represents the verification data, and i is the estimated value of the optimal confidence probability selected by the ABC algorithm. Then, the threshold value with the optimal probability was selected as the basis for data separation, and the separated data were passed to the LR model for further judgment. Lastly, the ABC algorithm was used to identify the optimal threshold value, which was used in identifying the optimal predicted classification for the cascade two-stage model. The ABC algorithm  screened to be lower the estimated value of optimal confidence probability i and to determine the majority combinations in both models, respectively. Furthermore, the value of i was between L and U and expressed as L ≤ i ≤ U (in Algorithm 1 line 11); this was obtained directly through the tests performed during the experiment. The RF model and optimal separation threshold value i * were obtained at this stage. As shown in Fig. 2, after combining these two models, the iteration process will be completed when 95% training accuracy is reached.

IV. EXPERIMENTAL RESULTS AND DISCUSSION
This section describes the multi-classification database used for validation and the experiment setup; it also reviews performance measurement and compares the multiclass indicators used in the present study with the latest algorithms used in other relevant studies. These algorithms include RF, deep forest (gcForest) [44], [45], extreme Gradient boosting (XGBoost) [46], decision tree (DecisionTree), KNN, Gaussian NB (GaussianNB), and partial least squares two-block regression (PLS2Regression) [47], [48].

D preprocess
= f SMOTE (f feature_sele (D)) ← processing the raw data with feature selection and the synthetic minority oversampling technique (SMOTE) to form the training data for the LR model using Eq. (8).  The HCV data used in this study were taken from the Machine Learning Repository of the University of California, Irvine (UCI) [49], [50], [51]. The dataset originally contained a total of 615 instances, four classes, and 14 attributes. The elimination of some missing values resulted in 582 remaining instances. This dataset has clear dataimbalance problem; specifically, a great discrepancy exists between the sample size of the class with the highest and lowest sample number, making this dataset an imbalanced dataset with an imbalance ratio (IR) of 43.83. Two additional datasets were used as comparison datasets to verify the proposed method: data sets containing multiple classes and minority classes were specially selected for this purpose. The IR value and other attributes of the datasets are presented in Table 1, and these data were also retrieved from the Machine Learning Repository of UCI.

B. EXPERIMENT SETUP
The selected verification dataset was first checked for samples with missing features, which were then removed. Next, k-fold cross validation was performed. Specifically, the data were randomly divided into k sets, of which one was selected to be the testing data, and the others were designated as training data. These steps were repeated until each set had been designated as the testing data, that is, k tests had been performed. If we set k to be 10 in the experiment, then a 10-fold cross-validation was performed. This validation was run 50 times to determine the averages and to verify the robustness of the proposed method. In addition, during the training process, one-tenth of the training datasets were used as validation data, which were used to evaluate the performance of the overall validation method. Table 2 presents the algorithms and parameter settings.

C. PERFORMANCE MEASUREMENT
In machine learning, a task that involves two or more classification tasks is known as a ''multiclass classification'' task. The dataset used in this study was based on multiclass classification, and the problem of minority groups was taken into consideration. This section presents the performance measurement standards selected by the researchers, which were used to assess the proposed multiclass classifier. The selected measurement methods, which are presented in Table 3, were as follows: accuracy, precision, recall, F1-score and Matthews correlation coefficient (MCC).
Accuracy refers to the probability of the model making a correct prediction. In the case of whole-sample prediction, accuracy refers to the measurement of the model making correct predictions for all classes. The precision and recall indexes for each category would need to be calculated separately. Precision is the measurement of accuracy; in other words, it is an indicator of how many samples labeled as positive are correctly labeled [52] (i.e., positive predictive values). When the cost of false positives is high and their occurrence is expected to be minimized, the enhancement of precision measurement values should be emphasized. Furthermore, this indicator can reflect the precision level of each response class, and therefore, it is suitable to be used on minority classes [53].
Recall is an indicator of how many positive samples are correctly labeled. When the cost of false negatives is high and their occurrence is expected to be minimized, the enhancement of recall measurement values should be emphasized. In multiclass classification, recall represents the percentage of positives in class K that are correctly identified (i.e., the true positive rate). In other words, recall is an indicator for measuring integrity (as plotted in Fig. 3). The F-score is a measurement method that combines the two indicators of precision and recall and is used to express the weighted mean of the two indicators. In cases of uneven class distribution, the F-score is often more useful than accuracy.
MCC is a correlation coefficient [54] indicating the correlation between the observed and predicted classifications, it returns a value that ranges between −1 and +1. A coefficient of +1 represents a perfect prediction, a coefficient of 0 represents a prediction that is no better than random; and a coefficient of −1 represents a total disagreement between prediction and observation results. Furthermore, MCC can be used when the size difference between categories is huge. In recent years, it has become an extensively utilized measurement standard in the testing of machine-learning performance [55], and it is suitable for use in targeting different classes under a multiclass condition [56].

1) ACCURACY SCORE
As shown in Fig. 4, the model established based on the accuracy indicator only took into consideration the ratio of the correctly identified instances among the number of classes. This indicator did not consider the class differences. A discrepancy of 2.04% was observed between the average performance of the proposed method and the second-best algorithm (i.e., XGBoost), and a discrepancy of 1.38% was observed between the best performance of the proposed method and the second-best algorithm. This result indicated that the proposed method exhibited greater performance than the other methods in terms of accuracy scores.

2) PRECISION SCORE
The weighted-average precision (as plotted in Fig. 5) measures the weighted mean of each category. In consideration of the class differences, the weighted mean value of each class indicator was calculated separately based on the weight of  each class (i.e., the total sample size of each class), as shown in Eq. (11). In this regard, the performance of some algorithms would be slightly superior to the accuracy score after the multiclass factor is considered. If the problem of data balance is not considered, then the unweighted mean values would be used (as plotted in Fig. 6). The results revealed that only three algorithms had a vertical axis legend position higher than 0.8. Moreover, when the number of classes was low, the overall accuracy dropped, and the average performance of Cascade RF-LR (with SMOTE) using the ABC algorithm was only 77.11%. For further discussion on the numerical values of other classes, please refer to Table 4. Results presented in this table indicate that in class 1, the proposed method exhibited greater performance compared with the other algorithms, whereas SVC and gcForest exhibited the greatest performance in class 3.

3) RECALL SCORE
In multiclass classification, recall represents the ratio of correctly identifying a class among all classes. Similarly, weighted-average recall (as plotted in Fig.7) is the measured weighted mean. Because this is a weighted mean value, large quantity differences between class weights would exist in class performance. According to the derivation based on Eqn. (13), the performance of recall score is consistent with the performance of accuracy score. Considering the unbalanced data of each class, the unweighted mean values were used (as plotted in Fig. 8); the maximum indicator for the vertical axis legend position dropped from 1.0 to 0.75. The results indicated that the average performance of the proposed method (i.e., Cascade RF-LR (with SMOTE) using the ABC algorithm) was 71.53%, which exceeded the average indicator of the second-best algorithm by 9.2%. For further discussion on the numerical values of other classes, please refer to Table 5.

4) F1 SCORE
The F1-score is a measurement approach that combines two measurement methods, namely precision and recall. In essence, F1-score is the harmonic mean of precision and recall. They are expressed in weighted mean values (Fig. 9) and unweighted mean values (Fig. 10). As depicted in both figures, the performance of this algorithm surpassed that of the other algorithms. For further discussion on the numerical values of other classes, please refer to Table 6.

5) MCC SCORE
In essence, MCC is a correlation coefficient that ranges between −1 and +1; a correlation coefficient of +1 indicates a perfect prediction, whereas −1 indicates an inverse prediction. As depicted in Fig. 11, the average performance of Cascade RF-LR (with SMOTE) using the ABC algorithm was 78.84%, which surpassed the performance of the second-best algorithm by 10.33%.

D. K-FOLD-MONTE CROSS-VALIDATION
We performed 50 runs of 10-fold Monte Carlo crossvalidation to compare the indicators and other algorithms, 9/10 of all the data were used for training purposes, whereas the remaining data were used for verifying the chosen methods. Moreover, 1/10 of the data were randomly retained every turn to avoid overfitting and selection bias. The k-fold values for each turn were saved, and the final data after 50 runs are illustrated in the figures below. In addition, the final mean and standard deviation values of the 50 runs are presented.

E. COMPARISON WITH OTHER DATASETS
For the performance of the other two datasets, please refer to Table 7 and Table 8. The results of all indicators indicated that the proposed method exhibited the optimal performance among the tested methods. The only exception was found in the results of macro-average precision. In the thyroid dataset, a discrepancy of 5.06% existed between the proposed method and the best-performing indicator. In the page block dataset, a discrepancy of 20.53% existed between the proposed method and the best-performing indicator. In particular, the same algorithm was not used in the indicators with greater performance. However, among the weightedaverage precision indicators, the proposed method exhibited the optimal performance.

F. DISCUSSION AND ANALYSIS
The comparison of the proposed method and the original RF algorithm is depicted in Fig. 4. With regard to the accuracy indicator, the constructed models only consider the ratio of entities correctly identified by category to the overall number of specimens; however, this indicator does not consider differences in category. The average performance results indicate that the proposed method outperformed the original RF algorithm by 2.17%. The above analysis targeted the average accuracy indicator in Fig. 4 where the performance was superior to that of the original RF algorithm, as well as other algorithms compared.
Precision is an indicator of accuracy, and it refers to the number of samples that were correctly marked as positive. The weighted averages (Fig. 5) indicate that the proposed method outperformed the original RF algorithm by 2.96% on average. However, if the data-imbalance problem was not considered, the unweighted averages were used (Fig. 6), and the sum of all the category values was divided by the number of categories, then the original RF algorithm outperformed the proposed method by 9.8% in terms of average performance. In order to analyze different types of and individual unbalanced problems and the differences arising from the indicators in Fig. 5 and Fig. 6, Table 4 was generated and analyzed. Table 4 was created to analyze this phenomenon and to further discuss the effects of categories with fewer samples. This table compares the accuracy of each category. Compared with Table 1 where the numbers of samples in each category was compared, Table 4 indicates that Class 1 had the greatest number of HCV datasets, and that Class 3 had the fewest; the IR was 43.8. A comparison of the algorithms revealed that the proposed method performed the best in Class 1 but the second worst in Class 3. However, a comparison with the recall indicators (Table 5) indicated that the proposed method had the best performance. The performance in Class 2 and 3 with  the least quantities was superior to that of other algorithms compared.
|y k | VOLUME 10, 2022 This demonstrates that in situations involving insufficient sampling, the proposed method has an accuracy rate of   approximately 50% and a recall rate that is slightly greater than 50%. By contrast, the other methods have accuracy rates that are greater than 50% and recall rates that are mostly less than 30%. Table 6 (the harmonic means of the accuracy and  recall rates) reveals that the proposed method produced the best F1 performance in Class 3. The proposed method tends to be conservative in situations involving undersampling, but it does not perform poorly in such situations.

G. ABLATION EXPERIMENTS
In addition, the proposed model consists of several submethods. The ablation experiments could explain the significance of the submethods. In the ablation experiments, the performances of four different submethods combinations (''LR'', ''LR with feature selection'', ''LR with feature selection and SMOTE'', and ''RF'') are validated by 10-fold Monte Carlo cross-validation experiments for 50 times. The final results of ablation experiments in terms of accuracy, weighted-average F1, unweighted-average F1, and MCC-score are illustrated in Fig. 12-15.
Accuracy of the multi-category classifier is not the only indicator for performance evaluation. Thereby, F1-score,   which is a combination of precision and recall indicators, is adopted in the experiments. It is noteworthy that the difference between the performances of ''LR with feature selection'' and ''LR with feature selection and SMOTE'' are quite small in Fig. 13. However, the difference between these two above-mentioned submethods is increased by 5% in Fig. 14. This result also indicates that feature selection and SMOTE   brought many benefits to the multi-category classification performance. Consequently, RF is selected to be the base classifier, because its standard deviation is smaller than other models in the experiments.

V. CONCLUSION
In this study, the researchers proposed Cascade RF-LR (with SMOTE) using the ABC algorithm to detect the multiclass probabilities of HCV incidence. This objective was achieved using a cascade two-stage method combining the RF and LR algorithms. The final results were a combination of the results obtained from the two models. The critical threshold value for separating Model 1 and Model 2 was obtained through optimized searching using the ABC algorithm. The proposed model was evaluated using various performance measurement indicators, including prediction accuracy, precision, recall, F1-score, and MCC. In addition, the proposed model was compared against the latest algorithms.
The mean values obtained from 50 runs of 10-fold Monte Carlo cross-validation experiments were used as the retrieved values.
The results indicated that Cascade RF-LR (with SMOTE) using the ABC algorithm can be used to detect the multiclass probabilities of HCV, indicating that this model can be used to improve the effectiveness of relevant treatments. Despite the presence of imbalanced data in the clinical data of medical cases, the method in this combination was not considered in other combinations. Finally, improvements to prediction accuracy in situations involving insufficient IRs and samples will be explored in future research, such that the proposed method can be applied to the collection of complex data relating to rare diseases and medical treatments in clinical practice. From 2017 to 2021, he was an Assistant Professor at the Department of Intelligent Robotics, National Pingtung University. Since 2022, he has been an Associate Professor at the Department of Mechanical Engineering, National Chung Cheng University. His major research interests include fuzzy control, intelligent algorithms, humanoid robot, image processing, robotic application, big data analysis, machine learning, and deep learning applications.