Prediction Model of Shield Performance During Tunneling via Incorporating Improved Particle Swarm Optimization Into ANFIS

This paper proposes a new computational model to predict the earth pressure balance (EPB) shield performance during tunnelling. The proposed model integrates an improved particle swarm optimization (PSO) with adaptive neurofuzzy inference system (ANFIS) based on the fuzzy C-mean (FCM) clustering method. In particular, the proposed model uses shield operational parameters as inputs and computes the advance rate as the output. Prior to modeling, critical operational parameters are identified through principle component analysis (PCA). The hybrid model is applied to the prediction of the shield performance in the tunnel section of Guangzhou Metro Line 9 in China. The prediction results indicate that the improved PSO-ANFIS model shows high accuracy in predicting the EPB shield performance in terms of the multiobjective fitness function [i.e. root mean square error <inline-formula> <tex-math notation="LaTeX">$(RMSE) = 0.07$ </tex-math></inline-formula>, coefficient of determination (<inline-formula> <tex-math notation="LaTeX">$R^{2}) = 0.88$ </tex-math></inline-formula>, variance account <inline-formula> <tex-math notation="LaTeX">$(VA) = 0.84$ </tex-math></inline-formula> for testing datasets, respectively]. The good agreement between the actual measurements and predicted values demonstrates that the proposed model is promising for predicting the EPB shield tunnel performance with good accuracy.


I. INTRODUCTION
With the progress of manufacturing technology, larger and increasingly complex tunnel projects are being constructed in many Chinese cities [1]- [4]. In tunnel excavation projects, one of the main aims is to optimise the performance of the drilling system. Therefore, accurate performance of the tunnel boring machine (TBM) can be employed to reduce the risks associated with high costs and time consumed during the tunnelling process [5]. Conversely, overestimating can be a negative effect for the utilization of project resources [6]. Thus, if the tunnelling process is addressed in an appropriate manner, the risks related to tunnelling projects will be decreased considerably [7]- [12]. In general, the The associate editor coordinating the review of this manuscript and approving it for publication was Jenny Mahoney.
TBM performance can be represented by the penetration rate and advance rate. The penetration rate is the linear distance between the excavation faces per unit time when the machine is in production. The advance rate is the rate of the machine face advancing forward, including both the production time and downtime [13]. As the advance rate determines the total construction time and the overall cost of a tunnelling project, one of the most essential efforts in tunnel construction design is to estimate the advance rate.
The performance analysis of the TBM and the development of accurate prediction models have been the ultimate goal and are still under development in several studies. In most of the previous studies, both empirical and theoretical approaches were developed for predicting the TBM performance. Typical input parameters can be categorised as follows: i) geological conditions [e.g. uniaxial compression strength (UCS) and VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ geological strength index (GSI)], and ii) operational parameters [e.g. thrust force (TF), cutterhead torque (CT), and tunnel diameter (D)] [14]- [17]. Owing to large uncertainties in geological environments and construction processes, empirical and theoretical approaches cannot display the nonlinear and dynamic behaviour of the TBM performance. By utilizing a large amount of field data, artificial intelligence (AI) models may overcome this limitation. AI-based models emerged two decades ago to serve as an acceptable solution to several geotechnical problems; many comprehensive reviews have summarised the effectiveness of using AI models in widespread applications. To estimate the TBM performance, some AI models have been developed, including artificial neural networks (ANN) [18]- [20], fuzzy logic (FL) [21], and adaptive neurofuzzy inference systems (ANFIS) [22], [37]. ANN and non-linear multiple regression models have been used for estimating tunnel boring machine performance as a function of rock properties. However, to date, studies have mostly used only geological data in the prediction of TBM performance. Yin et al. [20] conducted a comparative study for identifying the soil parameters using different optimisation techniques such as genetic algorithms, particle swarm optimization, simulated annealing, differential evolution, and the artificial bee colony. Results showed that the differential evolution had the highest search ability but the slowest convergence speed.
On the other hand, the dynamic features of construction make the tunnelling process a nonlinear problem with large uncertainties, which challenges construction management and makes accurate predictions difficult. In this respect, ANN and FL can be used to address such challenges. However, there has been an argument as to whether AI models can yield reasonable solutions with robustness when addressing nonlinear problems with uncertainties [23], [24]. The other argument is that AI models may give distorted and/or inadequate explanations for problems owing to problems with the local minima and inferior generalization. As a result, hybrid models have been developed by incorporating the optimization algorithms with AI-based models. Salimi et al. [11] discussed the applicability of artificial intelligence to design sewage transfer system. The obtained results illustrated that the ANFIS had better performance for estimating water hammer phenomenon in the UPVC pipes while the PSO-ANFIS was found to be more suitable in metal pipes. Azad et al. [8] performed a comparative study for optimizing the performance of ANFIS model in simulating monthly rainfall magnitudes using different algorithms such as genetic algorithms, particle swarm optimization, differential evolution, and the artificial bee colony. The results showed that the hybrid models had better accuracy than the simple ANFIS model in escaping local optima [15]. So far, the basic idea of the hybrid approaches is to address the shortcomings of single approach and generate the effect of synergy in prediction, which have become the predominant approaches in recent years. For example, Elbaz et al. [25] proposed a hybrid model of a multiobjective genetic algorithm with ANFIS for predicting the shield performance, demonstrating better prediction accuracy than that from the traditional ANFIS technique. In spite of available hybrid optimization techniques, attempts to propose new ones are still ongoing.
This study aims to propose a hybrid multiobjective optimisation model for the prediction of shield machine performance during the tunnelling process. The proposed model is constructed using a fuzzy rule-based system optimised by an improved PSO algorithm, which simultaneously adjusts both antecedent and consequent variables. Principal component analysis is applied to examine the effect of different parameters on the advancement rate of the shield machine. To evaluate the performance of the proposed model, the prediction results are compared with the results of the ANFIS-FCM model.
The remaining content is organised as follows. Section II presents the AI technique and the proposed model. To verify the effectiveness of the proposed model, it is applied for predicting the tunnelling performance of a tunnel section in Guangzhou. Section III presents the real-time field monitoring data. Section IV describes a shield tunnelling performance database and presents the principal component analysis. Section V presents prediction results with a technical discussion. The last section concludes the study.

II. ARTIFICIAL-INTELLIGENCE BASED MODELING
A. ANFIS MODEL ANFIS, developed by Jang [26], is a multilayer adaptive network-based fuzzy inference that maps relations between inputs and outputs. ANFIS is useful for solving complex problems with large uncertainties by creating a fuzzy inference system (FIS) with adjusted parameters of the membership function (MF). In particular, it uses neuro-adaptive learning methods to adjust membership function parameters until reaching the optimal solution. In this way, ANFIS combines the reasoning capacities of fuzzy logic principles with the learning capabilities of the ANN system to solve complicated and nonlinear issues. Fig. 1 shows the ANFIS architecture, with two input parameters (x, y) and one output parameter (f ), using the Takagi-Sugeno fuzzy inference system.
The following content briefly describes the five layers of the ANFIS model. Further details of ANFIS can be found in other literature such as [26]. In Layer 1, each node (i) has an MF of a linguistic variable. The output of each node is calculated according to the following equation: where x is the input value of node i, A i is the linguistic variable associated with this node, and σ i , ν i , and b i are function parameters with b i > 0. The parameters in this layer are defined as premise parameters. In Layer 2, every node computes the firing strength for each rule by multiplying the received signals: In Layer 3, every node computes the ratio of the i th rule's firing strength to its sum for all rules. The outputs are normalized firing strengths.
Layer 4 contains the adaptive node: where w i is the output of layer 3; p i , q i , and r i are the consequent parameters. Layer 5 calculates the summation of all input signals as the overall output: B. FUZZY C-MEANS CLUSTERING By assigning a set of data into groups, fast and robust data clustering is essential to extract beneficial structures from large data. Fuzzy C-means (FCM) clustering is a powerful algorithm for clustering overlapped datasets. In FCM, the grade of a data point belonging to a cluster is identified by a membership. The membership shows a large value for data near the cluster centre and a small value for data far away from the cluster centre. FCM divides the selection of n vectors x i (i = 1, 2, . . . , n) into fuzzy sets and determines the cluster centre for each set to minimise the fitness function. The FCM clustering method works in the following procedure. Given n data points (x 1 , x 2 , x 3 , . . . , x n ), the centre of the i th cluster is randomly chosen as c i (i = 1, 2, . . . , C), where C is the total number of clusters (C ≤ n).
The membership matrix U can be calculated as follows: where d ij = ||c i − x j || is the Euclidean distance between the i th cluster centre and the j th data point, µ ij is the coefficient of membership matrix U , and m is the fuzziness index.
The objective function can be computed as follows: Finally, a novel c fuzzy cluster centre C i (i = 1, 2, . . . , C) can be calculated by utilizing the following equation: The PSO algorithm initialises a set of particles randomly scattered in the space of the objective function. Then, it updates generations to find the optima of all possible solutions (so-called particles). Each particle is defined by two positions and velocity values based on the two best fitness values: pbest and gbest. pbest is the best fitness solution of each particle fulfilled so far, whereas gbest is the global best solution gained by any particle in the population tracking by PSO. According to pbest and gbest values, all particles update their velocities and positions until the optimal solution is reached.
As an optimization method, PSO is easy to understand and implement. It is computationally efficient and maintains the diversity of the swarm. Assuming the position . . , v t in ) of the i th particle in the t th iteration, the particle optimises its location in the (t + 1) th iteration by utilizing the following equation [27]: where p t i is the best location of particle i th in iteration t th , g t is the global best location up to the t th iteration, r 1 and r 2 are random values in the range of [0, 1], w is the inertia weight where 0 ≤ w ≤ 1, and parameters c 1 , and c 2 are the cognitive acceleration rate and social coefficient, respectively.
Inertia weight w greatly influences the contribution rate of the velocity from the previous step to the velocity at the next step. A traditional strategy of improving the inertia weight is applied as follows: where T is the iteration number, T ∈ [0, G max ]; G max is the maximum number of iterations; w initial is the initial inertia weight; and w final is the development value at the maximum iteration.
Because the feature vector usually has high dimensions, PSO particles may easily be trapped in local optima rather than reach global optima [28]. Therefore, Clerc [29] added VOLUME 8, 2020 a constriction factor k into PSO to verify the best convergence as follows: A rule of thumb is that the constriction factor should be a convex function in precocious iterations to avoid the early convergence to local minima, and a concave function in late iterations to change slowly until reaching a global optimum. Based on this rule, the constriction factor function is built as follows [30]: Because the inertia weight influences the degree of the particle velocity and the constriction factor affects the convergence performance of PSO, the following content explains the improvement of synchronously using the inertia weight and the constriction factor.
The improved PSO has both the inertia weight and constriction factor varying synchronously. Integrating equations (12) and (13), we get the following equation: After a number of iterations, the particle may be close to the global optima; the inertia weight becomes smaller to allow the particle to retain its original speed and search for the optima in a smaller range. If the particle does not reach the accurate minima, then the inertia weight becomes greater to retain its original velocity for the global optima search.
Derivation details of this approach can be found in Lu et al. [44]. By solving these derivations, we can get the equation According to this equation, the inertia weight w should be no less than the maximum value of the right-hand side. On the right-hand side of Eq. (15), −1/k reaches the maximum value When the values of c 1 and c 2 are equal to 2, then w final = 2 − (1/(3.5/4)) 0.857. As the inertia weight w ranges from [0, 1], this study sets the initial value of the inertia weight w initial = 1. Thus, the inertia weights can be presented as follows: D. STOPPING CRITERIA Stopping criteria are specified as the conditions required to terminate the iterative search algorithm when there is no obvious improvement over the number of iterations. As usual, termination criteria include the expected value of accuracy and the maximum iteration number. To determine the appropriate number of iterations, a useful approach suggested by Zielinski and Rainier [24], which is based on comparing the results of various iteration numbers, is applied. In this work, the maximum number of iterations is set as a termination criterion. To determine an appropriate iteration number, we conducted trial computations by varying the population size of the improved PSO model based on the root mean square error (RMSE) [49].
To control the overfitting, a global validation strategy was implemented according to the definition by Mitchell [45]. We assume that c * j and c * ι j are the best-performing candidate groups found by computing the error rate ε for every element of P(c) in the optimization set (op) and in the validation set (ν), respectively. P(c) represents the powerset of classifiers c = {c 1 , c 2 , . . . , c n } determining the population of all potential candidates c j . The ranking error of the optimization set is denoted by ε(v, c * j ), and the ranking error of the validation set is denoted by ε . In this way, overfitting is defined as

E. HYBRID MODEL OF IMPROVED PSO-ANFIS
In order to predict the tunnelling performance with good accuracy, this study introduces an improved PSO-ANFIS model. In this hybrid model, the aforementioned improved PSO helps to tune and achieve the optimal values of ANFIS parameters through training. Fig. 2 shows a flowchart of the improved PSO-ANFIS model. The improved PSO-ANFIS model works using the following procedure. Initially, all datasets are reprocessed for the training model, including the operational shield parameters and the corresponding advance rate. With postprocessed data, the initial ANFIS method is produced with all parameters randomly initialised. To achieve accurate prediction, the ANFIS model needs to be supported by an appropriate number of clusters. The initial ANFIS method utilises the FCM clustering approach to optimise the result by extracting a set of rules that model the datasets and form the FIS. Then, premise parameters (σ i , ν i , b i ) and consequent parameters (p i , q i , r i ) of the ANFIS method are extracted in this step to estimate the dimensions of every particle for setting up the PSO algorithm in the next step.
The corresponding parameters for each MF are extracted iteratively to form a vector. In this vector, the parameters constitute the variables to be optimised by PSO; therefore, the length of every particle in PSO can be determined. Once the PSO parameters are specified, the initial population is generated. After initializing all particles, the improved PSO updates the velocity and the position of each particle in the swarm until a convergence is obtained to get the optimal values of the variables. The objective function of each particle is computed, and the best new values are updated accordingly. The last step assigns these optimal values as antecedent and consequent parameters to the final ANFIS model.

III. PROJECT DESCRIPTION
As one type of TBM, an earth pressure balance (EPB) shield machine is suitable for digging tunnels in unstable ground such as clay, silt, and sand. In EPB shield tunnelling, there are an increasing number of computational models for predicting the cutting rate [31], torque and thrust [32], and advance speed [11], [25]. However, no studies have integrated improved PSO with ANFIS to predict the EPB shield performance.
This study completes this work by applying the proposed model of improved PSO-ANFIS to a field tunnelling project in Guangzhou, China [33], [34]. In the Guangzhou metro tunnels, an EPB shield machine with a diameter of 6.25 m was used to excavate the tunnel section between Maanshan Park Station and Liantang Station for Guangzhou Metro Line No. 9 [35], [36]. This case study is selected to verify the applicability of the proposed model. Also, this case represents a new project in the urban area of Guangzhou city, which needs to be carefully considered based on the existing infrastructures. The main specifications of the utilised machine are summarised in Table 1. Fig. 3 shows a plan view of the construction site. The tunnel alignment is approximately 1280 m in length, with a burial   Fig. 4. This study collects inputs of operational parameters and geological conditions from the monitoring and testing results along the tunnel alignment.
The advancement of the shield machine usually encountered silt clay soil at the studied section. The properties are listed in Table 2.
Field investigations showed that the void ratio ranges between 0.7 and 0.85, and the maximum cohesion value is 40 kPa. The silty clay soils have a plasticity of over 10 and a uniaxial compressive strength (UCS) of less than 2. The soils have a consistency index of less than 1.0, categorised as low-plasticity clay (CL) according to Casagrande's plasticity chart. In a standard penetration test (SPT), N values of the soils are over 10.

IV. DATA PREPROCESSING
In this project, the EPB shield machine has a built-in data acquisition system in which the actual data are collected by the sensors of every subsystem. The collected data are stored in the shield machine computer and transferred to the laboratory server over a fibre-optic network. This system is VOLUME 8, 2020  provided to simplify data collection during the tunnelling process and serves as a decision-making tool for tunnel engineers. Prior to the data analysis, raw data from the shield tunnelling are preprocessed based on the dimensional data. It is noteworthy that the data acquisition system records a wide diversity of shield operating data related to the tunnel performance, including the thrust force, cutterhead torque, and soil and grouting pressure. To adjust the dimensions of the monitoring data for the selection of shield parameter data, the following two criteria are adopted: (1) The monitoring data should have meaningful values and be collected in the daily reports. Engineers refer to the daily reports to analyse the shield performance during the tunnelling process. (2) The shield tunnelling parameters are examined and selected by tunnel experts so that the selected parameters can reflect the actual relationship of the shield tunnelling performance between different tunnel parameters. Operation data usually include a certain amount of outliers, which affect the quality of the data [48]. Zhao et al. find that the K-nearest neighbour (KNN)-based outlier detection method is appropriate for detecting the outliers from a large amount of data [47]. Inappropriate raw databases were screened as outliers based on the K-nearest neighbour algorithm proposed by [46]. This study adopted the distance-based method of the KNN technique summarised in Algorithm 1 to detect outliers. The reason is that the operating data usually include a certain number of outliers, which normally affect the quality of the data [48]. In this manner, a total of 200 operating datasets were selected for the prediction of tunnel performance.

A. SHIELD TUNNELLING PERFORMANCE DATABASE
Following the aforementioned data preprocessing, this study established databases of the shield tunnelling performance, focusing on shield machine specifications and operational parameters. The shield machine specifications were collected from manufacturer's documents. Operational parameters were directly extracted from a built-in data acquisition system in the EBP shield machine. In total, there are nine parameters: cutterhead torque (CT), thrust force (TF), soil pressure (SP), rotational speed of the screw rate (SC), cutterhead rotation speed (CR), grouting pressure (GP), grouting amount (GA), excavation depth (H), and advance rate (AR). Among them, AR is closely interrelated with the other eight parameters.  result from frequent changes of the machine status during construction. Table 3 displays statistics of the nine parameters.

B. PRINCIPLE COMPONENT ANALYSIS
Principle component analysis (PCA) is a conventional multivariate statistical approach used for classification and regression in various fields of study [37]- [39]. PCA can be applied to decrease the complex data form of forecasting variables to a lower dimension. During the analysis, the PCA can provide a few linear collections of the parameters that can be adjusted to summarize the data without losing much information. This method uses the orthogonal transformation to transform observations of possibly correlated variables to linearly uncorrelated variables. It reduces the dimensionality and keeps the informational value of the input data intact. PCA has been widely used for selecting independent variables and eliminating duplicate or highly associated variables. Fig. 6 shows a two-variable dataset, originally measured in the X-Y coordinate system. In another coordinate system, the U axis refers to the principal direction of this dataset, and the V axis refers to FIGURE 6. Principle components for data representation [38].
the second most important direction. Usually, the V axis is orthogonal to the U axis; therefore, the covariance between the U and V variables is equal to zero. That means that all data are decorrelated by transforming from (X, Y) coordinates to (U, V) coordinates though an orthogonal transformation. PCA computes new variables as a linear combination of the original variables by calculating the covariance/correlation matrix of the data. When the variation of a dataset is caused by a natural property or a random experimental error, the variables are likely to follow normal distributions.
Linear transmutation transforms the input data into a set of components that are arranged according to their variance. The first principal component is the direction along which the data has the most variance. PCA projects the input data on a k-dimension eigenspace of k eigenvectors that are computed from a covariance matrix of the data N = [N 1 , . . . , N n ]. N i is i th d-dimension data sample, and N refers to the number of samples. PCA chooses k, with k < d, eigenvectors having the largest eigenvalues that represent the main components VOLUME 8, 2020 of the dataset. The selected eigenvectors are projected in a matrix and arranged into columns, where the first column corresponds to the largest eigenvalue. Eventually, PCA computes and determines the feature vector v from the data in the matrix [37].
Given the inputs of several parameters, this study uses PCA to identify critical input parameters that have the greatest impact on the advance rate (Fig. 7). This figure shows that the three inputs of CT, SC, and CR (i.e. the cases of 2, 3, and 4) are the most critical parameters, with the highest variance ratio of 93%. Therefore, CT, SC, and CR are selected as input parameters for developing the predictive model in this study, and the advance rate is considered as a function of these three parameters.

V. RESULTS AND DISCUSSION
As previously mentioned regarding PCA, this study uses three input parameters (CT, SC, and CR) to predict AR within the hybrid improved PSO-ANFIS model. Additionally, an ANFIS model is established to compare the prediction accuracy of the hybrid model. For general computation procedures of the ANFIS-based FCM model and the PSO-ANFIS model, please refer to section II. Both ANFIS and PSO-ANFIS are implemented in MATLAB. This study has a total of 200 datasets, randomly divided into two subsets, of which 80% of the datasets are the training set and the other 20% are the testing set, following the recommendation of Swingler [40].

A. ANFIS-FCM MODEL
In the ANFIS model, all datasets are normalised to simplify the computational procedure using the following equation [41]: where X and X n are the measured and normalised data, respectively; X min and X max are the minimum and maximum data of X , respectively.
The ANFIS model in MATLAB requires users to determine the number and the type of membership functions (MFs). As there is no explicit method or formula to predict the necessary MF numbers [42], this study estimates the number of MFs by trial and error. The best estimates are obtained when using three Gaussian MFs. Table 4 lists the employed parameters in the developed model. The Takagi-Sugeno method is applied as FIS owing to its high accuracy and good computational effectiveness in developing a systematic approach for constructing fuzzy rules from the input-output dataset. MATLAB with the genfis3 function is implemented to construct the initial FIS structure of the model. More ANFIS settings based on the FCM clustering are listed in Table 4.
To improve the model accuracy, four different cases were designed to evaluate the impact of using different numbers of clusters in the FIS function on the computational results with ANFIS. The four cases use 5, 7, 10, and 14 clusters, respectively. Table 5 presents the computational results from different cases for the training set and the testing set in terms of the coefficient of determination (R 2 ), root mean square error (RMSE), and variance account (VA).
The variations in the statistics can quantify the impact of changing the cluster number on the network result for the ANFIS model. Small variations of R 2 , RMSE, and VA indicate that the number of clusters in the ANFIS model only slightly affects the prediction accuracy. Among the four cases, the third case of the ANFIS model using 10 clusters has the best prediction accuracy. Therefore, the following ANFIS model uses 10 clusters in the FIS function to predict the tunnel performance. Fig. 8 shows the correlations between the measured and predicted advance rates for the training set and the testing set. This figure shows a better correlation in the training set than in the testing set for the ANFIS model.     In Fig. 9, it can be seen that the predicted values of the advance rate are relatively close to the measured values. In addition, the absolute and relative error indicators in Fig. 10 show that the predicted data can successfully follow the measured data with small discrepancies in the range of ±25%.

B. IMPROVED PSO-ANFIS MODEL
To develop the ANFIS model, the improved PSO is used to obtain optimum parameters for the ANFIS model. In this study, a Gaussian is applied as membership functions (MFs), as suggested by several researchers [14], [25]. In this hybrid model, PSO helps to establish closer relationships between the input and output. To determine the optimal PSO parameters, a trial-and-error approach is applied to find the maximum iteration numbers c 1 and c 2 [48]. These three parameters are eventually 300, 2, and 2, respectively ( Table 4). As indicated in Fig. 11, the PSO algorithm converged the optimal fitness function after approximately 55 iterations, and then it settled at a constant level. This shows that PSO reached the optimal solution and that the search operation could be stopped.
The network performance results with the improved PSO-ANFIS model with different population sizes are displayed in Table 6. From this table, it can be concluded that the improved PSO-ANFIS model with a population size equal to 100 leads to the best prediction capacity. In the present study, the architecture of the fourth model (No. 4) was chosen as the best model to predict the tunnel performance, as shown in Table 6. In this study, the convergence speed of proposed model is considered [16]. Results showed that the 300 iterations required about 20 minutes to train the system. Otherwise, the computational volume of the proposed model is satisfied to achieve the accuracy from the predicted results. The comparison results in the estimation of the advance rate for the improved PSO-ANFIS model in the training set and the testing set are displayed in Fig. 12.
Scattered data in both plots are close to the line of equality (shown as a dashed line), demonstrating the good accuracy of the improved PSO-ANFIS model. To give a visual sense for the improved PSO-ANFIS model, Fig. 13 has been added VOLUME 8, 2020   to show the relation between the measured and predicted AR for all databases. From Fig. 13, it can be seen that the AR predicted values are close to the measured values for almost all of the data. For more clarification, the absolute and relative errors of the outputs for the improved PSO-ANFIS model are plotted against the advance rate measured data, as depicted in Fig. 14.
The relative error of AR varies around zero, mostly in a smaller range (±15%) than the range (±25%) of the ANFIS-FCM model. This indicates that the improved PSO-ANFIS model has better accuracy in the prediction of tunnel boring machine performance when compared to the ANFIS-FCM model.

C. DISCUSSION
Applying computational techniques such as AI to the prediction of tunnelling performance has become increasingly popular. Previous studies on the prediction of TBM performance have mostly performed structural analyses under static load conditions far from actual working conditions [3]. Furthermore, predicting the TBM performance is a nonlinear and multivariable complex problem that cannot be accurately predicted using simple models.
In this respect, this study presented a hybrid multiobjective optimization technique to enhance the performance of TBM based mainly on dynamic operational factors. The dynamic operational factors, unlike the geological conditions, are controllable and thus can be manipulated by changing the machine orientation functions and optimal subsystems. Theoretically, the perfect prediction model is expected to have RMSE = 0, R 2 = 1, and VA = 1. A small value of the RMSE and great values of the coefficient of determination R 2 and VA indicate a good prediction accuracy of the model. To assess the performance of the proposed model, this study uses a multiobjective fitness function with the objective of decreasing the RMSE and increasing the coefficient of determination R 2 with a VA.
where x mea , x pre , and x m are the measured, predicted, and mean of the x values, respectively; and n is the total number of datasets. Z 1 , Z 2 , and Z 3 ∈ [0, 1], satisfying Z 1 + Z 2 + Z 3 = 1.
To reach the optimal model, the values of Z 1 , Z 2 , and Z 3 are determined as 0.4, 0.31, and 0.29, respectively. To understand the impact of the input parameters (CT, SC, and CR) on the response (AR) more fully, three-dimensional surface graphs are studied. Fig. 15 shows a surface graph of the improved PSO-ANFIS model to predict the advancement rate along with two input parameters while holding the third input parameter constant. As expected, the AR mainly follows a linear increasing trend with an increase in the SC, CT, and CR. However, there is a sharp decrease and a sudden increase in a local region (0.4 < SC < 0.6, 0 < CT < 0.2). The fluctuations of AR values in the local region probably indicate either a sudden instability at the tunnel face or sudden changes in the geological features of this local region. This was validated by the TBM operator based on our discussion with him. The operator said that he clearly noticed sudden changes in local regions when operating the TBM machine. For instance, when finding an obvious variation of the extracted soil from the screw conveyor system or when the amount of soil extracted from the machine was very different from the estimated quantity, the TBM machine performed differently, presenting immediate changes in the operation parameters. Therefore, while the developed AI-based models should always find a good trade-off between the complexity of the model and the data  dimensionality, we should also consider how to address the challenge of representing unexpected operations when operating the machine under different field conditions. It can clearly be realised that the variation of the AR with CT, SC, and CR is found to be intuitive and in agreement with previous research. For instance, a similar model for the prediction of shield machine performance was developed by the author and his colleagues [12], [25] based on data from the Ma-Lian section of Guangzhou Metro Line No. 9. Their model integrates a genetic algorithm (GA) with the adaptive neurofuzzy inference system (ANFIS) based on a multiobjective fitness function. The results of their model are compared with the improved PSO-ANFIS, as shown in Fig. 16. This figure displays the assessment results from the ANFIS-FCM, improved PSO-ANFIS, and proposed model by [25]. Because of a smaller RMSE and greater values of R 2 and VA, the improved PSO-ANFIS model outperforms the GA-ANFIS and ANFIS-FCM models. The above analyses elucidate that the proposed model of PSO-ANFIS can predict the advance rate and represent their statistical features with reasonable accuracy.
In practical applications, the EPB shield machine can use shield parameters such as CT, CR, and SC as inputs to predict the shield tunnelling performance. It is noteworthy to mention that the proposed model can provide initial estimations of shield performance, especially for estimating the advance rate of the shield machine at the project planning stage. With the advance rate determined from the proposed model, project durations can be estimated, thus facilitating time allocations when developing construction plans. Briefly, the improved model in this research is expected to provide insightful suggestions to support engineers in the prediction of shield-tunnelling advancement, and can be used as intelligent selection to achieve an acceptable prediction for TBM performance.

VI. CONCLUSION
This study presented an AI-based model to predict the shield machine performance during the tunnelling process. In this regard, the most influential parameters were identified through PCA, and an improved PSO-ANFIS model was established to predict the advance rate of the EPB shield tunnelling. The proposed model was applied to a case study of the Guangzhou Metro Line 9 tunnelling project. For validation, prediction results from the improved PSO-ANFIS model were compared with the prediction results from an ANFIS-FCM model. Major conclusions were obtained as follows: • The improved PSO-ANFIS model can predict the shield performance in terms of the advance rate, in good agreement with the measured advance rate for both the training and testing sets. The improved PSO-ANFIS model uses computation parameters tailored to the studied tunnel section for predicting the advance rate. Based on a multiobjective fitness function, the values of R 2 , RMSE, and VA of 0.88, 0.07, and 0.84 for the testing datasets indicate that the proposed model is accurate.
• The proposed model demonstrates better prediction accuracy than the ANFIS and GA-ANFIS [25] models based on a multiobjective fitness function. Prediction results from this study can facilitate decision-makers in predicting the project duration and construction cost of EPB shield tunnels. This supports efficient construction management, particularly when developing construction plans.
• The absolute error of the improved PSO-ANFIS method was in an adequate range of ±15%, whereas the ANFIS-FCM showed a wider error range of ±0.25. This demonstrated the precise prediction of the improved model in the prediction of tunnel boring machine performance. Therefore, the improved model can be utilised to guide construction practices in a more meaningful way.
• The proposed model is general and can be used for analysing different tunnelling systems in other types of geological conditions. To improve the robustness, more tunnelling data should be collected for calibration and validation of the proposed model.