Nonlinear Model Predictive Control Using Feedback Linearization for a Pressurized Water Nuclear Power Plant

The present work aims to introduce a nonlinear control scheme that combines intelligent feedback linearization (FBL) and a model predictive control (MPC) for a pressurized water reactor (PWR). The nonlinear plant model that is considered in this study is described by the first-principles approach, and it consists of 38 state variables. First, system identification using a dynamic neural network (DNN) structure is performed to obtain a standard affine nonlinear system. The quasi-Newton algorithm is employed to find the best DNN model. Then, an FBL is formulated to address the nonlinearity of the DNN model. An MPC controller is developed based on the FBL system to improve the system performance. The designed controller is compared with a linear MPC controller that is based on state-space models to evaluate the performance of the proposed controller. The proposed approach improves the load-following operation and offers better disturbance rejection capability than the conventional MPC. In addition, numerical measures are employed to compare and analyze the performances of the two control strategies.


I. INTRODUCTION
Nuclear power plants (NPPs) are characterized as complex, nonlinear, time-varying, and constrained systems. Controlling an NPP is a substantial challenge due to parameter variations that are caused by fuel burnup and internal reactivity feedback, among other factors. Moreover, the daily load cycle variations that are due to the load-following mode can significantly degrade plant performance. Conventional controllers such as proportional-integral and linear-quadratic controllers are unable to control a plant effectively and robustly in an uncertain environment [1], [2]. Thus, it is necessary to improve the regulation and control strategies to strengthen the security, reliability, and operability of NPPs.
The associate editor coordinating the review of this manuscript and approving it for publication was Bing Li .
Receding horizon control, which is better known as model predictive control (MPC), is a widely employed control method that has the advantage of handling constraints effectively in multi-input-multi-output (MIMO) systems. MPC uses an explicit model for the prediction of the system output at future time instants and solves an online optimization problem to obtain the future control input that brings the system as close as possible to the reference [3]- [5]. MPC has received considerable attention in nuclear plant control over the last two decades. An MPC controller is proposed in [6] for the distribution and power control of a pressurized water reactor (PWR). A nonlinear MPC controller is designed to control the power of a research reactor in the presence of disturbances [7]. Eliasi et al. [8] and [9] developed a robust nonlinear MPC controller to perform a loadfollowing operation. One of the main drawbacks of these studies is that they require a precise mathematical model for control design. Although an approximate nuclear reactor model can be obtained using first-principles techniques, it is expensive to develop and specific to the process. Recently, subspace-based MPC techniques have been developed for the PWR reactor control [10]- [14]. These methods use linear subspace models that are obtained directly from the input-output data of the system. However, they are ineffective over a broad operating range because they utilize linear models, and therefore, the control performance can degrade substantially. Multiple predictive control is proposed in [15] to address the nonlinearities that are associated with the control of a movable PWR, but the implementation of such a control strategy is nevertheless challenging due to its complex structure.
In the past three decades, soft computing techniques, such as neural network (NN), fuzzy logic (FL), and genetic algorithm (GA), have been employed to control NPPs. For instance, the FL and NN approaches are employed to regulate the temperature and power of a nuclear reactor in [16]- [19]. A fuzzy adaptive robust optimal controller and an NN-based controller are designed to control a reactor during load-following operation [20], [21]. Some studies have applied fuzzy logic-based PID controllers to enhance the effectiveness of control of NPPs [22], [23]. An intelligent reactor core controller is developed by combining fuzzy approaches and dynamic neural networks [24]. In [25], a GA optimized PID is applied to the control of the power level. In a similar context, the particle swarm optimization (PSO) algorithm is used to enhance the control performance [26]. In a recent paper, Elsisi and Abdelfattah [27] employed a new optimization algorithm named lightning search algorithm to find the optimal parameters of a variable structure controller. Fuzzy reasoning techniques and neural network architectures are also used to enhance MPC performance. For instance, Liu et al. [28], [29] and Na et al. [30] proposed MPC strategies based on nonlinear fuzzy models. Recently, a neuralnetwork-based MPC controller is employed to control the thermal power of a nuclear superheated-steam supply system [31]. The application of machine learning techniques in the context of optimal MPC tuning has recently been reported in the literature. In [32], the PSO algorithm is used for the optimal tuning of a multivariable MPC. In [33], the authors proposed a tuning method for the MPC parameters for robotic manipulators. In another recent work [34], the authors put forward a framework using a new optimization algorithm called social ski driver algorithm (SSD) for the tuning of a nonlinear MPC and the efficacy of the technique is validated on an autonomous vehicle.
One of the main drawbacks of the nonlinear configurations is the high computational burden that is associated with the online solution of the nonlinear programming problem. The feedback linearization (FBL) approach is employed to reduce the computational burden of nonlinear techniques. FBL-based techniques that use empirical nonlinear models with linear MPC have been proposed in the literature [35], [36]. Recently, the FBL approach is combined with other techniques to improve the control performance during the load-following operation. For instance, an FBL-based robust controller is developed for controlling core power peaking during load-following operation [37]. A partial FBL-based linear active disturbance rejection control is proposed to improve the transient response during power control [38]. Additionally, a robust observer-based FBL controller is presented for addressing the disturbances and uncertainties to which the system is subject [39]. In this study, an FBL-based integrated nonlinear MPC technique is developed by employing a dynamic neural network (DNN). For the first time, a DNN-based FBL is proposed in the context of nonlinear control of a PWR-type nuclear reactor. The control structure consists of an FBL approach that is based on an identified DNN model and an MPC controller to control the linearized system.
Most of the studies in an NPP control design literature do not consider the coupling effects among the various subsystems, nor do they consider model equations of sensors and actuators. A realistic study should incorporate control schemes for the entire NPP process. However, only a few papers have discussed the control design for whole NPP system [1], [2], [40], [41]. In this respect, the proposed FBL-based MPC control strategy is applied to the different subsystems of the integrated NPP model. The efficacy of the proposed controller is validated in the MATLAB/Simulink environment. The proposed control approach is compared with a state-space based classical MPC controller. The main contributions of this paper are as follows: • Nonlinear model predictive control using feedback linearization based on dynamic neural networks is proposed to enhance the control performance of a PWR.
• Control of the reactor core, steam generator (SG), pressurizer, and turbine are studied.
• Comparison with state-space MPC is performed. The remainder of the paper is organized as follows: The plant model is presented in Section II. Section III provides a brief introduction to the neural network structure and presents the system identification approach that is used to estimate the dynamics of the PWR model. Section IV introduces the design of the hybrid control strategy. Section V presents and discusses the simulation results. The conclusions are presented in Section VI.

II. NON-LINEAR PWR MODEL
A schematic diagram of various components of a typical PWR plant is shown in Fig. 1. The plant comprises two main loops: primary and secondary loops. The primary loop consists of a reactor core, steam generator, pressurizer, and reactor coolant pump. The reactor core is suitably described using a point kinetics model. The reactor core is controlled using control rods. The coolant that is heated by the reactor core is pumped to the steam generator, where steam is produced and supplied to the secondary loop. The pressurizer is modelled with pressure and level equations. It aims to VOLUME 10, 2022 maintain the primary system pressure so that there is no boiling in the primary system. The secondary loop consists of a turbine, moisture separators and steam reheaters. The turbine model consists of high-, intermediate-, and low-pressure turbines. For brevity purposes, only a summary of the reactor core model is provided here. For a complete description of the entire NPP model, the readers are kindly referred to [40]- [42].

A. POINT KINETICS REACTOR CORE MODEL
The core-neutronic system models the normalized power and the normalized precursor concentration of six groups of delayed neutrons. It is expressed as follows: where P r and C ir denote the normalized power and normalized delayed neutron precursor concentration, respectively; λ i and β i are the decay constant and fraction of delayed neutrons, respectively; is the prompt neutron lifetime; and ρ t denotes the total reactivity.

B. THERMAL-HYDRAULICS MODEL
The core thermal-hydraulics model comprises two lumps that represent the coolant nodes and one lump that represents the fuel node. The model is expressed as follows: where T f denotes the fuel temperature; T c1 and T c2 denote the temperatures at coolant nodes 1 and 2, respectively; H f and H c are proportionality constants; and τ f , τ c , and τ r are system time constants.

C. REACTIVITY MODEL
The variations of the temperatures of the fuel and coolant introduces internal reactivity feedback into the system. The reactivity is controlled using control rods, and hence, the total reactivity is represented as: where ρ rod , ρ f , ρ c1 , and ρ c2 are the reactivities that are related to the control rod, fuel temperature, and coolant temperatures at node 1 and 2, respectively; and α f and α c are temperature coefficients of reactivity due to the fuel and coolant, respectively. Definitions of various inputs and outputs for the loops  of the PWR are presented in Table 1. The values of the system parameters are obtained from [40]- [42].

III. IDENTIFICATION OF THE PWR USING DYNAMIC NEURAL NETWORK A. DNN STRUCTURE
The dynamic neural network that was introduced in [43] has notable potential for learning the dynamics of complex nonlinear systems, whereas static NNs are limited and fail to realize acceptable modelling and mapping performance. A dynamic neuron model consists of internal dynamics that are added to a static neuron, which cause the activity of the neuron to depend on its internal states. Fig. 2 illustrates the DNN structure that is used in this work. The equation for the DNN is: where β i , ω i , and γ i are adjustable weights; x i is the state of the system; and u j is the input signal. The vectorized form of (7) is expressed as: where x corresponds to the coordinates of , and β ∈ R N ×N is a matrix that is diagnosable and has diagonal elements β 1 · · · β N .

B. TRAINING DNN
Assume that the data has been collected from the process for the training exercise of the DNNs. Consider a training data set with M input-output pairs and sampling time T s : where y (t k ) is the measured output; u (t k ) is the measured input; and k is a sampling index. The identification procedure that uses DNNs is based on a comparison of the plant output measurement and the simulated output. The objective is to adjust the model such that the dynamic behaviour of the model and the real plant is identical. To succeed in this, a cost function is defined and minimized: whereŷ (t k |θ ) is the estimated output; and θ is a vector parameter. The minimization is performed using an optimization algorithm. The training problem can be formulated as a nonlinear unconstrained optimization problem: The optimization problem (12) can be solved using local optimization techniques; therefore, the quasi-Newton algorithm is chosen here. Quasi-Newton methods are developed to solve nonlinear optimization problems. The main principle of this algorithm is to progress step by step from an initial point θ 0 along with line search directions h k until the optimal point θ opt is obtained. At each iteration, the inverse of the Hessian matrix H k is computed to obtain the search direction.
Here, The matrix H k is obtained using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm [45]: where s k = θ k+1 − θ k is the parameter change between two iterations and q k = ∇f (θ k+1 ) − ∇f (θ k ) is the gradient change between two iterations. The basic approach of the algorithm is described as follows: 1) Choose a starting point θ 0 and initialize H 0 . 2) Compute the gradient ∇f (θ k ) and the search direction h k = −H k ∇f (θ k ). 3) Perform a line search from θ k in the direction h k using the following equation: θ k+1 = θ k + T s h k , where T s is the step size. 4) Compute H k+1 using the BFGS formula (13) and return to step 2.

C. SYSTEM IDENTIFICATION
The technique that was explained in the previous subsection is used to identify five different subsystems: a reactor corepower loop, a steam generator loop, a pressurizer-pressure loop, a pressurizer-level loop, and a turbine-speed loop. For simplicity, only the identification of the reactor core-power loop is discussed here. The remaining loops can be identified similarly using the corresponding input and output of the loop. Reactor power loop identification is carried out using 3000 input-output data samples, and the sampling time T s is 1 second. The control rod speed is considered as a random step input signal and the reactor power as an output signal. DNN training is performed with the help of the quasi-Newton algorithm. The optimal model that is obtained is a 2 nd order system. The identified DNN model that is obtained for the reactor is presented in (14)(15)(16).The evolution curves for the cost function is shown in Fig. 3. Fig. 4 shows the training and validation outputs that are employed for the system identification exercise. It is observed the obtained DNN model accurately tracks the reactor power. A similar procedure is applied to all other loops, and the DNN models that are obtained for each subsystem are also found to be of 2 nd order. The inputs and outputs that are used for the training of the DNNs are presented in detail in Table 2. The DNN models for the other loops are presented in Appendix A. (14) VOLUME 10, 2022 where x 1 and x 2 are the output and hidden state, respectively, of DNN1 and β r , ω r and γ r are the adjustable weights with respect to DNN1. The parameter values of DNN1 are represented as follows:

A. MODEL PREDICTIVE CONTROL
In this study, the MPC approach [35] is employed to control the feedback-linearized system. It uses a discrete-time linear state-space model of the open-loop process. The resulting process model is expressed as: where x r represents the state vector at the time t, u r (t) represents the vector of inputs, and y r (t) and z r (t) are the vector of the measured outputs and the vector of the outputs that are to be controlled, respectively. In this study, the controller employs a steady-state Kalman filter. It can be expressed as: wherex r (t + 1|t) represents the estimated state vector that depends on the conditions at time t;ŷ r (t|t −1) is the estimated output vector; K r is the Kalman gain matrix, which is chosen to minimize the estimation error variance; andê r (t|t) is the error that is estimated, which is expressed asê r (t|t) = y r (t)− y r (t|t − 1). The MPC algorithm computes the control move at each sampling instance t, which ensures the minimization of the cost function that has the form: subject to the control constraints: where N p and N u are the prediction and control horizons, respectively; i = 1, 2, . . . , N u and j = 1, 2, . . . , N p ;ẑ r (t +i|t) is the predicted controlled input; r r (t + i|t) is the reference; and R r (i) and Q r (i) represent the input and output weight matrices, respectively. The prediction horizon N p is increased to enhance the system performance. The control horizon N u is set in such manner as N u ≤ 0.2 N p [46].

B. PROPOSED MPC-FBL-DNN CONTROL APPROACH
The FBL uses DNN models to produce a control law that removes the system nonlinearities. The MPC controller that was discussed previously is applied afterwards to the feedback-linearized plant to achieve improved performance. Fig. 5 presents a schematic diagram of the designed controller. The FBL approach is based on the DNN models that 16548 VOLUME 10, 2022 are of the following form: where x represents the state vector, u is the control input and y is the controlled output; f (x) and g(x) are vector fields in the state space; and h(x) is a scalar function of x. The Lie derivative of the function h(x) with respect to f (x) is expressed as follows [43]: Likewise, for the case of the vector field g The system presents a relative degree re if: Based on this relative degree condition and DNN1 (14-16), the system relative degree is found to be re = 1. Hence, the system is feedback linearizable, and the FBL control law can be expressed as follows: where v is the virtual control input and R(x) and S(x) are given by: (37) and . . .
where, the parameterλ 1k corresponds to an arbitrary value. λ 0 and λ 1 are tuned to have the similar static gain and constant time with respect to DNN1 for the linearized system. Then, a linear MPC controller is added to the outer control loop, as illustrated in the block diagram in Fig.5.

V. SIMULATION RESULTS
Simulations are carried out to assess the performance of the proposed controller on the nonlinear PWR model. The plant is initially presumed to operate at a steady state. The five control loops of the PWR are studied here: the reactor, steam generator, pressurizer pressure and level, and turbine loop. In all cases, the MPC-FBL-DNN controller is compared with the SS-MPC controller. In this study, external disturbances are considered and are injected into the control signal for each loop. The considered disturbance (ζ (t)) is a sinusoid signal with a magnitude of ζ 0 . This disturbance is represented as follows [2]: To analyse the control performance, two numerical measures are computed: the percentage-root mean square error (PRMSE), which is used to obtain the error between the system state and the reference trajectory, and the total variation of input (TVI), which is employed to analyse the control action effect on the input. These are expressed as follows:

A. REACTOR POWER LOOP
The reactor loop is assessed in the context of a load-following procedure in the presence of disturbances. A disturbance (ζ (t)) is injected into the rod speed and with a magnitude of ζ 0 = 10 −3 . The reference setpoint to be followed by the reactor is detailed as follows: the plant is initially assumed to be at fractional full-power (FFP). The demand is maintained at FFP for 200 seconds. Then, it is reduced to 0.8 FFP in 100 seconds and maintained at 0.8 FFP for 300 seconds. Finally, it returns to its initial value. Fig. 6 shows the tracking performances of the controllers. The reactor power response is shown in Fig. 6a. It is observed that the SS-MPC controller fails to deal with the disturbances, as it presents bounded variations with a constant amplitude that remains within 0.25% of the reference. The MPC-FBL-DNN controller realizes a better tracking accuracy and does not exhibit oscillations. The control rod speed and reactivity variations are shown in Fig. 6b and 6b, respectively. It is observed that the control input of the SS-MPC controller is affected by the disturbances because it contains oscillations. These are not recommended and can damage the actuator [40]. Table 3 presents the values of the PRMSE and TVI measures that are computed to evaluate the performance of the controllers. The PRMSE measure is computed using the reactor power measure and the power set-point. TVI is computed from the control rod speed. It is observed that the MPC-FBL-DNN controller has a lower PRMSE value than the SS-MPC controller, and there is indeed an approximately one order of magnitude difference between the PRMSE values of the two controllers. Both controllers require similar control effort with respect to TVI, although the MPC-FBL-DNN controller is subject to fewer variations in the control signal.

B. STEAM GENERATOR LOOP
The proposed controller is tested for a reference change in the steam pressure in the presence of disturbances. A disturbance (ζ (t)) with a magnitude of ζ 0 = 5.10 −4 is added to the steam generator actuator. Fig. 7 shows the performances of the two controllers. The steam pressure and control signal responses are shown in Fig. 7a and 7b, respectively. Although both controllers are able to follow the reference value, the MPC-FBL-DNN controller does so 3 seconds sooner and is disturbance-free. The SS-MPC controller is observed to be affected by the disturbances, as it exhibits sustained oscillations with a constant amplitude that remains within 0.8% of the reference. The control efforts of both controllers are comparable; however, oscillations in the control signal are noticeable for the SS-MPC controller. For this case, the PRMSE measure is computed based on the steam pressure measure and the pressure reference. TVI is computed from the control signal to the turbine governor valve. The control efforts of both controllers are similar in terms of TVI. However, the MPC-FBL-DNN controller offers better tracking performance and produces less error in terms of PRMSE. The MPC-FBL-DNN controller has a PRMSE that is approximately half of an order of magnitude lower than that of the PRMSE of the SS-MPC controller.

C. PRESSURIZER PRESSURE LOOP
The pressurizer pressure is controlled by the actuation of a bank of heaters. The controllers are evaluated for a reference change in the pressurizer pressure and in the presence of disturbances. A disturbance (ζ (t)) is introduced into the pressurizer actuator with a magnitude of ζ 0 = 5.10 −1 .
The performances of the controllers are shown in Fig. 8, and the variation in the pressurizer pressure is depicted in  Moreover, the SS-MPC controller is observed to be slower, as it needs 5 more seconds to reach stability and it presents a peak overshoot ratio of 2%. The control signal response is shown in Fig.8b. The control effort of the SS-MPC controller is found to be more important for the SS-MPC, as the control signal is subject to bounded variations. The PRMSE and TVI values confirm that the MPC-FBL-DNN controller outperforms the SS-MPC controller. In addition to producing less error, the MPC-FBL-DNN exerts significantly less control effort and is able to deal with the disturbances with success.

D. PRESSURIZER LEVEL
A pressurizer level controller aims to maintain the water level of the reactor coolant system. The pressurizer level controllers are evaluated in the presence of disturbances.
The same disturbance signal as in V-C is injected into the pressurizer level actuator. Fig. 9 shows the responses of the controllers to a set-point change in the pressurizer level. The pressurizer level response is shown in Fig. 9a. The MPC-FBL-DNN controller has a better tracking accuracy and reaches the demand faster than the SS-MPC controller. In addition to being slower, the SS-MPC controller presents a peak overshoot ratio of 0.6% and settles with residual oscillations. The control signal variation is shown in Fig. 9b. The SS-MPC controller presents bounded variations, whereas the MPC-FBL-DNN exerts less control effort. Based on Table 3, the MPC-FBL-DNN controller has a better setpoint tracking with the smallest PRMSE value. The PRMSE of the MPC-FBL-DNN controller is less than that of the SS-MPC controller by an order of magnitude. The proposed controller also exerts slightly fewer control efforts with respect to TVI. The MPC-FBL-DNN controller is found to produce two times less control variation than the SS-MPC controller. The PRMSE is computed from the pressurizer level and the level demand. The TVI measure is computed using the mass surge flow rate.

E. TURBINE SPEED LOOP
The performance of the turbine speed loop is evaluated in the load-following mode and the load-rejection mode in the presence of disturbances. A disturbance (ζ (t)) is added to the actuator with a magnitude of ζ 0 = 10 −4 . Fig. 10 shows the simulation results of the controllers in the load-following mode. The turbine output trajectory is shown in Fig. 10a. The turbine output for the MPC-FBL-DNN controller tracks the reference steadily, whereas the turbine output for the SS-MPC controller requires more time to track the setpoint. The SS-MPC controller reaches the 0.9 FFP at 485 seconds, whereas the MPC-FBL-DNN controller reaches it 85 seconds earlier. The simulation results for the load rejection mode are shown in Fig. 11. The response of the turbine output is shown in Fig. 11a. In addition to being slower, the SS-MPC controller is subject to an undershoot ratio of 0.6% at 500 seconds. The responses of the turbine speed and control signal are shown in Fig. 11 and 11b, respectively. The MPC-FBL-DNN exerts more control efforts for the system output to follow the power level change faster with minimum error. The mechanical power demand and the turbine output are used to compute the PRMSE measure. TVI is computed from the control signal to the turbine governor valve. The PRMSE and TVI values from Table 3 confirm the outstanding performance of the proposed controller. The MPC-FBL-DNN controller has a better tracking precision in terms of PRMSE. The PRMSE value of the MPC-FBL-DNN controller is indeed half of an order of magnitude lower than that of the SS-MPC controller. In addition, the MPC-FBL-DNN controller is found to produce fewer variations in the control signal.

VI. CONCLUSION
A hybrid control technique that integrates a DNN-based FBL approach with MPC for the control of a PWR-type nuclear plant has been presented. The simulation results show that the proposed controller offers improved tracking performance and enhanced robustness under bounded disturbances. The efficacy of the proposed control architecture has been validated for the reactor core, steam generator, pressurizer pressure and level, and turbine subsystems. The proposed controller has been compared with the state-space based conventional MPC and the control performance have been validated using two numerical measures. The proposed technique has a better tracking accuracy and exerts less control efforts than the conventional technique. Furthermore, the designed control law has demonstrated to handle the disturbances effectively, whereas the conventional MPC fails in doing so. Future works will involve the development of a fault-tolerant control scheme that can accomodate failures of sensors and actuators.

APPENDIX A
The obtained DNN models are described in detail as follows: DNN2: