Combined Feedforward Control and Disturbance Rejection Control Design for a Wafer Stage: A Data-Driven Approach Based on Iterative Parameter Tuning

This article presents a data-driven algorithm that combines the advantages of iterative feedforward tuning and disturbance rejection control to satisfy the precision requirements and ensure extrapolation capability of wafer scanning. The proposed algorithm differs from pre-existing algorithms in terms of its low requirement of system model, high extrapolation capability for non repetitive trajectory tracking tasks, and high tracking precision. The feedforward controller is tuned based on instrumental variables. It utilizes tracking errors from past iterations to eliminate reference-induced errors without requiring a system model. Meanwhile, the system inverse is approximated during iterative process, and then a disturbance rejection control based on iterative tuning is constructed to compensate for disturbance-induced errors. The proposed algorithm is applied to a wafer stage. The experimental results validate the effectiveness and superiority of the proposed algorithm.


I. INTRODUCTION
Gene sequencing is an important scientific technology. In the production steps of the next-generation gene sequencing platform, a two-dimensional spot-array is attached to a patterned wafer substrate, and each spot contains a DNA nanoball, which is an amplified DNA cluster. The nano-ball is imaged using an optical system while it is moving in a constant-velocity scanning motion. The DNA nano-ball is positioned using a wafer stage. The control performance of the wafer stage determines the imaging efficiency and quality of the gene sequencing. Therefore, the following requirements are imposed on the controller: 1) high tracking precision; 2) strong resistance to disturbance; 3) high extrapolation capability for non repetitive trajectory tracking tasks. Feedforward control is widely used in motion systems, in particular in systems requiring high precision in trajectory The associate editor coordinating the review of this manuscript and approving it for publication was Jesus Felez . tracking tasks. Feedforward controllers can compensate for errors induced by known reference trajectories and repetitive disturbances; therefore, they can improve system performance significantly using a priori knowledge. Traditional feedforward control approaches, including iterative learning control (ILC), model-based feedforward control, and iterative feedforward tuning (IFFT), can satisfy these requirements.
Systems with a repetitive nature, such as a wafer stage, lithography machine, and mechanical arm, have repeating trajectories that are known a priori. Iterative learning control is a feedforward control method designed for repetitive tasks. It was proposed by Arimoto et al. [1] in 1984 and in the past three decades has been extended by researchers to different schemes, such as inverse systems [2] and the stochastic method [3], [4]. Furthermore, ILC can be combined with other control methods, including adaptive control [5], [6] and robust control [7], [8], to achieve better performances based on system characteristics and requirements.
The ILC algorithm can effectively reduce repetitive disturbance and achieve high control precision [9]. However, non-repetitive disturbance may accumulate and negatively affect the convergence of ILC [10], [11]. Furthermore, when extrapolating to different tasks, changes in the reference trajectory can result in performance deterioration. Poor extrapolation results in strict restrictions when applied to industrial plants.
By contrast, model-based feedforward control results in high control precision and high extrapolation capability for a class of tasks [12]. The model-based control method requires a system model and uses its inverse or a sensitivity function as a learning filter to compensate for system information. Hence, the control performance is highly dependent on the quality and accuracy of the system model obtained from system identification. Moreover, complicated dynamics and different unfavorable environmental conditions render it difficult to achieve an effective control [13]- [15]. The requirement of a high-quality model conflicts with the industrial expectation of a simple model-free and data-driven approach [16].
To combine the high control precision of ILC with the extrapolation capability of model-based feedforward control and remove the requirement of system models, many approaches have been investigated. In 2010, Wijdeven and Bosgra proposed a method that introduced basis functions [17]. With a basis function, a signal is projected onto a space spanned by the basis function, and a feedforward controller can be used to approximate the system inverse. Furthermore, the feedback controller can be parameterized, as shown in [18]. By introducing the basis function, the requirement of system model identification is transformed into an IFFT problem of a parameterized feedforward controller [19]. In [20], a parameterized feedforward controller was tuned using a data-driven gradient-descent learning algorithm. In [20]- [22], it was shown that using a least-squares algorithm resulted in bias errors. To eliminate those bias errors, a least-squares algorithm with an instrumental variable (IV) was proposed, which resulted in an unbiased parameter estimation. Accuracy analysis was provided in [23]. Song et al. [24] proposed a high-order approach based on an IV using error data from all past iterations to achieve a high tolerance for disturbance. In [25], [26], a rational basis function was introduced to achieve a higher precision.
A feedforward controller can compensate for errors induced by known reference trajectories rather than non repetitive disturbance-induced errors. To eliminate non repetitive disturbance-induced errors, several types of disturbance rejection controls have been investigated, such as the disturbance observer based (DOB) control [27], active disturbance rejection control (ADRC) [28], and extended state observer (ESO) [29]. Herein, a new data-driven iterativetuning-based compound control strategy that combines both feedforward control and disturbance rejection control is proposed to improve the motion performance of the wafer stage. The proposed disturbance rejection controller aims to eliminate the disturbance-induced error, whereas the feedforward controller aims to compensate for the reference trajectory-induced error. Meanwhile, when the reference trajectory changes, the algorithm still maintains an acceptable precision. Additionally, the proposed algorithm is data-driven and does not require a priori information of the system; therefore, the algorithm is applicable to different plants without requiring system identification in advance.
Compared to previous studies, the main contributions of this study are as follows: 1) A data-driven algorithm that combines feedforward control and disturbance rejection control is proposed, thereby satisfying control precision and extrapolation capability simultaneously without requiring a system model or sensitivity function.
2) Both feedforward control and disturbance rejection control can be constructed via iterative tuning. The design approach is simple and easy to implement.
3) Experiments are performed on a wafer stage with varying reference trajectories; subsequently, they are compared with existing ILC and IFFT methods. The validity of the proposed algorithm is proved.
The paper is organized as follows. Section II describes the control configuration of the system and the control problem considered herein. In Section III, a detailed statement of the components used and the procedure of the proposed algorithm are provided. The experimental setup employed to assess the performance of the proposed algorithm and the experimental results are provided in Section IV. Conclusions are presented in Section V.

II. PROBLEM FORMULATION A. CONTROL CONFIGURATION
Let the superscript k represent the trial number. Consider the control configuration as depicted in Fig. 1, where P is a true unknown system that is assumed to be single-input-singleoutput and linear time-invariant. C fb is a feedback controller designed to stabilize system P by introducing a feedback control signal. C k ff represents the feedforward controller that is part of the disturbance observer. Furthermore, the known reference trajectory r is designed as a fourth-order polynomial trajectory with constraints on the first three derivatives. y k denotes the output signal of the system, u k ff the feedforward control signal produced by C k ff , u k the output signal of the controller, d k the unknown disturbance whose mean is assumed to be zero, andd k the estimate of the unknown disturbance. All signals considered are discrete-time signals.
Based on Fig.1, the measured signal y k is expressed as where S = (I + PC fb ) −1 denotes the sensitivity function of the system, and VOLUME 8, 2020 FIGURE 1. Block diagram of the control configuration. In the controller parameter tuning procedure, C k ff is determined based on reference trajectory r and the measured signals e k−1 and y k−1 from the (k − 1)th task. Subsequently, disturbance rejection control based on C k ff is constructed to compensate for disturbance-induced errors.
The tracking error e k can be expressed as This shows that the tracking error e k comprises two components: a reference trajectory-induced error and a disturbance-induced error From (3) and (4), it is clear that the estimate of unknown disturbanced = d, and the tracking error is equal to 0 if In (7), feedforward controller C k ff is introduced to approximate the inverted plant dynamics such that reference-induced errors can be compensated when passing through C ff .
It is noteworthy that P, S, and d k are unknown; therefore, e k r and e k d cannot be constructed from the measured tracking error e k . Hence, a data-driven iterative optimization problem is formulated in the next section to update C k ff . Namely, for a transfer function G and a discrete signal µ, the filtered signal of µ by G can be expressed as y = Gµ.

B. FEEDFORWARD CONTROLLER PARAMETRIZATION
The traditional feedforward controller introduced to approximate the system inverse is designed from system information calculated a priori from system identification before initializing the control algorithm. Whenever an algorithm is applied to a new system, system identification must be conducted using system identification methods, such as frequency sweeping and correlation analysis. Consequently, this will increase the complexity of the control process. To reduce the complexity of the control process and integrate the control process into a single algorithm, a parameterized feedforward controller C ff (θ ) is applied to acquire the feedforward control signal, thereby reducing trajectory-induced errors.
As shown in Fig.1, feedforward signal u k+1 ff is parameterized with respect to a reference trajectory r and a fixed feedforward controller C k+1 ff (θ k+1 ) as follows: Subsequently, where the update C ff based on measured signals e k+1 and y k+1 is obtained from the optimization problem.
where C is designed a priori as in Definition 1.

Definition 1:
The feedforward controller C ff is parameterized in terms of the basis functions as follows: Here, θ ∈ R n×1 , the parameter vector of C ff , is expressed as and the polynomial basis functions are defined as All ψ i (q) are basis functions of q as follows: where T s denotes the sampling time of the system, and n represents the number of basis functions introduced into the parameterization. For simplicity, we will omit q herein after. The optimal parameter vector of the feedforward controller is a parameter vector satisfying the following definition.
Definition 2: The feedforward controller parameter vector θ * is optimal if the feedforward controller designed with the corresponding parameter vector C * ff = ff θ * satisfies e * r = S(1 − PC * ff )r = 0. The parameterized feedforward controller is designed to satisfy the following goal. Goal 1. Determine C ff built based on θ such that the criterion V (θ k ) in (10) is minimized in the trajectory tracking process, where V (θ k ) is formulated as: where e k is formulated as in (4).

III. ALGORITHM
In this section, a new algorithm that combines IV-based IFFT and disturbance rejection control (IFFT-DRC) is proposed to deal with the problems that arise during reference tracking.
A theoretical formulation and a detailed procedure of the proposed algorithm are provided in the following.

A. ITERATIVE PARAMETER TUNING APPROACH
With respect to parametrization in Section II-B, measured signals e k and y k can be extended using parameter vector θ k as follows: From (17), we can express r as: Evaluating the expected value of (18) yields By combining (16) and (19), the gradient with respect to parameter vector θ is expressed as follows: Remark 1: If (C k ff (θ k ) + C fb ) −1 is unstable, then the stable inversion procedure in [21] can be employed to obtain the value of (C k ff (θ k ) + C fb ) −1 . Using the gradient-descent optimal algorithm, we can iteratively calculate θ k+1 as follows: where de dθ is denoted by (20). By iteratively calculating θ k , the algorithm will converge to an optimal parameter vector θ * .
Although gradient-descent IFFT is an effective method to calculate optimal parameter vector θ rapidly, it has been discovered that this method will be affected by noise or disturbances [20] and, therefore, will yield a biased approximation of θ . In [21], an IV was introduced to eliminate the contribution of the noise signal and achieve an unbiased converged optimal θ . More detailed information can be found in [19], [22]. In this study, IV Z was set as By introducing IV Z ∈ R N ×n , the criterion in (10) is transformed to Remark 2: V (θ k ) is changed to V z (θ k ) to achieve an unbiased approximation of θ . This will not change the main criterion of the entire control process.
Based on the criterion of (25) and de dθ calculated in (20), we can solve the optimal problem by setting the learning matrix L ∈ R n×N as where ρ is the learning gain. Subsequently, we set the update law as to calculate the converged θ using an iterative tuning algorithm. (20) and (24), IV Z and de k dθ k are correlated. Consequently, Z T de k dθ k is nonsingular. Theorem 1: For measured signals e k and y k , consider the update law in (27), an unbiased estimate θ * can be obtained as k → ∞, i.e.,

Remark 3: As shown in
Proof: From (4), the tracking error is expressed as From Definition 2, S = SPC * ff = SP ff θ * ; substituting it into (29) yields Next, substituting (30) into (27) and using LSP ff r = −ρ yields When k → ∞, Because of the design of IV Z in (24), and as the known reference trajectory r and e d are uncorrelated, we obtain which indicates that E(θ k ) converges to θ * as k → ∞. This completes the proof.

B. FEEDFORWARD AND DISTURBANCE REJECTION CONTROL BASED ON PARAMETER TUNING
Based on (9) and (10), C k+1 ff is expressed as Based on Fig.1, the estimate of disturbance is expressed aŝ Subsequently, the controller is expressed as The role of the proposed feedforward controller is to eliminate the trajectory-induced error e k r rather than both e k r and the disturbance-induced error e k d . Disturbance rejection control based on iterative parameter tuning solves this problem.
Theorem 2: For measured signals e k , consider the control law (36), the following fact holds: lim k→∞ e k (t) = 0.
Proof: As shown in Fig.1, when Q = 1, the disturbance can be estimated aŝ From Theorem 1, we conclude that θ * is an optimal parameter of the feedforward controller. It follows that Hence, the disturbance-induced error Consequently, the tracking error e k = 0, k → ∞. This completes the proof.
For the above calculations, the steps to solve optimal problem (11) can be summarized as follows: Procedure 1. Estimation of θ k in the k th iteration.

Initialization.
(1) Determine the appropriate learning gain ρ, 0 < ρ < 1, to obtain a trade off between the convergence speed and the noise tolerance. (2) Determine the appropriate basis functions θ to describe feedforward controller C ff and reference trajectory r. (3) Determine the appropriate initial filter coefficients θ 0 . (4) Set the initial values of e k ,d k , and y k to 0. Main procedure.
(1) Perform an experiment using (36), and measure e k and y k . If the tracking error e k meets the accuracy requirement, terminate the iterations, else (2) Calculate de k dθ k = ff (C k ff + C fb ) −1 y k using (13), (27), and (34).

and return to
Step (1).

Remark 4:
The traditional disturbance observer assumes that the controlled plant is of the minimum phase and then designs it using the inverse model of the controlled plant. In this study, the proposed disturbance rejection control was designed using parameters obtained through iterative tuning and the model of the plant or the sensitivity function was not required. The entire design process was data driven.
Remark 5: To avoid amplifying the measurement noise, filter Q was designed as a low-pass filter to filter out high-frequency noise. The cutoff frequency of the filter should include the frequency range of the disturbance as much as possible. Therefore, the cutoff frequency of Q should be selected according to the performance and stability requirements. In general, a high cut-off frequency provides a better anti-disturbance performance but increases noise sensitivity, thereby reducing the system stability. The frequency characteristics of the signal at different stages of motion differ; therefore, the cut-off frequency of filter Q was designed adaptively via time-frequency analysis. The details are provided in Section IV.

IV. EXPERIMENT
The proposed algorithm was assessed using the experimental setup described in Section IV-A. In Section IV-B, the convergence and performance of IV-based IFFT are evaluated. In Section IV-C, the proposed algorithm implemented on the experimental setup is described, and the performance of the algorithm based on reference trajectory changes is validated to prove its effectiveness.

A. EXPERIMENTAL SETUP
The experimental linear motor system depicted in Fig.2 was employed to validate the theoretical results and the algorithm proposed herein. The servo control system comprises an upper computer, a motion control card, a motor driver,  and a linear motor NEWPORT IDL225. The position of the linear motor was measured using a Heidenhain linear incremental encoder with a resolution of 20 µm; subsequently, it was interpolated with 2000 intervals to obtain a resolution of 10nm. The motor was driven by a digital servo drive. The proposed algorithm was realized using C language on a PowerPMAC controller.The sampling period of the control system was T s = 0.0001s. The frequency response of plant P is depicted in Fig.3. The closed-loop servo system was excited using two types of fifth-order point-to-point reference trajectories r1 and r2, as depicted in Fig.4. The number of setpoints for both in the reference was N = 6450. The feedback controller used in the following experiments was a linear PD controller formulated as C fb = K p +K d 1−q −1 T s , with K p = 0.3 and K d = 0.0005. Based on reference trajectory r1, r2 and feedback controller C fb , experiments were performed to evaluate the performances of the algorithms.

B. RESULTS OF THE PARAMETRIZATION CONTROLLER
In the linear motor, the force ripple is a highly undesirable problem that can seriously deteriorate the servo performance. To clarify the nature of the force ripple better, the force applied by the feedback controller was measured while the motor was moving at a constant velocity. As the acceleration is approximately equal to zero and the friction force is   assumed to be constant, any chattering in the applied force compensates for the force ripples [30]. By doing several experiments in the negative and positive directions, the offset owing to the constant friction can be removed, and an estimated force ripple can be obtained, as shown in Fig.5.
The frequency response function of the considered plant is depicted in Fig.3, which shows the rigid-body dynamics at less than 100Hz. The time-frequency analysis of the first trial racking error is shown in Fig.6. As shown, the error characteristics during the acceleration and deceleration phases differed from those during the uniform velocity phase. The dominant frequency during the acceleration and deceler-VOLUME 8, 2020 ation phases was within the range of 500Hz. Therefore, the cut-off frequencies of the Q-filter during the acceleration and deceleration phases and the uniform velocity phase were set as 460 and 90Hz, respectively. The second-order low-pass filters are expressed as where Q 1 and Q 2 were used for the acceleration and deceleration phases and the uniform velocity phase, respectively. The basis function ff was set as ff = [ψ 1 , ψ 2 ] with and the corresponding parameter vector as θ = [θ 1 , θ 2 ] T .
Parameterized C ff (θ ) = ff θ comprises acceleration feedforward and velocity feedforward, in which ψ 1 θ 1 contributes to the acceleration feedforward compensating for the rigid-body dynamics, whereas ψ 2 θ 2 contributes to the velocity feedforward compensating for viscous friction. Considering the trade-off between the convergence speed and the stability of the algorithm, learning gain ρ was set to 0.8. Estimation results of θ 1 and θ 2 at each iteration are depicted in Fig.7. As shown, their values changed slightly after the 2nd iteration and satisfied the convergence criterion; this implies that the algorithm converged in two steps, thereby confirming that the operating time of the proposed algorithm was acceptable.

C. PERFORMANCE ASSESSMENT OF PROPOSED ALGORITHM
The proposed algorithm, which combined feedforward control and disturbance rejection control, was validated on a wafer stage, as shown in Fig.2.  To verify the effectiveness of the feedforward controller in the proposed IFFT-DRC method, 10 iterations of the experiment were performed. The tracking error at the 10th iteration in comparison with that at the 1st iteration is shown in Fig.8. As shown in Fig.8, the tracking error during the acceleration and deceleration phases reduced significantly. This indicates that the compensation of the feedforward controller effectively eliminated the trajectory-induced error.
To verify the high control precision and strong anti-disturbance capability of the proposed method, comparison experiments with the existing PD-type ILC and IFFT methods were implemented. The updating law of the PD-type ILC is expressed as u k+1 ff = u k ff +L p e k +L dė k , with L p = 0.1, and L d = 0.00005. The trajectory r1 in Fig.4 was used as the same reference trajectory for the comparison experiment, followed by 10 iterations for ILC and IFFT, separately. The tracking errors and control input signals of the three methods are shown in Fig.9 and Fig.10, respectively. In addition, the mean and root mean square (RMS) error of the three methods  are shown in Table 1. The following were observed based on Fig.9 and Table 1: (1) ILC and IFFT eliminated almost all the trajectoryinduced errors. However, a heavy fluctuation was observed because these two methods could not eliminate the disturbance-induced error. (2) The control precision of the proposed method was much better than those of the existing ILC and IFFT. The proposed method can eliminate both the trajectoryand disturbance-induced errors. To verify the high extrapolation capability of the proposed method for tasks, comparison experiments with different reference trajectories were performed. Feedforward control signals and feedforward parameters obtained with r1 in the previous comparison experiment were applied to r2. Figs.11 and 12 show the tracking errors and control input signals, respectively, of the three methods under reference trajectory r2. The following were observed based on Fig.11 and Table 1: (1) The performance of the PD-type ILC was highly dependent on the specified reference trajectory. The tracking precision deteriorated severely when the optimal feedforward signal learned by r1 was applied to r2. (2) The performance of the IFFT and the proposed IFFT-DRC method did not deteriorate when the reference changed, as shown in Fig.11. Meanwhile, the control precision of the proposed method was much better than that of the IFFT. This confirms that the proposed IFFT-DRC method has a high extrapolation ability for different reference trajectories. (3) Using IFFT to calculate θ * such that the data-based approximation of the system inverse can compensate  for the reference-induced error and using C ff to construct a disturbance observer to compensate for the disturbance-induced error, the algorithm becomes data-driven without requiring a system model.

V. CONCLUSION
Herein, a new algorithm that combined IV-based iterative parameter tuning with disturbance rejection control was proposed to increase the extrapolation ability and stability of varying references tasks. The algorithm is data-driven and does not require a priori knowledge of the system; meanwhile, system inverse is estimated, and disturbance is compensated for with iterative parameter tuning in an iterative process. The experimental results validated the effectiveness of the proposed algorithm. Furthermore, the latter was compared with pre-existing approaches. It was demonstrated that the proposed algorithm achieved (i) high tracking precision, (ii) strong anti-disturbance capability, and (iii) high extrapolation ability in industrial variable trajectory tracking tasks. VOLUME 8, 2020