Closed-Loop Subspace Identification for Stable/ Unstable Systems Using Data Compression and Nuclear Norm Minimization

This paper provides a subspace method for closed-loop identification, which clearly specifies the model order from noisy measurement data. The method can handle long I/O data of the target system to be noise-tolerant and determine the model order via nuclear norm minimization. First, the proposed method compresses the long data by projecting them to an appropriate low dimensional subspace, then obtains a low order model whose order is specified by a combination of data compression and nuclear norm minimization. Its effectiveness is demonstrated through detailed numerical examples.


I. INTRODUCTION
It is essential to construct mathematical system models in control system design and analysis. Among various system modeling problems, closed-loop identification is often necessary for many engineering applications. For example, when the target system itself is unstable, we must collect its I/O data in a stabilized closed-loop setting. Even when the target system itself is stable, feedback control is required to collect the data at certain operating points. Thus, data acquisition under feedback control is often required in practice due to safety/economic reasons. In addition, if we look at networked control systems or large-scale interconnected systems, it is difficult to identify the whole system simultaneously. Instead, we have to identify each subsystem where there exists some feedback from the other subsystems. Hence closed-loop identification becomes more and more important these days. Since it is well known that closed-loop identification is difficult due to the correlation between the measurement noise and the input, various methods have been extensively developed so far (see survey papers, e.g., [1], [2]).
In the prediction error framework [3], the closed-loop identification methods may be classified into three categories, namely, (a) direct identification, (b) indirect identification, The associate editor coordinating the review of this manuscript and approving it for publication was Shen Yin. and (c) joint input-output methods [2]. The first approach uses only the I/O data of the target system while ignoring the presence of feedback. Hence the controller could be of the higher-order or nonlinear. Instead, accurate noise modeling is necessary, which is difficult in practice. The second approach requires the exact knowledge of the feedback controller (except the Dual-Youla approach mentioned later), and the obtained model tends to be of higher-order. The third approach identifies both the target system and the controller in the loop from the I/O data with the excitation signal information, but both the target system and the controller should be linear. One common weak point of most methods (in (a), (b), and (c)) is that they may not be suitable for the identification of (open-loop) unstable systems due to numerical problems. This point may not be well-recognized except [4]. As for the identification of unstable systems, the Dual-Youla method [5] is known to be quite effective, which is an indirect method and transforms the closed-loop identification problem (for a possibly unstable system) into an open-loop problem (for a stable system). Furthermore, the accuracy of the identified model does not depend on the feedback controller model so much. This method has been further developed by [6] and [7], which may accept non-accurate (and possibly nonlinear) controller models. However, one big practical issue is how to specify the model order in advance. The identified model accuracy could strongly depend on this class.
While, within the framework of the subspace identification methods, various methods have been developed so far to handle the closed-loop data (see e.g., [8]- [11]). Joint inputoutput subspace approach can be found in [12], [13]. Innovation estimation approach was developed by [14]. Two stage methods proposed by [15]- [17] may provide us with unbiased models. However, these methods may not be suitable for open-loop unstable systems, or the results may not be reliable when the feedback controllers are of high order or nonlinear. On the other hand, SSARX (State Space Auto Regressive with eXogenous input) [18] and its variant PBSID (Prediction-Based Subspace IDentification) [19] are direct methods and do not have such drawbacks. These methods work for stable/unstable systems irrespective of the controller complexity. The key is to use innovation form and identify higher-order vector ARX models first, and then some model reduction steps are employed. This approach can be regarded as one of the most promising methods (see [10]). The origin of this approach can be traced back to [8]. In the subspace identification, the system order is usually selected from the singular values of some matrix (related to the so-called extended observability matrix). Unfortunately, the system order selection is not automatic (i.e., not so straightforward) in most cases. In addition, this approach also requires an accurate noise model to obtain a bias-free plant model. However, this is not easy at all in practice.
As for the automatic model order selection, the nuclear norm minimization has been employed in recent subspace identification literature (e.g, [20]- [24]). The nuclear norm minimization is known as a powerful convex relaxation for the rank minimization problem. Hence it would be useful in determining the system order. In particular, N2SID (Nuclear Norm Subspace Identification) [24] employs the innovation form and provides an interesting identification framework. However, most papers employing nuclear norm minimization focus on open-loop identification except [25], which combines PBSID with N2SID for closed-loop subspace identification. Hence it is not still clear whether this approach will work in closed-loop. More importantly, one disadvantage of the nuclear norm minimization is that it requires a heavy computation burden. Therefore, the existing methods can handle very short I/O data, and SNR(signal to noise ratio) should be very high to obtain an accurate plant model. Consequently, it is not clear if nuclear norm minimization is useful enough in practical identification problems with noisy data.
The purpose of this paper is to provide a subspace method for closed-loop identification, which clearly specifies the model order against noisy measurement data in the presence of unknown feedback controllers. For this purpose, the method should be able to handle arbitrarily long data in order to be noise-tolerant (i.e., robust against any colored noise and low SNR) and should employ the innovation form to handle unstable systems as well as stable systems. First, we compress the data by projecting them to a low dimensional subspace which does not have any correlation with measurement noise. Then, based on a nuclear norm minimization with the compressed data, the system order is selected automatically, and the model of the selected order is obtained immediately. In order to evaluate the effectiveness of the proposed method, detailed simulation results will be given. In the simulation, various stable/unstable systems are accurately identified in the presence of heavy (and colored) measurement noises under which the existing reliable subspace identification method may not work.
The earlier version of this work can be found in [26]. This paper provides an improved algorithm, where it is unnecessary to include any tuning parameter, and contains detailed numerical examples for validation. We use the following notation. H n×m k×l denotes the set of block Hankel matrices which consists of n row blocks and m column blocks and each block is k ×l matrix. T n×m k×l denotes the set of block Toeplitz matrices which consists of n row blocks and m column blocks and each block is k × l matrix. Note that each element in H n×m k×l (or T n×m k×l ) is a nk ×ml matrix. N (µ, σ 2 ) denotes the normal distribution with mean µ and variance σ 2 . U(a, b) denotes the uniform distribution between a and b.

II. PROBLEM SETTING
We consider the stabilized closed-loop system shown in Fig. 1. The plant P is described by for k = 1, 2, . . ., where x p (k) ∈ R n , u(k) ∈ R m and y(k) ∈ R p are the state, the input and the measured output, respectively. The plant output measurement is contaminated by η(k) ∈ R p which is given by with some white noise w(k) and an unknown noise shaping filter H which is stable. Here, q denotes the forward shift operator (i.e,, qw(k) = w(k + 1)). The stabilizing controller K is unknown which could be of the high order or nonlinear. The vector r(k) ∈ R m is the external excitation signal which is not correlated to the measurement noise and is specified by the user. We assume that both u(k) and r(k) are rich enough to identify the plant. The purpose is to determine the minimal system order n and identify without any knowledge of K and H .

III. SUBSPACE IDENTIFICATION
In this section, we will briefly describe the subspace identification method which is used in this paper.  We employ the innovation form given by where x(k) ∈ R n denotes the estimated state, A ∈ R n×n , B ∈ R n×m , C ∈ R p×n , D ∈ R p×m and K e ∈ R n×p are matrices to be determined. The vector e(k) ∈ R p represents the innovation process which is supposed to be zero mean white, and K e is the Kalman gain. It is assumed that then, from (4) and (5) we obtain Now define the following matrix U ∈ H Here, N is the available data length, and s is chosen by the user which should be larger than the state dimension n. Similarly, from e(k). Then, the data equation is obtained, where the matrices O s ∈ R ps×n , X ∈ R n×(N −s+1) , T u ∈ T s×s p×m and, T y ∈ T s×s p×p are defined as follows: The matrix O s ∈ R ps×n is called the extended observability matrix.
Various subspace identification methods have been proposed to calculate O s , T u and T y based on I/O data U and Y with the data equation (9). We will propose our method in the next section. Once O s , T u and T y are obtained, it is easy to determine {A, B, C, D, K e } as shown below.
Partition O s by 3 row blocks as be the first column blocks of T u and T y , respectively. Then, from (13), C and A can be determined by where · F denotes the Frobenius norm. From (11) and (12), D and B are obtained by Then, we have Hence, the model of the system is given by

IV. PROPOSED METHOD
This section proposes an identification method which focuses on the following two points.
• Property (A) The system order can be selected automatically.
• Property (B) It is robust against heavy (possibly colored) measurement noise. As for property (A), the nuclear norm minimization is known to be effective. Hence, we will exploit it. On the other hand, one simple way for property (B) is to use the long data. Since data storage cost is getting cheaper and cheaper these days, this could be a good option. However, the computation burden of the nuclear norm minimization becomes serious when the data length N grows. Hence, in the next section, we propose how to take advantage of the long I/O data while exploiting the nuclear norm minimization.

A. DATA COMPRESSION
We will compress the long data by projecting them on some low-dimensional subspace S . Let n ≤ s J N be its dimension. Each element in S is a N − s + 1 dimesional row vector. Let be the matrix whose rows are coincide with the orthonormal basis of S . Namely, it satisfies Then, by projecting the data matrix U on S , we obtain which solves By projecting the data matrices U and Y on S , the data equation is compressed as It is crucial to choose an appropriate subspace S , which should be not correlated to the measurement noise but strongly related to the deterministic part of the I/O data. In what follows, we give one way to find such S .
Then, we choose Note the dimension of S is given by s J = ms + n J . Here, it would be natural to include R in the above equation. While, inclusion of X F may look strange. This term implies that the projected space contains filtered signals with various modes (i.e., poles of F J (q)). So the above equation implies that the projected space consists of not only the excitation signal but also its filtered signals. As demonstrated later in the simulation section, this choice of S works well.

B. NUCLEAR NORM MINIMIZATION
Now we focus on property (B). Assuming that both X and X T have full rank, the model system order n is given by Note that the above equation hold in almost all X T . This implies that we have to minimize the rank of O s X T in order to obtain a low order model. From (22), we have Based on the above observation, we propose to solve the following minimization problem.
where · denotes the nuclear norm (i.e., sum of all singular values). Note that we can always obtain the (globally) optimal solution, because this is a convex relaxation of the rank minimization problem. Let T * u and T * y be the optimal solution, and define which can be regarded as an estimation of O s X , and calculate its singular value decomposition (SVD) as where n is the diagonal matrix consisting of the first n singular values (i.e., dominant part), and U n and V n are the dominant unitary parts corresponding to n . Then, we determine From O * s , T * u and T * y , we can calculate (A, B, C, D, K e ) as shown in the previous section.
Remark 1: In N2SID [24], the problem is to find E ∈ H s×N −s+1 p×1 , T u ∈ T s×s p×m and T y ∈ T s×s p×p minimizing Here, what we have to look for are e(k)(k = 1, 2, · · · , N ) and the first (block) columns of T u and T y rather than the whole matrices E, T u and T y . This is a nice idea. Also the above minimization avoids the use of instrumental variables to remove the effect of innovation term E. This point could be an advantage when the available data is short as claimed VOLUME 10, 2022 in [24]. However, when the available data is long enough, the use of instrumental variables (or introduction of projection) may work well. Hence we adopt (28). In addition, when we adopt (32), the obtained model strongly depends on the choice of the weight λ. Consequently, it is really difficult to choose an appropriate λ in practice. In order to avoid such difficulty, we eliminate such a weight in (28). This is very convenient for the user. Besides, if λ → ∞ in (32), then eventually we have e(k) → 0, which corresponds to the optimization of (28) up to the projection. Hence we may consider that the proposed method tries to minimize (32), while choosing the weight λ extremely large. However, note that the projection plays a crucial role to be able to handle long data. So this makes a big difference from N2SID.
Remark 2: Most subspace methods try to minimize the Frobenius norm of O s X . In this case, it is often very difficult to select the appropriate system order, especially when the measurement noise is significant. As shown later in the numerical examples, the nuclear norm minimization is really effective even in such a case.
Remark 3: Compared to the earlier version of this work [26], two points are different. One is the minimization problem. In [26], the problem is to find ZŶ ∈ R ps×s J , T u ∈ T s×s p×m and T y ∈ T s×s p×p minimizing where Z U := U T and Z Y := Y T . This minimization corresponds to N2SID after the data compression, and it contains a tuning parameter γ . Though it is very difficult to choose γ properly in advance, the solution strongly depends on the choice of γ . On the other hand, this paper adopts (28) as the minimization problem, it overcomes such a problem. This turns out to be a big advantage in practice. In addition, we optimize just the first columns of T u and T y in (28), which implies that the number of parameters to optimize is much smaller than that of [26]. As a result, the computation burden decreases. The other is the choice of S . In [26], first, calculate {u J (k)}(k = 1, 2, · · · , N ) by projecting the input data {u(k)}(k = 1, 2, · · · , N ) on the excitation signal space span(R). Then obtain the output sequence {y J (k)} (k = 1, 2, · · · , N ) by injecting {u J (k)} to an initial plant model P J (q). From {u J (k), y J (k) | k = 1, 2, · · · , N }, construct the Hankel data matrices U J and Y J . Then, the space S is given as follows: Furthermore, an iterative procedure to update the initial plant model P J (q) is given so that the updated model becomes closer to the true plant P(q). Consequently, the procedure in [26] is complicated. Compared to this, (25) in

V. SIMULATION
In this section, three examples are given to illustrate the features of the proposed method. For each configuration, 30 trials based on different random number realizations were conducted to reveal the statistical performance. The resulting models were evaluated by comparing their frequency response and step response to those of a true plant. For simplicity of presentation, we treat SISO systems only here. The data generation system in each example is configured as shown in Fig. 1, and the settings for plant P and controller K are summarized in the Table 1.
As the excitation signal, we used the multisine signal, which is commonly used in system identification [29], and the number of data is set to N = 200 000. The frequency characteristics of noise in real problems often cannot be represented by lower-order models, and their characteristics vary widely. To reproduce the problem caused by this, we use a 100th order FIR filter with

random coefficients
to generate the noise. The noise source w is Gaussian white noise with zero mean and the variance is adjusted for each example to obtain the desired SNR. In the following, 30 trials with different random realizations of the coefficients h 0 , . . . , h 100 , the noise source w, and the randomized phase of the excitation ϕ 1 , . . . , ϕ N 2 are performed for each example.
In applying the proposed method, the parameters are set as s = 30 and n J = 20. For comparison purpose, we also applied SSARX, which is a well-established subspace identification method that can VOLUME 10, 2022 handle closed-loop setting, and the implementation in n4sid function of MATLAB System Identification Toolbox is used. For simplicity, the initial states of all systems are set to zero, and SSARX uses that information. Other settings of SSARX were left at their default values. In SSARX, the order of the model, including the noise model, is automatically selected based on the Hankel singular values, and the final model is obtained by reducing the dimensionality of the non-noise part of the model to the order of the true plant using balanced truncation method.
In the following, we also discuss the ease of order selection by showing plots of the Hankel singular values used for order selection in SSARX and the singular values of O s X used for order selection in the proposed method.
A. EXAMPLE 1: OPEN-LOOP BENCHMARK [27] First, the characteristics of the proposed method are demonstrated through a benchmark problem of open-loop system identification [27]. The input and output data, where the SNR is about 3 dB, is shown in Fig. 2.
The thick red lines in the figure show the response without noise (where η(k) = 0), and the thin blue lines show the response with noise. Note that this example is an open-loop setup (K = 0), so no noise appears in the input signal.  The singular value plot obtained in this setting is shown in Fig. 3  In methods that rely on the identification of a noise model, such as SSARX, the dynamics of the noise and the target system are not separated. Therefore, when the noise has complex dynamics, there is no clear threshold for the singular value plot as seen in Fig. 6(b), and a sufficiently high order model, which is difficult to identify, has to be constructed. Indeed, the final model obtained by reducing the order of the Step response of the closed-loop systems with obtained models (Example 2). non-noise part to the correct order does not agree well with the target system (Figs. 4(b) and 5(b)).
On the other hand, the noise component is reduced by projection and dimensionality reduction. It appears as a sequence of small, uniform singular values in the proposed method, as can be seen in Fig. 6(a). In the proposed method, the dynamics of the target system correspond to a small number of superior singular values, and by extracting these, a good model is obtained, as shown in Fig. 4(a).
In the existing studies on system identification, higherorder noise such as (39) have not been taken into account, but it is known that such noises are ubiquitous [30]. Therefore, the proposed method is expected to be effective in many practical situations. In addition, it is difficult to overcome this problem by increasing the number of data in the method relying on the noise model. Fig. 6 shows the average of singular value plots for 30 experiments with increasing number of data N . As shown in Fig. 6(a), the noise part is reduced by increasing the number of data in the proposed method, while in SSARX ( Fig. 6(b)), increasing the number of data does not directly contribute to the ease of separation.   The input and output data obtained with a noisy setting are shown in Fig. 7. As in the previous figures, the red line shows the response without noise. The SNR is difficult to evaluate in the closed-loop setting because the noise-derived components can also carry information. Here, we call the ratio of the contribution by r to that by η the SNR, SNR y := 20 log 10 N k=1 y 0 (k) 2 N k=1 (y 0 (k) − y(k)) 2 = 5.9 dB (40) where u 0 (k) and y 0 (k) are input and output signals obtained in noise-free simulation, respectively. The frequency responses of the models obtained with this setup are shown in Fig. 8, and the step responses of the closed-loop systemsP (q)K (q) 1+P(q)K (q) with the obtained modelsP(q) are shown in Fig. 9. Note that the step response is taken in closed-loop because the target system here is unstable and the step response in open-loop does not make sense. From the figures, it can be seen that the model obtained by SSARX is strongly affected by the colored noise, while the proposed method produces a good model.
When the noise dynamics are complex and the SNR is low as in this example, noise-model dependent methods such as SSARX do not perform well. Theoretically, the desired model should be obtained by using a sufficiently high order model that can accommodate the noise model, and then reducing the non-noise portion of the obtained model to an appropriate dimension. But in practice, system identification based on such high order models is difficult, and MATLAB's automatic order selection does not select orders greater than 10. On the other hand, the proposed method, which attempts to suppress noise components, works better in such a situation. Since high SNR data is difficult to obtain from operating plants for economic and safety reasons, such a situation is typical in closed-loop system identification.
The singular value plot used for order estimation is shown in Fig. 10. As shown in Fig. 10(b), the dynamics of the plant Step response of the closed-loop systems with obtained models (Example 3, SNR u = −6.7 dB, SNR y = 11.5 dB, N = 200 000).
are difficult to distinguish in SSARX because it is buried in the higher-order dynamics of the noise. In the proposed method (see Fig. 10(a)), the boundary between the noise and the plant dynamics is clearer because noise is compressed by the projection and the nuclear norm minimization.

C. EXAMPLE 3: UNSTABLE PLANT WITH ROBUST CONTROLLER
The third example is the closed-loop identification of an unstable plant and is based on the model of the actual magnetic levitation system in [28].
An example of input and output data is shown in Fig. 11. As seen in the figure, the effect of noise on the input is particularly large due to the robust controller used. The singular value plots for order selection and the frequency response of the model obtained in this setting are shown in Figs. 12 and 13, respectively. As in the previous example, the step responses of the closed-loop systemsP (q)K (q) 1+P(q)K (q) with the obtained modelsP(q) are shown in Fig. 14. The results show the same trend as in the previous examples, indicating that the dynamics of the plant and noise are well separated by the proposed method even for the unstable plant with strong control. In addition, the change of the singular value plot when the number of data N is increased is shown in Fig. 15. The figure shows the average of the singular values over 30 trials. As can be seen from the figure, the trend in the case of closed-loop identification of unstable systems is similar to that of Example 1, which again confirms that the proposed method is more effective for large amounts of data with low SNR, which is typical in closed-loop identification.
Remark 4: Note that noise models are highly complex (of the order 100th) and 30 different noise models are used in each example here unlike [26]. As a result, the identification results demonstrate that the proposed method is really insensitive to the noise dynamics. More importantly, Figs. 6 and 15 show that, in the proposed method, the noise contribution to the singular plots is clearly separated from that of the target plant dynamics when the data length grows. This contrasts with the existing methods such as SSARX.

VI. CONCLUSION
This paper proposed a subspace method for closed-loop identification, which clearly specifies the model order in the presence of noisy measurement data. The method identifies both stable and unstable systems without any knowledge of noise models in the presence of unknown (possibly nonlinear) feedback controllers. This can be achieved through a combination of the nuclear norm minimization and data compression, while exploiting long I/O data and the innovation form in a subspace identification framework. Its effectiveness was fully demonstrated by various numerical examples for stable/unstable systems in the presence of heavy colored noises. Future work includes validation through experiments and theoretical analysis on the consistency of the proposed method.