Computational Efficiency of Multi-Step Learning Echo State Networks for Nonlinear Time Series Prediction

The echo state network (ESN) is a representative model for reservoir computing, which has been mainly used for temporal pattern recognition. Recent studies have shown that multi-reservoir ESN models constructed with multiple reservoirs can enhance the potential of the ESN-based approach. In the present study, we investigate computational performance and efficiency of the multi-step learning ESN which is one of the multi-reservoir ESN models and characterized by step-by-step learning processes. We show that the time complexity of the training algorithm of the multi-step learning ESN is equal to or smaller than that of the standard ESN. Our numerical experiments demonstrate that the multi-step learning ESN can achieve better or comparable performance with much less computational time compared to the standard ESN in nonlinear time series prediction tasks. Moreover, we reveal how the model architecture of the multi-step learning ESN is effective in comparison with other possible variant models. The step-by-step learning is applicable to general multi-reservoir systems and hardware for enhancement of their computational ability and efficiency.


I. INTRODUCTION
Recurrent neural networks (RNNs) and their variants have been widely used for pattern recognition with time series data, such as nonlinear time series prediction and classification. The echo state network (ESN) [1], [2], originally derived as a special type of RNN, is a representative reservoir computing (RC) model consisting of a reservoir and a readout [3]. In the ESN, the reservoir is typically given by a sparse random recurrent neural network and the readout is by a linear regressor or classifier. Since the reservoir is fixed and only the readout is adjusted with a simple learning algorithm, the ESN can be trained much faster than other RNN-based models with gradient-based learning algorithms. In addition to this merit in software computation, the RC approach has a high potential for developing machine-learning hardware The associate editor coordinating the review of this manuscript and approving it for publication was Mostafa Rahimi Azghadi . by exploiting a variety of physical phenomena for dynamic reservoirs [4].
The ESN transforms a given input time series into a highdimensional feature space using an RNN-based reservoir and subsequently extracts desired information from the high-dimensional reservoir state in the readout [3], [4]. The ESN has universal approximation properties as a discrete-time temporal filter under certain conditions [5]. In practice, its computational performance highly depends on the design of the unadaptable reservoir part [6]. In general, the computational ability of the ESN is enhanced with an increase in the reservoir size (i.e. the number of nodes in the reservoir), because it means an increase in the number of trainable parameters in the readout and therefore enables to approximate a wider class of dynamical systems. However, some studies empirically demonstrated that the performance of the ESN often peaks out when the reservoir size is increased in several computational tasks [7], [8]. In addition, VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ an ill-posed problem in the training of the readout with linear regression can occur when the number of reservoir nodes is much larger than the length of the input time series data [9]. A possible option to overcome the above-mentioned limitation of the standard single-reservoir ESN is to consider multi-reservoir ESN models consisting of multiple RNNbased reservoir modules. The reservoir modules can be connected serially as in the deep ESN models [10]- [12], hierarchically as in the hierarchical reservoir model [13], or parallelly as in the grouped ESN [14], [15]. In these existing models, the model output is given by a weighted sum of the states from all the reservoir modules, and the readout training is conducted in one step. As another approach, we have proposed the multi-step learning ESN for time series prediction [16], inspired by the hybrid model combining a knowledge-based model and an ESN [7]. The multi-step learning ESN consists of a series of multiple ESN modules with additional connections, which are trained step by step to reduce the output error. After the step-by-step training, the final output error can be much smaller than the initial output error at the first ESN module. The aim of the multistep learning ESN is to keep the prediction error as small as possible in each step, which is particularly beneficial to prevent a rapid expansion of the error in autoregressive prediction for chaotic dynamical systems.
In our previous study, we have conducted the performance analysis only for the 2-step case with two ESN modules [16]. We also derived a method to compute Lyapunov exponents of the 2-step learning ESN and evaluated the model performance considering memory-nonlinearity tradeoff [17]. Remaining issues are to clarify how the computational performance and training cost of the N -step learning ESN changes with increasing N and to reveal its difference from other variant models. We tackle these issues in the present study. We evaluate the computational performance and the training cost of the N -step learning ESN in three benchmark tasks of nonlinear time series prediction. The first task is to predict chaotic time series data generated by the Lorenz system [2], [18], [19]. The second task is to predict nonlinear time series data generated by the nonlinear autoregressive moving average (NARMA) model. The third task is to predict the time series data in the Santa Fe Laser Dataset. First, we demonstrate that the multi-step learning ESN can achieve better or comparable performance with a much lower training cost compared to the standard ESN when the total reservoir size is sufficiently large. Then, we show that the model architecture of the multi-step learning ESN is more effective than that of other possible variants.
The rest of this paper is organized as follows. In Sec. II, we introduce the standard ESN and then describe the multistep learning ESN. We also compare the time complexity of the training process between them. In Sec. III, we show the computational performance and efficiency of the multi-step learning ESN in three nonlinear time series prediction tasks. In Sec. IV, we analyze the effectiveness of the architecture FIGURE 1. Architecture of the ESN model [1]. The input weight matrix W in and the reservoir weight matrix A are fixed. Only the output weight matrix W out is trained to minimize the error between the target output y and the model outputŷ. Figure reproduced with permission from [16].
of the multi-step learning ESN in comparison with other variants. In Sec. V, we conclude this work.

A. ECHO STATE NETWORK
We briefly explain the standard ESN [1], [3], since it is a basis of the multi-step learning ESN introduced in Sec. II-B. The model architecture is illustrated in Fig. 1. The ESN consists of an input layer, a reservoir, and a readout. The reservoir is normally given by a sparse random RNN. The randomization approach for neural network models has merits in simplicity of implementation, fast learning, and applicability of linear learning algorithms [20], [21]. In time series prediction, the aim is to construct an ESN model that well approximates a filter that transforms a given input time series u(t) ∈ R N u into a desired output time series y(t) ∈ R N o for a certain range of time t, where N u and N o denote the dimensions of the input and output, respectively. The state vector r(t) ∈ R D of a reservoir consisting of D nodes is updated as follows: where t denotes the update time interval, tanh represents the element-wise hyperbolic tangent function, A ∈ R D×D denotes the internal weight matrix of the RNN-based reservoir, and W in ∈ R D×N u denotes the input weight matrix. Each element in W in is randomly generated from a uniform distribution in [−σ, σ ] where σ > 0. The reservoir network structure is given by a sparse random graph where the mean degree (i.e. the average number of connections per node) is d [22]. Each nonzero element of A is randomly generated from a uniform distribution in [−1, 1] and then all the elements are scaled so that the spectral radius of A is equal to a given value ρ. The output vectorŷ(t) is obtained as follows:ŷ where W out ∈ R N o ×D denotes the output weight matrix in the readout and f out represents an element-wise output function. In the training process for time period −T ≤ t < 0, the output weight matrix is adjusted so as to minimize the output error by ridge regression (or the least square method with Tikhonov regularization) [23] as follows: where ||W out || 2 represents the sum of squared elements of W out , and β > 0 denotes the regularization parameter. By constructing the state collection matrix and the teacher collection matrix a solution to Eq. (3) is obtained as follows: where I is the D × D identity matrix. The time complexity of this calculation is O(D 2 (T + D)) if D N u and D N o as in most cases [9].
In the prediction (or inference) process for a certain time period (t > 0), the model output is generated from Eq. (2) where the reservoir state r is obtained by Eq. (1) with unknown input time series and W out =Ŵ out . For autoregressive prediction, the model output is given by the time-shifted input, i.e.ŷ(t) ∼ u(t + t). Therefore, the model output at time t can be used as the input at time t + t. By inserting the righthand side of Eq. (2) into Eq. (1), we can derive the following recursive equation: The reservoir states can be produced with iterations of the above equation in a free-running way only with a given initial state.

B. MULTI-STEP LEARNING ECHO STATE NETWORK
The multi-step learning ESN is one of the multi-reservoir ESN models, which consists of multiple ESN modules and additional connections as illustrated in Fig. 2 [16]. The N -step learning ESN contains N ESN modules, each including an input layer, a reservoir, and a readout. In addition, the output of the (n − 1)th module is fed into both the input layer and the readout of the nth ESN module for n = 2, . . . , N . The N readouts are trained step by step, unlike other multireservoir ESN models with one-step training. The size of the reservoir of the n-th module is denoted by D n . The procedure of time series prediction with the N -step learning ESN is divided into an initialization process, N training processes, and a prediction process.
In the initialization process, we set the input and reservoir weight matrices of all the modules randomly as described in Sec. II-A. For simplicity, we use the same values of the hyperparameters σ , d , and ρ in all the modules in this study.
In the training processes for time period −T ≤ t ≤ 0, the output weight matrix W (n) out is trained one by one for n = 1, . . . , N . The first module is equivalent to the standard ESN. With an input time series u(t), the state vector r 1 (t) ∈ R D 1 of the reservoir of the first module is updated as follows: where W (1) in ∈ R D 1 ×N u and A 1 ∈ R D 1 ×D 1 denote the input and reservoir weight matrices of the first module, respectively. The output weight matrix of the first module is trained as follows: where β 1 is the regularization parameter. The output of the first module is given bŷ where f out is an element-wise activation function for n = 1. For the nth (2 ≤ n ≤ N ) module, the input, reservoir, and output weight matrices are denoted by W , respectively. The reservoir state vector and the output vector of the nth module are denoted by r n andŷ n , respectively. In the nth training step, VOLUME 10, 2022 the input u(t) and the outputŷ n−1 (t) of the (n − 1)th module are concatenated and then fed into the input layer of the nth module. The reservoir state vector r n (t) ∈ R D n for n ≥ 2 is updated as follows: With the reservoir states r n (t) computed for −T ≤ t ≤ 0, the output weight matrix of the nth module is determined as follows: where β n is the regularization parameter. The reservoir state vector of the nth module and the outputŷ n−1 (t) of the (n − 1)th module are concatenated to produce the output of the nth module. The output of the nth module is obtained with the optimized output weight matrix as follows: where g out represents an element-wise output function for n ≥ 2.
In the prediction process for a certain time period (t ≥ 0), a forward propagation of an unknown input time series in the whole model with the N trained output weight matriceŝ W n out (n = 1, . . . , N ) is performed to determine the output at the last ESN module, which is regarded as the model output given byŷ Particularly in autoregressive prediction, the target output is set as y(t) = u(t + t). In this case, the model output y(t) ∼ u(t + t) at time t can be used as the input at time t + t. Once the initial conditionŷ(0) = u( t) is given, the trained model can autonomously produce the predicted values of output time series. In this case, the model is called a closed-loop model. We repeat the following three steps: 1) Compute the output of the first module as follows: out .
2) Compute the output of the nth module iteratively for n = 2, . . . , N as follows: 3) Obtain the output as follows: C. TIME COMPLEXITY We estimate the time complexity of the training process of the N -step learning ESN model. In most cases, we can assume D n N u and D n N o for n = 1, . . . , N [3]. The time complexity for the training process of the first module is given by O(D 2 1 (T +D 1 )) as explained in Sec. II-A. Since we feed the output of the (n − 1)th module into the nth module for n ≥ 2, it is computationally efficient to store the predicted output y n−1 of the (n − 1)th module for training the nth module. The time complexity for training the nth module is given by O(D 2 n (T + D n )). Consequently, the total time complexity is represented as O(T N n=1 D 2 n + N n=1 D 3 n ) for the N -step learning ESN. We compare this with the time complexity of the training process of the standard ESN, i.e. O(D 2 (T + D)). When the total number of reservoir nodes is the same in both models, i.e. N n=1 D n = D, we can show from the following Cauchy-Schwarz inequalities: This means that the learning cost of the N -step learning ESN is equal to or smaller than that of the standard single-reservoir ESN if the total reservoir size is the same. Therefore, the computational efficiency can be enhanced by dividing a large reservoir into multiple smaller reservoirs and applying the above-mentioned step-by-step training method.

III. NONLINEAR TIME SERIES PREDICTION
We apply the multi-step learning ESN to three nonlinear time series prediction tasks and compare its predictive performance and computational efficiency with those of the standard ESN. In Sec. III-A, we perform a prediction of the Lorenz system by using a trained closed-loop model. In Sec. III-B, we conduct a prediction of the NARMA system. In Sec. III-C, we deal with a prediction task using a real time series data related to laser dynamics. Numerical experiments were performed on a computer with CPU, Intel Xeon E5-2650, 2.20GHz.

A. THE LORENZ SYSTEM
We evaluate the predictive performance of the multi-step learning ESN in a chaotic time series prediction task with the Lorenz system [24] which is described as follows: This three-dimensional autonomous dynamical system exhibits chaotic behavior, for which a long-term prediction is difficult. The practical aim of the task is to predict the future state of the system accurately for as long as possible. We generated the time series data from the Lorenz system. The time lengths of initial transient, training, and testing data were set at T init = 2, T train = 100, and T test = 25, respectively. With sampling interval t = 0.02, the lengths of the initial transient, training, and testing data are 100, 5000, and 1250, respectively.
The input data fed into the model is represented as a three-dimensional vector u(t) = [x(t), y(t), z(t)] and the target output is as u(t + t). Following the method in Pathak et al. [7], we set the output functions of the multi-step learning ESN as follows: where r j and r * j represent the jth component of r and r * , respectively. We fixed the mean degree of the reservoir network at d = 3 [7]. We optimized the hyperparameters of the ESN models based on the 10-step walk forward validation [25] considering the candidate values in Table 1.
To measure the predictive performance, we used the Valid Time [7] which is defined as the elapsed time t valid until the normalized prediction error E(t) first exceeds a given value (0 < < 1), where We set = 0.4 in this study, following [7]. This measure provides the duration of sufficiently accurate prediction made by the trained model. An example of chaotic time series prediction with the 10-step learning ESN is demonstrated in Fig. 3. The total number of reservoir nodes is 1000. The number of each reservoir is given by D n = 100 for n = 1, 2, . . . , 10. The prediction is successful until t = t valid ∼ 8.76, after which the difference from the target output expands by the nature of chaos.
In the multi-step learning ESN, we can choose different reservoir sizes under a fixed total size. To systematically examine the effect of each reservoir size on the predictive performance, we computed the Valid Time for the 2-step   Fig. 4. When D 1 is relatively small, i.e. D 1 ≤ 100, the predictive performance is drastically improved by increasing D 2 . In contrast, if D 2 is relatively small, i.e. D 2 ≤ 100, an increase in D 1 does not significantly improve the prediction accuracy. This result means that better predictive performance can be achieved by making the second reservoir larger than the first reservoir.
Next, we compared the N -step learning ESN with the standard ESN in terms of predictive performance and learning cost (i.e., the program execution time for the training process). The total reservoir size was fixed at D in both models. For a given D = 100N , we used the N -step learning ESN where each reservoir size is D n = 100 for n = 1, . . . , N . We performed 10 trials with different network realizations for each parameter setting. The Valid Time averaged over the trials is plotted against the total reservoir size D in Fig. 5(a). The predictive performance of the multi-step learning ESN increases with the number N of the ESN modules when D is relatively small. The average Valid Time of the multistep learning ESN is larger than that of the standard ESN. Fig. 5(b) shows that the program execution time for the training process of the multi-step learning ESN is much shorter than that of the standard ESN especially for large D values. The computational time for training the multi-step learning ESN increases linearly with the total reservoir size, whereas that for the standard ESN increases superlinearly. This numerical result agrees with the result of the time complexity analysis in Sec. II-C. As observed in Fig. 4, there is a possibility that the predictive performance of the multistep learning ESN is further improved by setting D n (n ≥ 2) larger than D 1 under a fixed number of D.

B. THE NARMA MODEL
We consider another prediction task with the NARMA model, which has often been used as a benchmark for the performance evaluation of RC models [26]- [29]. The NARMA model with order-m is described as follows: where t is the discrete time, u(t) is the input signal randomly generated from a uniform distribution in [0, 0.5].
The parameter values were set at α = 0.3, β = 0.05, γ = 1.5, and δ = 0.1 as in the previous studies. The order m controls how distant past states and inputs influence the current state in Eq. (28). The aim of this task is to predict the output time series y(t) from the input time series u(t) for a certain time period. This is often called the NARMA-m task.
To predict a NARMA model with a larger m, an ESN-based model needs to have more memory capacity [6].
We performed the NARMA-10 task by generating the time series data from Eq. (28) with m = 10. The lengths of initial transient, training, and testing data were set at 100, 3000, 1000, respectively.
In the multi-step learning ESN, we set the output functions as follows: The mean degree of the reservoir network was fixed at d = 3 [7]. The other hyperparameter values were optimized considering the candidate values in Table 1 based on the 10-step walk forward validation [25]. We evaluated the predictive performance for the testing data using the three error indices including the root mean square error (RMSE), the normalized RMSE (NRMSE), and the mean absolute error (MAE), which are defined as follows: whereŷ(t) is the model output, y(t) is the target output,ȳ is the temporal average of y(t), and · t represents a temporal average. Fig. 6 demonstrates an example of a successful prediction with the 10-step learning ESN where D n = 200 for n = 1, . . . , 10. In this case, the RMSE, NRMSE, and MAE values for 1000 testing steps are 1.1 × 10 −2 , 1.0 × 10 −1 , and 9.1 × 10 −3 , respectively.
Next, we compare the predictive performance between the 10-step learning ESN and the standard ESN. The total number of reservoir nodes is given by D. For a given D = 200N , we used the N -step learning ESN where D n = 200 for n = 1, . . . , N . The results are shown in Fig. 7. The advantage of the multi-step learning ESN over the standard ESN in the predictive performance is clearly observed in Fig. 7(a) (see Supplementary Figure 1 for the RMSE and MAE). This is because the multi-step learning ESN is suited for approximating dynamical systems with strong nonlinearity rather than those with long-term memory as suggested in our previous study [16]. The computational efficiency of the multi-step learning ESN is confirmed in Fig. 7(b). Fig. 8 shows how the output error of the first ESN module is reduced by the subsequent modules in the 5-step learning ESN with D n = 100 for n = 1, . . . , 5. The prediction error gradually decreases as the training process proceeds, and thus, the final prediction error is significantly smaller than the  output error of the first module. This result demonstrates that the successive training processes in the multi-step learning ESN model is effective for improving the prediction accuracy.

C. LASER DATASET
We deal with a predication task with the Santa Fe Laser Dataset [30], which is one of the benchmark dataset for time series prediction [27]. The aim of this task is to make a one-step ahead prediction for an experimental recording of the output power of a far-infrared laser in a chaotic regime. We used a part of the laser dataset. The lengths of training and testing data were set at 4000 and 1000, respectively.
In the multi-step learning ESN, we set the output functions as in Eqs. (29) and (30). The mean degree of the reservoir network was fixed at d = 3 [7]. The other hyperparameter values were optimized considering the candidate values in Table 1 based on the 10-step walk forward validation [25]. Fig. 9 demonstrates an example of a successful prediction with the 10-step learning ESN where D n = 100 for n = 1, . . . , 10. In this case, the RMSE, NRMSE, and MAE values for 1000 testing steps are given by 3.4 × 10 −3 , 6.3 × 10 −2 , and 1.8 × 10 −3 , respectively.
We compared the predictive performance between the 10-step learning ESN and the standard ESN. The total number of reservoir nodes is given by D in both models. For a given D = 100N , we used the N -step learning ESN where D n = 100 for n = 1, . . . , N . The results are shown in Fig. 10. As shown in Fig. 10(a), the performance of the multi-step learning ESN is significantly improved by increasing the system size (see Supplementary Figure 2 for the RMSE and MAE). The multi-step learning ESN yields better predictive performance than the standard ESN. In addition, Fig. 10(b) shows that the multi-step learning ESN is more computationally efficient than the standard ESN.

IV. ANALYSIS OF THE MULTI-STEP LEARNING ESN
We analyze the role of network architecture in the multistep learning ESN model. Similar network architectures with multiple reservoirs are used in other models such as the DeepESN model [10], [11] and the hierarchical reservoir model [13]. In the former model, multiple reservoirs are stacked, and their states are collected to train a single readout. In the latter model, multiple ESN modules, including an input layer, a reservoir, and a readout, are stacked, and the training of multiple readouts are sequentially conducted. Therefore, our model is close to the hierarchical reservoir model, but it has additional network connections as illustrated in Fig. 11 for the 2-step case. One is Connection-1 via which the input data is fed into all the ESN modules. The other is FIGURE 11. Comparison of the network architecture between (a) the two-step learning ESN and (b) the two-layer hierarchical reservoir model [13]. Both models have two readouts to be trained. The difference between them is that the two-step learning ESN has Connection-1 and Connection-2.
Connection-2 via which the model output of an ESN module is fed into the output layer of the subsequent module.
To clarify the roles of Connection-1 and Connection-2 in nonlinear time series prediction tasks, we compared the predictive performance of the following five different variant models:  ESN (wo1,2): the 2-step learning ESN without Connection-1 and Connection-2, which is equivalent to the hierarchical reservoir model [13]. The performance was evaluated in the three time series prediction tasks used in Sec. III. The total reservoir size was fixed at D in all the variant models. In the 2-reservoir models, the sizes of reservoirs were set at D 1 = D 2 = D/2. For each parameter setting of each model, we performed 10 simulation trials. The other experimental conditions are the same as those in Sec. III. Fig. 12(a) shows the results of the comparison among the five variant models in the Lorenz system prediction. The Valid Time was used to evaluate the predictive performance as in Sec. III-A. In Fig. 12(a), the best performance is obtained by the Multi-step ESN having Connection-1 and Connection-2 when the reservoir size is sufficiently large (i.e. D ≥ 400). The Multi-step ESN (wo1) and Multi-step ESN (wo2) yield mildly degraded performance. The Multistep ESN (wo1,2) produces largely degraded performance. The performance of ESN is better than that of the other models only when the reservoir size is small (i.e. D ≤ 300). This result indicates the importance of the two additional connections in our model, which are absent in the hierarchical reservoir model. These connections are considered to play an essential role in augmenting data used for training the readout of the subsequent ESN module, contributing to enhancement of the prediction ability of the model. Fig. 12(b) shows similar comparative results in the NARMA-10 task. The NRMSE defined in Eq.(32) was used for evaluation of the predictive performance as in Sec. III-B. It is clearly observed that Multi-step ESN and Multi-step ESN (wo2) yield best results equivalently. This means that the removal of Connection-2 is not harmful in this task. The removal of Connection-1 (Multi-step ESN (wo1)) and that of the two connections (Multi-step ESN (wo1,2)) largely reduce the predictive performance. The performance of ESN is the second best when the total reservoir size is relatively small but is quickly saturated with an increasing D. In this prediction where input and output time series are different, the concatenation of the original input and the output of the first ESN module is significant for data augmentation. The original inputs fed via Connection-1 homogenize the inputs to all the ESN modules, and therefore, we can use the same hyperparameter values in all the modules without exhaustive optimization efforts. Fig. 12(c) shows the results for the laser task. The NRMSE defined in Eq.(32) was used for evaluation of the predictive performance. The experimental setting is the same as that in Sec. III-C, but the length of training data was increased from 4000 to 8000 in order to highlight the difference between the five models. In this case, Multi-step ESN and Multi-step ESN (wo2) are comparable and much better than the other three models for most D values, as in the NARMA-10 task. The advantage of these two models suggest that Connection-1 is much more important than Connection-2.

V. CONCLUSION
We have demonstrated that the multi-step learning ESN is an effective and efficient multi-reservoir ESN model. In the training of the multi-step learning ESN, each ESN module is trained step by step with linear regression. We have analytically and numerically shown that this training method requires less computational cost than that of the standard ESN with the same total reservoir size. In addition, we have shown in the tested tasks that the computational performance of the multi-step learning ESN is better than that of the standard ESN when the total reservoir size is sufficiently large. In short, our results indicate that it is effective to separate a single large-scale reservoir into multiple reservoirs and apply the step-by-step learning method instead of the one-shot learning, for enhancement of computational performance and efficiency of the ESN-based approach. This strategy would be applicable to improving the performance of other reservoir computing systems and hardware [4], particularly when exhaustive efforts are needed for implementing a largescale reservoir.
In nonlinear time series prediction, the task difficulty is involved with memory and nonlinearity of a dynamical system behind the time series data. The task is more difficult when the data has longer-term correlations and exhibits stronger nonlinear phenomena. Considering the results of our previous study [16] and the present study, we can derive the conclusion that the multi-step learning ESN is suited for predicting highly nonlinear behavior like chaotic one. If time series data with long-term correlations are handled, the reservoir size of each ESN module needs to be enlarged so that the echo of input information is maintained for a long term within the reservoir. A future work is to apply the multi-step learning ESN to a prediction of higher-dimensional chaotic systems [18] and other real-world data. Another remaining work for practical applications is performance comparison between the multi-step learning ESN and other state-of-the-art machine learning models in time series prediction.