Modeling and Evaluation of Echo-State Networks Using Spin Torque Nano-Oscillators

An echo state network (ESN), capable of processing time-series data with high accuracy, is designed and benchmarked using spin torque nano-oscillators (STNOs) with easy-plane anisotropy. An ESN belongs to the category of reservoir computers, where the reservoir comprises a randomly initialized, recurrently connected, and untrained pool of neurons and acts as a high-dimensional expansion of the input signal. The readout function is used to glean a meaningful output representation. Here, we use STNOs as the basic building block of the ESN and apply the ESN to predict the Mackey–Glass (MG) time-series data. The design parameters of the STNO and the input data representation are selected to yield prediction errors as low as $4\times 10^{-3}$ . We also quantify the short-term memory (STM) and the parity-check (PC) capacity of the ESN and obtain metrics that are comparable to or better than existing spintronics-based ESNs, as well as ESNs employing “tanh” neurons. The peak STM is found to be approximately 8.8, while the peak PC capacity is found to be approximately 3.9. The impacts of thermal fluctuations and process variability on ESN performance are systematically quantified. Although the ESN’s prediction and memory capability remain robust with temperature variations, a 10% variation in the dimensions of the STNO free layer can lead to around 40% increase in its prediction error for the MG time-series data.


I. INTRODUCTION
The human brain is widely regarded as the ultimate computing engine with extremely high energy efficiency, reliability, and learning and cognitive capabilities [1].Although CMOS devices are traditionally used for artificial neural networks (ANNs), there exists a mismatch between their properties and those of the brain [2].Emerging devices, such as resistive memory [3], phase change memory [4], electrochemical memory [5], ferroelectrics [6], and spintronics [7], have been used to realize physical ANNs.Among these devices, spintronics technology offers distinct neuro-inspired functionality that can be controlled via an external knob (e.g., electric fields, currents, and magnetic fields) [8].Spintronics devices can display coherent and stochastic oscillations [9], [10], memristive switching [11], probabilistic switching [12], and high-frequency signal modulation [13].Spintronics devices such as two-terminal (2T) and three-terminal (3T) magnetic tunnel junctions (MTJs) [14] can be densely interweaved in large networks or a reservoir to generate highly optimized, energy-efficient non-von Neumann compute engines [15], [16].
Among various ANN implementations, echo state networks (ESNs), a type of reservoir computer with recurrent connections, are capable of processing temporal information and work with a simpler statistical model for training and are, therefore, a good choice for IoT/edge applications [17].ESNs can be used for signal generation and forecasting, modeling of biological systems, filtering, and several other applications where time-series data prediction is key [18].
In prior work, spintronics devices have been successfully used to realize physical reservoir computers [19].A spin torque nano-oscillator (STNO) reservoir was experimentally shown to perform human-voice recognition task with 99.6% accuracy [20].In [21], the memory capacity of a recurrent neural network using a vortex-type STNO was experimentally evaluated and quantified to be 1.8.A reservoir computer using 5-7 MTJs was theoretically shown to have performance similar to that of an ESN employing 20-30 ''tanh'' neurons [22].Dipolarly coupled STNOs for reservoir computing have been analyzed numerically in [23], where the highest performance was obtained when the array operated at the boundary between the synchronized and disordered states.Voltage-controlled magnetic anisotropy (VCMA) in MTJs was used to realize a reservoir computer to handle the nonlinear autoregressive moving average (NRMA) task and normalized mean square error on the order of 10 −6 was demonstrated numerical simulations [24].The task-independent information processing capacity of STNOs was experimentally evaluated in [25].It was shown that the total capacity reaches a maximum of 5.6 at the edge of the echo-state property.The effect of feedback on the vortex-core dynamics of a nanomagnet and its subsequent use for reservoir computing were presented in [26].Numerical simulations showed that the short-term memory (STM) of the reservoir peaked when the delay time was not an integer multiple of the pulsewidth.
Despite the significant progress in the field of nanomagnetism and spintronics over the last decade, spintronics-based neuromorphic computing, particularly the reservoir computing model, is in its nascent stage and its application to handle real-world temporal tasks is not fully quantified.In this article, we implement an ESN using STNOs and test it on the Mackey-Glass (MG) time-series prediction task.We also evaluate the network's figure-of-merit including its accuracy, memory capacity, and nonlinearity with respect to the STNO's physical properties and input signal representation.We also benchmark the impact of thermal noise and process variability on the ESN's performance.
Compared to prior works, this article establishes a clear connection between the STNO's dynamics (i.e., nonlinearity and relaxation) and the network's ability for accurate time-series prediction as well as its capacity for STM and parity check (PC).Signal pulsewidth adjustment is utilized to enhance the nonlinearity of the STNO's output relative to the input current.The proposed STNO-based ESN offers competitive metrics with respect to prediction accuracy compared to other similar networks designed with emerging devices including memristors and ferroelectric tunnel junctions [21], [22], [24], [27], [28].In addition, unlike most previous works in spintronics reservoir computing that ignore device-level nonidealities, we compute the statistical performance of the ESN due to thermal noise and variations in the dimensions of the STNO.

A. IMPLEMENTATION OF ESNs
An ESN is composed of an input layer, a reservoir with recurrently connected nodes, and a trainable output layer [29], as shown in Fig. 1.The input of an ESN is time-series data, u(n) ∈ R N u , where n(= 1, 2, . . ., T ) is the time index and T is the total number of samples.The reservoir of size N r maps the input data to a higher-dimensional space according to where q(n) and x(n) are the input and output of the reservoir, respectively, and α ∈ [0, 1] is the leaking rate and is chosen to match the speed of input and output dynamics.α is set to 0.95 in this article (see the supplementary document).f is the activation function of the neurons, for example, hyperbolic tangent.The matrices correspond to the weights of the input-reservoir connections and the recurrent connections, respectively.Both weight matrices are initialized with a uniform distribution in the range of [−1, 1] and remain invariant in a given network.In addition, W rr is normalized such that its spectral radius ρ(W rr ) is smaller than unity (see the supplementary document).Then, the output is an inner product of W ro and x(n), expressed as where y(n) ∈ R N y is the output of the ESN and W ro ∈ R N y × R 1+N u +N r is the output weight matrix.W ro is the only trainable matrix in an ESN, and it is optimized to minimize the error between y(n) and the target output, y target (n).The latter is prepared as where k is the time switch step.That is, the network is designed to predict the time-series k steps in the future.The root mean squared error (RMSE) of the prediction task is given as During the training phase, (2) can be written in the matrix notation as where T ∈ R N y ×T and X ∈ R (1+N u +N r )×T are the horizontal concatenation of y(n) and [1; u(n); x(n)], respectively.The training process amounts to solving an overdetermined system, where N r < T where Y target is the horizontal concatenation of y target .To solve the system of linear equations in (6), we use a stable and universal ridge regression method where β is a regularization coefficient to stabilize the solution and I is the identity matrix.ESNs benefit from the fixed W ir and W rr together with linearly solvable W ro , which lowers the training difficulty and computation complexity.

B. EVALUATION OF ESNs
In this article, three different tasks are implemented to evaluate the performance of the network: 1) MG time-series prediction; 2) evaluating the STM capacity; and 3) the PC.
The training sample includes 5000 data points, while the test sample includes 2000.The default reservoir size and the switch k are 50 and 1, respectively, unless otherwise specified.

1) MG TIME-SERIES PREDICTION
The MG series is a popular time series and it provides a valuable framework for understanding and studying the dynamics of nonlinear and time-delayed systems, offering insights into the behavior of complex real-world phenomena [30].The system is described by the following differential equation: where X (t) is the state variable, and the parameters β, γ , τ are set to 0.2, 0.1, and 10, respectively, in this article.The series shows chaotic behavior for m > 16.8 [31].In this work, m is a randomly generated integer in the range of [20,30].The goal of this task is to use X (t) as the input data and X (t + k) as the target output and measure the prediction capability of ESNs for different values of the time step k.

2) STM CAPACITY EVALUATION
The STM capacity refers to the network's ability to retain and utilize information from recent inputs for a limited duration of time [32].Unlike the target output for prediction tasks given in (3), the target output for STM is where d (= 1, 2, 3, . . ., ) is the delay, while the input u(n) in the STM evaluation is a random signal composed of binary integers, 0 or 1.The memory capacity is quantified as In real experiments, it is impossible to sweep d from 1 to infinity, and d ≥ 20 can ensure C STM,d≥20 ≪ 1.

3) PC TASK
The PC is another task that can be used to evaluate an ESN's ability to handle binary time-series data [33].The input is the same as that for evaluating the STM capacity.That is, the input is a random sequence of 0s and 1s.The aim is to check the parity of a sequence with delay d.Like the STM capacity, the PC capacity quantifies the number of target data the reservoir can store, although the target output for the PC task is a nonlinear transformation of the input data and is given as The performance evaluation of the PC capacity is the same as that of the STM capacity expressed in (10) and d = 8 is enough to ensure C PC,d≥8 ≪ 1.

A. PHYSICS AND MODELING
The structure of an STNO, which is a 2T MTJ, is illustrated in Fig. 2(a).It consists of a free ferromagnetic layer with inplane magnetization, while the reference ferromagnetic layer is pinned in the out-of-plane direction.The free and reference layers are separated by a tunneling barrier, typically made of MgO.The reference layer's primary function is to act as a spin polarizer for electric current flowing through the device.
The free layer absorbs the spin angular momentum of the spin-polarized current flowing through it, which can result in a dynamic instability of the free layer's magnetization vector.
Depending on the magnitude of the spin torque and the design parameters of the MTJ, different dynamics can be excited in the MTJ.In our case, we focus on the case wherein the spin torque can excite an out-of-plane oscillation of the free layer's magnetization with the z-component of magnetization increasing monotonically with the input spin current density.1.
The magnetization dynamics are described by the Landau-Lifshitz-Gilbert (LLG) equation [34] where γ e , µ 0 , h, and e are the gyromagnetic ratio, the reduced Planck constant, the vacuum permeability, and the elementary charge, respectively, α g is Gilbert's damping coefficient, m (= M/M s ) is the normalized magnetization, M s is the saturation magnetization, I is the input spin current, V is the volume of the free layer, and m p is the unit vector in the direction of the reference layer's magnetization.The charge current, I c , that produces the spin current, I , is given as I c = I /η, where η is the STT efficiency.All analyses conducted in our work are based on I , and thus η is not directly invoked.H eff is the effective magnetic field, including demagnetization field, H d = (0, 0, M s ), and crystalline anisotropy field.The anisotropy field for analytical calculations is assumed to be finite in the x-and z-directions, H an = (H ax , 0, H az ).In the presence of thermal noise, H eff must be amended with a Langevin field representing Gaussian white noise.Section IV-B.3 discusses the impact of thermal noise on the ESN's performance.Additional details on thermal simulations can be consulted in the Supplementary Material.
To excite oscillations in the free layer's magnetization [35], [36], a critical spin current density, J min , must be applied to the free layer where m xc and m yc are, respectively, the x-and y-components of the free layer's magnetization at the critical point where m leaves the free plane and t f is the thickness of the free layer.The oscillations in m x and m y are shown in Fig. 2(b).For spin current density exceeding an upper bound, J max , the free layer's magnetization orients along the fixed layer's direction.J max is given as Fig. 2(c) shows the reorientation of the z-component of the free layer, m z , along the fixed layer's polarization for J = 2 × 10 11 A/m 2 .Derivations of J min and J max are presented in the supplementary document.
The free layer's z-component of the magnetization, m z , is utilized as the output state variable of the STNO and can be measured via tunneling magnetoresistance (TMR).The conductance of the STNO, G, is linearly dependent on m z [37].Thus, with a readout voltage V read , the output current I out = V read G also becomes linearly dependent on m z .

B. RESULTS AND DISCUSSION
The relationship of m z versus J is depicted in Fig. 3(a).For this simulation, we assume the free layer to possess an anisotropy field along the x-direction, H ax , which is varied from 10 to 40 kA/m.For finite H ax , m z -J displays a discontinuity because of the presence of a finite J min defined in (13).The scaling of J min with H ax , obtained analytically, matches well with the results obtained from micromagnetic simulations as shown in Fig. 3(b).To ensure that m z varies smoothly with J z , vanishing in-plane anisotropy field (H ax & H ay ≈ 0) is required.For materials with negligible in-plane crystalline anisotropy, a circular cross section of the free layer with an easy-plane energy landscape can render a negligible anisotropy.Our motivation to reduce the discontinuity in the m z -J curve is to obtain optimal ESN performance with respect to its prediction accuracy as discussed in Section IV-B.
Unless otherwise stated, the STNO parameters used for all micromagnetic simulations are listed in Table 1.These parameters are taken from experimentally measured STNO in Houssameddine et al. [35].Note that the choice of M s will affect the transition region of m z , that is, the region when m z changes from −1 to +1 as the input spin current density is changed.However, the conclusions drawn here will qualitatively remain the same for a different M s value.The weak crystalline anisotropy of 800 A/m, reported in [35], is neglected in our simulations, while the reported shape anisotropy of 4-8 kA/m is manipulated to zero with a circular cross section of the free layer.Furthermore, the weak crystalline anisotropy can be counteracted by tailoring the shape anisotropy, while the effect of slight anisotropy induced by shape variations is discussed in Section IV-B.3.
For oscillatory dynamics, the energy supplied by spin current equals the energy consumed via damping over one time period of oscillations, leading to the following condition: where E is the energy of the free layer, and T denotes the time period of oscillations.By solving (15) for easy-plane ferromagnets with negligible anisotropy, we obtain which highlights the continuous and linear dependence of m z on the spin current density in a steady state.This relationship is strictly true as long as the spin current density is lower than the upper-bound J max defined in (14).
For parameters noted in Table 1, the micromagnetic simulation results of m z -J relationship are shown in Fig. 3(c).For this simulation, we select input current density of different pulse widths, t pw , ranging from 0.3 to 2 ns.For all cases, the m z -J relationship is nearly linear as long as J is lower than J max .However, for shorter t pw , we find that m z -J behavior can depart from linearity for a broader range of J .The nonlinearity feature is typically important for reservoir nodes and ESN applications.

IV. SIMULATION OF ESNs
A. SIMULATION SETUP Fig. 4 depicts the flowchart of the implementation of an ESN utilizing the STNO.For the simulation, both the input and output are scalar values, thus denoted as N u = N y = 1.By default, the reservoir size N r is set to 50, unless specified otherwise.The fixed weight matrix is initialized as described in Section II-A.
The input signals undergo a preprocessing step where their mean is adjusted to zero and their standard deviation is normalized to one.Moreover, in the case of continuous input signals, additional discretization is required, employing an appropriate sampling rate, as exemplified in Fig. 5(a).The resulting signal, denoted as q(n), is subsequently treated as the input signal for the STNO.To match the input signal's range (approximately 1) with the range of current density required for the STNO (around 10 11 A/m 2 ), a conversion process is necessary.The conversion process is expressed as where γ in A/m 2 represents an essential parameter that significantly influences the performance of the ESN.The dynamics of the STNO depend on the choice of γ along with the input pulsewidth, t pw , as shown via micromagnetic simulations reported in Fig. 5(b).For these results, γ = 1.8 × 10 11 A/m 2 and t pw = 0.5 ns.Furthermore, we can allow the STNO to relax toward its equilibrium state (i.e., m z = 0) after each input pulse as shown in Fig. 5(c).Here, the relaxation time, t relax , is set as 1 ns.Two distinct simulation methods are adopted to balance the demands of simulation accuracy and computational efficiency.The first approach, ''Real-Time Simulation,'' conducts a complete micromagnetic simulation of the STNO at each time step n.The initial state of the STNO at time step n is determined by its final state at time step n − 1, followed by relaxation for t relax time duration.While this method can simulate the ESN performance for a range of t relax values, it requires significant computational resources.Hence, it is primarily utilized in Section IV-B.1 to assess the influence of the relaxation time on ESN performance.The second approach, ''Dictionary-Search Simulation,'' is based on a pregenerated J -m z database, called the dictionary, where the initial magnetization state of the STNO lies in the easy xy-plane.For each input current density, the simulation searches for the closest J in the dictionary and uses the corresponding m z as the output.This method operates under the assumption that the STNO has been provided a sufficient relaxation time, allowing for it to start with m z = 0 at each time step.Therefore, it is applied in Section IV-B.2 to provide the most accurate, yet fast, estimation of the performance of the STNO-based ESNs.

B. RESULTS AND DISCUSSION 1) IMPACT OF RELAXATION TIME
We implement ''Real-Time Simulation'' in this section to evaluate the influence of t relax on the performance of ESNs.The output of the STNO is a joint contribution of its initial state, input current density, and pulsewidth.In the STNO-based ESN simulation, the initial state is the final state of the STNO at the previous time step.Thus, a longer t relax enables the STNO to be less dependent on its input history.The impact of t relax on ESNs' performance is tested for both MG time-series prediction and STM tasks.
Fig. 6(a) illustrates the dependence of RMSE on t relax for the MG time-series prediction task.Simulations with two different t pw values show that longer relaxation time leads to a lower error for the prediction task.The RMSE without relaxation is more than twice as large as when t relax = 2 ns.The STM capacity also increases with a longer relaxation time as shown in Fig. 6(b), which is consistent with the RMSE results for the MG time-series prediction task.From micromagnetic simulations, t relax > 1 ns is sufficient to return m z to its equilibrium state and ensure that the network maintains high performance for various evaluation tasks.

2) IMPACT OF SIGNAL CONVERSION
We utilize the ''Dictionary-Search Simulation'' method to investigate the influence of γ , t pw , the reservoir size, and the time switch step on the ESN performance, which is gauged by the RMSE between the target output y target and the network output y out .Fig. 7(a) presents the results of the MG time-series prediction tasks, employing γ = 1.5 × 10 11 and t pw = 0.5 ns.In Fig. 7(b) and (c), we evaluate the impact of the reservoir size and the time switch, respectively, on the ESN's RMSE.Results are shown for various t pw values.For comparison, we also show the network's performance when the activation function is a hyperbolic tangent.From these results, we can infer that the RMSE is reduced for a larger reservoir size, although the benefits in improvement tend to saturate for reservoirs with more than 50 neurons.Moreover, the STNO-based ESN has a lower RMSE compared to that of an ESN employing a hyperbolic-tangent activation function.Finally, we can also observe that pulsewidths shorter than 1 ns are desirable for reducing the RMSE.
A positive correlation between the RMSE and larger switch values (k) can be seen in Fig. 7(c).The RMSE stabilizes when k approaches 10, suggesting that the ESN has limited capability to predict time-series data for k > 10.For all k values analyzed here, the STNO-based ESN has a similar RMSE as that of a tanh-ESN.Fig. 7(d) reports the STM capacity as a function of delay for various t pw values.C STM increases linearly till d = 7 and saturates after d = 10.The maximum C STM reaches a high value of 8.88, which is significantly greater than that reported for spintronics-based ESNs [21], [22], [24].Furthermore, we can see that the memory capacity of the STNO-based ESN is equal to or better than that of a tanh-ESN, making STNOs an attractive candidate technology for ESN-based applications.
The simultaneous impact of γ and t pw on the RMSE, STM capacity, and PC capacity is shown in Fig. 7(e)-(g), respectively.Fig. 7(e) shows the RMSE of MG time-series prediction for k = 1.The RMSE reaches as low as 4 × 10 −3 when the mean and standard deviation of the MG time series are 0 and 1, respectively.The RMSE increases monotonically with γ ∈ [6 × 10 10 , 3 × 10 11 ] A/m 2 for a fixed t pw > 1.2 ns.In the low t pw regime, the optimal γ ranges from 1.2 × 10 11 to 2.1 × 10 11 A/m 2 , with lower optimal γ values resulting for shorter t pw .Our simulations offer a rough idea of optimizing γ : if current densities applied to the STNO after signal conversion are located in the nonlinear region of Fig. 3(c), the ESNs' performances will be close to the optimal case.
For the STM capacity assessment, reported in Fig. 7(f), the training and testing data are randomly generated binary values of 0 or 1, while the delay is swept from 1 to 20.We can see that the ESN performance remains nearly constant for pulse widths greater than 1 ns for a given γ .For maximizing C STM , we can identify optimal combinations of γ and t pw .For example, for t pw = 1 ns, γ opt = 1.2×10 11 A/m 2 , yielding C STM = 8.86.Likewise, for t pw = 0.3 ns, Fig. 7(g) depicts the PC capacity by sweeping γ from 1.5 × 10 10 to 3 × 10 11 A/m 2 , while t pw is varied from 0.1 to 2.0 ns.The PC capacity is maximized for smaller γ and t pw values, which is distinct from the optimal γ and t pw values identified for MG time-series prediction and STM capacity evaluation tasks [Fig.7(e) and (f)].Thus, our results indicate that the design of the STNO-based ESN must take into account the task-specific details.The PC capacity reaches a maximum of 3.88 with γ = 3 × 10 10 and t pw = 0.8 ns.
The tradeoff between C STM and C PC is explored in Fig. 8.The memory-nonlinearity tradeoff has been studied previously in many works [23], [38].For various t PW values explored here, the product C STM × C PC reaches its peak value  of 22.3 for γ = 1.2 × 10 11 A/m 2 and t pw = 1.5 ns.However, with increasing γ , the product tends to gradually degrade to values less than 10.The C STM -C PC tradeoff might exist due to the completeness property of the information processing capacity.Further exploration is beyond the scope of this work.

3) IMPACT OF THERMAL NOISE AND PROCESS VARIABILITY
this section, we will consider the influence of thermal noise on the dynamics of STNOs, as well as fabrication process variability.The thermal noise is included as a thermal field, H th , in (12), which is given as where ξ t ∼ N (0, 1) is a standard Gaussian vector, k B is the Boltzmann constant, and t = 0.1 ps is the time step in micromagnetic simulations.The thermal field imparts stochasticity to the dynamics of STNOs.To account for this, we conduct 20 micromagnetic simulations with the same ESN setup and STNO parameters and record the statistical variations in the ESN performance.
The impact of thermal noise on the performance of the ESN is analyzed in Fig. 9(a).Here, we assume that the magnetic properties of the STNO are insensitive to temperature, which is justified as long as the Curie temperature of the magnetic materials employed in the STNO is much greater than 400 K [39].For a given pulsewidth of the input signal, the RMSE exhibits a positive correlation with temperature within the range of 0-400 K, as depicted in Fig. 9(a).The RMSE at 400 K increases by ≈30%-42% compared to its value at 0 K.We note that even in the presence of thermal fluctuations, the RMSE is lower for longer relaxation times.The impact of thermal noise on the RMSE and relative standard deviation (RSD) of the ESN for different input pulse widths is discussed in the Supplementary Material.
As mentioned in Section IV-B, a circular cross section of the STNO ensures easy-plane anisotropy of the free layer, which results in a continuous J -m z curve [see Fig. 3(a)].Size variations in fabrication will induce shape anisotropy in either the x-or y-direction.The impact of process variations is modeled by first calculating the demagnetizing coefficients of the thin-film ferromagnet according to [40].Subsequently, the demagnetization coefficients are used to estimate the in-plane anisotropy field, H ax/ay = (N y/x −N x/y )×M s .The anisotropy field is found to be H ax = 1757 A/m for r x = 55 nm, while H ay = 2147 A/m for r x = 45 nm.Here, r x denotes the radius of the free layer along the x-axis.Fig. 9(b) shows the RMSE of the ESN for the MG time-series prediction task when the x-axis diameter is varied.We find that for a 10% process variability, the RMSE increases 36% for t pw = 0.5 ns and 24% for t pw = 1.0 ns.Yet, the absolute value of the RMSE remains under 1.4 × 10 −2 with process variability and could be acceptable for many error-tolerant applications.

V. CONCLUSION
We explored the use of an STNO with easy-plane anisotropy to implement an ESN that possesses high prediction accuracy, nonlinearity, and memory capacity.Using real-time simulations combined with analytic models of the STNO physics, we identified the optimal representation of the input data signal, its pulsewidth, and the relaxation time of the STNO to achieve high accuracy for the MG time-series prediction as well as for the reservoir to possess both memory and nonlinearity.We found that an input pulsewidth of below 1 ns, the input signal to current density conversion factor, γ , around 1.5 × 10 11 A/m 2 , and a relaxation time of 1 ns are preferred to obtain low prediction error (∼ 4 × 10 −3 ) and peak STM of 8.8.The compound metric, which is the product of the STM and PC capacity, is found to exhibit a maximum, in the range of 20-23, depending on the choice of γ and t pw .We also showed that increasing the reservoir size from 10 to 50 STNO neurons can dramatically reduce the prediction error by more than a factor of 2×.However, the benefits of further increasing the reservoir size yielded diminishing returns.We quantified the impact of thermal fluctuations and process variability on the ESN's prediction ability.Our statistical analysis revealed an increase in RMSE of the prediction task by as much as 42% when the temperature increased from 0 to 400 K.Likewise, a 10% variation in the STNO's free layer dimensions yielded almost a 40% increase in the RMSE.Nonetheless, our STNO-based ESN offers competitive performance compared to other emerging devices-based reservoir computers and is thus an excellent candidate for a physical reservoir computer that has low training costs.

FIGURE 1 .
FIGURE 1. Overview of ESN implementation.The reservoir is made using STNOs.The meaning of symbols is explained in the text.

FIGURE 2 .
FIGURE 2. (a) Schematic of the STNO, where a tunneling barrier is sandwiched between an in-plane free layer and an out-of-plane polarizer.(b) Oscillatory dynamics of the STNO for J = 8 × 10 10 A/m 2 .(c) Reorientation of the free layer along the fixed layer's magnetization for J = 2 × 10 11 A/m 2 .Simulation parameters for these results are given in Table1.

FIGURE 3 .
FIGURE 3. (a) m z -J relationship for different H ax values.Dashed vertical lines represent the threshold current density J min .J max , shown by the dotted vertical lines, is 1.72 × 10 11 A/m 2 .(b) J min versus H ax obtained analytically and validated against micromagnetic simulations.Here, m xc m yc = 0.4.(c) m z -J relationship with different input pulse widths t pw .

FIGURE 4 .
FIGURE 4. Flowchart depicting the ESN implementation and simulation setup.

FIGURE 5 .FIGURE 7 .
FIGURE 5. (a) Example of continuous input u(t) and discrete sample u(n).(b) STNO's current density signal generated from u(n) with γ = 1.8 × 10 11 A/m 2 and t pw = 0.5 ns and corresponding output m z .(c) STNO's current density generated from u(n) with t pw = 0.5 ns (solid lines) together with 1.0-ns relaxation time (dashed lines) and corresponding output m z .

FIGURE 8 .
FIGURE 8. Product of the STM and PC capacity of the ESN for different signal conversion ratios and pulse widths.

FIGURE 9 .
FIGURE 9. (a) Relationship of RMSE and temperature with 20 simulations per data point for MG time-series prediction task, t pw = 0.5 ns and t relax = 0.3, 0.5, and 1.0 ns.(b) RMSE versus free layer's diameter variation for MG time-series prediction task.Here, γ = 1.5 × 10 11 A/m 2 and t pw = 0.5 and 1.0 ns.