Capacity Prediction and Validation of Lithium-Ion Batteries Based on Long Short-Term Memory Recurrent Neural Network

Capacity prediction of lithium-ion batteries represents an important function of battery management systems. Conventional machine learning-based methods for capacity prediction are inefficient to learn long-term dependencies during capacity degradations. This paper investigates the deep learning method for lithium-ion battery’s capacity prediction based on long short-term memory recurrent neural network, which is employed to capture the latent long-term dependence of degraded capacity. The neural network is adaptively optimized by the Adam optimization algorithm, and the dropout technique is exploited to prevent overfitting. Based on the offline cycling aging data of batteries, the capacity prediction performance is validated and evaluated. The experimental results demonstrate that the proposed algorithm can accurately track the nonlinear degradation trend of capacity within the whole lifespan with a maximum error of only 2.84%.


I. INTRODUCTION
Lithium-ion batteries have been widely deployed in electric vehicles (EVs) and energy storage systems of power grids due to their high energy/power density, no memory effect and long lifespan [1]. However, with the cyclic charging and discharging operations, battery's capacity degradation and electrical performance deterioration can influence vehicle operation performance and safety. In particular, when the capacity decreases below 80% of its initial value, lithiumion batteries turn to be unstable and degrade faster than before, implying that they reach end of life (EOL) [2], and the continued operation of batteries may lead to irreversible damage. As such, accurate diagnosis for battery health condition becomes an indispensable task [3]. In practice, a serviceable battery management system (BMS) is essential to ensure operating efficacy and battery safety [4]. One main function of BMS is to conduct inner status estimation of batteries, such as state of charge (SOC) and state of health (SOH) [5]. Accurate capacity information can supply an important foundation for SOC and SOH estimation, and also provide valuable indexes to end-users and battery manufactures [6].
To now, extensive research has been conducted to improve capacity prediction accuracy of batteries. The conventional capacity estimation methods can be categorized into two types: model-based methods and data-driven methods [7]. Reference [8] proposes a two-stage scheme for battery capacity estimation according to the variation of thermal dynamics. In the first stage, the estimation for battery core temperature and heat generation is implemented, and then a joint estimation for both SOC and capacity are exerted in the following stage. Reference [9] builds a capacity fading model based on the sample entropy, which is employed to calculate the battery surface temperature in the charging process. Through considering the influence of heat generation on capacity attenuation, the particle filter (PF) is exploited to estimate the battery remaining capacity. Reference [10] presents a datadriven diagnostic technique for capacity estimation based on Gaussian process regression (GPR), in which the voltage measurement over a short period of galvanostatic phase is considered as the model input. In [11], considering the consistency among cells connected in series, the variation characteristics of voltage are extracted from two different cycles by conducting the dynamic time warping algorithm. Based on the extracted feature, a three-step capacity estimation method with the theoretical foundation of shape invariance of the charging voltage is proposed to calculate the capacity difference between two adjacent cells. In these mentioned prediction methods, the estimated capacity value is severed as an intermediary for other state estimation. Considering the coupling relationship, joint estimation of capacity with other battery states, such as co-estimation of capacity and SOC, are usually carried out sequentially.
For the co-estimation of capacity and SOC, some joint algorithms become attracted due to their satisfactory precision and robustness [12]. Based on effective electrical models such as equivalent circuit model (ECM), a number of advanced filter algorithms (such as Kalman filter (KF) and PF) can then be adopted to conduct the joint estimation of battery status and model parameters [13]. In [14], a secondorder resistance-capacitance (RC) ECM is established, and the square root cubature KF is employed to estimate the SOC. Meanwhile, the capacity, as one of the key parameters of model, is identified by the genetic algorithm (GA). Reference [15] proposes a multiscale dual H-infinity filter to estimate the SOC and capacity of battery in real time with different timescales for reaction to slow varying battery capacity and fast varying battery state. To address the different variation rates of model parameters, [16] presents a joint algorithm integrated by KF and the recursive least square (RLS) method to estimate SOC and capacity, in which the model parameters are adaptively updated by a vectortype RLS. For the sake of enhancing the estimation precision. Reference [17] constructs a serially connected battery pack model based on a second-order RC ECM of cell. Then, a multiscale extended KF algorithm is employed to accurately estimate SOC, model parameter and capacity of single cell in battery packs.
In addition to SOC, accurate estimation of capacity, as mentioned above, is also of importance for health diagnosis of battery. As an indicator of assessing the battery degradation status, SOH is usually indexed by the ratio of current maximum useful capacity over the rated value [18]. In [19], a fusion method incorporating partial incremental capacity (IC) analysis and a dual GPR model is proposed to estimate the SOH of lithium-ion batteries. To improve the SOH estimation accuracy and reliability, [20] extracts four feature vectors representing the degradation status of battery from the charging voltage curves. Consequently, the SOH prediction is attained via the well-tuned GPR model with the extracted features as the inputs. By incorporating the critical features derived from battery operation data set, [21] proposes a real-time estimator for remaining useful life (RUL) prediction based on the SVM model. In [22], the SVM model is constructed with a radial basis function kernel, and the feature variables are extracted from partial charging voltage curves to construct the training dataset. In addition, the kernel parameters of SVM model are optimized by the grid search method. Reference [23] leverages the conjugate gradient method and multi-island GA to optimize the hyper parameters of GPR model, and the characteristic parameters of constant-current charging process are extracted as the healthy features by the IC analysis method. On this basis, the SOH estimation is attained by combing the extracted features and the optimized GPR model. In short, all of these datadriven methods need healthy features to establish a mapping relationship between SOH and feature variables. In other words, a reliable SOH estimation strongly requires proper feature extraction to perform qualified SOH diagnosis [24]. However, lithium-ion battery degradation is consecutive and generally involves hundreds to thousands of cycles, and the later degradation evolution is highly related with the former degradation information throughout these cycle operations. Moreover, the healthy features extracted from the charging and discharging profiles also show a specific variation trend with the aging. These variables can be regarded as a time series signal, of which the current values may exhibit long-term dependencies with historical values. Nevertheless, the conventional data-driven methods, such as SVM and GPR, are inefficient to learn the long-term dependencies, thus it remains challenging to maintain high estimation accuracy for long-term capacity prediction [25].
Presently, deep learning network has received widely attention and has been progressively applied in the language modeling [26] and image recognition [27]. As a kind of deep learning network, long short-term memory recurrent neural network (LSTM-RNN) is employed to solve the problem with long-term dependences. LSTM-RNN can reserve the key information from the degradation data via effective learning of long-term dependence based on the specific gate [28]. Given long-term characteristic of battery degradation, LSTM-RNN may be a suitable solution to learn the longterm degradation trend of capacity variation. Reference [29] exploits the LSTM-RNN to learn the long-term dependence of degraded capacity of supercapacitor, and the experimental results show that the LSTM-RNN can predict the RUL of the supercapacitor on the rest testing data with 2.61% root mean square error (RMSE). Reference [30] employs the LSTM-RNN to predict RUL of lithium-ion batteries. The elastic mean squared back-propagation algorithm and Monte Carlo simulation are respectively applied to adaptively optimize the network and generate a probabilistic RUL prediction. However, the LSTM-RNN in [29], [30] is trained based on the historical capacity degradation data, and then one-and multi-step forward RUL prediction is performed. In addition, [29] reveals that the capacity degradation trajectory of lithium-ion batteries is approximate to the linear degradation, thus the decline rate of capacity under the whole cycle life is similar. Nonetheless, the degradation rate of battery is significantly different in the beginning and ending stage of whole lifespan. Therefore, the prediction accuracy of capacity based on only partial degradation data for model training needs to be further analyzed. Furthermore, when incomplete offline data is available, whether the LSTM-RNN can also accurately predict the battery remaining capacity in the whole lifespan still needs to be investigated and validated.
Motivated by this, the capacity prediction of lithium-ion batteries based on the LSTM-RNN is carefully conducted. Firstly, the LSTM-RNN is optimized based on the Adam optimization algorithm, and the dropout technology is employed to prevent the network from overfitting. Then, the optimized LSTM-RNN is exploited to achieve the capacity prediction of lithium-ion batteries. Whereupon, this paper conducts the validation and comparison of capacity prediction effectiveness of LSTM-RNN for the lithium-ion batteries from the following four aspects. (1) The influence of aging factors on the performance of capacity prediction for lithium-ion batteries based on LSTM-RNN is discussed. (2) The capacity prediction results based on the LSTM-RNN are compared with the prediction results of SVM, GPR and Elman NN.
(3) Through one battery's whole cycle life data for model training, others battery's data are validated to examine the prediction performance of the built LSTM-RNN model. (4) The leave-one-out cross validation (LOOCV) method is applied to evaluate the performance of the LSTM-RNN with the aging factors as model inputs.
The remainder of this paper is structured as follows. The battery life cycle test is introduced, and the experimental data are analyzed in Section II. Section III illustrates the detailed capacity prediction process of lithium-ion batteries based on LSTM-RNN. The validation and comparison of prediction results are elaborated in Section IV, and Section V concludes the study.

II. BATTERY AGING EXPERIMENTAL AND DATA ANALYSIS
In this section, the cycle life experimental and the degradation data acquired from a huge cycling data repository are introduced. Based on the cycling data, the aging factors are extracted to represent the battery capacity variation. After that, the framework and process of capacity prediction are illustrated in a schematic diagram.

A. BATTERY AGING EXPERIMENTAL AND CAPACITY DEGRADATION DATA
In this study, the cyclic aging data of lithium-ion batteries are obtained from an open source experimental data repository [31], which collects the cyclic life tests of a variety of commercial LFP/graphite batteries (nominal capacity of 1.1 Ah and rated voltage of 3.3 V). The upper and lower cut-off voltages of the battery are 3.6 V and 2.0 V, respectively. The charging policy follows a form of C1(Q1)-C2 mode, where C1 and C2 denote the first and second constant current stage, and Q1 denotes the SOC at which the current changes. The second current step ends at 80% SOC, after that the cell is charged with 1C (C denotes the rate capacity value, i.e., 1.1) constant current (CC)-constant voltage (CV) mode, and the cells are discharged with 4C current. During the experiment, the surface temperature and internal resistance are measured and recorded. Note that the internal resistance measurement is conducted during charging at 80% SOC by imposing 10 pulses of ±3.6C current with the duration of 33 ms [31]. Seven cells' data (labeled as Cells 1 to 7) are selected from this dataset to investigate the performance and effectiveness of the LSTM-RNN model for capacity prediction.
The curves of degradation capacity are shown in Fig. 1, which highlight that the degradation trajectories of four cells remain almost the same, indicating that the degradation mechanism is nearly consistent for the same type of lithiumion batteries. The cycle life experiments for all batteries are terminated when the batteries reached 80% of nominal capacity, i.e., 0.88 Ah. It can also be found that the degradation slope is relatively flat before 90% SOH. However, when the SOH drops less than 90%, the capacity degradation shows an exponential decline trend with faster dropping speed. Besides, the electric characteristics will gradually deteriorate during the aging process, the thermal characteristics of batteries will also vary with aging [32]. Next, the aging factors will be extracted from electric and thermal characteristics variation of the battery.

B. AGING FACTORS EXTRACTION AND ANALYSIS
From the perspective of electric characteristics, one main change during degradation is that the internal resistance (IR) will gradually increase. During the battery aging process, the formation and thickening of the SEI film, the cathode electrolyte interface (CEI) formation and the internal structure disordering can lead to the increase of IR. However, the aforementioned issues cannot be measured directly; and by contrast, the measured IR, as a representative variable, will vary in a nonlinear manner relating to the capacity degradation [32]. As can be seen from Fig. 2, significant variation of IR does not obviously appear in the cycle of [1, 800] but with an exponential variation trend after cycle 800. The IR increase represents the capacity degradation to some extent with the form of inverse proportional function. In other words, the more obviously IR increases, the faster the capacity declines. Therefore, the battery IR, denoted by F 1 , can be selected as an aging factor.
Considering the battery's thermal characteristics, the surface temperature at each moment is recorded during the experiment. Due to the IR increase and active materials loss of contact caused by the current collector corrosion, binder decomposition and electrolyte loss, the generation of ohmic heat and the heat distribution inside battery differ greatly under the same charge/discharge C-rate when the battery  ages [33]. Fig. 3 shows the temperature variation curves of Cell 1 at different cycles. It can be seen that the temperature shows an augmented trend with the increase of cycle number. Intuitively, the average temperature of each cycle is calculated to analyze the thermal characteristics of battery. The variation of average temperature with different cycle times is shown in Fig. 4. It is clearly observed that the average temperature increases progressively with the cycle experiment. Based on the variation relationship of capacity and average temperature with the cycle number, it can be concluded that the average temperature can also represent the capacity degradation. Consequently, the average temperature of each cycle can be selected as another aging factor, denoted by F 2 . Except for the IR and temperature, the aging factors can also be extracted from the charge/discharge voltage profiles. Since the experimental battery in this study is discharged with constant current, the incremental capacity analysis during the discharging process is conducted. The discharging incremental capacity (DIC) curves at different cycles for Cell 1 are  shown in Fig. 5. In addition, Fig. 6 shows the variation curves of peak absolute value with cycle numbers for Cells 1 to 4. As can be seen, the absolute value of peak decreases gradually with the increase of cycle number, implying that the absolute value of DIC peak point can effectively characterize the battery degradation. Hence, the absolute value of DIC peak can also be considered as one aging factor, called F3. Next step, the implied relationships between aging factors and capacity will be analyzed.

C. CORRELATION ANALYSIS OF AGING FACTORS BASED ON GRA
As discussed previously, the IR, average temperature and the absolute value of DIC peak, denoted as F 1 , F 2 and F 3 , are selected as the aging factors to characterize the capacity degradation. To further analyze the relationship between aging factors and capacity, we took cell 1 as an example, and the variation relationships between the aging factors and VOLUME 8, 2020  capacity with respect to cycle life are shown in Fig. 7 and Fig. 8, where the color scale represents the cycle life. As can be found, F 1 and F 2 increase and F 3 decreases with the capacity degradation. Additionally, in the early and middle phases of cycle life (1 to 800 cycles), the capacity degrades with a slow speed, and F 1 remains almost unchanged; and in contrast, F 2 increases obviously and F 3 gradually decreases with the increase of cycle numbers. Comparatively, in the ending phase of cycle life (800 to 1100), the capacity degradation and the increase of F 1 are faster, and the increase rate of F 2 becomes slower and more stabilized; however, F 3 still decreases obviously. It can be concluded that the change of F 1 is not obvious, whereas the variation of F 2 is relative larger in the early cycle life. In the later cycle life stage, the changes of F 1 and F 2 are opposite to that of the early stage. Furthermore, there exists obvious change in F 3 throughout the whole cycle life. To sum up, a kind of mapping relationship between the aging factors and capacity really exists in different cycle life phases. In this study, the correlation between the aging factors and battery capacity is further evaluated by grey relational analysis (GRA). As a crucial method based on the  grey system theory, GRA evaluates the correlation among the elements according to the similarity and dissimilarity of their variation trend. The quantitative analysis based on the GRA is to obtain the correlations between reference and comparative sequences, as detailed in [34]. Through GRA, the correlation grades between aging factors and capacity of each cell are acquired, as shown in Table 1. Particularly, the correlation grade of F 2 is greater than 0.75 for all the cells, which means the selection of aging factors is feasible for capacity estimation. Fig. 9 shows the framework and flowchart of capacity prediction based on the LSTM-RNN model. As can be seen, the whole prediction process contains the experimental data processing, the model construction and the capacity prediction modules. In the data processing module, the aging factors data set X i = [F 1i , F 2i , F 3i ], where the subscript i represents the cycle number, is structured based on the extracted characteristic features from the aging experimental data. The sample set is divided into the training set and the test set. In the model construction and optimization module, the architecture and network layers of LSTM-RNN is firstly designed, and the model parameters are initialized. Then, the aging factors data set X i and the corresponding capacity value y i in the training set are considered as the LSTM-RNN model's input and output, respectively. The optimal model parameters are searched via test and cross validation. In the capacity prediction and  error analysis module, similarly, the aging factors data set X * i in the test set is inputted into the well-tuned model, and then the outputŷ i is collected as the prediction value of battery capacity. By calculating evaluation criteria and comparing the predicted valueŷ i with the observed value y * i , the prediction effectiveness of LSTM-RNN model is assessed.

III. METHODOLOGIES
This section elaborates the mechanism and derivation of related model and algorithms applied for the capacity prediction, including the LSTM-RNN, the Adam optimization algorithm and the dropout technique. In addition, the evaluation criteria and LOOCV method are addressed to evaluate the performance of LSTM-RNN based capacity estimation algorithm.

A. THE ARCHITECTURE OF LSTM-RNN
LSTM-RNN is a kind of specialized RNN for solving vanishing gradient problems and gradient explosion problems with long-term dependency [35]. Compared with the simple RNN, the LSTM-RNN adds a state c in the hidden layer to keep the long-term state, and this newly added state c is called the cell state [29]. The structure of LSTM-RNN is shown in Fig. 10. Note that the subscript t of each vector represents the moment state, which denotes the generality of LSTM-RNN applications. For the capacity prediction of lithium-ion batteries, the moment state means the cycle number. At moment t, there are three inputs for the LSTM-RNN: the input variable x t of the current time network, the output value h t−1 and the cell state c t−1 in the previous step. Meanwhile, the LSTM-RNN has two outputs: the output value h t and cell state c t at current moment t.
Similar with classic RNNs, LSTM-RNN is composed of the input layer, hidden layer and output layer. However, the hidden layer in LSTM-RNN is with a specialized memory mechanism, instead of a general neuron. The internal state of LSTM-RNN at moment t is called c t , which is critical to the network and locates at the heart of each neuron that is linearly activated. The internal state can be regarded as a carrier, to which the information has been added or from which has been removed. This information processing can be carefully regulated by the so-called gate [30]. The gate is a distinctive feature of LSTM-RNN, which actually denotes the fully connected layers. There are three gates, namely, the input gate, forget gate and output gate, in a LSTM-RNN architecture. Any read or modification operation can be achieved through controlling of these three gates. Additionally, the information selection of gate is mainly conducted by the sigmoid function, tanh function or matrix multiplication [24].
It can be seen from Fig. 10 that the first step of applying the LSTM-RNN is to decide what information should be discarded by the forget gate, which reads h t−1 and x t , and outputs a value f t between 0 and 1, where the upper bound 1 indicates that the information should be totally kept; and by contrast, the lower bound 0 means that it should be thoroughly discarded. The next step is to determine what information should be stored in the memory gate. One part of the input gate i t , called the sigmoid layer, decides what information should be updated, and another part, called the tanh layer, creates the candidate vector a t , which is added to the current cell state. Finally, by means of the updated cell state c t and the value o t of output gate, the output of LSTM-RNN can be calculated. Based on the previous moment output h t−1 and the input of current moment x t , the state values of three gates and the candidate vector a t can be formulated as: where h t−1 is the last output of cell state, x t is the current cell input, σ represents the sigmoid function, W f is the weight matrix of forget gate, and b f is the bias of forget gate; W i and W c denote the weight matrix of sigmoid layer and tanh layer of input gate, respectively; b i and b c represent the bias of sigmoid layer and tanh layer of input gate; W o and b o denote the weight matrix and bias of the output layer. When the state values of each gate are determined, the current cell state c t and the output of LSTM-RNN can be calculated, as: Based on the above discussion, LSTM-RNN can reach the purpose of learning the long-term dependences of capacity degradation and performing one-or multi-step forward prediction. Next, the training algorithm will be detailed to search the optimal weight matrices and biases for capacity prediction.

B. OPTIMIZATION TRAINING FOR LSTM-RNN
In this study, the Adam optimization algorithm is employed to optimize the parameters of LSTM-RNN. The Adam algorithm is a first-order gradient optimization method that mainly accounts for optimizing the gradient of stochastic objective function based on adaptive estimates of lowerorder moments. Compared with traditional random gradient VOLUME 8, 2020 descent algorithms, it advances higher computational efficiency, lower RAM occupation, less turning labor and better dominance in solving large-scale parameter optimization. Reference [36] experimentally validates that the Adam algorithm is more efficient in solving deep learning problems, compared with the RMSprop [37] and AdaGrad algorithm [38]. The parameter updating process of Adam algorithm is detailed as follows. Firstly, at step t, the gradient of optimization objective is calculated, as: where J (θ) represents the objective function with θ, g t denotes the gradient with θ t−1 . At step t, the exponential moving average value of both gradient and squared gradient m t and v t , are respectively calculated, as: where β 1 and β 2 denote the exponential decay factors for weight distribution and influence incurred by squared gradient. In general, the initial value of m 0 and v 0 is set to zero, m t and v t are adjusted to zero in the initial stage of training process. Thus, a modification will be applied to reduce the training error, as: wherem t andv t denote the modified values of m t and v t . The parameters are updated as: where α denotes the learning rate, and ε expresses the smooth coefficient for avoiding the denominator from zero. The remaining parameters of Adam algorithm are set to β 1 = 0.9, β 2 = 0.999, α = 0.001, and ε = 10 −8 .

C. DROPOUT TECHNOLOGY TO PREVENT LSTM-RNN FROM OVERFITTING
Overfitting refers to the model's ability of fitting the training data set well but showing inferior fitting effect in the test data set [29]. To address this issue, the dropout technique is employed to prevent the LSTM-RNN from overfitting [39]. Generally, the error back-propagation method is applied to iteratively adjust the parameters for each Mini-Batch in the RNN training process. The key idea of dropout technique is that it removes the neurons from the layers of RNN during the training process to prevent the model from overfitting. The neurons along with all its connections are temporarily discarded from the network, as shown in Fig. 11. It is essentially a random process during which one stochastic neuron is selected to remove. Therefore, each neuron will be retained with a fixed probability p, which is set to 0.4 in this paper. It can be seen from Fig. 11 (b) that the NN model after applying the dropout technique is equivalent to sampling a condensed network from it. The condensed network consists  of all remaining neurons and their connections after removing the discarded neurons. Hence, training a neural network with dropout can be regarded as training many condensed networks with extensive weight sharing, where each condensed network is trained rarely [30]. By this manner, the network becomes less sensitive to the specific weights of neurons, which in turn results in that the network is with the better generalization capability.

D. LEAVE-ONE-OUT CROSS VALIDATION
In this paper, the LOOCV method is employed to verify the performance of the LSTM-RNN for capacity prediction [40]. The schematic diagram of LOOCV applied in this study is illustrated in Fig. 12. The complete feature data set contains four subsets X C1 , X C2 , X C3 and X C4 which are combined with four cells' aging factors extracted from the experimental data. Each subset is composed of three features vectors {F 1 , F 2 , F 3 }, namely IR, average temperature and absolute value of DIC peak. As illustrated in Fig. 1, the degradation curves of four cells are similar, indicating the degradation mechanism is coincident for one type batteries. Therefore, it is feasible to train model with one cell's data and test with others cells' data for validating the prediction effectiveness of the proposed LSTM-RNN model. In this work, we suppose one cell's data as the test dataset and compile other three cells' data together as the training dataset, as shown in Fig. 12.
Since the validation datasets are not imported in the training process, the trained model can provide an approximately unbiased estimation [13]. The training and test process is repeated four times, and thus each battery cell is used as the test dataset, and we can conclude that it is equivalent to perform a 4-fold cross validation for the LSTM-RNN model. After each iteration, the prediction error and evaluation criteria are calculated to assess the model performance. Next, the evaluation criteria applied in this study are introduced.

E. THE PERFORMANCE EVALUATION CRITERIA
To assess the prediction performance, the maximum absolute error (MAE), mean square error (MSE), RMSE and goodness-of-fit R 2 are considered as the evaluation criteria. MAE, MSE and RMSE evaluate the average prediction performance, of which the smaller value implies better prediction precision. By contrast, R 2 , varying within [0, 1], evaluates the correctness of trained model, and the higher value (closer to 1) of R 2 indicates more similar prediction result, compared to the real attribution. These four criterions are formulated as: where n represents the total sample number; y i andŷ i are the real value and predicted value of target variable for the ith sample, respectively; andȳ i represents the average value. In the next step, a series of validations are conducted, followed by the detailed comparison and discussions.

IV. RESULTS AND DISCUSSION
In this study, four cells' data are employed to validate the effectiveness of the proposed LSTM-RNN model for capacity prediction. The capacity prediction results under different conditions are discussed, including the influence of aging factors for model inputs, the comparisons of LSTM-RNN with traditional SVM, GPR and Elman NN, as well as the prediction results in terms of different cells' data for training.

A. INFLUENCE WITH AGING FETURES AS MODEL INPUT ON THE PREDICTION PERFORMANCE
To analyze the influence on the LSTM-RNN model caused by the aging factors as model inputs, the historical capacity degradation data and extracted aging factors are respectively employed as the inputs for model training. When considering the historical capacity data as model inputs during the training process, the model output is the observed capacity value of the next cycle corresponding to the current input cycle. In contrast, the observed capacity value of current cycle is regarded as model output while taking the aging factors as model input. Therefore, the data length of prediction results has one cycle difference with different model input. To make the prediction results of different features as model input are consistent, the prediction with historical capacity data as model input starts from the last cycle of training set. In addition, when the LSTM-RNN model executes one-or multistep forward prediction with historical capacity data as model input, it will obtain different predicted values with disparate variables for state update of the network. For comparison, the observed value and the predicted value of current cycle are respectively exploited to update the network state for next cycle's prediction. The prediction results of taking observed value and predicted value to update state are synchronously compared with the prediction results with the aging factors as model input.
Taking Cell 1 as an example, 60% of cycle life data is employed for model training, and the rest 40% is utilized for test. The predicted results and corresponding errors are shown in Figs. 13 and 14. We can find that when the predicted value is employed to update the network state, all the predicted results remain almost the same, indicating that the model cannot identify the degradation pattern in this case. When the observed value is employed to update the network state, the predicted results show a slight degradation trend in the global view but distinctly deviate from the observed capacity degradation trajectory. When the aging factors are taken as the model input, the capacity degradation trend can be well tracked by the LSTM-RNN model, and the maximum prediction error is less than 2%, as show in Fig. 14. Note that the battery degradation can be divided into two stages according to the capacity decline rate in this study. One stage is a linear degradation with a slower decline rate, e.g., the cycle range [1,800], and the other stage is an exponential degradation with a faster decline rate, such as cycles 800 to 1100. The experimental results show that when the degradation rate of capacity is distinctly different in the early and later of cycle life period, the LSTM-RNN cannot identify the battery  degradation pattern with the historical capacity data as the model input. However, as long as some effective aging factors such as the IR and temperature of battery can be extracted as the input for model training, the LSTM-RNN can predict the remaining capacity of battery with preferable accuracy and strong robustness.
To further analyze the influence of aging factors on the capacity prediction, the absolute value of DIC peak is extracted as one aging factor, namely,  trajectory. As can be obviously seen from Fig. 16 (b), when F 3 is added as the model input, the maximum prediction error is 1.78%, which does not decrease much, compared with the maximum error with only F 1 and F 2 as the model inputs. Moreover, when the model input is only F 3 , the maximum prediction error reach 3.02%, as shown in Fig. 16 (c). Note that the extraction of F 3 requires differential and interpolation calculation, significantly increasing the computation burden and the algorithm's complexity. Furthermore, constant current charging/discharging operations are difficult to encounter in practical applications. Compared with F 1 and F 2 that can be directly measured, the extraction of F 3 is more complex. To sum up, the subsequent discussion of capacity prediction in this study is based on only F 1 and F 2 as the model inputs hereinafter.

B. COMPARISON OF PREDICTION RESULTS WITH DIFFERENT METHODS
To further evaluate the performance of LSTM-RNN model, the single GPR, SVM and Elman NN algorithms are respectively applied for the capacity prediction of Cell 2. For the sake of fair comparison, 60% of the cycle data (1 to 686 cycles) are utilized to train the model, and the remaining 40% data (687 to 1144 cycles) are employed to verify the precision. The aging factors are taken as the model inputs, and the predicted results and errors are shown in Figs. 17-18 and Table 2. It can be seen from Fig. 17 that the LSTM-RNN can precisely track the degradation trajectory of capacity in the whole test dataset and can achieve the preferable prediction accuracy. Although the other three methods can roughly reflect the variation trend of capacity, the prediction errors are far more than that of LSTM-RNN. From   During the model training and optimization process, the consumption time for each iteration is recorded. The average consumption time of each method in the model training is calculated, as show in Table 2. As can be seen, the time cost of Elman NN is shortest, which is 32.93 s, followed by the LSTM-RNN, which lasts 56.46 s. Owing to the calculation of kernel functions and optimization of complex hyperparameters, the SVM and GPR respectively cost 193.55 s and 153.56 s for model training, which are much longer than that of LSTM-RNN. It is worth noting that the capacity degradation rate gradually increases with the cycling experiment, and the prediction errors of GPR, SVM and Elman NN also gradually increase, as show in Fig. 18. The results  indicate that the GPR, SVM and Elman NN are not qualified for the time series prediction with large sample data and long-term dependence. To sum up, the proposed LSTM-RNN algorithm not only exhibits higher prediction accuracy and faster operation, but also shows more robustness in predicting capacity degradation with long-term dependence.

C. CAPACITY PREDICTION WITH SINGLE BATTERY DATA
To further validate the performance of LSTM-RNN for capacity prediction, the experimental data of another three cells, i.e., Cells 5, 6, and 7, are analyzed. Similarly, 60% cycle data are employed to train the model for each single battery; in other words, the prediction of Cells 5 to 7 starts at cycle 617, 554 and 564, respectively. Note that the aging factors, which are exploited for the LSTM-RNN model input, are only the IR and average temperature. The prediction results and errors of Cells 5 to 7 are shown in Figs. 19 to 21 and listed in TABLE 3. As can be seen, the maximum prediction error of these three batteries is 2.54%, which is acceptable for capacity prediction. It can also be seen from Figs. 19 to 21 that the prediction error gradually increases with the increment of capacity degradation but declines quickly at the EOL of battery. The prediction error reveals that the LSTM-RNN can better predict the battery capacity in the whole cycle lifespan with the aging factors as the model inputs. In addition, the MSE and RMSE for these three cells are 1.29 × 10 −4 , 5.24 × 10 −5 , 2.22 × 10 −4 , and 1.14%, 0.72%, 1.49%, respectively; manifesting that the proposed LSTM-RNN model leads to preferable prediction performance. The R 2 of Cells 5 to 7 are respectively 0.9704, 0.9797 and 0.9241, which illustrate the prediction value is holistically consistent with the real capacity. To sum up, by using the aging factors as the model inputs, the LSTM-RNN can predict the battery capacity with preferable accuracy.

D. CAPACITY PREDICTION WITH MULTIPLE BATTERY DATA
To analyze the capability of degradation mechanism identification for the same type battery based on the LSTM-RNN VOLUME 8, 2020   model, we employed the whole cycle life data of Cell 1 as the training data and the other cell's data for test. Fig. 22 and Table 4 sketch the prediction results and corresponding errors. As shown in Fig. 22, the capacity degradation trajectories of   Table 4, we can find that the MSE and RMSE of Cell 3 estimation are the largest, i.e., 3.61 × 10 −5 and 0.60%; whereas the R 2 is least, which is 0.9690. It can be obviously seen from Fig. 2 that the IR value of Cell 3 is the most among the four cells when reaching its EOL, and its value is about 0.001 ohm larger than that of the other three cells. It can also be found from Fig. 4 that the average temperature of Cell 3 is the least at every cycle. The slight difference in the aging factors results in larger capacity prediction error of Cell 3 than those of Cells 2 and 4. Nevertheless, the prediction errors of Cells 2 and 4 are mostly less than 1%, except some individual points where the error is relative larger. The MSE and RMSE of Cells 2 and 4 are less than 2.70 × 10 −5 and 0.51%, which can be regarded as a preferable accuracy for capacity prediction. Moreover, the R 2 of Cells 2 and 4 are 0.9872 and 0.9835, demonstrating that the predicted values are highly consistent with the observed values. In summary, the experimental results manifest that even only the complete cycle life data of one cell are employed to train the model, the LSTM-RNN can still accurately predict the capacity of other batteries with the same type.

E. THE VALIDATION OF CAPACITY PREDICTION BASED ON LOOCV
According to the LOOCV principle shown in Fig. 12, the data of four cells are randomly combined into one group, and thus they are divided into four data group, each of which contains a training dataset and a test dataset. In this study, the training dataset is assembled from three cells' data, and the remaining cell's data is utilized for test. This validation process is repeated for four times until each cell is employed for test in turn. Therefore, a 4-fold cross validation is performed for the LSTM-RNN model.
The validation results of capacity prediction are shown in Fig. 23, and the corresponding errors are illustrated in the Fig. 24 and Table 5. As can be seen from Fig. 24, the maximum prediction errors of Cells 1, 2 and 4 are lower than 2%, whereas that of cell 3 reaches 2.84%. The MSE, RMSE and R 2 of Cell 3 are 2.84 × 10.5, 0.50% and 0.9787, respectively. It is worth noting that the prediction error of Cell 3 is the largest among those of the four cells. This prediction results are in line with the previous conclusion that the slight difference in the aging factors can lead to larger prediction error, as drawn in Section 4.4. Compared with the prediction results of Cell 3 with only the data of Cell 1 for model training, when the data of Cells 1, 2 and 4 are employed for training, the prediction accuracy is not significantly improved, as shown in Fig. 24 (c) and Table 5. It can be therefore concluded that increasing the amount of training data cannot distinctly improve the prediction accuracy. Furthermore, the experimental results indicate that when the prediction model is fixed, the capacity prediction accuracy is  not much related to the amount of training data but depends on the effectiveness of the extracted aging factors. It can be seen from Fig. 23 that the LSTM-RNN model can accurately predict the global trend of capacity degradation, whereas the predicted values fluctuate in the vicinity of the observed values. In addition, the R 2 of Cells 1, 2, and 4 are 0.9957, 0.9955 and 0.9960, which are quite close to 1, highlighting that the predicted values are very similar to the observed values. It can be noted that the capacity prediction for Cells 1 to 4 is attained based on different datasets for model training, and the maximum prediction error is less than 3%, highlighting that the proposed model is stable and reliable. To sum up, the experimental results manifest that when the effective aging factors are extracted for model training, the LSTM-RNN model can precisely learn the degradation pattern of battery and predict the battery capacity with preferable accuracy.

V. CONCLUSION
The key challenge of capacity prediction for lithium-ion batteries based on data-driven methods lies in effective extraction of key aging factors and accurate modeling of the long-term dependences of capacity degradation. In this paper, the LSTM-RNN algorithm is employed to construct the data driven-based capacity prediction for lithium-ion batteries. To improve the prediction performance of LSTM-RNN model, the Adam optimization algorithm is leveraged to find the optimal model parameters, and the dropout technique is exploited to prevent the network from overfitting. The reliability and robustness of LSTM-RNN for capacity prediction is validated based on the leave-one-out cross validation. The experimental results validate the LSTM-RNN model can well track the nonlinear capacity degradation trajectory. Meanwhile, even when only one battery data is employed for model training, the capacity prediction error of other cells is still less than 2%. Moreover, two conclusions can be drawn based on the leave-one-out cross validation. Firstly, when different training and test dataset are employed, the LSTM-RNN model can accurately predict the battery capacity with a maximum error of 2.84%, manifesting that the proposed method has preferable prediction accuracy and strong robustness. Secondly, when the model can learn the capacity degradation pattern in the whole lifespan of battery, increasing the amount of training data does not distinctly reduce the prediction error. The prediction accuracy mainly depends on the reliability and validity of the extracted aging factors. This work highlights the feasibility of applying the LSTM-RNN to predict capacity of lithium-ion batteries.