Gaussian Process Regression With Automatic Relevance Determination Kernel for Calendar Aging Prediction of Lithium-Ion Batteries

Battery calendar aging prediction is of extreme importance for developing durable electric vehicles. This article derives machine learning-enabled calendar aging prediction for lithium-ion batteries. Specifically, the Gaussian process regression (GPR) technique is employed to capture the underlying mapping among capacity, storage temperature, and state-of-charge. By modifying the isotropic kernel function with an automatic relevance determination (ARD) structure, high relevant input features can be effectively extracted to improve prediction accuracy and robustness. Experimental battery calendar aging data from nine storage cases are utilized for model training, validation, and comparison, which is more meaningful and practical than using the data from a single condition. Illustrative results demonstrate that the proposed GPR model with ARD Matern32 (M32) kernel outperforms other counterparts and can achieve reliable prediction results for all storage cases. Even for the partial-data training test, multistep prediction test, and accelerated aging training test, the proposed ARD-based GPR model is still capable of excavating the useful features, therefore offering good generalization ability and accurate prediction results for calendar aging under various storage conditions. This is the first-known data-driven application that utilizes the GPR with ARD kernel to perform battery calendar aging prognosis.


I. INTRODUCTION
L ITHIUM-ION (Li-ion) batteries are the promising candidates for electric vehicle (EV) applications, owing to their impressive features such as high energy density, high efficiency, and environmental friendliness [1]. However, reliable calendar aging prediction is still a bottleneck for the performance enhancement of EVs. In real automotive applications, Li-ion batteries generally degrade with the calendar and cyclic modes. Considering that more than 75% of battery life is spent under parking mode for EVs [2], calendar aging prediction therefore becomes a prerequisite for battery service life diagnosis.
Calendar aging for most Li-ion batteries is mainly caused by the growth of solid electrolyte interface (SEI) during storage [3]. Specifically, when a battery is stored, the reduction of its electrolyte solvents such as ethylene carbonate would cause the formation of Li-based products, further resulting in the generation of SEI on the anode particle of battery [4]. In such cases, Li-ion battery capacity would decrease over time [5]. The corresponding capacity aging rate is highly dependent on several key factors including the storage temperature and battery state-of-charge (SOC) [6]. Therefore, a key but challenging issue for calendar aging prediction is to simultaneously take these factors into account. It is vital to develop suitable models to capture capacity degradation dynamics under various storage conditions. Several physics-based models have been reported in the literature to explain battery calendar aging behaviors [7], [8]. Although electrochemical dynamics of batteries during storage have been analyzed in the simulation environment, these models are highly time-consuming and complex to parametrize, making them overly expensive for real-time calendar aging prediction on a long period scale.
To overcome the above challenges, calendar aging prediction approaches based on semiempirical models have been designed. For instance, Schmalstieg et al. [9] proposed an Arrhenius-based semiempirical model to capture calendar cell aging. Petit et al. [10] developed an empirical capacity loss model to evaluate the effects of SOC and temperature on storage lifetime of Li-ion batteries. In [11], instead of using Arrhenius acceleration model, a semiempirical approach based on the Eyring acceleration model was adopted to predict battery calendar aging, while the SOC drifting was also taken into account. By considering the effects of temperature and storage conditions, De Hoog et al. [12] proposed a semiempirical combined model to estimate the calendar lifetime for a nickel-manganese-cobalt oxide battery. In [13], by taking the initial surface layer caused by cell formation into account, an extended semiempirical model was proposed to improve the calendar aging predictive ability. These referred works belong to open-loop models without strong generalization abilities; in a way, their performance highly depends on the quality of test experiments.
Data-driven models, which are free of assuming any mechanism a priori, are also gaining increasing attention in the battery state-of-health (SOH) estimation and remaining useful life diagnosis [14]. Different intelligent techniques such as support vector regression [15], [16], Bayesian prediction [17], [18], and artificial neural network [19]- [21] have been successfully applied to build data-driven models for battery cyclic aging prediction. On one hand, some review papers have summarized these state-of-the-art applications [22], [23], concluding that several limitations still exist as: 1) data-driven approaches are mainly used to capture battery cyclic aging states but very few attempts have been done for calendar aging diagnosis.
2) Most publications fit the model on aging data obtained under constant operating conditions, ignoring various cases of temperature and SOC. Such models are infeasible for predicting capacity under different conditions. On the other hand, in a previous publication, a critical review on various data-driven models in battery aging domain was presented, in which the Gaussian process regression (GPR) is identified as one of the most powerful techniques. Detailed comparisons for different machine learning techniques are referred to Table 5 of [14] and the corresponding discussions. In fact, beyond the performance of simple structure and computationally acceptable predictions, GPRs enjoy the significant merits of being nonparametric and able to consider the uncertainty of predicted values. Through formulating specific input features, GPR-based models have been applied successfully in both academic and industrial domains [24]- [26]. However, to the best of our knowledge, there is still a lack of researches by using GPR in battery calendar aging prediction domain.
Besides, most existing works just use single conventional kernels to develop their GPR techniques without considering the correlations of multidimensional input variables. In the light of this, it could be a promising way through developing an improved GPR technique with the multidimensional kernel structure to capture the battery capacity degradation dynamics under different temperature and SOC storage conditions. Based on the above discussions, this article is concerned with machine learning-enabled calendar aging prediction for Li-ion batteries, where both the corresponding storage temperature and battery SOC can be taken into account simultaneously. Several key original contributions are made in this article. First, nine cases of experimental calendar aging data are collected under various storage temperatures and SOC levels over 480 days, constituting a well-rounded database to train and validate the calendar aging model. Second, because the battery calendar aging involving local fluctuations over storage time is a highly nonlinear process, a framework based on the GPR model is proposed to efficiently capture the capacity degradation dynamics with reliable confidence ranges. Third, due to the input features involving storage temperature and cell SOC, the isotropic kernel function of GPR is modified with an automatic relevance determination (ARD) structure, which brings the benefits that irrelevant inputs can be removed by fixing large length scales. Meanwhile, various predictors can be formulated to improve prediction accuracy and robustness. Finally, based on our dataset, the prediction performance of our proposed GPR model is investigated in terms of different kernel functions, and compared with a regression calendar-life (RCL) model. This is the first known data-driven application by utilizing GPR with ARD kernel to handle battery calendar aging predictions. Obviously, due to mechanismfree characteristics, the proposed GPR+ARD model can be readily extendable to other battery types for calendar aging prognosis.
The rest of this article is organized as follows. Section II presents the calendar aging experiments and the collected dataset. Section III introduces the developed model framework and several key quantitative metrics, followed by the description of ARD-based GPR model. Section IV analyzes the comparison and verification results. Finally, Section V concludes this article. Fig. 1 illustrates the equipment used for conducting the battery calendar aging tests under different conditions. The cells were stored in the Votsch VT3050 Thermal Chambers, and operated by the Bitrode MVC 16-100-5 Cell Cyclers. The generated battery data were monitored and stored by a computer. Commercial Panasonic NCRBD batteries from a commercial automotive company were the cells used to study calendar aging characteristics of Li-ion batteries. The battery has a 3Ah nominal capacity, 2.5 V lower cut-off voltage, and 4.2 V upper cut-off voltage. Because the rate of degradation can be minimized through keeping the SOC at a low or medium level and lowering the battery temperature [27], all cells are stored at 10 • C and moderate 50% SOC prior to any tests.

II. CALENDAR AGING TEST
The calendar aging test was performed under various storage temperatures (10 • C, 25 • C, and 45 • C) and SOC levels (20%, 50%, and 90%) for a storage time of 480 days. All batteries were set in the temperature chambers with an open-circuit status  during storage. For each storage temperature and SOC, three cells were studied to obtain the average battery capacity and minimize any battery-to-battery discrepancies.
Periodic check-ups were performed every 30 days to obtain the capacity information during storage. For all tests, 1 C-rate is equal to 3 A. At each check-up, the temperature chambers were set to 25 • C. Each cell was then charged by a constantcurrent constant-voltage (CC-CV) pattern with 1/2 C-rate in the CC phase until the terminal voltage reached 4.2 V, followed by a CV phase until the current dropped below 1/10 C-rate. After resting for 3 h, the cells were discharged by a CC pattern with 1/3 C-rate until the lower cut-off voltage of 2.5 V. The average discharge capacity (over the three cells) was selected as the battery capacity for each condition. Before calibrating the SOC of cells, CC-CV pattern would be implemented again to recharge all batteries to their full-charging states. After another 3 h rest period, the batteries were discharged to their specified SOC setpoints by a well-controlled coulomb counting method. Fig. 2 illustrates the open-circuit voltage (OCV)-SOC curve for our adopted NCRBD battery. The OCV-SOC points at which the batteries were stored are also highlighted with red. Subsequently, each temperature chamber was readjusted to its specified storage condition again.
Following this procedure, the calendar aging dataset that contains nine storage cases was obtained, as shown in Fig. 3. Five cases (Case 1, Case 3, Case 5, Case 7, and Case 9) are labeled as "Group 1" and another four cases (Case 2, Case 4, Case 6, and Case 8) are labeled as "Group 2." Detailed capacity aging curves with the standard deviations versus time for various storage conditions are illustrated in Fig. 4. Several of the capacity fade trends illustrate an initial rapid capacity fade followed by a more linear decrease. This phenomenon is attributed to the presence of excess anode electrode area in comparison to the cathode electrode area. Known as anode "overhang" in literature [28],   [29], an outflow of Li-ions can occur from the active regions of the anode to its excess passive regions, leading to the initial rapid capacity fade. The subsequent linear capacity fade is then attributed to the irreversible capacity fade due to SEI growth.
The initial battery capacities C ini for these cases are all different from each other, as described in Table I. It can be also seen that the initial measured capacities do not start from the nominal capacity of cell, which is a practical and likely scenario to occur.

III. TECHNIQUE
This section elaborates the modeling methodology as well as the corresponding quantitative metrics. Additionally, the fundamentals of GPR technique with ARD kernel are presented, followed by a brief description of a RCL model for comparison purposes.

A. Model Development and Quantitative Metrics
On the basis of the tested calendar aging dataset, machine learning-based techniques can be developed to capture capacity degradation dynamics with various storage conditions. Compared with the existing data-driven models that normally consider just capacity information, an innovative model structure, which also takes both storage temperature term and battery SOC term into account, is developed for calendar aging prediction, as shown in Fig. 5. This model framework can be mainly divided into two parts. For the prediction of next capacity point, output C sto (t + 1) can be predicted after using GPR to learn the underlying mappings among all input terms including the historical , storage SOC level SOC sto , and temperature T sto . In GPR, these mappings are reflected within the covariance functions. Detailed pseudo-code of this mapping can be found in Section II of [30]. For the multistep prediction, as illustrated in the green dashed line of Fig. 5, an iteration process that uses the previously predicted capacity as the next input to further predict new capacity value is conducted until the jth capacity value is achieved. Here j and k represent the horizons of future and previous calendar capacities, respectively. In order to use our collected dataset and verify the prediction performance of the proposed model, the capacity prediction in this article is conducted in steps of 30 days (720 h) for a total time duration of 480 days (11 520 h). Through a trial-and-error method, the case of "k = 2" is selected due to a good trade-off between computational effort and prediction accuracy.
After constructing the proper input-target pairs from the battery calendar aging dataset, the GPR technique is employed to study the potential mapping mechanism, giving rise to the capacity prediction model that considers various storage conditions. Based on our collected dataset as shown in Fig. 3, in order to ensure enough aging information can be learned for pure machine-learning techniques, the dataset from "Group 1" (green cases) that covers all temperatures and SOCs is applied for model training purpose. After training, the dataset from "Group 2" (yellow cases) is used to verify the effectiveness of the proposed model.
Moreover, to evaluate the prediction performance of the datadriven model, several key quantitative metrics are adopted in this article [31]. Here, N is the total number of predicted points, y j andỹ j stand for each actual capacity data and each predicted value, respectively. 1) Maximum absolute error (MAE): By defining as (1), MAE is used to illustrate the maximum difference between the predicted and real test values. The larger the MAE values, the poorer the predicted accuracy is [31].
2) Root-mean-square-error (RMSE): RMSE is another widely used metric to measure the overall difference between the predicted values and real test values. By defining as (2), the closer RMSE reaches to 0, the better the prediction accuracy is achieved [31].
3) Fit-goodness (R 2 ): R 2 is defined by (3) to measure the match quality of a model to the real test data [32] whereȳ is the mean of the predicted values. It is evident that as R 2 approaches 1, the corresponding model well describes the variability of the target class. 4) Calibration score (CS): By defining as (4), CS reflects the frequency of real data lying within the obtained confidence range [24] where [·] I represents the Iverson bracket. For a GPR model, 95.4% is a general confidence range with the interval corresponding to ±2σ [24]. Therefore, the ideal CS should get close to 0.954: be less or larger than this value indicates that the developed model is overconfident or underconfident, respectively.

B. GPR Technique With ARD Kernel Structure
Derived from the Bayesian framework, the GPR is able to undertake nonparametric regression with the Gaussian process. By defining the mean function m(i) and covariance function κ(i, i ) of a real process f (i) as The probability distributions of GPR can be specified by [30] f Supposing that the same Gaussian distribution exists between the training set i and the new dataset i , then the corresponding output y can be calculated by the conditional distribution as [24], [33] p (y |i, y, i ) = N (y |ȳ , cov (y )) with where y ,ȳ , and cov(y ) are the GPR posterior prediction, its corresponding mean and covariance, respectively. N () indicates a normal distribution; κ(i, i), κ(i, i ), and κ(i, i ) are the covariance matrices between just training inputs, just validation inputs, as well as training and validation inputs, respectively; y denotes the training output vector. It should be known that the uncertainty quantification of GPR in this article is actually the confidence boundaries to reflect the "scope compliance" uncertainty of predicted values. This uncertainty is caused by differences between the modeled context and the application context, which is not the same as the standard deviations of measurements [30]. The performance of GPR is fully determined by its m(i) and κ(i, i ), indicating that the corresponding kernel function must be selected and learned carefully from the training dataset. Among various kernel types, several simple but effective kernel functions are particularly noteworthy.
Squared-exponential (SE) function is a more widely used kernel function given as [30] where σ SE and l SE are hyperparameters to control the amplitude and length scales. To some extent, SE kernel belongs to a stationary-type kernel in that the correlations between different points are purely affected by the term i − i , leading to a smooth distribution. This would be too strict for capacity degradation data with many local fluctuations; therefore, an alternative is the Matern32 (M32) kernel function as [30] where σ M 32 and l M 32 represent the hyperparameters to control the function amplitude and smoothness, respectively. In practice, due to the limited capture ability of the SE function and M32 function, these isotropic kernels would provide unreliable predictive results for nonlinear mapping that involves multidimensional input variables. For the calendar aging model, the inputs should not only contain the capacity terms, but also involve the storage temperature and cell SOC. In order to extract these features and improve accuracy, the isotropic SE and M32 kernels are modified with the ARD structure [34], as denoted by (11) and (12). with where hyperparameters l T , l SOC , and l C determine the relevancies of temperature, SOC, and capacities inputs with respect to the regression results, respectively. Generally, a large value leads to a low relevancy. For GPR, the "learning" implies the optimization of hyperparameters within the covariance function, using the training dataset. In this article, a standard gradient descent optimizer is used to fit the hyperparameters of GPR through maximizing the log marginal likelihood [30]. Here the threshold of gradient descent optimizer is 2 × e −5 , which is defined by Matlab GPR toolbox. Therefore, ARD structure can be seen as a powerful tool for input features extraction. By using the ARD kernel for calendar aging prediction, irrelevant input features among capacity, storage temperature, and SOC would be effectively removed by fixing large length scales for them, yielding a sparse and explanatory subset of features. Besides, various predictors with different length scales are generated to improve prediction accuracy and robustness.

C. Regression Calendar-Life Model
In order to demonstrate the effectiveness of the proposed GPR+ARD model, a simplified RCL model is also adopted and compared. This RCL model is actually a typical semiempirical model which has been applied in publication [11]. Generally, the capacity loss ΔQ during storage is expressed as a function of the battery SOC SOC sto , storage temperature T sto , and storage time duration t as [11] Then, (13) can be simplified by assuming decoupling. First, both battery SOC and T s versus time, implying that the calendar degradation trend is similar over time and can be shaped by a coefficient as [11] In the literature, the shaping coefficient C(SOC sto , T sto ) is usually assumed to follow the Arrenhius relation, allowing the SOC sto and T sto to be decoupled as [11] C (SOC sto , It is noteworthy that the activation energy, denoted E a in (15), could be approximated by an affine linear dependence of SOC [11], as expressed in (16) For this RCL model, input parameters include the storage SOC, temperature, and time duration. Following the same datasets as GPR, all data from "Group 1" (green cases) are used to train the RCL model, while all data from "Group 2" (yellow cases) are used to validate the trained RCL model. In this article, five parameters (α 1 , α 2 , α 3 , α 4 , and p) require to be identified. An advanced heuristic method named biogeography-based optimization (BBO) is adopted to calculate these parameters by minimizing the RMSE between the predicted values and the real test data through 20 independent runs. More details regarding the BBO technique and the identification procedure can be found in [35]. The corresponding identified parameters are shown in Table II.

A. Performance Comparisons
In this subsection, two comparisons are first conducted to quantify the improvement by using GPR+ARD model for calendar aging prediction.

1) Comparisons of Various Kernel Functions:
To evaluate the performance of different kernel functions in the calendar aging prediction domain, four covariance functions, including solo SE, solo M32, SE with ARD kernel (ARD+SE), and M32 with ARD kernel (ARD+M32), are compared with respect to their training and prediction performance. The initial values of all GPR models' hyperparameters are set through using MATLAB GPR toolbox, and defined as follows: for the solo SE and M32 kernels, σ SE = σ M 32 = 0.1, l SE = l M 32 = 2; for the ARD+SE and ARD+M32 kernels, σ SE = σ M 32 = 0.1, l T = l SOC = 2, l c = 1. Here, Case 5 with the worst training results and Case 6 with the largest one-step prediction errors are specified for performance comparisons. Fig. 6 illustrates the training results by using different kernel functions for Case 5 data. It is evident that experimental capacity presents a nonlinear declining trend during storage with 0.5 SOC and 25 • C temperature. Although solo SE function can capture the overall degradation trend with the largest confidence range, some points are still mismatched especially at beginning. After modifying SE with the ARD structure, as shown in Fig. 6(c), the training performance can be effectively improved. Here the MAE for ARD+SE case is 0.0091, which is 33.1% less than that at solo SE case. For solo M32 and ARD+M32 kernels, it seems that better training results are obtained for both the cases. After using the ARD structure, the MAE for ARD+M32 case becomes 0.0078, which is 8.3% less than that of the solo M32 case. These satisfactory training results are mainly due to the variable length scales of the ARD structure. We can conclude that with the same calendar aging dataset, the training performance can be improved by using ARD-based kernel functions.
After training, the GPR models with different kernel functions are applied to predict the future capacity in different storage conditions. Fig. 7 and Table III illustrate the prediction results  and the corresponding quantitative metrics for Case 6 data. It can be observed that by using solo SE kernel, although the obtained 95% confidence range almost covers the overall degradation trend, the mean prediction values still mismatch the real experimental data in most time points (here the CS and RMSE values are 0.929 and 0.0100, respectively). For solo M32 kernel, the prediction values are all lower than the actual values (here MAE, RMSE, and R 2 become the worst ones as 0.0300, 0.0228, and 0.927, respectively). Besides, the corresponding 95% confidence range distributes in a wide region, implying that high uncertainty is achieved in this case. These prediction failures are mainly caused by the overfitting problem, implying the poor generalization ability of solo kernel structure. In comparison, the predicted values get closer to the real capacity data by using the ARD-based SE kernel, indicating the effectiveness of ARD structure. But several mismatch points still exist, especially after 8000 h points for this case, which means the SE kernel cannot capture the overall capacity degradation dynamics. In Fig. 7(d), by using the ARD-based M32 kernel, the capacity trend is well captured as desired. Quantitatively, the RMSE for Case 6 data here is just 0.0054, which is 76.3% and 34.1% less than the solo M32 case and ARD+SE case, respectively. Besides, the 95% confidence range distributes in a narrow region for such a case, indicating a high credibility for the prediction results. This satisfactory performance is caused by the strong feature extraction abilities of ARD and high robustness of M32 kernel. Accordingly, ARD-based M32 kernel is selected for predicting calendar aging in the following studies.

2) Comparisons of Training Results for GPR and RCL
Models: Next, in order to further evaluate the effectiveness of   Fig. 8 and Table IV illustrate the training results and the corresponding quantitative metrics for different storage conditions, respectively. It is worth noticing that the RCL model gives a general trend of capacity aging without the direct uncertainty quantification for the predicted values. Even for the best fitting results obtained under 0.2 SOC and 10 • C storage temperature, the MAE is larger than those of GPR model, respectively. For the remaining three cases of "Group 2" validation, the corresponding fitting results also present large differences with the measured data, implying that this RCL model is inadequate to capture capacity degradation of our dataset case. The main reason that makes RCL model bad would be the unrecorded initial capacity fading for such dataset. In comparison, by using the ARD-based GPR model, both the overall capacity decline trend and local nonlinear fluctuations are well fitted as desired. From Table IV, the MAE and RMSE for all cases by using GPR model are within 0.005 and 0.0012, respectively. Moreover, the CS values are all equal to 0.954, indicating the high training accuracy and good generalization ability by using our proposed GPR+ARD data-driven model. Fig. 9 and Table V present the prediction results and the corresponding quantitative metrics for "Group 2" cases after full-data training based on the "Group 1" cases. It can be seen that the predicted capacity values for all cases match the actual data well. The trained ARD-based GPR model captures the overall  capacity degradation trends well as the RMSE of all predicted samples are less than 0.006. Besides, among all samples, the maximum MAE value is 0.0122, obtained for Case 6 data. This is mainly caused by the insufficient training data as only Case 5 covers 0.5 SOC condition. However, this MAE is still less than 0.5% capacity range, indicating that high accuracy is also achieved for such a case. For Case 8 with 0.9 SOC and 25 • C storage temperature, the corresponding CS value reaches 0.954, which means that the actual results are all covered within the obtained confidence range. Interestingly, CS values for other cases are all 0.929, implying that the corresponding confidence ranges are also reliable. Therefore, it can be concluded that the full-data-trained GPR model with ARD structure is effective and highly accurate for battery calendar aging prediction under various storage conditions.

C. Partial-Data Training Results
For GPR technique, the inclusion of a larger number of relevant data could lead to explain the data better and learn more underlying mapping information, further resulting in more accurate prediction results, and narrower confidence boundaries. However, collecting calendar aging data under various storage conditions is an extremely time-consuming process in real-world applications. In such a case, it is meaningful to develop a reliable model with a satisfactory accuracy level based on partial training data. To evaluate the partial-data training results and the corresponding prediction performance of the proposed ARD-based GPR model, the capacity data before 8000 h of all "Group 1" cases (nearly 7/3 split) are chosen as the dataset for the training part, while the remaining data are employed as the validation set. Fig. 10 illustrates the results for "Group 1" cases based on the partial-data training. From Fig. 10, it is observed that for various cases with different storage conditions, the capacity values are highly similar to the real data in training phase, indicating that an accurate fitting result is obtained for our GPR model. After 8000 h, apart from Case 9 that still presents the highly similar trend with just 0.0032 MAE, other cases all show more or less differences. Specifically, Case 1 and Case 7 obtain the MAE values of 0.0116 and 0.0154 at 11 520 h, respectively. Case 3 achieves 0.0123 MAE at 10 800 h. It therefore proves that decreasing the training data will result in the information loss of capacity fade in calendar aging, further reducing the extrapolation and generalization performance of the trained model. Even so, by using the GPR model with the ARD structure, all MAE values are still less than 0.5% capacity range, which means that the training results are still reliable. To further evaluate the prediction results of partial-data training, all data from "Group 2" cases are then employed as the validation set.
After training the GPR model based on the partial data from "Group 1" cases, the prediction results for "Group 2" cases are presented in Fig. 11. Moreover, detailed quantitative metrics are examined in Table VI. It is seen that Case 4 presents much higher accuracy in the whole validation process with the smallest values of MAE (0.0112) and RMSE (0.0057), respectively. For Case 2, the corresponding RMSE is 0.0065, indicating that satisfactory overall capacity prediction is also achieved. Here the MAE is 0.0128, caused by a short-period mismatch around 11 000 h. From Fig 11(d), the predicted values present more fluctuations in comparison with those in the full-data training case. Quantitatively, here the MAE and RMSE for Case 8 become 0.0118 (55.3% increase) and 0.0061 (32.6% increase). Even so, the result of Case 8 still presents a satisfactory capacity prediction. In comparison, Case 6 has the worst prediction due to several mismatches occurring after 7000 h. The RMSE for  Case 6 reaches 0.0084, which is 61.5% more than that under the full-data training. This result is reasonable due to the decreased capacity characteristics covered by using partial-data training. However, the MAE for Case 6 is still less than 0.6% capacity range (here is 0.0167), which means that the corresponding predicted accuracy is also acceptable. Besides, the CS values for all cases are all larger than 0.896, implying that the confidence levels are reliable. In conclusion, these facts signify that with a suitable partial-data training, the proposed GPR model is also capable of excavating the useful information, therefore providing reliable and accurate prediction results for calendar aging under various conditions.

D. Multistep Calendar Aging Prediction
Multistep calendar aging prediction is more meaningful in real-world applications as it can provide the entire future trend of capacity degradation. To evaluate the multistep prediction performance of our proposed ARD-based GPR model, a multistep prediction test is conducted for all "Group 2" cases in comparison with the RCL model.
In this test, after obtaining a new predicted capacity value by our GPR+ARD model, a recursive process is iteratively conducted to predict future capacity until the last one is achieved. It should be known that due to the structure as illustrated in Fig. 5, ARD+GPR model requires the information of first k + 1 historical capacity points (k = 2 in the article), while the RCL model just requires the initial one capacity point. Here the comparison between two models is conducted after the k + 1th   Fig. 12 and Table VII illustrate the multistep prediction results and the corresponding quantitative metrics for both RCL and GPR+ARD models, respectively. From Fig. 12, it can be observed that relatively large predicted mismatches exist for cases by using the RCL model (here the worst MAE reaches 0.0472 for Case 6 at 11 520 h), indicating the poor generalization ability of the RCL model for our dataset. In comparison, by using the GPR+ARD model, although several mismatches occur especially in large local fluctuations of Case 4 and Case 6, the entire capacity decline trends are still captured reliably for all cases. These increased local mismatches are reasonable as the predicted errors are accumulated for multistep conditions. Here the RMSE for Case 4 and Case 6 become 0.0102 and 0.0104, which are nearly twice larger than those of one-step prediction cases. These results are mainly caused by the poor training dataset for 0.5 SOC condition (just Case 5 owns the information of 0.5 SOC). However, the worst MAE is still within 0.7% capacity range (here it is 0.0152 for Case 6), indicating that the corresponding multistep prediction results are acceptable for all cases. Moreover, all the obtained uncertainty ranges cover the local fluctuations. It can be concluded that even for the multistep prediction, the developed GPR+ARD model can capture the overall capacity degradation trends well with an acceptable confidence level.

E. Prediction at New Condition Through Accelerated Aging Data Training
In the real world, batteries experience a wide range of storage temperatures and SOCs. Developing a lifetime model based on converting accelerated aging data to predict new degradation  case is another promising research topic [32], [36]. To evaluate the corresponding performance of proposed GPR+ARD model, a test regarding the entirely new condition prediction is conducted in this subsection.
In this test, GPR+ARD model is trained through using accelerated aging data. Specifically, the aging data under the relatively high SOCs and temperatures from Case 5, Case 6, Case 8, and Case 9 are utilized for model training. After that, data from Case 1 that represents an entirely new storage condition are used for validating our proposed model. Fig. 13 and Table VIII illustrate the prediction results of Case 1 after accelerated aging data training and the corresponding quantitative metrics, respectively. One obvious observation is that the obtained uncertainty bounds become relatively wider than the tests from previous subsections. This is hardly surprising given that temperature and SOC in Case 1 are both different from the utilized accelerated aging data. In such situations, the covariance values calculated by the kernel function are smaller, leading to the broader confidence boundaries. However, it is clear that these uncertainty bounds still cover the real data. The overall capacity degradation trend of Case 1 can be captured by the predicted capacity values, indicating that the proposed GPR+ARD model also presents effectiveness for such a case.

F. Further Discussions
Due to the lack of exploiting machine learning-based approaches for calendar aging prediction in the existing published work, for the first time, this article focuses especially on the development of the GPR technique with ARD kernel to achieve satisfactory capacity prediction under various storage conditions. In this article, the calendar aging dataset is acquired from an OEM automotive company with some initial degradation due to the reduced begin-of-life (BOL) capacity of the battery. Then, the observed trends would inevitably decrease the prediction performance of the utilized RCL model, while favoring the stepby-step GPR model. Based upon our test results, several useful observations can be made. 1) In order to take full advantage of the semiempirical model, a well-designed aging test that covers the battery's nominal capacity and considers overhang effects is recommended [28]. 2) In real-world applications, missing data related to any usage and subsequent degradation of a cell is a practical and likely scenario to occur. In such circumstances, our proposed GPR+ARD model outperforms the RCL model with regard to prediction performance and uncertainty quantification.
3) To avoid underfitting problem of pure data-driven technique, a dataset covering enough useful information is suggested in the training phase [14], [30]. Future work could include an effective combination of the proposed GPR technique with battery electrochemical knowledge or electrothermal models, and the performance improvements in research areas such as the conversion of accelerated aging data to predict entirely new degradation cases, and the holistic aging predictions regarding both calendar and cycling modes.

V. CONCLUSION
In this article, effective capacity prognosis under various storage conditions for Li-ion batteries was presented. The GPR technique with ARD kernel was employed to synthesize a data-driven model for battery calendar aging prediction. Based upon the GPR toolbox of MATLAB 2018 with a 2.40 GHz Intel Pentium 4 CPU, our proposed GPR model can be well trained within 10 s. Illustrative results corroborated that the ARD+M32 kernel outperforms other kernels in both training and validation processes (here the MAE and RMSE are less than 0.011 and 0.0055 for all cases). Based upon our measured dataset, such GPR model exhibits improved prediction performance with higher accuracy and better generalization ability. Moreover, the uncertainty level of predicted results can be considered simultaneously. Even for the partial-data training test, multistep prediction test, and accelerated aging training test, the predicted results were satisfactory in terms of the accuracy (here the worst RMSE were less than 0.0105) and the reliable confidence range for various storage conditions. Without any requirements of electrochemical knowledge it is worth noting that the proposed model can be easily extended to other battery types for resilient calendar aging prognosis.