A Physics-Informed Machine Learning Approach for Estimating Lithium-Ion Battery Temperature

The physics-informed neural network (PINN) has drawn much attention as it can reduce training data size and eliminate the need for physics equation identification. This paper presents the implementation of a PINN with adaptive normalization in the loss function to predict lithium-ion battery cell temperature. In particular, the PINN was trained with the actual battery test data, and a lumped capacitance lithium-ion battery thermal relationship was applied to the loss function with the addition of a pre-layer and connection layer to the neural network architecture. The PINN architecture shows the most accurate battery temperature prediction compared with the fully connected neural network (FCN) and its variants evaluated in this study. The proposed PINN architecture has a mean square prediction error of 0.05 °C with a limited number of training data and without battery thermal model identification.


I. INTRODUCTION
As mitigation of global warming has become one of the critical social agendas and because transportation accounts for 14% of global carbon emissions, the electrification of motor vehicles has accelerated [1]. In recent vehicle electrification, vehicle manufacturers have selected lithium-ion batteries as a primary energy storage system to power vehicles for their higher energy density, safer and easier use, and lower cost than other energy storage and conversion devices such as supercapacitors and fuel cells. However, a poorly managed and operated lithium-ion battery has potential performance and safety risks that could cause early life degradation and hazardous events such as thermal runaway. Due to these characteristics of the lithium-ion battery, it requires fine controls during operation and storage. Recent studies on the battery management system (BMS) have proposed methods to improve the control of the lithium-ion battery by estimating the battery behavior and states such as state of The associate editor coordinating the review of this manuscript and approving it for publication was Shunfeng Cheng. charge (SOC) [2], [3] and state of health (SOH) [4], [5] to operate the battery within a safe and efficient range.
In addition to the SOC and SOH, battery temperature is one of the significant factors influencing the safety and performance of the lithium-ion battery. For most lithium-ion batteries, the best performance in terms of efficiency and safety can be achieved in the temperature range between 20 • C and 40 • C [6]. Any battery operation at a low temperature will result in performance degradation due to its high resistance, and an extremely high temperature may potentially induce hazardous events such as thermal runaway [7]. The BMS monitors and controls the battery temperature to reduce the risk of operating the lithium-ion battery at an undesired temperature. However, due to physical space constraints and costeffectiveness, the number of temperature sensor locations in a battery pack is often limited. Therefore, developing a method to predict the temperature evolution of lithium-ion batteries is necessary to control the battery usage and to design the battery pack structure and cooling systems.
With the recent advances in data-driven methods, highlighted by deep learning, many works in the literature have proposed data-driven methods to solve the technical challenges related to lithium-ion battery technology. In the literature, neural networks have been used in SOC prediction [8], [9], [10], [11], [12], [13], SOH prediction [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], model parameter identification [24], [25], [26], [27], [28], abnormality diagnosis [29], and voltage estimation [30]. For battery temperature prediction, a combined fully connected neural network (FCN) and long short-term memory (LSTM) was implemented to estimate the battery surface temperature [31]. Also, GRU alone was applied to estimate the core battery temperature [32]. The data-driven methods presented in the literature showed high prediction accuracy and reduced computation power during the prediction compared to other numerical methods. Furthermore, the data-driven method is free from the system identification that the model-based method requires. However, the aforementioned data-driven methods rely solely on battery tests or simulation data to learn the battery behaviors. The drawbacks of this method involve the time and cost of acquiring and generating data and the reduction of the prediction accuracy when training data scarcity causes extrapolation during the prediction. Also, it is sometimes not practically feasible to obtain the test data due to test setup limitations. To overcome these drawbacks, a physicsinformed neural network (PINN) has recently been introduced [33]. This method incorporates data and physics laws to learn the machine learning problem by applying learning bias to the loss functions, constraints, or inference algorithms. Table 1 compares the characteristics of the model-based methods based on physics or first principles, the conventional data-driven methods based solely on the data, and the PINN that bridges the two conventional prediction methods. In the literature, applications of the PINN have already been published in fluid dynamics [34], [35], [36], [37], [38], solid mechanics [39], optics [40], metallurgy [41], and earth system science [42] with successful predictions.
This study proposes a novel PINN model to predict lithium-ion battery cell temperatures. This work further develops the PINN developed in [44] to predict the battery cell temperature. In the PINN proposed in [44], no heat generation is considered for engineering manufacturing applications. However, in battery temperature prediction, the heat generation during battery operation is a significant factor. In the PINN, the lumped capacitance lithium-ion battery thermal model is incorporated in the loss function with the adaptive coefficient, pre-layer, and connection layer in the neural network architecture. This PINN model is beneficial for the following reasons: 1. Accurate prediction with data scarcity 2. Improved prediction accuracy with simple physics 3. No need for the model identification The rest of the paper is organized as follows. Background information regarding the battery thermal model and its PINN is provided in Section II and Section III, respectively. Section IV includes the methodology of the study, including the battery test and neural network training. Section V provides a comparison study between PINN and FCN. Also, in the second half of this section, the activation function of the pre-layer in the PINN architecture is further optimized to improve the lithium-ion battery cell temperature prediction. Lastly, the conclusion of the paper is presented in Section VI.

II. BATTERY THERMAL MODEL
This study implements a lumped capacitance thermal model in a PINN to predict the battery cell temperature during the battery cell test. This thermal model assumes the uniform temperature distribution of the thermal object. During a low-rate charge and discharge, battery operation and air cooling were applied so that the temperature distribution of the battery cell was maintained without a large temperature gradient. With this assumption, the energy balance equation around the entire body of the lithium-ion battery cell during the battery cell test is provided as follows: In (1), m is the mass of the lithium-ion battery cell, C p is the heat capacity of the battery cell, T is the battery cell temperature, t is time,Q is the heat generation during the battery operation, h is the convectional heat coefficient, A is the surface area, and T amb is the ambient temperature inside the climate chamber.
88118 VOLUME 10, 2022 For the heat generation term in the energy equation, only the irreversible heating of the battery cell is considered as the primary heating source. Although other forms of minor battery heat source elements such as reversible heating are not included in the thermal equation, causing incompleteness in the physics model, the PINN still learns the effects of other minor heat sources from the battery test data. The irreversible heat of the battery cell is formulated as follows: In (2), V is the battery voltage, V ocv is the open-circuit voltage, and I is the current applied to the battery cell during the battery cell test.
For implementation in the physics-informed neural network, (1) and (2) are combined and rearranged as follows: In (3), f is the final physics equation applied to the PINN, T is the battery cell temperature, t is time, V is the battery voltage, V ocv is the open-circuit voltage, I is the current applied to the battery cell during the battery operation, T amb is the ambient temperature inside the climate chamber, and λ 1 and λ 2 are the coefficients after combining the two equations.

III. PHYSICS-INFORMED NEURAL NETWORK
In this paper, three approaches are considered to construct physics-informed neural networks. Those approaches are a loss function containing physics equations, an adaptive normalization factor usage in the loss function, and a PINN architecture derived from the analytical solution of the physics law.

A. LOSS FUNCTION WITH PHYSICS INFORMATION
To promote bias learning in the solution of the lumped capacitance thermal model, the loss function used in the neural network training requires the inclusion of the energy equation. Therefore, in physics-informed neural networks, multiple loss functions are present to minimize the residual between the predictions and true values and the estimation error of the lumped capacitance thermal model. For the first loss function related to the residual between the neural network predictions and the true values, the mean square error is selected as the loss as in other regression neural network cases. This loss function is presented as follows: In (4), N is the number of training data, T i pre is the temperature predicted by the neural network, and T i is the true temperature from the battery cell test data.
For the second loss function related to minimizing the estimation error of the lumped capacitance thermal model, f in (3) is applied to the loss function as follows: In (5), N is the number of training data and f is defined in (3).
In addition to these two loss functions, this study also includes one additional loss function that is related to the initial condition in which the current is not applied to the battery cell and the temperature of the battery is equal to the ambient temperature. This loss function is formulated as follows: In (6), f (t = 0) is the initial value of f , which is defined in (3). T amb is the ambient temperature inside the climate chamber.
In sum, the loss function to minimize the PINN combines (4), (5), and (6). The combined loss function is formulated as follows: Loss total = Loss r + αLoss f + βLoss initial (7) In (7), α and β are the scaling factors applied to normalize the loss terms in the loss function. The following section will discuss the method to estimate the scaling factors.

B. ADAPTIVE NORMALIZATION FACTOR
Unlike the loss function with a single loss term found in most neural network methods, the loss function of the PINN consists of more than two loss terms. One is the residual loss, and another loss is related to the physics and boundary conditions. In this study, the learning rate annealing algorithm proposed in [43] was implemented to estimate the scaling factor.
In the first step of the learning rate annealing algorithm, the instant scaling factor is calculated by computing the ratio between the maximum backpropagation gradient of the residual loss and the mean backpropagation gradient of other loss terms. In this study, the instant scaling factorsα andβ are computed as follows: In (8) and (9),α andβ are the instant scaling factors. ∇Loss is the backpropagation gradient of the loss terms concerning the change in the weight in the neural network layers. In the second step of the algorithm, the scaling factors are computed from the moving average between the previous scaling factors and the instant scaling factors as follows: (10) β = (1 − γ )β previous + γβ previous (11) In (10) and (11), γ is a tunable hyperparameter, which is recommended to be 0.9 in the study conducted by in [43]. This recommended value is implemented in this study because it produced the convergence and reduction of the loss function as expected. After computing the scaling factor, a gradient descent method is used to update the neural network weight VOLUME 10, 2022 and bias during the training process. In the training process, the recommended learning rate is 0.001.

C. PHYSICS INFORMED NEURAL NETWORK ARCHITECTURE
In addition to the loss function modification and adaptive normalization factor implementation, the architecture of the PINN is another way to improve the prediction accuracy of the network. For instance, in the research work published in [44]., the 1D thermal heat transfer equation was solved by PINN. In [44], the authors alter the architecture of the FCN with a pre-layer structure based on the analytical solution of the 1D thermal heat transfer equations to improve the prediction accuracy. However, in [44], the thermal model does not contain the heat generation term. In this paper, various neural network architectures, including various forms of the pre-layers, are evaluated to find the neural network architecture with the most accurate prediction in which the thermal model contains the heat generation term due to the battery heating during the operation. Table 2 lists all neural network architectures reviewed in this study. Figure 1 shows a visual presentation of various neural network architectures.

IV. METHODOLOGY
This study obtained the training data from the battery cell test. In this section, the description of the battery cell test is provided. Then the data preprocessing and hyperparameters used in the training process follow. This section will be concluded with the prediction evaluation.

A. LITHIUM-ION BATTERY CELL TEST
In this study, battery test data such as battery voltage, battery current, battery temperature, and chamber temperature were collected and applied to the PINN as the inputs and output. Open circuit voltage (OCV), another input in this study, was estimated from an open-circuit voltage and state-of-charge table provided by the battery cell manufacturer. In the battery test, a prismatic lithium-ion battery cell was discharged and charged with 5A current pulses with 20 minutes of rest time at 25 • C. The details of the battery cell are provided in Table 3. The test specimen was placed in the climate chamber in the battery test setup as presented in Figure 2.
The thermocouples, voltage sensors, and power lines were connected to the battery cycler and data acquisition system in the test setup. Figure 4 shows the charge and discharge cycles performed during the battery test. 35% of this battery cycle test was allocated for training the physics-informed neural network.

B. TRAINING
After the test data were collected in the battery cycle test, the data were preprocessed before being fed to the neural network for training. The normalization conducted in the preprocessing was performed with the equation provided as follows:x = x − x min x max − x min (12) In (12),x is the scaled data, x is the data, x min is the minimum of the data, and x max is the maximum of the data. After the preprocessing of the data, the first 35% of the data was reserved for training the neural networks. This training data size was selected to cause a limited training data size. Details of the data scarcity will be revisited in a later section of this paper.
For the hyperparameter tuning, this paper refers to another research paper that implemented PINNs in a thermal application [44]. For the hyperparameters for which there are no references available, such as training iteration, hidden layer, and pre-layer, the design of experiment was conducted to find the best combination of the hyperparameters. In the results and discussion section, the activation functions in the pre-layer are tuned with the full factorial design of the experiment. Table 4 provides the list of the hyperparameters selected in this study.

C. PREDICTION EVALUATION
After the training of the neural networks, the entire dataset from the battery cycle test data was fed into the neural network to evaluate the prediction accuracy of the neural network. For this, mean absolute error (Max AE) and maximum absolute error (MAE) are applied in this study, and they are formulated as follows: In (13) and (14), MAE is mean absolute error, Max AE is maximum absolute error, n is total number of data points, y i is prediction, and x i is true value.

V. RESULTS AND DISCUSSION
In this section of the paper, the comparison study between PINN and FCN is presented to show the benefits of the PINN  over the conventional neural network method, FCN. Also, PINNs with various pre-layer and connection layer designs are reviewed in order to propose the neural network topology that enhances the prediction outcome.

A. PINN VS. FCN
To review the effectiveness of the PINN, a comparison study was conducted to evaluate the prediction accuracy between PINN and FCN with a limited training data size. FCN is a conventional neural network method with the loss function containing only the mean square error between the prediction and the actual data with a hidden layer structure located between the input (time, current, voltage and OCV) and output (battery cell temperature) layers. The proposed PINN in this study has three aspects of improvement over the FCN. First, the loss function of PINN has additional terms involving physics laws. Second, the loss function of PINN has the adaptive coefficient. Third, pre-layer and connection layer structures are added to the neural network architecture. In the evaluation, five cases are reviewed to conduct the comparison study and analyze the significance of implementing the three PINN aspects. Table 5 includes a list of all five study cases. For assessing the prediction performance, the prediction evaluation techniques discussed in the previous section are applied with a qualitative observation of the prediction accuracy at the peaks of the battery temperature profile. The peaks are the areas with abrupt slope changes, and FCN requires dense training data to make a high-quality prediction [44], [45]. This characteristic of the FCN limits its prediction ability with the training data scarcity. In this study, the training data scarcity is prepared by using only 35% of the entire data for the training. One prominent peak and two small peaks are in the training data. Three prominent peaks and four small peaks are placed in the test data. FCN and PINN are  challenged to make predictions with the small training data at these peaks.
In the test results of the FCN case, Figure 5(a), FCN has a limited prediction accuracy that is the second-worst prediction of all five cases. In the prediction, no peak location is correctly predicted. This imprecision is due to the lack of training in the peak area, which demands a rich amount of data [44], [45]. In the results of the FCN with the loss function containing physics law-based loss terms with unit coefficients, Figure 5(b), a similar prediction inaccuracy as the FCN case is observed. No peak location is identified. In the results of the FCN with the loss function with adaptive coefficients, Figure 5(c), some prediction improvements in the small peak locations are observed, but the prediction error is still high due to the prediction divergence at the large peak areas. The results of the two cases with the loss function modification, which is the most popular aspect of PINN, show that incorporating the physics laws into the loss function is not enough to develop an accurate PINN. This outcome is also presented in another study of the thermal application of the PINN [44]. In the test results of the PINN cases, Figures 5(d) and 5(e), both prediction accuracy and peak location identification are better than in the previous cases with FCN and its variants. The case of the PINN with the concatenated connection layer has the best prediction accuracy among all five cases. It also correctly identifies all peaks in the test data. In the case of the PINN with the multiply connection layer, it is less accurate than the former case and the temperature prediction deviates from the true temperature profile at the prominent peak. However, the prediction accuracy is considered to be at an acceptable level because the mean absolute error and maximum absolute error are less than 0.5 • C, which is the measurement tolerance of the thermocouple. Table 5 contains the mean absolute error and maximum absolute error of all five cases considered in this study.
In summary, the test results in this section demonstrate the benefits of PINN over FCN when a limited amount of training data is available. This study also indicates that all three PINN aspects should be presented together to enhance the prediction accuracy. However, the PINN architecture developed in this section still needs to be optimized since the test results show some areas requiring accuracy improvement. To address this concern, this paper conducts another study to find the optimum pre-layer combination for the PINN architecture.

B. EFFECT OF PRE-LAYER
As discussed in the previous section, the proposed PINN architectures still show inaccuracy during the tests. In this section, a new full factorial design of experiment study is conducted to find the optimum pre-layer and connection layer combination. In [44], the sine and exponential activation functions in the pre-layers are recommended based on the analytical solution of the thermal equation. The design of experiment theory outlines sixteen possible experiment combinations with the two possible connection layer structures, as shown in Table 6.
In the test results for the cases with the pre-layers connected to the multiply layer in PINN, the predictions made by most of the architectures are not converged and show significant prediction errors. However, for cases 6 and 8, the prediction errors are within the reasonable mean square prediction error size, which is less than 0.5 • C. In case 6, however, the temperature prediction diverges from the true value with a different profile trend from the true values. Case 8 shows a not-well-trained portion at the vertices of the 88122 VOLUME 10, 2022  profile, which leads to inaccurate prediction during the test. Table 7 shows the mean absolute error and maximum absolute error of the first eight cases in Table 7 when the multiply layer is implemented to connect the hidden layer in PINN. The battery temperature profile and predicted battery temperature profiles are presented in Figure 6.   In the test results for the cases with the pre-layers connected to the concatenate layer in PINN, case numbers 10, 12, 13, 14, 15, and 16 have a prediction error of less than 0.5 • C. Among them, cases 14 and 16 show good convergence to the true value with low mean and maximum absolute errors. Among all cases studied in this section, the pre-layer architecture presented in case 14 (bolded in Table 8) has the most accurate battery temperature prediction with the lowest absolute prediction errors. Table 8 shows the mean absolute error and maximum absolute error of the last eight cases in Table 6. The battery temperature profile and predicted battery temperature profiles are presented in Figure 7. Based on the test results and evaluation of the battery temperature prediction accuracy, this study proposes a PINN architecture with the pre-layers containing exponential activation functions for the time, battery voltage, and open-circuit voltage and containing sine activation functions for the current input (case 14), with the concatenated layer as the connection layer for battery temperature prediction in the case of applying the lumped capacitance thermal model to the PINN.

VI. CONCLUSION
This study proposes a PINN to predict lithium-ion battery cell temperature, which is a piece of essential information for safe and robust lithium-ion battery operation. The main contributions of this paper are summarized as follows: (1) we developed a PINN by inserting the energy balance law into the loss function, implementing adaptive normalization to the loss function, and improving the neural network architecture with the pre-layer and connection layer; (2) we conducted a comparative study between PINN and FCN, which is the conventional neural network method, to prove that PINN is superior to FCN in predicting the battery temperature with limited data size and unidentified physics equations; and (3) we further investigated various pre-layer and connection layer architectures to find the PINN architecture with the highest battery temperature prediction accuracy. The results show that a PINN architecture with pre-layers containing exponential activation functions for the time, battery voltage, and open-circuit voltage and containing sine activation functions for the current input (case 14) and with a concatenated layer outperforms other architectures for battery temperature prediction with the highest prediction accuracy of 0.05 • C.

VII. FUTURE WORK
In future work, a PINN study with the actual vehicle driving profiles with various chamber temperatures will be analyzed with a battery pack. This future study will analyze the effect of the driving profile, chamber temperature, and more significant battery pack size on PINN implementation.