An LSTM-PINN Hybrid Method to Estimate Lithium-Ion Battery Pack Temperature

Physics-based models for battery temperature prediction are often not suitable for online applications due to the large number of fitted parameters, low fidelity results from parameter inaccuracy and unaccounted model dynamics, limited high quality experimental data, and slow convergence of predictions from unknown initial conditions. On the other hand, data-driven models require much less computational power but a large dataset to learn the time dependent behavior. This paper proposes a physics-informed neural network (PINN) to take full advantage of both physics-based and data-driven models. Four comparative studies were performed to investigate the effectiveness of including chamber temperature with two different activation functions, the optimal number of neurons and hidden layers, the dependance on reversible heat generation, and the performance of long short-term memory (LSTM)-PINN model. The results show that the LSTM-PINN with chamber temperature as one of the inputs delivers better prediction accuracy. The LSTM-PINN with an exponential activation function for the chamber temperature has a more accurate prediction for the direct current fast charge (DCFC) test profile and similar prediction accuracy for the grade load (GL) 100 test profile. The root mean square errors (RMSEs) of the LSTM-PINN are 0.57°C for DCFC and 0.52°C for GL 100, respectively. In addition, having 55 neurons and 4 hidden layers gives the lowest prediction error. Furthermore, the improvement by having reversible heat generation is negligible. At last, the LSTM-PINN model has less prediction error than the LSTM model, especially when the battery temperature range is enormous.

order non-uniform model to trace the temperature gradient 99 inside a module under different load profiles. They found the 100 maximum temperature location for most of the cases is the 101 middle of the module. In addition, as the current magnitude 102 increases, the hottest cell moves to the positive side of the 103 module. Furthermore, the temperature difference between the 104 hottest and coldest cells is in reverse relation with the current 105 magnitude. Similar studies were done in [8] and [9] using 106 both heat transfer equation and energy conservation equation. 107 Another factor that plays an important role on the tem-108 perature distribution is cell voltage deviation. A group of 109 researchers used three voltage balancing strategies to mini-110 mize the temperature gradient of a battery array for high depth 111 of discharge applications in [7]. They believe if cells touch 112 one another, heat is usually accumulated in the middle of the 113 module and the distribution is mainly affected by the arrange-114 ment of the cells. However, if there are air gaps between cells, 115 power losses determine the temperature distribution of the 116 module. 117 In addition to the current magnitude and cell voltage 118 deviations, the spacing among cells was investigated in 119 [10] and [11]. Some researchers studied the temperature dis-120 tribution of an 8-cell array by varying the air gap spacing 121 and height of the air inlet and outlet [10]. The eight cells 122 were discharged under normal driving conditions and an 123 easily applied optimization method was employed to find the 124 optimum. Heat transfer equations were used in this study. 125 Similarly, another group of researchers investigated the tem-126 perature distribution for square and rectangular cell arrange-127 ments with forced convection [11]. The heat transfer equation 128 and energy conservation equation were used in the models. 129 It was found that the temperature distribution varies between 130 these two arrangements. The square arrangement had a higher 131 temperature difference than the rectangular one. 132 The design of cooling channels is the last factor reviewed 133 in this paper. Ref. 11 studied how the temperature distribution 134 is affected by cooling channel parameters. With the assump-135 tions that the thermal conductivity of the battery is anisotropic 136 and other properties are homogenous, an energy conservation 137 equation and heat transfer equation were used in their models. 138 Experimental data was used to fit the heat generation rate and 139 state of charge formulas. They found increasing the discharge 140 current will reduce the temperature uniformity and increase 141 the maximum temperature of the pack. 142 Modeling heat distribution is challenging due to requiring 143 high-order physics-based models. A common approach to 144 address this is order reduction. Thus, a reduced lumped model 145 to estimate the temperature distribution was proposed in [13]. 146 In their work, partial differential equations and ordinary dif-147 ferential equations were reduced to algebraic equations. 148 In summary, works [5], [6], [7], [8], [9], [10], [11], [12], 149 [13] use physics-based models such as equivalent circuit 150 model, heat transfer equations and/or energy conservation 151 equation to predict temperature distribution inside an array. 152 However, this approach has three shortcomings. The first one 153 is the number of fitted parameters is usually high [13]  The shortcoming of using data-driven models is to have a 164 large dataset to learn the time dependent behavior.     The remainder of the paper is organized as follows: One of the most common and simple forms of neural net-212 works used in practice is known as artificial neural networks 213 (ANN). These networks are made up of multiple layers where 214 each layer is constructed with numerous neurons. These 215 neurons map the input data to output data by applying a 216 set of weights and offsets to the data to minimize the error 217 between an expected outcome or prediction and a known 218 value. In this case, information only flows in one direction 219 and these types of ANNs are known as FNNs. Other network 220 architectures exist such as bidirectional networks, but these 221 will not be considered for this study because they require 222 more computational power than FNNs. Combining a physics-223 based model with a bidirectional network does not reduce the 224 compute-intensive of the bidirectional network.

225
The cell structure utilized in this analysis is illustrated 226 in Figure 1. The ANN cell takes a vector of inputs at 227 time t denoted as x t . As information flows through the 228 ANN, the input vector is multiplied by a matrix of weights 229 labeled as W xi and an offset vector b i is applied. Lastly, 230 the data is passed through an activation layer which dictates 231 what information is allowed to flow to the output vector 232 Y t . Equations 1 and 2 illustrate the mathematical operations 233 inside the ANN cell to map data from x t to Y t . Taking the cell structure of the ANN, the FNN is created 237 with an input layer, hidden layers, and output layer. The 238 hidden layers are fully connected to all inputs in the vector 239 x t and each layer is composed of numerous ANN cells. 240 Figure 2 gives a high-level illustration of a generic FNN.
where Y t is a vector containing all outputs at time t, W xi is (PINN) in this study to predict battery temperature. In this 258 thermal model, the energy balance is described in (4).
where m is the mass of the battery module in g, C p is the heat irreversible heat generation.
where I is the battery current in A, V ocv is the OCV of the 281 module in V, and V is the battery voltage in V. The term for 282 the reversible heat generation is described below: The above term represents entropic heating and is related to 285 electrochemical reactions with Li ion insertion and extraction 286 between the cathode and anode. The term for the irreversible 287 heat generation is expressed below.
This term describes ohmic loss.

290
This study explores the prediction of total heat generation 291 in two different ways. One way includes only irreversible heat 292 generation presented in (8). The other way is shown in (9) to 293 include both reversible and irreversible heat generation.
where f ir and f re__ir are the parameterized heat transfer mod-299 els for irreversible heat generation and both reversible and 300 irreversible heat generation, respectively, λ 1 , λ 2 , λ 3 , λ 4 , and 301 λ 5 are coefficients for heat generation, thermal conductivity 302 between the center of the module and the outboard cross-303 section of the module, and thermal conductivity between 304 the center of the module and the inboard cross-section 305 of the module, thermal conductivity between the outboard 306 cross-section of the module and air, thermal conductivity 307 between the inboard cross-section of the module and air, 308 respectively. In PINN, no additional effort is required to find 309 these coefficients. The Adam optimization method obtains 310 the optimum coefficients during the training process.

312
PINN conducts learning from the data and physics laws. 313 To accommodate this objective, the loss function of PINN 314 contains three terms to bias learning to the physics law. 315 The mathematical expression of the loss function in the 316 physics-informed neural network is defined as follows: where Loss r is the loss term associated with the residual 319 between the predicted and the measured temperature, Loss f 320 is the loss term related to the physics law, Loss i is the loss 321 term estimated from the initial condition, and α and β are 322 the normalized coefficients, respectively. The mathematical 323 expression for each loss term is provided in (11) -(13). 324 where N is the number of training data points,x i is the 328 predicted temperature, x is the measured temperature, f is 329 the rearranged physical law equation, f (t = 0) is the value 330 estimated from the physic law when the time is 0, and x i is 331 the initial measured temperature.  (15) and (16).
where γ is a tunable hyperparameter and 0.9 is used in     Figure 6). After that, the predictions of the 387 LSTM model (Temp1 and Temp2 in Figure 6) along with the 388 other inputs feed into the PINN model to generate Temp 3 389 in Figure 6. The PINN then predicts the temperature in a 390 different position of the interested module. The overview of 391 the LSTM-PINN hybrid model is provided in Figure 6.    To generate the test data used for training and testing the 427 LSTM-PINN, various drive cycles listed in Table 2 were con-428 ducted on the battery pack using the test equipment outlined 429 previously. These drive cycles are multi-cycle test (MCT), 430 Vmax, US06, federal test procedure (FTP) 20, direct current 431 fast charge (DCFC), and grade load (GL) 100. All data was 432 recorded at 1Hz for use in the LSTM-PINN model. These 433 drive cycles with different charge and discharge patterns were 434 selected to evaluate the ability of the LSTM-PINN model 435 to learn the effects of reversible heat generation. Also, the 436 test profiles contain a rest time and a large current pulse 437 which could disturb the temperature prediction to evaluate 438 the robustness of the LSTM-PINN model during these events. 439 The initial temperature column of Table 2 where N is the total number in the test data, i is the variable, x i 487 is the true value in the test data andx i is the prediction value.

489
In this section, the results of the four comparative stud-  , the chamber temperature of the application is 503 assumed to be constant. Therefore, their PINN architecture 504 does not have the chamber temperature as one of the inputs 505 to the PINN. However, the chamber temperature often varies 506 as heat exchanges between the chamber air and the pack. 507 Therefore, this paper treats the chamber temperature as a 508 variable. By using it as one of the inputs to the LSTM-PINN 509 model, the effect of chamber temperature is analyzed. The 510 analysis in this section compares three different predictions 511 from the three different LSTM-PINN structures. One is the 512 LSTM-PINN without the chamber temperature input layer. 513 The other two are the LSTM-PINN with the chamber tem-514 perature input layer but different activation functions: sine 515 and exponential. Figures 9 and 10 provide the prediction 516 outcomes with the mean square errors in Tables 4 and 5. The results show that the LSTM-PINNs with the chamber 518 temperature in the input layer have a more accurate and sta-519 ble prediction. For the prediction made by the LSTM-PINN 520 without the chamber temperature input layer, in both test 521 FIGURE 10. With and without chamber temperature inputs study for GL100 test profile. profiles, the initial temperature prediction has the most sig-   Table 6 and 548 Table 7 show the outcome of the test results.   neurons are the best performer for 3, 4, 5, and 6 hidden layers, 552 respectively. 4, 3, 4, 5, and 3 are the best performer for 45, 50, 553 55, 60, and 64 neurons, respectively. However, the prediction 554 accuracy table for the GL100 test profile shows no large error 555 difference due to the neuron number and hidden layer number 556 changes. Thus, the subsequent two comparative studies used 557 55 neurons per hidden layer and 4 hidden layers.

559
One of PINN's benefits is its acceptance of the incomplete 560 physical law [15]. Two battery thermal models were incorpo-561 rated into the proposed LSTM-PINN to analyze the effect of 562 the incompleteness in the physical law with the LSTM-PINN. 563 One battery thermal model only contains irreversible heat 564 generation. The other battery thermal model includes both 565 irreversible and reversible heat generations to describe the 566 physics better. The two battery thermal models are provided 567 in (8) and (9). The results of the temperature prediction with 568 the test profiles are provided in Figure 11 and Figure 12.   LSTM model which solely relied on data to train the pre-594 diction method, there is a higher prediction error where the 595 geometric location of the prediction is different from the loca-596 tion of the input temperature in the LSTM model. The pro-597 posed LSTM-PINN model overcomes the drawback of the 598 LSTM model by implementing the physical law along with 599 the loss function modification, the adaptive coefficient in the 600 loss function and PINN architecture into the neural network 601 structure.