Virtual Thermal Sensor for Real-Time Monitoring of Electronic Packages in a Totally Enclosed System

A monitoring system is essential for controlling temperatures under safe levels of operation. It is often challenging to attach temperature sensors directly to drive chips owing to the operating environment or geometric challenges. Based on this motivation, we present a model-based virtual thermal sensing technique for the real-time temperature monitoring of the electronics package. A few real sensors located far from the target position are utilized in this virtual sensing system. These are then connected to a well-tuned finite element model for data augmentation utilizing an inverse heat conduction framework. Therefore, the virtual sensor allows us to estimate the temperature without the aid of a sensor installed inside. However, this technique has a stability issue because it is classified into an inverse problem (i.e., an ill-posed problem). We propose a Tikhonov regularization method to address this challenge, including an efficient ridge estimator. The ridge estimator is used to select an optimal regularization parameter so that we can obtain the stable and reliable inverse solution. Since conventional ridge estimators rely on total transient errors, they require a significant computation. The proposed estimator is based on the bias and variance errors, not the total errors, which allow us to efficiently find the optimal parameter. In this paper, the thermal model is modeled using the finite element method, and the Krylov subspace-based model order reduction is employed to reduce the computational burden. Finally, the proposed virtual thermal sensor was experimentally validated utilizing a sealed cylindrical structure in which the commercial servo drive operated.


I. INTRODUCTION
This study presents a virtual thermal sensor (VTS) for the real-time monitoring of electronic packages. Recently developed electronic packages are compact enough to fit in various limited spaces; thus, no control cabinet is required. Accordingly, drive modules integrated with the electronics package have been designed, leading to more critical thermal challenges; therefore, it is necessary to monitor temperatures of electronics package. However, it is challenging to attach temperature sensors directly (or appropriately located) to the drive chips because of the enclosed The associate editor coordinating the review of this manuscript and approving it for publication was Jiajie Fan . structure or operating environment. The VTS can be utilized to provide cost-effective alternatives to costly and/or impractical physical measurement instruments. The VTS solely requires a few temperature sensors installed at an easily accessible location in the machine to estimate the thermal quantity of interest. It can be applied in several applications, such as electric vehicle batteries, electric motors, semiconductor manufacturing thermal processors, transformers, etc. [1]- [4]. For a similar purpose, the virtual sensor technique has also been studied for structural vibration problems [5]- [8].
The proposed VTS is based on the inverse heat conduction problem (IHCP) [9], [10]. The inverse problem is the process of calculating from a set of measurements and is VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ widely used in image reconstruction [11]- [14] and virtual sensing [5]- [8]. Developing the VTS starts from modeling the heat transfer model. Updating thermal parameters and the filtering measurement errors are particularly added for accuracy and stability, respectively. It is also essential to determine the location and number of sensors. When estimating only one heat source, the closer the sensor is to the heat source, the better it is. Unconditionally increasing the number of sensors rather amplifies the measurement noise and should be avoided. When estimating multiple heat sources, it is desirable to select the location and number of sensors to have linearly-independent sensitivity coefficients with large magnitude [9], [10]. For heat transfer modeling, the finite element method (FEM) [15] is applied here. FEM provides highly accurate numerical results, but it is challenging to be employed for real-time monitoring because of its high computational burden. This can be alleviated by using the lumped thermal model [1], [3], [4], [16], but the lumped model may cause significant estimation errors in the IHCP. Another popular approach to reduce the computational cost is the model order reduction based on the dominant basis [17], [18]. The eigenvectors are commonly used as the basis in structural problems because they have physical meaning, i.e., structural mode shapes. Since the responses of the domain in the frequency band of interest are described principally by dominant mode shapes in its frequency band, this basis is useful in structural vibration. However, unlike structural problems, thermal modes do not exhibit resonance and hence, the behavior of a thermal system at a frequency is generally not dominated by a few number of eigenvectors. Therefore, for thermal problems, it is recommended to use a Krylov basis [19]- [22] rather than the eigenvector. The Krylov subspace technique is the moment-matching method. This method requires locating samples where the moments are matched, and the performance of this method in thermal problems has been validated by many studies.
During the FE modeling process, the thermal model parameters of the numerical model are tuned to reflect the real model well. Test measurement data are used to revise the FE model. This task is commonly referred to as the model update [23], [24]. Most of this research has focused on two approaches in thermal problems: the steady-state method [25] and the transient method [26]- [29]. Steady-state methods commonly require long duration experimental runs and it is difficult to calibrate the specific heat capacity. Therefore, the transient approach based on the IHCP with a relatively brief duration experimental run is employed in this work to achieve the real-time monitoring. The conjugate gradient method based on the whole-time domain [10], [26], [27] was used to solve the IHCP. Three thermal model parameters, namely, thermal conductivity, specific heat capacity, and thermal contact conductance, are updated in the target electronic packages. It is difficult to adjust the three parameters simultaneously. Thus, at the component level, the thermal conductivity and the specific heat capacity are concurrently revised first, and the thermal contact conductance is adjusted in the assembly model.
In addition, measurement errors should be carefully controlled in VTS because small errors are dramatically amplified through the inverse problem. To address this issue, many researchers have developed the stabilizers such as the sequential function specification method (SFSM) [9], regularization method (RM) [30], [31], iterative regularization (IR) [32], combined SFSM-RM [33], and randomized singular value decomposition [34]. These methods provide good stabilization performance, but those may require additional computational burden at every time step to reduce the ill-conditioning of the gain coefficient matrix. However, the Tikhonov regularization method [30], [31] only has a computational load when selecting the optimal Tikhonov parameter in pre-processing, not real-time sensing. Therefore, the Tikhonov method is considered in the proposed VTS.
The optimal Tikhonov parameter can be selected by the ridge estimator, such as the Morozov discrepancy principle [35], [36], generalized cross-validation (GCV) [37], [38], and L-curve rule [39], [40]. The most commonly used method is the Morozov discrepancy principle, which is relatively accurate and easy to use. This method can only be used when prior information is known [35], [36]. Currently, the noise level of temperature is well known; thus, the Morozov discrepancy principle is one of the best methods to recover the temperature history as accurately as possible, but it cannot guarantee the heat source estimation [41]. In addition, if the system size and data measurement time are increased, the conventional ridge estimator requires significant computational costs. This is because the discrepancy principle needs to calculate transient solutions at each regularization parameter discretized over a specified range. To improve this problem, we utilized an efficient ridge estimator proposed in the optimal hybrid parameter selection algorithm [42].
The key advantage of an efficient ridge estimator does not depend on FE transient solutions, unlike the conventional ridge estimator. Thus, regardless of the operating time of the system, the optimal Tikhonov parameter can be estimated quickly. This is possible because it is not based on the total square error but the sum of the bias and the variance errors. Each of the two errors is a function of the maximum change in heat flux and the noise level of the sensor, not transient solutions. In this study, a new ridge estimator for temperature estimation, rather than heat flux, is derived and used to maximize the performance of the real-time temperature sensing of the VTS.
Finally, the proposed VTS can be applied for the real-time monitoring of electrical machines with complicated heat flows and is accurate and stable. The methodology is demonstrated using a commercial servo drive, Gold Solo Twitter, which delivers up to 5 kW . The commercial servo drive operates inside a sealed cylindrical housing. This paper is organized as follows. Section II reviews the thermal modeling and presents a solution algorithm that includes the block diagonal formulation based on the Schur complement. Section III introduces the model update. Section IV suggests the virtual sensing algorithm using the Tikhonov regularization method and the ridge estimator. Section V evaluates the feasibility and performance of the proposed VTS. Section VI presents the conclusions.

II. THERMAL MODELING
This section briefly reviews the finite element model of the heat equation and the time integration scheme. We then introduce a model order reduction technique based on the Krylov subspace techniques [19], [22] for real-time sensing.

A. FINITE ELEMENT MODEL
The governing equations for the temperature fields can be approximated by the discretization method as the following algebraic differential equation [15]: where C and K are the heat capacity and heat transfer matrices, respectively. T and q are the temperature and thermal loading vectors, respectively. If the total number of degrees of freedom (DOF) is defined as N g , the matrix size of C and K is (N g × N g ). The vector size of T and q is (N g × 1). The relationship between the thermal parameters and the system matrices is presented in TABLE 1.
To solve (1), we employ the Backward Euler method, which is given: where subscript m denotes a time step, and time t is represented as the discretized form t m . t denotes the time step size (or measurement time interval) and is defined as ( t = t m − t m−1 ). Then, we can obtain as follows: We can now construct a measure of the residual error with T m and the measured temperature data to estimate the unknown parameters.

B. MODEL ORDER REDUCTION
The Krylov model reduction method is based on the conservation of the transfer function of the system. Assuming K is non-singular, Equation (1) can be rewritten as follows: In the Laplace domain, the transfer function H(s) can be defined with the Laplace variable s and its Taylor expansion around s = 0 leads to where the vector m k+1 is the moments of the transfer function and defined as m k+1 = M k b. Therefore, the Krylov subspace of a N g -dimensional vector space has a l-dimensional subspace where l ≤ N g . The projection matrix V for model reduction is defined as: where colspan denotes the span of the columns. It can be efficiently computed using Arnoldi [43] or Lanczos algorithms [44]. Finally, we can obtain the reduced-order model using V as follow: where the upper tilde( ) refers to the reduced quantity.

III. MODEL UPDATE
This section introduces a method to calibrate the thermal conductivity, specific heat capacity, and thermal contact conductance. The thermal conductivity and specific heat capacity are tuned simultaneously, and the thermal contact conductance is calibrated in an assembled state. The thermal parameters can be updated by minimizing the squared residual errors, and the sum of the squared residual errors is defined as: where Y m andT m are the vectors containing the measured and calculated temperatures at a certain time step m, respectively, VOLUME 10, 2022 and, L is the Boolean matrix that maps the vector to the components corresponding to the sensor positions. The superscript T indicates the transpose of the vector. I , N , and M are the number of sensors, the number of unknown parameters, and the number of time steps, respectively. To minimize S(P), we implement an iterative process based on the Conjugate Gradient method [10], [27], and the iteration step consists of the following steps: 1) Initialization of unknown parameters.
2) Construct the finite element heat transfer model: where superscript (r) denotes the number of iterations of the CG method, and sets r to 1 at first. 3) Determine the sensor information and compute the temperature using (3). 4) Calculate the gradient direction by differentiating (8a) with respect to the unknown parameters: where ∇S (r) j and X (r) m j , j = 1, · · · , N , are the j-th gradient direction and the j-th sensitivity coefficient vector, respectively. 5) Compute the conjugation coefficient, which is given by the Fletcher-Reeves [45] expression in the form: 6) Calculate the direction of descent d (r) : Determine the search step size β (r) : 8) Unknown parameters are updated as: 9) By the discrepancy principle [36], [46], the iterative procedure is stopped when the following criterion is satisfied: where ε is the tolerance. In general, the value of ε can be obtained by the standard deviation of the measurement noise using the assumption for the temperature residuals in the discrepancy principle. 10) Increase r by one and go to the second step.

A. THERMAL CONDUCTIVITY AND SPECIFIC HEAT CAPACITY
The thermal conductivity and the specific heat capacity are temperature-dependent parameters in practice. If the linearity of the material is strong within the operating temperature range of the machine, these two parameters can be assumed to be constant. In this case, the sensitivity coefficient matrix in (10) can be defined as follows: with where the sizes of X

B. THERMAL CONTACT CONDUCTANCE
The factors influencing the thermal contact conductance include contact pressure, interstitial material, surface deformation, surface roughness, waviness, and flatness [47], [48]. Since the quantification of these factors can only be measured in a statistical sense, it is very challenging to consider all. Therefore, we assume thermal contact conductance as a constant and estimate the constant based on the inverse heat conduction problem. To do this, the following assumptions are considered to address the inherent complexity of the thermal contact problems: 1) Contact surfaces are clean.
2) Only bulk material cases are considered, not thin-film material.

3) Contacting solids have isotropic thermal and physical
properties. 4) Contact surfaces are not warped or broken in the operating temperature range, and thermal resistance changes due to increased temperature may be ignored. 5) The contact pressure and thermal interface material are uniformly applied in the defined contact surfaces. The sensitivity coefficient matrix is now constructed to calibrate the value of the thermal contact conductance and is defined as: in which where h (j) , j = 1, · · · , N c , is the thermal contact conductance specified on the j-th contact surface. N c denotes the number of contact surfaces.

IV. VIRTUAL THERMAL SENSOR
The updated numerical model described in the previous section is used to estimate the unknown heat source and temperature distribution. Such time-varying function estimation is obtained by minimizing the ordinary squares norm based on the sequential time-domain. The estimated solutions based on the sequential time-domain method are extremely sensitive to measurement errors, causing severe oscillation in estimated solutions. We use a Tikhonov regularization method to reduce this instability. Therefore, we incorporate the zeroth-order regularization term into (8) and then obtain [9]: where g k,m is the internal heat rate generated inside the k-th body at time t m , and f l,m is the surface heat flux imposed on the l-th surface at time t m . S m andT m are the functions of P m . The second term on the right-hand side of (20a) is a zeroth-order regularization term. α is a regularization parameter. By solving the minimization problem in (20), the matrix normal equation and estimated solutions are derived as follows: in which whereT m is the virtual temperature under a uniform load condition (P m = P m−1 ), and P m is the estimated heat source vector. T m is the estimated temperature vector that contains information on the regions without sensors attached.
(∂q m /∂P m ) is the unit heat flux matrix and does not change over time in linear problems (i.e., the thermal material properties are constants).

A. REGULARIZATION PARAMETER FOR STABILITY
The regularization parameter α in (21a) controls the degree of filtering of the amplified measurement noise. If the value of α is too large, a significant amount of information in the solution is lost. Therefore, it is important to select the optimal α. In this work, we propose an efficient ridge estimator for the temperature estimation. The efficient ridge estimator is based on the sum of the squares of the bias and variance errors. It is already well known that the Tikhonov regularization method [30], [31] is a bias-variance trade-off approach. The square of the total error can be calculated as: where T , D and V are the total, the bias, and the variance errors, respectively, and are functions of α. First, the square of the bias error is defined as: in which δT bias,m is the bias error vector. To calculate the bias error vector, the gradient of (20a) is set to zero, which yields whereT m can be defined as follows: Here, δT bias,m is the bias error vector of the temperature corresponding to the sensor positions and is a function of α. When α is zero, δT bias,m becomes a zero vector, which indicates thatT m is over-fitted to Y m . Substituting (26) into (25) gives where X s X T s + is the Moore-Penrose generalized inverse of X s X T s . The inverse solution of the heat source tends to deviate initially and to become parallel to the exact solution. Using this characteristic, we assume that P m equals the exact heat source change when ignoring the deviated error. We then consider only the maximum heat source change to calculate the maximum bias error. Note that it is not reasonable to ignore the deviated error when the sensor position is far from the heat source [42]. Therefore, a sensitivity analysis is essential to find appropriate sensor location [9]. Equations (24) and (27) can be redefined as: δT bias = α XX T +X P max , (28b) Second, the square of the variance error is defined as: VOLUME 10, 2022 where δT var,m is a variance error vector of the temperature. To calculate the propagation and accumulation of the measurement error over time, Equation (21a) can be rewritten using LT m instead ofT m , as follows: where T 0 and q 0 are the initial temperature and thermal loading vectors, respectively. The initial errors of T 0 and q 0 are negligible if the thermal boundary condition is clearly defined with a well-established experimental environment. Equation (30a) could then be defined using only the measurement error: where δ P var,m is the variance error of the heat source change. δY m+1 is the measurement error vector and is normally distributed with zero mean and standard deviation, denoted by δY m+1 ∼ N (0, σ 2 ), where σ is an I -dimensional standard deviation (i.e., the measurement noise level of the sensors) vector.
The summation term in (31) represents the propagation and accumulation of errors, and it entails high computational resources. To solve this problem, we assume δY m as a regular oscillation: By substituting (32) into (31), we can obtain the time-independent variance error of the heat source change: where δ P var is the vector of the time-independent variance error of the heat source. M a is a time step at the moment when the norm of A M a is less than one. Equation (33) is derived when M a is odd, and in a similar manner, the equation can be derived when M a is even. Equation (21c) indicates that the variance error of the temperature is determined by the variance error of the heat source change. To compensate for the assumption in (32), the variance error of the temperature can be defined as follows: where δT s,m and δ P m are the temperature and the heat source change vectors with zero mean, respectively, and are calculated by δY m , which is the measurement noise data with zero mean. This equation can be calculated more efficiently through the Krylov model reduction method introduced in (7). Using (34), we can redefine (29) as follows: Finally, the total error of the temperature is calculated as the sum of D 2 and V 2 defined in (28a) and (35), respectively. Note that Equations (28a) and (35) do not depend on the temperature history, and thus, this approach can be utilized for the initial estimation without collecting measurement data in advance. It is also important that the bias error is a function of the heat source change and α, and the variance error is a function of the measurement noise and α.

V. EXPERIMENTAL RESULTS
In this Section, the proposed model-based virtual thermal sensor is experimentally validated using a sealed cylindrical structure in which a commercial servo drive operates. The cylindrical structure comprises three parts that are assembled by the thermal contact: the middle and bottom parts are made of aluminum 6061 T-6, and the top part is made of 303 stainless steel. A commercial servo drive is located inside this structure with the heat-sink as illustrated in Figure 1 and is used to control the EC motor, brushless, 400 W . We utilized a Gold Solo Twitter (Elmo Motion Control, Petah Tikva, Israel) that delivers up to 5 kW power in an average 1.87 in 3 compact package, and a flat heat-sink provided by Elmo Inc. was used to dissipate the heat. The Gold Solo Twitter (GTWI) has six MOSFETs located on the metal PCB, shown in Figure 1b. It is necessary to monitor the temperature of the MOSFET because most of the heat is generated there. However, it is not easy to attach the temperature sensor directly. Even if a sensor can be inserted inside, the GTWI is designed to be compact,  making it difficult to attach the bead of the sensor directly to the MOSFET. Therefore, we use a virtual thermal sensor to estimate the temperature and heat source of the MOSFET without the aid of a sensor installed inside.

A. EXPERIMENTAL SET-UP AND NUMERICAL MODELING
This study strictly controls the experimental environment to eliminate unpredictable error factors. First, the heat dissipation by convection and jig should be well established. Convection is the source of non-white noise, which is difficult to filter. Therefore, it is desirable to remove the effects of convection using a vacuum chamber; thus, we utilized the chamber and vacuum pump in Figure 2.
In all the experiments, the degree of vacuum and the chamber's internal (= ambient) temperature are set at 100mbar    and about 50 • C, respectively. In these chamber conditions, the convective heat transfer coefficient, τ , is about 6.16W /m 2 K [49], [50]. We minimize the heat dissipation by the jig using a thread with low thermal conductivity, as illustrated in Figure 4. T-type thermocouples (Copper-Constantan) were used, and the sampling rate is 2Hz for all experiments. This type of thermocouple (TC) has a  continuous temperature range from −250 up to 350 • C and maximum error of ±1.0 • C or ±0.75%. For the numerical modeling of the VTS, we consider the three-dimensional heat transfer problem and implement finite element modeling with eight-node hexahedral elements. The information for the FE models is listed in TABLE 2, and herein, the DOF per node is one. The experimental setup conditions for the model update process and the virtual sensing are listed in TABLE 3, and The detailed spatial information of sensor location is described in Figure 3.

B. MODEL UPDATE
The model update process was performed to minimize modeling errors. The model parameters were tuned in the range of 50 • to 65 • C considering the operating temperature range of the housing. After calibrating the thermal conductivity and the specific heat capacity before assembly, the thermal contact conductance was tuned by assembling. The experimental setup for tunning the two parameters is illustrated in Figures 4 and 5, and the Figure 6 describes the experimental setting for updating the thermal contact conductance. The mean squared error (MSE) of the sensors is adopted to evaluate the performance of the model update and is defined as: where i and m denote the sensor number and the time step. The initial and updated values of the two parameters for each material are listed in

C. VIRTUAL THERMAL SENSING
The updated model is utilized to construct the processing unit of the VTS. Algorithm is programmed using MATLAB      One thermocouple (TC 16) is used for virtual sensing, and the location of the sensor is illustrated in Figure 10. This measured temperature corresponds to Y m in (21a). The regularization parameter α used for stability is 2.16 × 10 −17 , obtained by the proposed ridge estimator in Section IV-A. The optimal α value selected by the proposed ridge estimator is compared with the Morozov discrepancy principle to prove that it is reasonable. The results are shown in Figure 11 and indicate that the optimal α values estimated by the two methods are similar. The average elapsed times to estimate α by the Morozov and the proposed methods are 16.13 hours and 4.85 min, respectively.  First, we estimate the heat source of the MOSFET using the VTS with one thermocouple (TC 16) attached to the floor, and the result of the estimated heat source is shown in Figure 12. Finally, four thermocouples (TC 12∼15) are used to prove that the estimated temperature distribution by the VTS is reliable. Three thermocouples (TC 13∼15) are attached to the housing, one to the heat sink (TC 12), and the locations of attachment are illustrated in the Figures 6 and 10. The results measured by thermocouple (TC 12) and virtual sensor (TC 12 and MOSFET) are shown in Figure 13a, and Figure 13b shows the MSE results evaluated by the four thermocouples (TC 12∼15). The temperature distribution of the unmeasured position are shown in Figure 15. Finally, Figure 14 describes the computational cost of implementing the virtual sensor. The processing time per step is the time required for m to increase by one. The Krylov basis used in the reduced-order modeling is three, and the processing time per step is about 270 times faster than the Full model.

VI. CONCLUSION
This study developed a virtual thermal sensor for the real-time monitoring of electronic packages with sensor attachment limitations. Minimizing modeling errors and measurement errors were the most important in developing reliable VTS, and real-time virtual sensing requires programming algorithms that can be implemented using affordable processors. A finite element model and its updating technique were used to minimize the uncertainties of the thermal parameters that arise during the manufacturing process. The computational burden of the finite element model was reduced using the Krylov model reduction method. This implementation saved the processing time per time steps about 270 times rather than the full model used. In addition, the Tikhonov regularization method was utilized to reduce the sensitivity to measurement errors, and we proposed an estimator for efficiently selecting the regularization parameter. The proposed estimator was verified by comparing it with the widely used Morozov discrepancy principle. The time required for each method to estimate the optimal α was 16.13 hours and 4.85 min, respectively, demonstrating efficiency.
Finally, the proposed VTS was implemented and experimentally evaluated with an enclosed structure in which the commercial servo drive, Gold Solo Twitter, operates. The VTS monitored that MOSFETs, the main heat source of Gold Solo Twitter, operates near 85 • C, generating about 2.5 W of heat. The advantage of the VTS is that it can estimate the amount of heat generated without electrical knowledge. In addition, taking advantage of the finite element method, the proposed VTS can also be installed in systems with more complicated heat flows. Therefore, the proposed VTS can be sufficiently applied to other applications, such as robot actuators, electric vehicle batteries, electric motors, semiconductor manufacturing thermal processors, and transformers.