How Can a Cutting-Edge Gallium Nitride High-Electron-Mobility Transistor Encounter Catastrophic Failure Within the Acceptable Temperature Range?

Commercial gallium nitride (GaN) high-electron-mobility transistors used for power electronics applications show superior performance compared to silicon (Si)-based transistors. Combined with an increased radiation hardening properties, they are key candidates for high-performance power systems in a harsh environment, such as space. However, for this purpose, it is key to know the potential failure mechanisms (FMs) of the devices in depth. Here, we demonstrate how the repeated thermomechanical stress in a power cycling (PC) test within specified operating conditions destroys the GaN device. Based on leakage current localization analysis, we identify an FM with a yet unknown root cause. Utilizing emission microscopy, focused ion beam cutting, and scanning electron microscope techniques, it is revealed that multilayer cracks of a GaN die are triggered by a commercial leading package structure, which shows excellent capability under frequent thermomechanical stress. Through multiphysics simulations, it is shown that the structural factors that lie behind the strong performing component properties inside the package ultimately are directly related to the failure pattern. This article is accompanied by a video demonstrating dynamic thermal distribution difference between thermography measured in a practical experiment and a multiphysics simulation result during a single PC of a PC test. This article is accompanied by a supplementary figures file demonstrating test environment, preparation process of specimens, and reverse engineering results for the simulation model.


I. INTRODUCTION
G ALLIUM nitride (GaN) high-electron-mobility transistors (HEMTs) have drawn considerable attention as an alternative to silicon (Si)-based transistors in power electronics [1], [2]. Their electrical performance in practice is already exceeding the theoretical limit of the Si-based transistors [3]. Besides, due to their better hardness against radiation and high temperature compared to the Si transistors, they could be utilized more stably and efficiently in new applications such as electric vehicle, aerospace, and renewable energy [4]- [6]. Despite these superiorities, the market has steadily raised a question regarding their reliability risk [1], [7], [8]. Being a new component, the reliability standard for GaN devices are currently being prepared [9], and new phenomena undetected in Si devices are reported with GaN devices [10]- [16]. Hence, for GaN devices to lead many selections from designers, the importance of their reliability research is gradually growing.
Most field failures of power switching devices in power conversion circuits are caused by thermomechanical stress [9], [17]- [19]. There were studies on failure mechanisms (FM) of commercial GaN devices against thermomechanical stress under a power cycling (PC) test environment [20]. Some of the revealed FM were similar to the failure phenomena of the Si devices [21]- [24], while others FM such as the drain-to-source off-state leakage current (IDSS) failure reported from our group in the PC tests was as of yet unexplained in terms of a root cause [25], [26]. This failure phenomenon leads to short circuit operation in a power conversion circuit, which could destroy an entire power conversion system as well as the GaN device. To prevent this risk, deep research into this IDSS FM is required.
Among various potential root causes, only a few candidates were identified to be consistent with observation [27]. The mechanisms are summarized in Fig. 1 and the analysis points to as if yet unclarified FM, the overheating, or dielectric crack. The previous study had not been able to pinpoint the cause of the failure because of test conditions beyond the allowed maximum operating temperature range 150°C and limit of analysis technique to be able to analyze deep defect location inside a complex package [28]. To find the root cause of this failure, we have here tried to conduct an additional PC test within acceptable temperature ranges and high-level analysis techniques that can visualize deep cracks inside the package without destruction.  [21]: When IDSS failure was confirmed, threshold voltage (V th ), and the on-state resistance (R DS(ON) ) were in the normal range. These three parameters were characterized in the room temperature. (Words a : remaining possible failure mechanisms to lead to the IDSS failure in the PC test, words b : proven failure mechanisms that cannot be the cause of the IDSS failure in the PC test.) In this article, we describe the leakage current FM of a 650-V cutting edge commercial GaN device (model: GS66508P of GaN Systems inc. [29]) in an accelerated degradation test that involves cycling the power of the component to induce temperature swings from 24.1°C to 124.1°C. After failure, a photon emission microscope (EMMI) and a focused ion beam (FIB) are leveraged for identifying the exact defect locations without additional damage in the FA. The analysis revealed that repeated PC induce dielectric cracks in the GaN die which are consistent with the leakage current failure. To further corroborate the failure analysis (FA), a detailed multiphysics finite element method (FEM) simulation is employed. We present, based on the FA and the simulation, how structural factors of the advanced package without a bond wire affect these cracks. The knowledge obtained in this article will help not only the designers safely use this GaN device but also the manufactures make more reliable GaN devices.

II. POWER CYCLING TEST
For inquiring the possibility of the IDSS failure induced by the overheating, an additional PC test has been carried out with the temperature at the failure swing which does not exceed 150°C, that the maximum junction temperature limited by the manufacturer, at the failure. Two online parameters, the maximum ON-state resistance (R DS_MX ) and a temperature swing (ΔT _meas ), are monitored during the PC test. In the test, the temperature (T _meas ) is measured on the top surface of the discrete GaN device with an infrared (IR) camera. The device is matte black-painted to achieve higher emissivity. This measurement environment had been verified with 0.96 emissivity value from 25°C to 170°C, which is described in detail in Fig. S1 in the supplementary material. A slight difference between T _meas and the highest temperature we expected inside the die will be explained based on a subsequent FEM simulation in Section IV. R DS_MX is calculated by the drain to source current (I DS ) and the maximum drain to source voltage (V DS_MX ) measured in PC. I DS is calculated from the voltage difference between both terminals of the shunt resistor connected in series between a source node of the GaN device and ground. At the beginning of the test, 17.0 A has been set as the loading current satisfying the desired temperature swing ΔT _meas = 24.1∼124.1°C. It is 68% of the maximum continuous current guaranteed by the manufacture at the 100°C case temperature. The principle and operating strategy of the test is described in detail in Fig. S2.
In the result, two failure phenomena are observed; thermal conductivity degradation and IDSS failure. Fig. 2(a) and (b) exhibits two online parameters (R DS_MX and ΔT _meas ) monitored in a PC circuit and offline parameters (V th , R DS(ON) , and IDSS) measured by a curve tracer at the room temperature, respectively. The details of the measurement are elaborated in Fig. S2. The first failure phenomenon can be explained by the ΔT _meas and R DS_MX increase after the take-off point in Fig. 2(a) and no change of R DS(ON) in Fig. 2(b) [25], [26] at the end of test. In line with previous studies, the cause of this failure is the solder joint fatigue. Fig. 3(b) and (c) shows the solder joint images confirmed by scanning acoustic microscope (Model: KSI V-8 from IP Holding). In a comparison of them, it can be seen there is solder delamination underneath TPAD where the most heat is concentrated during the test. It is worth noticing, that the temperature swings employed are well within the acceptable operating conditions but still, the leakage current failure develops, which corroborates that the overheating scenario in Fig. 1 cannot explain the IDSS failure. Thus, the remaining dielectric crack hypothesis in Fig. 1 will be investigated extensively with additional structural FA.

A. Defect Localization
A backside photon Emission Microscopy (EMMI) is used for finding leakage locations of the failed device [30], [31].  An Indium gallium arsenide camera utilized in the EMMI cannot only visualize patterns of the die but also detect minute photon emission in the regions where the leakage current flows. This analysis ensures more accurate localization result when performed on the backside because GaN layer and Si substrate are transparent under near-IR environment the camera can see [32]- [35]. For a backside EMMI analysis, the PC-tested GaN device had been partially decapsulated in bottom side with fuming nitric acid. The preparation process of this sample is exhibited in Fig. S6. After the decapsulation, the bottom side of the GaN-on-Si die of the GaN device is exposed without any damage to the GaN HEMTs, copper (Cu) vias, and Cu plates of the upper side of the package.
An EMMI analysis can be divided into two processes: patterning and photon emission. The patterning is for the active structure on a thick Si substrate to be visualized [34]. The analysis is conducted in a dark chamber to eliminate interference in visible light. In the patterning, ohmic contact regions, between AlGaN and metallization structure in source and drain, metal, and gate structures visualized. We can recognize the active structures of GaN HEMTs from the patterning image [33], [35]. After the patterning, the photon emission is performed under electrical conditions where an IDSS failure was identified; V GS = 0 V, and V DS = 5 V. Fig. 4(a) exhibits the prepared EMMI analysis environment for IDSS defect localization. Fig. 4(b) shows a patterning image of the defective device analyzed in the backside EMMI. The various abnormal patterns suspected as defects are observed in this image. The abnormal patterns are classified into two types. The first conspicuous defective pattern is a long horizontal crack marked with blue arrows in Fig. 4(b). This crack at the vertical center of the GaN die is continued from the center of the die to left edge. Moreover, several of dot-like defect patterns marked with yellow arrows on Fig. 4(b) are observed around the horizontal crack. They are also placed in around the vertical center of the die. All defects found in the patterning process are placed in the defective zone marked in Fig. 4(b). They might be directly related to the IDSS failure. The results will be argued with photon emission analysis conducted under IDSS measurement conditions. Fig. 4(b) visualizes the defective positions directly contributing to the IDSS failure. Generally, the photon emission may occur even in normal operation so that the analysis of the defective device should be compared to a normal device without a PC test.  Fig. 4(b)]. The overlapping positions of the crack patterns and photon emission in these two images inform where the direct cause of the IDSS failure is. Cross-section analysis will then be performed at in Points A and B marked in Fig. 4(d) to understand the mechanism of IDSS failure in the following sections.
The IDSS failure mode without the change of R DS(ON) and V th identified in Fig. 2 and Fig. S4 can be described with these local defects. This GaN die consists of multiple unit devices and the unit devices are connected to one gate, one drain, and one source electrodes in parallel. When R DS(ON) is defined in V GS = 6 V and I DS = 9 A, this resistance represents a resistance of total source-to-drain channel width. It is hard to influence ON-state parameters such as R DS(ON) and V th because defective channel width observed in Fig. 4(b) is too small compared with an entire channel width of the die. These local defects, on the other hand, are sufficient to serve as a leakage current path on the OFF state. Therefore, the IDSS failure phenomenon accompanied by normal R DS(ON) and V th can be explained by this consideration.

B. Cross-Sectional Analysis
In two defective locations [ Fig. 4(d)], a cross-sectional analysis is carried out with FIB cutting and scanning electron microscopy (SEM). FIB cutting cannot only provide precise navigation of defective location but also avoid mechanical stress during the cutting progress, which improves the credibility of the FA results [33]. For the cutting, the upper side of the device had been decapsulated with a chemical. The specimen preparation process is explained in Fig. S8. Cross-sectional surfaces are prepared by FIB perpendicular to the gate finger in both points. After sample preparation, SEM is employed for defect investigation.
In the results of the cross-sectional analysis, defects relevant to the IDSS failure had been found in both selected points. Fig. 5(a) and (b) displays a top view image and a cross-sectional image around FIB-cut Point A, respectively. The horizontal long crack observed in Fig. 4(b) is also seen in Fig. 5(a). In Fig. 5(b), we can identify the vertical crack extended from a GaN HEMT core structure to multiple dielectric layers. Fig. 5(c) and (d) shows a top view image and a cross-sectional image at FIB-cut Point B, respectively. This location was a dot-like pattern observed in the EMMI patterning image [ Fig. 4(b)]. In Fig. 5(d), we can see also the severe cracks horizontally extended from multiple metallization layers to a GaN HEMT structure. Even though the propagation direction of the Point B crack is horizontal in the cross-section image unlike the crack in Point A, both cracks are directly crossing from drain to source of GaN HEMTs. These severe cracks can be a critical leakage path in IDSS measuring.
We have described the mechanism of the IDSS failure phenomenon uniquely reported in the PC test with the cutting-edge GaN device using advanced FA techniques. Multiple dielectric cracks revealed by EMMI and SEM cross-sectional analysis fully underpin the IDSS FM. In Fig. 2(b), IDSS failure was observed at the end of the PC test, while there was no noticeable change in V th and R DS(ON) . Based on this observation, the cracks directly affect the OFF-state characteristic, but not on the ON-state characteristics because the area affected by cracks might be negligible compared to the entire on-channel width. In the EMMI analysis, it was detected that the positions of the cracks had some tendency. Extensively looking at these defective tendencies, we can predict the mechanical stress is concentrated at the center of the die. Thermomechanical stress that leads to failure in PC test may be generated due to thermal expansion mismatch among constituents of a GaN device [36], and the stress is intimately linked with temperature [20], [37]- [39] and especially temperature gradient. Hence, the interesting observation also can be expounded with a thermal FEM simulation.

IV. FEM SIMULATION
A FEM simulation of the GaN device in PC test can help clarify the failure mode revealed by the previous FA. A stress FEM simulation requires various initial strain conditions of each material and commissure for better results. This information is difficult to grasp only by structural analysis, and its verification is not easy. For these reasons, a temperature FEM simulation, which can be verified with measured temperature in the test and can directly represent thermomechanical stress, is carried out here. In the PC test, the top surface of the die becomes the primary heat source, so that the magnitude of this temperature change represents the thermomechanical stress to the die [20], [21], [40]. Hence, temperature information on this surface in the simulation allows understanding some tendency of the cracks concentrated in certain areas.
A three-dimensional (3-D) model of the GaN device, a heat boundary condition, and a heat flux condition are needed for the FEM simulation. To reproduce a model closer to real structure, structure analysis results from cross-sectional analysis [26] and layer-by-layer structural pattern analysis of the GaN device was carried out and applied to the model. Fig. S9 shows the structural analysis results. The exact thickness information is confirmed based on the SEM cross-sectional image in Fig. S9. Geometry information of Cu vias and plates in the package are extracted by microscope and decapsulation by layer. Specific package structure has been fully reflected on the 3-D model.
We applied a simplified die model for this FEM simulation. Considering the GaN HEMT structure (Fig. S10), the 2DEG layer should be the hottest point during PC test. In our model, we have represented the relatively thin interconnection layer containing SiN x layers, tungsten vias, and Cu metal plates as well the AlGaN barrier as the 2-D thermal heat source. To ignore these structures may cause a temperature error between the die surface and the actual junction temperature. This error can be predicted by conservative calculation. We assume that the interconnection layer is 10 μm SiN x that has a relatively low thermal conductivity among the material constituting the interconnection layer as compared to Cu plates or tungsten vias. For the AlGaN barrier, the thickest condition 25 nm is reflected in this calculation in other studies [41], [42]. Thermal conductivity of SiN x and AlGaN are 90 and 3.1 W/m·°C, respectively [43], [44]. The active area should be the total die area 12.3 mm 2 as all of the unit GaN devices are fully turned on during the test. Under these conditions, assuming that 50% (13.8 W) of the highest power flows through the upper interconnection layer and AlGaN barriers during PC, the temperature difference of 0.13°C between the junction and the die top surface is predicted. It is not significant value compared the measured maximum temperature of 124.1°C. The temperature thus at the die top can represent the junction temperature with a small error in the simulation, which will be discussed by comparison the actual measurement and the FEM simulation results.
For better simulation results, position and condition of heat boundaries must reflect the actual operating state of a tested device as much as possible. The heat boundary and heat flux boundary are defined as the top surface of the die and Cu pads in the bottom side of the package, respectively. The actual power consumption curve measured in the PC test is implemented in the simulation as a heat boundary condition to closely replicate the experimental conditions. Fig. S11 shows two boundaries and the heat boundary condition. To obtain a correct value of the heat flux boundary condition on the cooling pad, a temperature measured in a steady-state test and temperature from the simulation were compared with each other.
It is critical in simulation to find the heat flux condition that accurately reflects the experimental environment. For this, we first measured the temperature of the device at specific power consumption in a PC test environment and then derived heat boundary condition satisfying the temperatures corresponding to the power consumption conditions in the simulation. In the PC test environment, two temperatures 23.8°C and 25.8°C were measured under the two static power dissipation levels of 1.099 and 1.653 W, respectively. Through the FEM simulations, the heat flux condition 1.55 × 10 4 W/(m 2 ·°C) satisfying the two results measure earlier had been extracted. The condition can be reverified in transient temperature comparison between the simulation and the real PC test. Fig. 6(a) displays the temperature (T _maes_pack_top ) measured on the package top surface in the test and the temperature (T _sim_pack_top ) on the package top surface in the simulation. As can be seen they have a very good quantitative agreement between both temperature curves in Fig. 6(a). Moreover, we compared the thermal spatial spreading in the simulation to the real PC test during a single PC recorded as a movie using a thermal camera monitoring the device surface this is supplied as supplementary video. In this comparison, the model parameter fitted was the heat boundary condition based on the simpler steady-state conditions, while we were able to confirm that this simulation is very close to the actual test result in terms of dynamic predictions. The proposed simulation, therefore, is confirmed to be very close to the real operation not only in constant condition but also in a transient condition, which ensure the credibility of this simulation.
Besides, this good agreement also proves that the simplified model proposed previously is acceptable. In the simplified die model, the interconnection layers and AlGaN barriers between the junction and die top surface were ignored. If their thermal resistivity is significant, it can generate a meaningful difference between the measured temperature and the FEM simulation results. We cannot see any significant difference between them. And the conservatively calculated temperature difference between the junction and the die top was 0.13°C. The die model, hence, can be simplified excluding the interconnection layer and AlGaN layer in the FEM simulation, and the die top temperature reflects the junction temperature very closely.  [26,27]. The lines for temperature measurement are shown in Fig. 6(a).
The measured temperature is the most direct stress parameter that we can monitor during the PC test. For this reason, the smaller the difference between this temperature and the actual junction temperature, the better. In the device using a no wire bond package, the difference between the junction temperature and the temperature measured by an IR camera has not been addressed for transient conditions in previous literature. Fig. 6(b) shows a difference of temperature on the top surface of between T _sim_pack_top and on the die (T _sim_die_top ) in the FEM simulation during a heating period 1 s of the PC test. Both temperatures were also defined as the average value of the line at the same position shown in Fig. 6(a). The change trends of them are very similar to each other, and the maximum temperature difference at the end of the period is only 1.26°C. T _sim_die_top represents the junction temperature in the simulation. Fig. 6(c) exhibits a strong correlation of both temperatures in the simulation. We can understand the temperature measured on a package top surface accurately reflects the junction temperature with a slight minus offset, indirectly through the simulation.
This FEM simulation allows us to understand the thermomechanical stresses generating inside a DUT during PC test. The stresses are basically induced by two mechanisms in a PC tested or operating power device: 1) coefficient of thermal expansion (CTE) mismatch of the involved materials in a DUT, and 2) global or local temperature gradient in a DUT. These stresses occur in both the expansion due to heating and the contraction due to cooling during the PC test, and they are directly related to the temperature distribution. Hence, detailed knowledge of thermal distribution allows understanding thermomechanical stress that is the root cause of PC test failure.
The locations of cracks found in the previous FA can be discussed with the FEM simulation. We can divide the cracks found in Fig. 4(b) into two groups: the long crack propagated from the center of the chip to the left edge and dot-type cracks scatted around the center. For the long crack, first we can inversely predict there was a strong vertical mechanical stress in the PC test. Fig. 7(a) shows the distribution of temperature swings at the die top surface height during PC. The different temperature distribution by the area of a homogeneous body can generate mechanical stress [45]. In Fig. 7(a), the center of the die has the highest temperature swing, while the upper left end has the lowest temperature swing. This imbalance can produce the overall thermomechanical stress of the die during the PC test. The Cu vias are not symmetrical at the left and right edges of the die in Fig. 7(b), which leads to a temperature imbalance between the upper and bottom at the edges of both sides. This imbalance induces stress on the surface of the die, which can lead to defects such as the first long crack.
The second crack group dot type defects in Fig. 4(b) can be analyzed by combining the EMMI patterning image [ Fig. 4(b)], the knowledge of the Cu via design inside the package directly bonded to the die, and the temperature gradient of the die (T _die_sim_grad ) extracted from the FEM simulation at the highest point during a PC. Fig. 7(b) exhibits an overlap image of these three items. T _die_sim_grad is expressed as (1).
When referring to Figs. S9 and S10, it is seen that the temperature gradient on the die top surface is significantly influenced by the Cu structure inside the package. Particularly, T _die_sim_grad is relatively high along the Cu vias in the center of the die surface. There might be a larger temperature gradient around the boundary between them because Cu has much better heat transfer capability then FR4 which is filling most of the package [38]. In Fig. 7(b), the cracks pointed by the yellow arrows are located mainly near the Cu vias within the center area of the die. In this area, it is expected that severe mechanical stress is generated by CTE differences among the materials (Cu, FR4, a GaN die) at these interfaces as well as the larger temperature gradient. Thus, it is revealed that the dot-type defects also correlate with the Cu vias design in the package.
In the FEM simulation, we investigated the FM of the cutting edge GaN under PC test more deeply following the previous physical FA. The simulation model, created by structure analysis of the GaN device, very close to the actual PC test environment was presented. In the simulation, it was indirectly predicted that the junction temperature of the GaN device has around only 1% difference compared to the temperature on the active surface monitored by an IR camera. Based upon FEM simulation, backside EMMI analysis, and structure analysis results, we discussed that multilayered cracks causing the IDSS failure in PC test greatly correlate to a design factor: the up-down asymmetry Cu vias layout of the advanced package.
In summary, the IDSS failure of the cutting edge GaN device induced by PC test was unveiled based on the FA and the FEM simulation. The PC test was performed within the temperature swing from 24.1°C to 124.1°C, which means this failure can be reproduced in actual operation. The system needs to be prepared for this risk because the sudden IDSS failure can trigger short circuit operation. Advanced FA techniques had been introduced to find defects in a GaN device using the package without a bond wire. These methods can be applied to another FA. An FEM simulation of the discrete GaN device without a bond wire was first attempted. Despite a minute and complex package structure, the credibility of FEM simulation has been qualified by comparison with the actual PC test results. It was discussed with the simulation that the defects found in FA are strongly correlated to the Cu structure layout of the package. Compared to other PC test results of a package using bond wires with similar or lower stress conditions [46]- [48], this device shows better reliability performance. Hence, by optimizing the design factors of the package pointed out in this article, this packaging technique can exhibit a much higher reliability, if exploited. As an extension of this study, in further research, the optimal design conditions of the package can be investigated against thermomechanical stress using the proposed FEM simulation.

V. CONCLUSION
This article has investigated the novel FM of the cutting edge GaN device in a PC test by applying state-of-the-art physical analysis tools combined with a high-detail FEM model. It has been revealed that thermomechanical stress induced by PC test can cause multilayer cracks inside a GaN HEMT die even within the allowed temperature swing range. Backside EMMI and cross-sectional analysis using FIB and SEM had been implemented for the FM analysis. Severe cracks able to trigger IDSS failure of GaN HEMTs have been visualized through advanced FA. Moreover, it is discussed that the crack locations are deeply correlated to design factors of the package based on FEM simulation. This article provides the key not only to understanding the dangerous failure phenomenon for the advanced GaN device but also a foundation for improving their reliability.