Modeling of Breakdown-Limited Endurance in Spin-Transfer Torque Magnetic Memory Under Pulsed Cycling Regime

Perpendicular spin-transfer torque (p-STT) magnetic memory is gaining increasing interest as a candidate for storage-class memory, embedded memory, and possible replacement of static/dynamic memory. All these applications require extended cycling endurance, which should be based on a solid understanding and accurate modeling of the endurance failure mechanisms in the p-STT device. This paper addresses cycling endurance of p-STT memory under pulsed electrical switching. We show that endurance is limited by the dielectric breakdown of the magnetic tunnel junction stack, and we model endurance lifetime by the physical mechanisms leading to dielectric breakdown. The model predicts STT endurance as a function of applied voltage, pulsewidth, pulse polarity, and delay time between applied pulses. The dependence of the endurance on sample area is finally discussed.

cache, which takes advantage of the nonvolatile behavior to reduce the OFF-state power consumption [6], [7].Also, the MRAM technology and spintronics, in general, are gaining considerable interest for non-von Neumann computing architectures, such as low-power hybrid MTJ/CMOS logic circuit [8] and beyond-CMOS brain-inspired neuromorphic circuit [9].
The state-of-the-art conceptual implementation of MRAM relies on the magnetic tunnel junction (MTJ), namely a metal-insulator-metal stack consisting of a MgO dielectric barrier (t MgO ≈ 1 nm) between two CoFeB ferromagnetic electrodes.Of these two ferromagnets (FMs), the pinned layer (PL) has fixed magnetic polarization, whereas the free layer (FL) can change its polarization between parallel (P) and antiparallel (AP) with respect to the PL.The MTJ resistance is dependent on the relative orientation of the magnetic polarization in the two FMs due to the tunnel magnetoresistance (TMR) effect [10], where the P state has a relatively low resistance R P , while the AP state has a relatively high resistance R AP .Switching from P to AP and vice versa takes place by spintransfer torque (STT), where the spin polarization of the electron flow across the MTJ is transferred to the FL ferromagnetic polarization by momentum conservation [11], [12].The perpendicular STT (p-STT) concept, where the ferromagnetic polarization lies out of the MTJ plane, allows a smaller switching current at a given retention time, thus enabling low power operation and improved scalability [13].
To drive the switching current across the MTJ, bipolar voltage pulses are applied, which might induce degradation and time-dependent dielectric breakdown (TDDB) in the long term.Although the cycling endurance of STT-MRAM is generally referred to as virtually infinite [14], the repeated electrical stress during cycling induces a breakdown-limited endurance lifetime, which poses a limitation on the applicability of STT-MRAM as working memory or in-memory computing element.Despite the relevant need for high endurance, the characterization methodology, the physical understanding, and the simulation models for breakdown-limited endurance are not yet well established.
In this paper, we address the endurance of p-STT-based memory.We study endurance failure for various pulse amplitude, polarity, and pulsewidth.Then, we present a model for breakdown-limited endurance based on defect generation,  activation, and diffusion, capable of predicting STT-MRAM lifetime under different cycling conditions.Finally, we discuss the endurance dependence on device area.
A preliminary study of the modeling of STT-MRAM endurance was reported previously in [15].Here, we provide a fully detailed report, with a deeper investigation of the fundamental mechanisms of defect generation/activation, a direct evidence for polarity-dependent activation, and a study of areadependent endurance.

II. SAMPLES AND METHODOLOGY
We used p-STT memory devices sketched in Fig. 1(a), consisting of CoFeB PL [bottom electrode (BE)] and FL [top electrode (TE)] with a crystalline MgO dielectric layer.The device cross-sectional area was 47 nm × 47 nm.Fig. 1(a) also shows the experimental setup for the pulsed characterization of STT devices, including a TGA 12102 waveform generator (TTi) to apply triangular pulses for set (transition from AP to P under positive voltage) and reset (transition from P to AP under negative voltage) processes, while the applied V TE voltage and current I across the MTJ were monitored by a 600-MHz LeCroy Waverunner oscilloscope.Fig. 1(b) shows a typical sequence of set, read, reset, and read operations.Each triangular pulse had a width of t P = 100 ns and a pulse delay of t D = 20 ns, except where noted.The maximum positive voltage during set was V + , while the maximum negative voltage for reset was V − .The read current in Fig. 1(b) confirms the different states of the device, namely P state after set and AP state after reset.Fig. 1(c) shows the I -V curve obtained from the collected V and I data [16].By monitoring the I -V curves at each cycle, we could observe possible degradation phenomena and the exact event of endurance failure.This technique is thus most accurate in reproducing the exact device conditions in real time, instead of unrealistic description by constant/ramped stress [17], [18].Also, the pulsed signal of Fig. 1(b) enables a comprehensive analysis with respect to various parameters, such as voltage (V + and V − ) and time (t P and t D ) parameters.All measurements were carried out at room temperature.by an abrupt drop of resistance, corresponding to a hard breakdown of the MgO dielectric layer, after a number N C of cycles.The resistance values R P and R AP are constant throughout the lifetime, thus indicating no obvious cyclinginduced resistance degradation [19].Also with a cycle-bycycle observation of the I -V characteristics, allowed by triangular stress pulses, no clear evidence for degradation could be found.Even though some preliminary studies [20] suggested a possible gradual dielectric breakdown for relatively thick MgO layer, in a nanometer thick tunnel barrier, an abrupt breakdown event is typically observed [19].Breakdown could take place on either voltage polarities, e.g., breakdown during the positive sweep for V + > |V − | [see Fig. 2(b)] or during the negative sweep for |V − | > V + [see Fig. 2(c)].Breakdown can be explained by defect generation in MgO, inducing a percolative path and thermal runaway, as sketched in Fig. 2(d) [21], [22].After breakdown, the device shows a TMR of 0% and a constant resistance R ≈ 300 , which we attribute to the metal contacts and interfaces.No other kinds of cycling-induced failure, such as a degradation of the magnetoresistance ratio due to the cycling-induced degradation of the ferromagnetic layers, were observed, although this might be possible as a result of the thermal runaway right after dielectric breakdown.The latter was always responsible for device failure, consistently with the high electric field causing stress within the MgO barrier.

III. CYCLING ENDURANCE
Fig. 3(a) shows the measured cycling endurance N C as a function of the applied voltage with a pulsewidth t P = 100 ns and a pulse delay t D = 20 ns.Three cycling conditions are compared in Fig. 3(a), i.e., symmetric bipolar stress with V + = |V − |, positive unipolar stress with V − = 0 V, and negative unipolar stress with V + = 0 V. N C data for positive and negative unipolar stress show similar behaviors, suggesting a high polarity symmetry of the MTJ structure with respect to degradation and breakdown processes.N C shows a steep exponential voltage dependence with a slope ≈ 50 mV/decade for the three regimes in Fig. 3 [15] indicates an estimated N C ≈ 10 18 at V = 0.3 V and t P = 100 ns, which is high enough to comply with most SCM and dynamic random access memory applications.Data indicate a higher N C value hence reduced degradation, for unipolar stress, compared with the bipolar stress condition.This can be interpreted by considering the MgO-CoFeB interfaces to be the regions of maximum generation of stress-induced defects, and thus, unipolar stress predominantly creates damage at a single interface, whereas both interfaces are affected by bipolar stress-induced degradation.No other input patterns were explored, e.g., a mixed unipolar/bipolar regime, although we expect that the failure mechanism would not change, and the endurance would be intermediate between the unipolar and bipolar cases.

IV. ENDURANCE MODEL
We developed a semiempirical model of endurance, which describes the dependence of N C on the voltage amplitude and pulsewidth of the applied signal.In the model, N C is inversely proportional to the defect concentration within the MgO layer, namely N C = N C0 (n D /n D0 ) −1 , where N C0 and n D0 are constant and n D was calculated as n D = n D,TE +n D,BE , where n D,TE and n D,BE are the defect concentrations originating from the TE interface and the BE interface, respectively.In this physical picture, defects are mostly generated near the interfaces where electrons have the highest kinetic energy and where the structure might display possible degradation  precursors, e.g., dangling bonds or oxygen vacancies.For example, an incomplete Mg oxidation could make unoxidized atoms to move more easily toward anode due to electromigration, thus increasing Mg/O vacancy concentration close to the interface [23].In addition, boron (B) diffusing from the electrodes toward the tunnel barrier might initiate the creation of pinholes that might short circuit the tunnel conduction [24].A relatively high density of initial degradation precursors plays also a key role in lowering the electron transport barrier height in MTJ [25].  in the following.Even though process variability is of great importance for the STT-MRAM design [27], we did not take into account such effects given the relatively low deviceto-device variation of conduction and switching among our samples.The observed variation in cycling endurance for a given voltage might result from intrinsic variability of both the position in the oxide layer and the number of generated defects.

A. Defect Generation
Fig. 5(a) shows the defect generation mechanism in our model.Electrons injected from one interface reach the other with a kinetic energy E given by the difference of the Fermi levels in the two electrodes, i.e., E = E F,TE − E F,BE = qV + for positive voltage applied to the TE and hence electrons injected from the BE.The release of the energy E induces lattice vibrations and defect generation at the TE interface by bond breaking.Even though the strong ionic bond between Mg and O is very energetic, bond breaking is possible due to the extremely high local field and polarization that it will experience, leading to significant bond distortion [28].This condition can be explained considering its high dielectric susceptibility and dipole moment [29].
In our model, defect generation probability is assumed to increase exponentially with the energy E, and thus, the generation rate is given by R TE = R 0 e αV + , where α is a constant, in agreement with the E-model of dielectric breakdown [30], [31].Similarly, the generation rate at the BE interface can be written as R BE = R 0 e α|V − | .
To test the defect generation model in Fig. 5(a), Fig. 6(a) shows the calculated N C value for asymmetric bipolar cycling, compared with data from Fig. 4. We assumed α = 42 V −1 in the calculations.The model correctly describes the steep decay of N C in region A; however, the model fails to predict the weak voltage dependence in region B. In fact, due to the exponential voltage dependence of R BE and R TE , the defect generation model only attributes degradation to the largest voltage, in contrast to the experimental evidences in Figs. 3 and 4.

B. Defect Activation
To account for the impact of the smaller voltage in the MgO degradation, we considered the defect activation mechanism displayed in Fig. 5 To further confirm that the activation process consists of a displacement rather than a thermal effect, e.g., a temperatureinduced stabilization of the generated defect, we compared the asymmetric bipolar stress (fixed V + = 1 V and variable negative V − ) and the asymmetric unipolar stress, where both the fixed and variable voltages were positive.Data shown in Fig. 7 indicate a larger N C value and a rather flat behavior in region B for the asymmetric unipolar stress, thus suggesting that a positive voltage is not effective in displacing O .These data confirm that the activation process requires bipolar stress.
To complete our model, we included defect generation and activation at the BE side with the same parameters used for the TE side in view of the high symmetry of our MTJ stack.We also included an explicit dependence on the pulsewidths t + and t − of the positive and negative pulses, respectively.The total defect density due to generation and activation is thus written as where t 0 = 10 −30 s is a constant.The model parameters are summarized in Table I.Fig. 8(a) shows the measured and calculated N C value for both constant V + with variable V − and constant V − with variable V + .Our model is able to predict the different slopes in regions A and B, where n D can be approximated as

V. PULSE-TIME DEPENDENCE OF ENDURANCE
A. Impact of Pulsewidth t P To test the impact of the pulsewidth t P on endurance, Fig. 9(a) shows the measured and calculated N C value for symmetric bipolar stress (V + = |V − |) for increasing t P from 100 ns to 100 μs.Data indicate that N C decreases at increasing t P as N C ∼ t −1 P , as also summarized in Fig. 9(b) for stress at V + = |V − | = 0.8 V. Calculations accurately account for the t P dependence, as a result of the dependence on t + and t − in (1), to describe the increase of the defect density with increasing stress time.
To study the distinct impacts of t + and t − in (1), Fig. 10(a) and (b) shows N C for asymmetric bipolar cycling for fixed V − = −1 V and two distinct values of V + , namely V + = 1.05 V corresponding to region A and V + = 0.4 V corresponding to region B. In these two regions, we measured N C as a function of t + for constant t − = 100 ns or as a function of t − for constant t + = 100 ns.Data in region A [see Fig. 10(a)] indicate that N C decreases as N C ∼ t −1 + while t − has no impact on N C .On the other hand, N C decreases as N C ∼ t −1 − in region B [see Fig. 10(b)] with no role of t + .Calculations by (2a) and (2b) are also shown, thus demonstrating that our model can predict the distinct dependence on t + and t − .
From our data, N C shows a dependence only on the width of the pulse of the largest voltage, namely the one that generates defects in the MgO [see Fig. 5(a)].The duration of the activation pulse is instead not affecting degradation.This is in agreement with a physical picture where activation behaves like a binary event, i.e., resulting in either failure or success.
where t D0 and γ are the constant parameters shown in Table I.
Calculations by (3) are shown in Fig. 11(a), in close agreement with the experimental results.The results also suggest that the gap between unipolar and bipolar endurance decreases for increasing t D , which is fully taken into account by our diffusive model.In fact, as t D increases, defects efficiently diffuse toward the opposite interface, thus making the difference between unipolar and bipolar stress increasingly negligible.Note that the weak dependence of bipolar endurance on t D is consistent with previous results in [32].On the other hand, our data for unipolar stress show no dramatic dependence on t D , in contrast to [32], which might be explained by a different structure or etch damage profile in our MgO layer.

VI. AREA DEPENDENCE
The reduction of device area A in p-STT-MRAM devices allows to decrease the switching current, which is required to minimize the cell area limited by the driving transistor in 1T-1MTJ structures [14], [19].In addition to reducing the footprint and power consumption, area scaling also allows to improve cycling endurance due to the Poisson area scaling of TDDB [14], [18].To study the area dependence, Fig. 12 shows the measured N C value for bipolar cycling as a function of V + = |V − | for increasing area, namely A = (47 nm) 2 , (75 nm) 2 , and (105 nm) 2 .
Data in Fig. 12(a) show an unexpected nonmonotonic behavior, which is summarized in Fig. 12(a) (inset) for the case V + = |V − | = 0.65 V. Here, N C decreases with area but shows an anomalous large value for the largest area.This result was attributed to a series resistance effect, where the actual voltage drop V across the MTJ decreases with the device area.In fact, V is given by V = V − R s I , where R s is the series resistance associated with the contacts and interfaces and I is the current.As the device area increases, I also increases, thus causing a decrease of V .We estimated V by assuming R s = 300 , corresponding to the resistance after breakdown in Fig. 2(a).Fig. 12(b) shows N C as a function of V , evidencing a correct monotonic decrease of N C with area.
Fig. 13(a) shows N C as a function of device area for V + = |V − | = 0.65 V, evidencing a decrease according to the power law N C ∼ A −2 .Based on Poisson area scaling, the exponent in the power law should be equal to the inverse of the Weibull shape factor of N C , namely the slope of the cumulative distribution in the Weibull plot [33].The latter is shown in Fig. 13(b) for various A, indicating an areaindependent Weibull shape factor η = 1.35 in the formula log(-log(1-F) = ηlog (N C /N C0 ).Such a value of the shape parameter η can be explained by intrinsic TDDB processes, such as defect generation controlled by the electrical stress, in contrast to extrinsic breakdown processes for η < 1 [14], [34].From Poisson area scaling, we calculate a theoretical  slope in Fig. 13(a) of −1/η ≈ −0.75, in contrast with the experimental slope ≈ −2.This disagreement might be explained by the etching process having beneficial effects on the device lateral surface, resulting in a low probability of breakdown initiation.This effect results in a reduced effective area for breakdown process appearing as a stronger area dependence for relatively small devices [18].Also, Joule heating effects might contribute to TDDB, thus causing deviation from the field-driven Poisson area scaling for relatively small device area.

VII. CONCLUSION
We show a comprehensive study of breakdown-limited cycling endurance in p-STT-MRAM devices.Cycling endurance is experimentally monitored as a function of the pulse amplitude, polarity, and timing.We developed a semiempirical model based on generation, activation, and diffusion of defects in the MgO tunnel barrier.The model accounts for the dependence of endurance lifetime on applied voltage, pulsewidth, and pulse delay.Finally, the area scaling of endurance is experimentally analyzed and discussed.

Fig. 1 .
Fig. 1.(a) Experimental setup for real-time monitoring of the I-V curves during ac cycling of the p-STT devices.(b) Measured waveforms of voltage and current.(c) Measured I-V characteristic.

Fig. 2 .
Fig. 2. (a) P and AP measured resistances during cycling, showing TMR ≈ 50% and endurance failure after 1.5 * 10 5 cycles; median values over 10 reads are shown.After the MgO breakdown, the device showed a resistance of 300 Ω, corresponding to the contact resistance.Breakdown happened during (b) positive sweep if V + > |V − | or (c) negative sweep if |V − | > V + .The apparent current clamping is due to the oscilloscope limiting the visible range.(d) Endurance failure is attributed to the increased defect concentration in the MgO structure after the application of a stress voltage V applied .

Fig. 2 (
a) shows the measured resistance during a typical pulsed experiment under symmetric switching (V + = |V − |) as in Fig. 1, as a function of the number of cycles.Data show clearly separate P and AP states with a TMR = R/R P ≈ 50%, where R = R AP − R P .Endurance failure is marked

Fig. 3 .
Fig. 3. (a) Number of cycles at endurance failure N C as a function of the applied stress voltage for symmetric bipolar and for positive/negative unipolar cycling.Measured N C for asymmetric bipolar cycling at (b) variable V + and constant V − = −1 V and (c) variable V − and constant V + = 1 V.

Fig. 4 .
Fig. 4. Measured N C for asymmetric bipolar cycling for an increasing constant positive/negative voltage of (a) 0.8, (b) 0.9, and (c) 1 V.A map color plot summarizes the measured N C value as a function of V + and |V − | for t P = 100 ns.

Fig. 5 .
Fig. 5. Schematic of the semiempirical model of MgO breakdown, comprising (a) defect generation phase and (b) their activation.(c) Defects could be considered to be O 2− i − V 2+ O Frenkel pairs.

Fig. 3 (
Fig. 3(b) shows cycling endurance for asymmetric bipolar stress, with variable V + and constant V − = −1 V.The voltage dependence of N C shows two distinct regions, namely: 1) region A for V + > |V − | where data show a steep slope ≈ 50 mV/decade and positive-voltage breakdown, i.e., failure occurs during the set pulse and 2) region B for |V − | > V + with a relatively low slope ≈ 600 mV/decade and negative-voltage breakdown, i.e., failure occurs during the reset pulse.Even though breakdown polarity is dictated by the largest applied voltage, surprisingly V + influences breakdown in region B, where V + < |V − |.This remarkable evidence is further confirmed by Fig. 3(c), showing N C for asymmetric bipolar stress with variable V − and constant V + = 1 V and indicating the same qualitative behavior as in Fig. 3(b).The same behavior is evidenced by Fig. 4(a)-(c), showing N C for asymmetric bipolar stress with fixed V + and variable V − or fixed V − and variable V + , with constant voltage equal to 0.8, 0.9, and 1 V.Note that, in each figure, N C for fixed positive voltage and fixed negative voltage overlaps almost exactly, again supporting the high symmetry of the MTJ stack with respect to voltage stress.The presence of two distinct regions A and B is confirmed in all of the three cases.Fig. 4(d) summarizes measured N C as a function of V + and |V − | in a color map plot, again confirming that the smaller voltage, e.g., |V − | for |V − | < V + or V + for V + < |V − |, and also contributes in dictating endurance lifetime.No other input patterns were explored, e.g., a mixed unipolar/bipolar regime, although we expect that the failure mechanism would not change, and the endurance would be intermediate between the unipolar and bipolar cases.

Fig. 6 .
Fig. 6.(a) Measured and calculated N C taking into account only defect generation process.(b) Calculated cycling endurance considers also defect activation process, demonstrating good agreement with experimental data.

Fig. 7 .
Fig. 7. Measured N C as a function of the applied voltage for asymmetric bipolar and asymmetric unipolar stress.The different voltage dependence supports the vision where the defect activation consists in a defect displacement rather than a thermal effect.
Defect concentrations are given by n D,TE = n D0 * R TE /R 0 and n D,BE = n D0 * R BE /R 0 , where R TE and R BE are the generation rates describing the cycling-induced degradation at the TE and BE interfaces, respectively, while R 0 is a constant.For our crystalline MgO layer, defects might be attributed, e.g., to Frenkel pairs of O vacancies V 2+ O and O interstitials O 2− i [26].As shown in Fig. 5, tunneling electrons are considered to have a primary role in MgO degradation according to a two-stage mechanism, including: 1) defect generation [see Fig. 5(a)] and 2) defect activation [see Fig. 5(b)], as detailed

Fig. 8 .
Fig. 8. Measured and calculated N C for (a) asymmetric cycling at constant V − and constant V + and (b) symmetric bipolar and unipolar cycling.(c) Color plot of N C as a function of V + and |V − | for t P = 100 ns, obtained from model calculations.
(b).After a positive pulse of voltage V + , the application of a negative pulse with amplitude |V − | < V + can activate the defects generated by the positive semicycle, e.g., by displacing an interstitial O 2− i away from the corresponding O vacancy in the newly created Frenkel pair, as shown in Fig. 5(c), with a rate R a /R 0 = ke β|V − | , where k and β are constants with β < α.The activation causes an additional damage to the dielectric layer during the lowvoltage semicycle, since the separation of the two constituents of the Frenkel pair leads to: 1) a reduced probability of recombination and 2) an increased defect concentration in the bulk of the MgO, supporting the formation of a percolative path [22].Calculation results from the generation/activation model with k = 1 and β = 4 V −1 are shown in Fig. 6(b), indicating better agreement with data in both regions A and B.

Fig. 9 .
Fig. 9. (a) Measured and calculated N C for symmetric bipolar cycling for different applied pulsewidths t P .(b) Corresponding data and calculations for N C as a function of t P for V + = |V − | = 0.8 V.
) respectively.Slopes in regions A and B can be directly related to α and β.The model is able to account for N C for unipolar (positive and negative) and symmetric bipolar stress (i.e., V + = |V − |), as shown in Fig. 8(b).Fig. 8(c) shows the simulated voltage-dependent endurance for t P = 100 ns and t D = 20 ns.

Fig. 10 .
Fig. 10.Measured and calculated N C as a function of t + and t − for asymmetric bipolar cycling in (a) region A and (b) region B. The maximum number of cycles depends only on the pulsewidth of the highest voltage pulse, which is responsible for the generation step in Fig. 5(a).

Fig. 11 .
Fig. 11.(a) Measured and calculated N C as a function of t D for unipolar and bipolar stress.(b) Schematic of the defect diffusion while no voltage is applied to the device.

Fig. 12 .
Fig. 12.(a) Measured N C as a function of the applied voltage for different sample areas for symmetric bipolar stress condition, the inset shows N C for three different areas at V + = |V − | = 0.65 V. (b) Similar cycling failure data are presented as a function of the actual voltage drop on the MTJ (V ), as shown in the inset.

Fig. 13 .
Fig. 13.(a) Measured and calculated N C as a function of device area for three different device areas.The applied waveform was symmetric bipolar with t P = 100 ns and V + = |V − | = 0.65 V. (b) Corresponding Weibull plot for measured and calculated N C .Endurance data can be well reproduced by Weibull statistics even though the area dependence is stronger than the one predicted with Poisson scaling approach: TDDB ∼ A −1/η .

TABLE I SUMMARY
OF ENDURANCE MODEL PARAMETERS IN (1)-(3)