Endurance of 2 Mbit Based BEOL Integrated ReRAM

In this work, we experimentally characterize the endurance of 2 Mbit resistive switching random access memories (ReRAMs) from a 16 MBit test-chip. Here, very rare failure events where the memory cells become stuck in the low-resistive state (LRS) are observed. As this failure mechanism is the limiting one concerning the endurance of this ReRAM implementation, extensive investigations are conducted and presented. The experimental findings are detailed via a voltage divider model, illustrating why memory cells can become stuck in the LRS. It is proposed, that an insufficient voltage dropping over the cell due to an unfavorable combination of cell- and transistor resistances is responsible for stuck-at-LRS bits. Furthermore, predictions for the origin of these suboptimal combinations are given. Additionally, a one-dimensional Kinetic Monte Carlo (KMC) model that allows a statistical investigation of large numbers of cells with regard to rare random events has been developed. Here, our proposed explanation for the observed failure mechanism is fortified by the simulation and evaluation of the switching process of the memory. All simulations are in very good agreement with the experimental data. Finally, based on our findings, we give suggestions for the improvement of switching algorithms.


I. INTRODUCTION
Regarding the strongly increasing demand for highly scaled, non-volatile memories in many modern applications like smartphones, the research in this field has grown strongly over the last years [1], [2], [3]. Here, resistive switching random access memories (ReRAMs) are very promising candidates for future industrial applications as well as for neuromorphic computing [4], [5], [6]. ReRAMs show a lot of great operation features like scalability, fast switching, reliability and complementary metal-oxide-semiconductor (CMOS) compatibility [7], [8], [9] and thus are expected to replace flash technology in the future [10], [11], [12].
The associate editor coordinating the review of this manuscript and approving it for publication was Giambattista Gruosso .
One of the most promising candidates in the broad field of ReRAM technology are bipolar switching memristive devices based on the valence change mechanism (VCM). VCMs typically consist of a transition metal-oxide as switching layer placed between two metal electrodes [13], [14], [15]. By applying appropriate voltages to the electrodes, oxygen can be extracted from the metal-oxide layer in a so-called forming step. Thereby, oxygen vacancies are left in the switching layer building a conducting filament through the initially insulating metal-oxide. With suitable applied voltages, the filament can be ruptured or rebuilt depending on the polarity of the voltage and so the device can be switched between a high resistive state (HRS) and a low resistive state (LRS) [16], [17]. The switching from HRS to LRS, in our case with the same voltage polarity as the forming step, is called SET, the opposite process is called RESET. The present state of the cell can be easily read out non-destructively by applying a small voltage to the device.
Independent of the specific application of the ReRAMs, reliability is a major issue that has to be investigated and optimized [18], [19], [20]. Especially, endurance meaning the ability of huge numbers of faultless consecutive switching cycles is of great interest and will therefore be in the focus of our work [21], [22].
In the following, experimental results of 2 MBit VCM cells are presented, predominantly showing great endurance. Nevertheless, looking at the huge statistics with many cells after many switching cycles, a few cells become stuck at the LRS. This rare failure event is the main effect limiting the endurance and is therefore further investigated experimentally. Whereas Yang et al. [23] are connecting this kind of failure mechanism to the presence of a second filament, we present an alternative explanation. In a simple voltage divider model, we propose a too low voltage dropping over the cell to be responsible for the RESET failure. Additionally, a newly developed one-dimensional kinetic Monte Carlo (KMC) model is presented. This model allows us to simulate the RESET behavior with high statistics and with regard to rare random events. These simulations underline our proposed explanation for the physical origin of the experimentally observed failure mechanism.

II. EXPERIMENTAL DETAILS
Our experimental endurance study was conducted on a test-chip with 16 Mbit VCM-type ReRAM, shown in Fig. 1, a). The cells are integrated back-end-of-line (BEOL) in a 1-transistor-1-resistor (1T1R) configuration in a 28 nm CMOS technology. Programming and read out are performed via on-chip circuitry. From this test vehicle, the highlighted block of 2 MBit is cycled.
A schematic of a single 1T-1R cell with the respective biases is shown in Fig. 1, b). The select transistor is opened by applying a voltage V WL to its gate. In order to electroform or SET a bit, a rectangular voltage pulse V BL is applied to the bitline which is connected to the ohmic electrode (OE) of the ReRAM cell. For the RESET operation, a rectangular voltage pulse V SL is applied to the source of the transistor (while its drain is connected to the active electrode (AE) of the ReRAM cell). To ensure reliable programming, electroforming, SET, and RESET are performed with a program-verify algorithm [24]. Here, the resistance of each bit is determined by a read pulse after each programming step and compared to a target resistance. If the target is reached, the programming operation is terminated and considered successful. Otherwise, the programming step is repeated with increased pulse length, V BL , V SL or V WL . Here, increasing V BL or V SL provides a higher voltage across the ReRAM element during SET or RESET, respectively. As the transistor not only selects the cell to be programmed but also acts as current limiting element, increasing V WL allows for higher current through the device. A good control of this current is crucial during electroforming and SET, as it strongly affects the resulting resistance of the cell [25], [26], [27]. In contrast, V WL is desired to be comparatively high during RESET in order to allow sufficient current for this process. A current limitation is not needed for this operation, because the increasing cell resistance naturally limits the current.

III. EXPERIMENTAL RESULTS
A. ENDURANCE Fig. 2, a) shows the cumulative HRS and LRS distributions read after different cycle numbers up to 500k. SET and RESET are performed using our standard algorithm which comprises multiple steps with increasing pulse length, V BL , V SL and V WL . This results in a very good endurance in the investigated interval of 500k cycles. Whereas the LRS distribution only slightly broadens upon cycling, the HRS distribution is observed to drift towards lower read current, which has a positive effect on the read window between HRS and LRS. However, at higher cycle numbers (300k-500k) a tail appears in the HRS distribution consisting of bits which were not RESET successfully but stuck in the LRS. Although this affects only few ppm of bits, this failure has to be investigated, which is the main subject of this study.

B. STUCK-AT-LRS FAILURE
The inset in Fig. 2, a) shows the number of failed bits in the SET (red) or RESET (blue) operation over the cycle number. Up to 250k cycles, both SET and RESET are comparatively stable, although each cycle carries the chance of single failed bits. Whereas the SET operation remains stable within the tested cycles, an increasing trend of failed bits is observed for the RESET operation. Here, the number of fail-bits increases Starting from approx. 250k cycles, the number of fail-bits in HRS increases linearly. b) Adjusted algorithm, intentionally provoking the observed stuck at LRS failure. After 50k cycles, a tail of stuck at LRS bits is observed at −4 σ . The tail grows to −3.5 σ after 100k cycles.
linearly with the cycle number to a maximum of 22 at 516k cycles, which accounts for approx. 10 ppm.
Despite the significance of this failure mechanism, it affects only very few individual ReRAM cells in the shown experiment. Further, no clusters of stuck-at-LRS bits are observed, but a random distribution of failed bits in the memory block. In order to gain deeper insights of the underlying mechanism, the endurance experiment was repeated with a programming algorithm that was slightly altered in a direction which provokes this failure and thus generates more stuckat-LRS bits. Cumulative distributions of HRS and LRS after 50k-100k cycles with the suboptimal algorithm are shown in Fig. 2, b). Here, from 50k to 51k the read current of each bit in each cycle was measured directly, resulting in consecutive traces containing the states right after SET or RESET over 1000 cycles.
In order to understand the origin of this failure, exemplary traces out of this data set are shown in Fig. 3. In the five columns, selected bits with different numbers of failed cycles are depicted. The first column (Fig. 3, a), b)) shows reference bits for which no failures have been observed. The remaining traces are sorted by the number of cycles that a bit was stuck in LRS within the monitored 1k cycles. This ranges from only a few cycles in Fig. 3, c), d) to more than 600 cycles in Fig. 3, i), j). The traces show that bits can become stuck in LRS spontaneously as in Fig. 3, c) or the HRS is observed to drift towards LRS from cycle to cycle until it becomes stuck (c.f. Fig. 3, h)). Although cells can become stuck over several cycles (> 1000), it is not primarily a permanent device failure. The single cells rather end up in a state from where the successful RESET becomes more unlikely. Thus, bits are typically observed to recover after a few or several cycles. Other bits seem to continuously alternate between functioning and faulty, as in Fig. 3, g). It may be noted that the cases in the different columns are not equally likely. RESET failures with long lifetime are much less likely than failures over single or very few cycles.
However, two general observations are made: Firstly, stuck bits usually comprise a comparatively high HRS current before failing or after recovery. Secondly, the stuck-at-LRS bits typically exhibit rather high read current in the problematic LRS. Whereas the first observation might be an indicator how bits become stuck, the second observation of low LRS resistances hints towards a possible explanation of the failure mechanism: In the interplay of ReRAM cell and series resistance of the periphery (including the access transistor), a too low cell resistance might lead to an insufficient electrical field across the cell as the major part drops across the periphery. In the following section we discuss this mechanism using a simple model before we compare our experimental findings to the results of KMC simulations.

C. PHENOMENOLOGICAL MODEL
The origin of the RESET failure can be explained by a simple model. In Fig. 1 an extract of a single memory element in a typical 1T-1R configuration was already presented. The transistor shown on the right can be controlled by the word line voltage V WL , the ReRAM can be read or switched by the applied bit line and source line voltages V BL and V SL . During the RESET, the resistance of the transistor is nearly constant. Therefore, the transistor as well as the line resistances and other parts of the circuitry are summed up to a resistance of the periphery R per . Thus, as can be seen in Fig. 4, a series connection of the ReRAM cell with a cell resistance R cell and the resistance of the periphery R per is taken to represent the 1T-1R element. This results in a voltage divider where the totally applied voltage V tot splits up into the voltage dropping over the periphery V per and the voltage dropping over the ReRAM cell V cell . This voltage V cell is crucial with regard to the switching process, as the switching time depends exponentially on it [28]. According to the voltage divider, V cell can be calculated to The traces are sorted column-wise regarding the number of failed RESETs. a), b) show good reference bits which were not observed to be stuck in this interval. c), d) comprise few failed cycles, which increases towards i) with more than 600 and j) with all 1000 failed cycles. Stuck bits can occur spontaneously or appear with a gradually increasing HRS current. The failure is not permanent, but can be recovered. Stuck bits typically come from higher HRS current and comprise high LRS current when they are stuck. which therefore depends on three parameters: The first parameter is the externally applied voltage V tot , which is in good approximation constant for all times and all devices. The second parameter is the resistance of the periphery R per , which is constant over time for a single device, but has a certain variance from device to device. The last parameter is the resistance of the cell R cell that varies during switching and is different for each device after every switching cycle. Especially, R cell in the LRS is strongly depending on the preceding SET event. The influence of both varying parameters, R cell and R per , on V cell is shown in Fig. 5. It can be seen that, depending on the possible cell and periphery resistances, V cell can vary significantly. Realistic values were chosen for the shown ranges of R cell and R per . It is observed that V cell can lay in a range from approximately 0.9..1.4 V. Exemplarily, a cell with R per = 3600 is marked for a common cell resistance of R cell = 3500 (A) and a low-ohmic state with R cell = 2500 (B) in Fig. 5. Between these two states, a change in V cell of about 0.2 V can be observed. Since it has been demonstrated that a decrease of V cell by 0.1 V can increase the RESET time by one order of FIGURE 5. Cell voltage in dependence of cell resistance and periphery resistance. As an example, a cell A with R per = 3600 and R cell = 3500 is shown. For cell B, with the same R per , but programmed into a low-ohmic state with R cell = 2500 , V cell is lowered by approx. 0.2V. magnitude [28], this huge variance in V cell leads to a strongly varying switching behavior from cell to cell and from cycle to cycle.
Most of the cases are not problematic, as the RESET conditions are nicely adjusted. But in rare cases of a very high periphery resistance R per in combination with a very low cell resistance R cell , the voltage dropping over the cell V cell is too low to drive the RESET process within the applied pulse width. This failure process becomes even rarer, as programming algorithms increasing the applied voltage V tot or the time of the switching pulse are able to switch devices that failed or only partly switched in the first try. Only for very extreme configurations, the devices are stuck in the LRS.
During the SET operation, the periphery resistance determines to a large extent in which resistance the cell ends VOLUME 10, 2022 up. The resulting LRS is expected to be in the range of R cell ≈ R per as the applied voltage would drop mainly across the periphery resistance as soon as the cell resistance becomes significantly smaller. For RESET, the transistor is opened further to ensure having a sufficiently large voltage dropping over the cell. However, the body effect has to be considered. Since the access transistor needs to be operated in two polarities (for SET and RESET), its bulk and source cannot be connected. In SET direction, the voltage is applied at the BL with the transistor source connected to ground. However, during RESET the effective source of the transistor is located at the middle node in Fig. 1, a), between the active electrode of the ReRAM element and the transistor channel. This reduces the effective voltage between gate and source and thus increases the resistance of the channel. As this results in a higher periphery resistance during RESET compared to SET, the body effect strengthens our prediction for the cause of the RESET failure.

D. GENERATION
After the reason for the RESET failure is established, the question arises how the unfortunate combination of low cell resistance and high periphery resistance is generated. One reason is the body effect, as discussed above. During SET, the cell is programmed to a state determined by the series resistance of the periphery. If this resistance increases upon attempted RESET, this process is already more challenging. As another possible origin of the RESET failure, the formation of an additional filament was reported [23]. However, both explanations do not account for the increasing number of failed bits with higher cycle numbers (c.f. inset in Fig. 2, a)). In order to understand the generation of stuck bits, the read currents of the last successful SET and RESET in our experimental data are traced and compared to the probability to become stuck in the following cycle. As depicted in Fig. 6, a) this probability increases for cells with a high read current in HRS, close to the read window. This fits to the qualitative observation in Fig. 3 that problematic bits are often accompanied by high HRS current before becoming stuck or after recovery. Additionally, Fig. 6, a) shows that an LRS read current close to the read window indicates a higher probability to become stuck in the next cycle. This is counter-intuitive, as bits are typically observed to become stuck in a high current LRS. However, note that the figure only contains the last good cycle. Bits that are already in the problematic high current state would not appear here. In both cases (HRS and LRS), the bits close to the read window are likely to become stuck afterwards. This hints towards the applied programming algorithm as a possible cause of bits becoming stuck, as the decision if further programming pulses are executed depends on a threshold current in the read window. It seems likely that in particular those bits that hardly reach the specified threshold endure further attempts by the programming algorithm, where the later steps typically comprise longer pulse width, V BL or V WL . To understand the impact of the programming algorithm, an exemplary algorithm for the SET operation is studied experimentally in Fig. 6, b). The algorithm consists of four steps with increasing V WL and thus increasing maximum SET current. Between the steps, all bits are read and the number of bits which failed to reach the threshold current of a successful SET are evaluated. As expected, it can be seen that this number decreases with each executed step. However, the study also shows that the number of fail-bits increases over the number of cycles. Note that here, fail-bits are those which did not SET successfully. This means that the SET operation becomes more difficult at higher cycle numbers. In order to reach a fixed threshold current, the algorithm would therefore more often execute the later steps with higher V WL , which increases the likelihood to end up in the unfortunate combination of R cell < R per . Especially, an increasing V WL seems to be an operation to more likely generate these bad combinations. It is conceivable that over several steps the limiting factor is the switching kinetics and thus the pulse length and height (V BL ). If, nevertheless, the gate voltage V WL is increased, the cell might endure a comparatively high SET current as soon as the switching event occurs.

IV. 1D KMC SIMULATION
To further support our theory and to investigate the origin and behavior of the presented RESET failure on another level, a one-dimensional kinetic Monte Carlo (KMC) simulation has been used. Basically, the JART VCM 1.0 model by Torre [29], [30] was adapted by adding typical KMC methods for the central transition process [31]. On the one hand, the high performance of the compact model allows investigating high cell statistics with a reasonable amount of computation time. On the other hand, the KMC extension enables the investigation of statistical effects with respect to the influence of random processes.

A. MODEL
The physics-based compact model is based on several general assumptions. The model assumes a preexisting filament connecting the two electrodes, which shows a high number of oxygen vacancies. The filament has a time-invariant radius r and is divided into a plug and a disc region with a uniform oxygen vacancy concentration N disc and N plug each. The lengths of both regions l disc and l plug are also constant, whereas the oxygen vacancy concentrations in the regions are the only state variables changing over time. The total number of oxygen vacancies in the cell N cell = N disc + N plug is constant. Oxygen vacancies are treated as doubly positive charged donors that can be moved via drift in an electric field enhanced by temperature, which is assumed constant over the whole filament. Diffusion and thermodiffusion are neglected. Additionally, the metal/oxide interfaces are modelled as Schottky contact at the active top electrode and as ohmic contact at the bottom electrode. An equivalent circuit diagram of the model is presented in Fig. 7. Here, the five elements of our model can be seen, namely the Schottky diode, the disc and plug resistances R disc and R plug , the series resistance R ser at the ohmic electrode and the periphery resistance R per including the transistor. To make the connection to our phenomenological model that was introduced before clear, the first four elements have been summed up to the total cell resistance R cell in Fig. 7. As a central part of the model, Kirchhoff's law is solved, with the current being denoted by I . The calculation of R disc and R plug is based on band conduction mechanism and temperature-dependent mobility via with the cross-sectional area of the filament A = πr 2 , the charge number of the oxygen vacancies relative to the perfect crystal z Vo , the temperature independent prefactor of the mobility µ n0 and a small activation energy E ac modelling the temperature dependence of the mobility [29]. The temperature is calculated via with the Joule heating being described by an effective thermal resistance R th,eff and T 0 depicting the ambient temperature. The detailed calculation of V Schottky can also be found in [29]. The oxygen exchange between the disc and the plug region via the ionic current is calculated by typical KMC methods. For both directions, a jump of an oxygen vacancy from the plug to the disc region and vice versa, jumping rates are calculated via where ν 0 denotes the characteristic vibration frequency. For the hopping barrier W A , a typical value for the present metal-oxide is used and modulated by the Genreith-Schriever approach as predicted in [29] and [32] via depending on the jump direction. The factor γ modifies the hopping barrier according to with the hopping distance a and the electric field which is calculated by for the RESET. Then, randomly but weighted by the probabilities, one of the two processes is chosen. The time of the process is calculated as with r 1 being a random number between 0 and 1. Finally, the simulation time is increased by t jump and the chosen process is executed by updating N disc/plug , which directly influences R disc/plug and so the current through the ReRAM cell. The simulation parameters are given in table 1.

B. RESULTS
In the first step, the known parameters R ser , l cell as well as several material parameters, are implemented. Furthermore, the unknown values of the filament geometry l disc , l plug and the radius r are chosen in a reasonable way. The number of oxygen vacancies in the plug region N plug is chosen with regard to typical oxygen vacancy densities used in simulations for VCM cells [30], [33]. The key parameter is the number of oxygen vacancies in the disc region N disc and is chosen quite high in view of the LRS being investigated in the beginning. Additionally, for the values of R per a variability known from the experiment and for N plug and N disc an estimated variability (σ ≈ 25) are added as the VCM cells typically show a cell to cell and a cycle to cycle variability leading to a certain width of the current or resistance distributions. The RESET is performed by applying a voltage V tot to the active electrode. Here a pulse sequence of a short read pulse with a low voltage of V read = −0.2 V, a longer RESET pulse of V tot = 2.4 V and again a short read pulse is applied. The initial LRS current distribution and the HRS current distribution after RESET are presented as dark blue curves of Fig. 8 and fit nicely to the experimental data. We are interested in very rare events that only occur, looking at very high numbers of cells and large numbers of switching cycles in the experiment. Due to the high amount of computing resources needed to simulate even more cells, this method would be very inefficient. Furthermore, increasing the variability of several simulation parameters is not reasonable, as the current distributions of LRS and HRS are very stable in total except the single cells becoming stuck at high currents. From our simple voltage-divider model above, the RESET failure is assumed to occur for bad combinations of R per and R cell . As the properties of R per are well known from the experiment and expected to be stable for a single cell, R cell has to be changed in our simulations. Here, the parameter of choice is the number of oxygen vacancies in the disc region N disc . As can be seen in the color gradient of Fig. 8, N disc has been increased continuously and concomitant with that R cell has been reduced. Thereby, the probability to observe bad combinations of R per and R cell is permanently increased.  As a result, more and more bits can be observed that are not completely switched to the HRS in the light blue and green curves of Fig. 8. In the bottom right corner of the orange and red curves, even many cells appear that did not or just slightly changed their resistance after the RESET pulse. In Fig. 9, the read current of all cells in the LRS before the RESET pulse I LRS and after the RESET pulse I RESET are presented. It can be seen that the cell with lower I LRS can be easily switched to the HRS. At higher I LRS , many cells can be observed that do not or only partly switch towards the HRS. From the color of the data points, N disc can be read out similar to the color gradient in Fig. 8 before. Here again, it is visible that the cells with high N disc are much more prone to the RESET failure.
In a second step, we have a closer look at the properties of the cells that have not or only been partly switched from the LRS towards the HRS. In Fig. 10, the probability of a cell to not switch properly is presented in dependence on R per and N disc . N disc is strongly correlated to the resistance of the cell R cell , leading to a lower R cell the higher N disc is. In Fig. 10, a), all cells that did not switch properly to the HRS are presented. It can be seen that at least a combination of a medium N disc and a high R per or a high N disc and a medium R per is needed to obtain the failure mechanism. In Fig. 10, b) only cells that are stuck very deep in the LRS above a high current level are shown. To obtain these bits that do not or hardly switch, a combination of high N disc and high or very high R per is necessary, which fits well to our simple model idea from above. In the next step, we want to have a closer look at typical cells that only partly switched and typical cells that did not switch under regular conditions. Hence, cells with a medium N disc and a medium to high R per on the one hand (cell-type A) and cells with a high N disc and a very high R per on the other hand (cell-type B) are taken. Now, additionally to the regular switching conditions, the switching behavior of the cells under longer switching pulses or switching pulses with higher V tot are investigated. Thus, both cell-types are pulsed with a FIGURE 11. On the right, the LRS current distributions of cells of type A (red) and B (blue) are presented. The solid lines show the current distributions after a normal RESET pulse, as before. As expected, cell-type A only partly switches towards HRS and cell-type B is completely stuck at LRS. With a second longer pulse (dashed line) or pulse with higher voltage (dotted line) the RESET is tried again. The stronger pulse has a much higher impact than the longer pulse. Cell-type A has a higher chance to reach the HRS with a second pulse, whereas type B tends to stuck in the LRS.
second pulse after the first normal RESET pulse. The second pulse either has an increased pulse time or an increased applied voltage V tot = 2.6 V. The current distributions of the initial LRS distributions as well as the current distributions after the normal and the second longer or stronger RESET pulse are presented in Fig. 11. As expected, it can be seen that the cells of type A (red) can partly be switched towards HRS, whereas the cells of type B (blue) are nearly completely stuck in the LRS after a normal pulse (solid line). With a second, longer pulse (dashed line), the majority of the cells of type A can be switched to the HRS, whereas the cells of type B remain stuck in the LRS. Alternatively, with a second stronger pulse, nearly all cells of type A can be switched to the HRS and even some cells of type B switch towards the HRS. On the one hand, this explains why in the experiment as in Fig. 2, a) nearly no cells occur, that only partly switched towards the HRS. The cells, that initially only partly switch towards the HRS, can be completely switched via additional pulses by a programming algorithm. The cells, that cannot even be switched via additional pulses, are typically stuck at a deep LRS state. On the other hand, the possibility to recover many of the cells initially stuck at the LRS via programming algorithms explains the rare occurrence of this failure mechanism in the experiment. Furthermore, it can be mentioned that the increase of V tot during the programming algorithm seems to be more effective than increasing the pulse time.
Although we are able to reproduce and explain most of the experimentally observed properties of the stuck-at-LRS failure mechanism, our phenomenological model and our 1D KMC model are limited. So far, our models cannot thoroughly explain the increase of the read current of single cells during RESET when becoming stuck. In the 1D KMC model, an increase of the current can indeed be observed due to single oxygen vacancies jumping randomly from the plug to the disc region during the RESET. But, in comparison with the experiment, this increase in current is too low. Thus, we propose to consider and investigate additional effects like diffusion, thermodiffusion or an additional oxygen exchange at the AE in future works [20], [28], [34]. Furthermore, in our model, so far, we assumed N cell to be constant. Nevertheless, e.g., due to oxygen exchange at the interface of the OE and the oxide or thermodiffusion, N cell could increase during cycling. This comes along with an increased probability of reaching unfortunate combinations of R cell and R per . Once reaching such a bad combination with high currents, thermodiffusion or oxygen exchange are even more encouraged. This would explain why the observed failure events increase over time in the experiment.

V. CONCLUSION
In our work, we presented the generally great endurance of 2 MBit ReRAM based on VCM integrated in a 1T-1R configuration in 28 nm CMOS technology. Nevertheless, looking at high cell statistics and high switching cycle numbers, a rare failure mechanism is observed after RESET. As the failure events increase over the cycle number, we investigated them in more detail. Hence, a phenomenological model simplifying the ReRAM structure into a voltage divider consisting of a cell and a periphery resistance was introduced. With this simple model, the RESET failure was proposed to occur for very rare and unlucky combinations of low cell and high periphery resistances. In the next step, we dealt with the question of how this unlucky combinations turn up and why they increase over time. Here, the SET event and the dedicated program verify algorithm play an important role. In the last step, a 1D KMC model was introduced. The high performance of compact modelling was combined with the ability to investigate random processes by the integration of KMC methods. This allowed us to simulate the RESET process with huge statistics and with special regard to rare processes. In our simulation, the RESET process could be nicely reproduced, and the failure events could be provoked by increasing the number of oxygen vacancies in the disc region. We saw that, as we assumed, cells with a bad combination of cell and periphery resistance are prone to this failure mechanism. Finally, we looked at the impact of possible programming algorithms and the outcome of longer or stronger RESET pulses. These programming algorithms have to be further optimized to prevent permanent RESET failures of the cells.