Investigation on the implementation of stateful minority logic for future in-memory computing

In-memory computing is one of the best ways to solve the delay and power consumption issues of traditional von Neumann structures in the current Internet of Things and big data era. The realization of in-memory computing based on memristors is being widely studied because of its simple structure, high integration, and compatibility with CMOS technology. Minority logic is considered as the most suitable to realize computation function, and the basic cell based on memristors to implementing minority logic has also been proposed. However, from our analysis, it has requirements on device electrical character. After fabricating resistance random access memory (RRAM) devices that meet requirements, demonstration still cannot achieve. Through theoretical derivation and simulation, resistance variation is the main reason for wrong results. Moreover, high variation devices with existing technology can be chanllenging to demonstrate the basic logic cell.


I. INTRODUCTION
Traditional von-Neumann architecture faces significant challenges on the power consumption caused by the separation between the memory and CPU. Exploring in-memory computing (IMC) using beyond-CMOS devices has been considered a promising pathway towards low-power computation. Wherein, IMC based on RRAM [1], [2] has raised great interest because of the simple structure, high density, good scalability, and CMOS compatibility. Besides, RRAM based IMC can be divided into two types, stateful logic and memristor-transistor logic. There have been several stateful logic methods to realize logic function and further the logic circuit in the memory array, e.g., material implication (IMPLY) [3], memristor aided logic (MAGIC) [4]. In 2008, the lab of HP proposed the IMPLY structure and they first used the resistance state of RRAM as logical variables. However, the problem of IMPLY and its variance [5] is that additional and stable resistors are required. Besides, long time operation is required to form complex logic functions, and the output will cover the input after operation. In 2012, the MAGIC structure was proposed to solve the problem of external resistance and coverage. The design practices of logic circuits based on MAGIC NOR gate were studied and effective synthesis methods were reported [6]. However, the issue that requires long operation steps to realize complex logic function still exists because the NOR function has low flexibility. Besides, the stringent requirement for VSET/VRESET [7], which most RRAM devices cannot satisfy, is also a severe challenge. Due to the limitation that more peripheral circuits are needed to realize logic operation in the memory array [8], stateful logic usually suffers more latency and power consumption.
Memristor-transistor logic (MTL) is also one promising technology, likened to conventional CMOS logic [9]. The CMOS-like gates have similar structure to their CMOS counterparts. The RRAM devices are used with threshold voltages, instead of NMOS/PMOS. Voltage represent both the logic input and logic output variables. Besides, it has the advantage of being more suitable to use the existing EDA tools to optimize the circuit design. However, the disadvantage is the extremely slow speed compared with CMOS.
Recent research [10]- [13] has confirmed that majority logic can implement arithmetic-intensive circuits with fewer gates, and it has been proved that Majority-based logic can achieve up to 33% reduction in logical depth compared to And-based logic [13]. For example, when comparing with IMPLY, the computation efficiency of the mathematical addition doubles with the majority logic, which is one particular case of the threshold logic [14]. Various realizations of the majority logic have been demonstrated on RRAM arrays, either using the voltage/resistance [15] or resistance/voltage [16] as input/output state variables.
The key challenges are 1) they are non-stateful logic where the conversion between voltage and resistance is required, which complicated both the operation procedure and the circuit design; 2) Majority logic is functionally incomplete and the logic NOT is needed, making it impossible for homogenous implementation with simple design.
The minority logic, which is the inverted majority logic, is function completeness and therefore can be an ideal alternative. Quantum-dot cellular automata (QCA)-based minority logic gate has been fabricated, and circuit modules designed based on the minority logic, such as 2to4 decoder and multiplexer, are also designed and simulated. Recently, the theoretical design for the stateful minority logic using RRAM, the so-called fast and energy-efficient logic-in-memory (FELIX), was proposed by S. Gupta et al [17]. The aforementioned challenges can be tackled if FELIX is realized with practical RRAM technologies. In this paper, we derived the requirements of RRAM on asymmetric I-V curve in the implementation of FELIX. The Cu/ZnO/Pt structure devices that can meet the requirements are fabricated according to the specific process. By testing the basic circuit cell with different input patterns and checking the logic output, we found that the output of one case is different from the theoretical value. Based on the analysis, we first show that the realization of FELIX is difficult to achieve due to the unavoidable intrinsic variation.

II. BASIC LOGIC CELL
The resistance state of RRAM is used as logical variable, and here we define the high resistance state (HRS) and low resistance state (LRS) of the memristor as logic 0 and logic 1, respectively. By using Ohm's law and Kirchhoff's law, which are both resistance-based laws, data storage and logic computing can be realized in memory arrays, which can solve the "storage wall problem" as mentioned above.
The structure of minority basic logic cell is shown in Fig.  1(a), in which the initial resistance state of RRAM A, B, C is logical inputs, while the final resistance state of Y after operation is logical output. The operation consists of two steps, and the first step is initializing the Y to LRS, the second step is applying Vo at the top terminal of A, B, and C, and grounding the top terminal of Y. After applying corresponding voltage, the voltage at the node of bottom electrode Vx is determined by the already existing logic values, which are also the resistance states of input RRAMs. All possible inputs can be divided into four categories: LLL, LLH, LHH, and HHH. Wherein each character denotes the state of one input RRAM. For different inputs, the voltage Vx can be different, which in turn affects the switching behaviour of Y. Since mature RRAMs have large resistance ratio, RRAMs with HRS can be considered as open circuit when they are in parallel to other RRAMs of LRS. Thus, Vx is only controlled by the LRS. The requirement for the switching voltage is elaborated below: • Inputs with more than one RRAM in LRS (LLH or LLL): the total resistance for input RRAMs is smaller than the value of Y. More than half of Vo will fall on the RRAM Y. For example, Vx=3/4Vo under LLL and Vx=2/3Vo under LLH. Y must be RESET to HRS for both cases. • Inputs with more than one RRAM in HRS (LHH or HHH): the total resistance for input RRAMs is larger than the value of Y. Vx is lower than half of Vo. For example, Vx= 1/2Vo under LHH and Vx=0 under HHH. In either case, no RESET should happen, and Y keeps its initial state. According to the analysis above, the truth table for this basic cell is shown in Fig. 1(b), and the logic function derived is written as Eq. (1). Besides, it can be converted to another version on the next line by a simple derivation, from which we can see that if we preset C as logical 1(write LRS to memory cell), NOR gate is formed; if we preset C as logical 0 (write HRS to memory cell), NAND gate is formed. Besides, when both B and C are preset into 1 and 0 respectively, NOT gate for input A is formed.
Because both the NAND gate and NOR gate are logical completely, all kinds of digital circuits can be realized by this basic logic function, thereby realizing logic operations in the memory structure is feasible. Compared to other logic cells realized by existing in-memory computing architecture, this logic function has more degree of freedom, so it may help when realizing complex digital circuits. For example, it is worth mentioning that the logic function result of this proposed structure and the output signal Cout of the full adder, which equals AB+ BCin+ ACin, are a pair of opposite signals.

Output
Expression The expression and operating steps of 1-bit adder are derived in Table 1, and the proposed circuit structure is shown in Fig. 1(c), which consists of three minority logic gates. In a ripple-carry adder, the generation and propagation of Cout are critical path, which determines the speed and power consumption, so the realization of the N bit adder can be accelerated and the power consumption is lower compared with other in-memory computing methods. However, the circuit raised some requirements to RRAM electrical characteristic. On one hand, when input RRAMs with LRS dominate, the minority logic requires that the output RRAM, Y, resets to HRS when Vx=2/3Vo and remains at LRS when Vx=1/2Vo. Therefore, the reset voltage, VRESET, must lie in between. As a result, operation voltage Vo should be set to satisfy: On the other hand, when input RRAMs with HRS dominate, Vx is either 1/2Vo or 0, both of which will not trigger the RESET process. However, in such a situation, the input RRAMs are positively biased at the voltage level of more than 1/2Vo, leading to undesirable SET operation. Therefore, the set voltage, VSET, should lie above its maximum possible value, occurring as Vo under HHH condition. Requirements of VSET can be written as: The relationship between VSET, VRESET and Vo is shown in Fig. 2. According to  and (2-3), not only a range of Vo is restricted, but also a requirement for device character is needed, as written in (2-4).
The range ∆V= 2/3Vo -1/2Vo is an approximation when HRS >> LRS, so in general cases, this range should be analyzed clearly. By defining the ratio of HRS and LRS as m, then this range is a function of m, which can be written as: Besides, when m>20, this range almost reaches the maximum value. Due to the variation of the resistance of the device, its high resistance and low resistance have upper and lower limitations, defined as RH_max, RH_min, RL_max, RL_min. Considering the worst case, it is necessary to ensure that each basic cell works normally with RH_min> m*RL_max.  V SET (V) [18] V SET =1.5|V RESET | By far, researchers have found that electrochemical conduction mechanism (ECM) RRAM has an asymmetric I-V curve compared with other memristor types. As shown in Fig.  3, a statistic of VSET and VRESET of ECM RRAM [18]- [26], vacancy conduction mechanism (VCM) RRAM [27]- [34], ferroelectric tunneling junction (FTJ) [35]- [38] and magnetic tunneling junction (MTJ) [39]- [42], means that due to different conductive mechanism, using Cu or Ag as electrodes leads to more asymmetric I-V curve.

III. DEVICE FABRICATION
Here, we fabricated a Pt/ZnO/Cu structure devices as shown in the inset of Fig. 4, which is supposed to serve as ECM device. 100nm Cu was deposited on a commercial SiO2/Si substrate by rf magnetron sputtering at room temperature. 80nm ZnO film was deposited by plasma-enhanced atomic layer deposition (PEALD), using H2O and Diethyl zinc as precursors, and the deposition rate was about 0.2nm each cycle. As top contacts, sputter-deposited platinum was used with a diameter of 50 µm~300um, patterned by a shadow mask, using a commercial sputter-coater. After these films are fabricated, wet etch process is used to expose the Cu bottom electrode for electrical testing. The current-voltage (I-V) characteristics of the Pt/ZnO/Cu/SiO2/Si devices were measured at room temperature in air using a Keithley 4200A-SCS semiconductor parameter analyzer connected to a Kelvin ST-500 probe station. The bias was applied to the top electrode. The typical I-V curve is shown in Fig. 4. As shown, the I-V curve is asymmetric under positive and negative voltage sweep, due to the different set and reset process mechanisms. By applying a positive voltage to the Cu TE, Cu atoms dissolute and migrate into ZnO film under the applied electric field. The migration of Cu ions eventually leads to a reduction at Pt BE and Cu atoms conductive filament (CF) formed from BE to TE. As a result, a large current flowing through this filament immediately reaches the compliance current (Icc), and the voltage is VSET. Once the bias polarity is reversed and compliance current removed, Cu atoms filament is broken either by reduction or Joule heating so that low current can flow, corresponding to high resistance state, and the voltage is VRESET. The red line shows forming process, the initial resistance of free device is 4KΩ, and it switches to LRS = 70Ω at VFORM = 2.64V with compliance current Icc = 8mA, which means conductive filament of Cu first formed due to oxidation of Cu ions originating from top electrode.  Device switches to HRS = 1.5KΩ at VRESET = -1.3V, which means the CF ruptured partially with a gap remaining between undissolved CF and Cu electrode. As a result, CF is much easier to form in the next process of positive voltage sweeping, corresponding to a lower switching voltage at VSET = 2V. After applying negative voltage, the CF ruptured by reduction of Cu atoms again, which means the device switches to HRS, with resistance ratio >20. These experimental results imply that the device has |VSET/VRESET| >1.7 and HRS/LRS >20, which can satisfy the requirements of basic logic cell mentioned before. In Fig. 5, sweep results of 20 cycles on one fresh RRAM device are tested, and the variation character is summarized. As shown in the upper part of Fig. 6, the VSET varies from 1.8V to 2.4V while VRESET varies from -0.8V to -1.2V, and this range still satisfies the requests mentioned in Eq. (4). The resistance variation is shown in lower part of Fig. 6, and it also satisfies the requests of resistance ratio over 20 with σR = 0.3.   We considered the simulation results of basic logic through Cadence Spectre based on VTEAM model. As shown in Fig.  7, a Cadence Verilog-A model is designed to fit the experimental data. The simulation result of logic output is shown in Fig. 8. In each cycle, inputs change and after operating, the results are saved as resistance state of RRAM Y, corresponds to the truth table from top to down. Besides, the simulation result of power consumption for different input patterns are shown in Table 2.

IV. VARIATION ANALYSIS
In order to demonstrate the functional correction of basic logic cell by fabricated devices, we test the basic cell through the structure in Fig. 1, with four probes on the TE of RRAMs and one at the intermediate node and the results are shown below. We choose 4 RRAM cells from sample and apply triangular wave on TE of input RRAMs through probe. With another probe on ZnO film to record current through output RRAM Y. When input LLL or LLH, output RRAM Y should finally reset to HRS due to Vx larger than |VRESET|, as shown in Fig.  9(a) and Fig. 9(b). When input LHH or HHH, output RRAM Y should finally maintain LRS due to Vx smaller than |VRESET|. However, from the result of Fig. 9(c), while input LLH, RRAM Y outputs HRS instead of the correct state of LRS. The output of HHH is correct, as shown in Fig. 9(d).
Here we discuss the reason for wrong results of case LLH by analyzing resistance variation. For simplicity, we firstly only consider that VRESET has variations. Assuming the average value and the variation of VRESET is μVRESET and σVRESET, the following two requirements must be met to guarantee the circuit design with a target yield of three sigma: • For LLH, the maximum allowable|VRESET|=(1+3σVRESET/ /μVRESET)·μVRESET, should not exceed L/(L/2+L)·Vo =2/3Vo. • For LHH, the minimum allowable |VRESET|=(1-3σVRESET/ μVRESET)·μVRESET, should be higher than L/(L+L)·Vo =1/2Vo. The maximum allowable σVRESET occurs when μVRESET is in the middle of the window, as shown in Fig. 10(a). This gives: As we can see, in this simplified condition, the maximum normalized variation, σVRESET/μVRESET, is a constant. Then, if we further consider the resistance variation, it is expected that the tolerance becomes even lower, as shown in Fig. 10(b).
When input case is LLH, the left boundary (LB) of voltage level is determined by the maximum pull-up resistance, which changes from μL/2 to (1+3σL/μL)·(μL/2), and the minimum pull-down resistance, which changes from μL to (1-3σL/μL)·μL. Due to the requirement that the left boundary of Vx should be lower than the LB of |VRESET|, a conjoint relationship between σL and σRESET is derived in .
When input case is LHH, the right boundary of voltage level is determined by the minimum pull-up resistance, which changes from μL to (1-3σL/μL)·μL, and the maximum pulldown resistance, which changes from μL to (1+3σL/μL)·μL. Due to the requirement that the right boundary of Vx should be larger than the UB of |VRESET|, the corresponding expression is shown in (4)(5).
From (4-4) & (4-5), a relationship between σL/μL and σVRESET/μVRESET can be estimated, as shown in Fig. 11(a). The area below both curves means σL/μL and σVRESET/μVRESET that can satisfy the two equations and can implement the logic correctly. Besides, we found that the curve from (4-4) is higher than that from (4)(5). It is because when input case is LLH, the variation is lower due to parallel connection of two RRAMs, and as a result, the range of voltage level is lower than that of LHH, which makes (4-4) easier to be satisfied. The points marked inside Fig. 11(b) are other kinds of devices from reference, and as a result, it is hard to realize FELIX with existing devices. In order to show the changing trend, we simulate the correction with device parameters of different σL/μL and σVRESET/μVRESET in Fig. 12. From the gradual change of color, the correction decreases to 70% by degree. Similarly, for the input RRAMs, voltage drop between TE and BE is (Vo-Vx), while for HHH case, this drop comes to maximum value Vo, since Vx=0. As a result, VSET should satisfy the requirement that Vo for HHH case should not exceed the minimum value so that the resistance state of input RRAMs can be maintained, as shown in Fig. 13(a). Since the ideal value of μVRESET is 7/12Vo, the value of μVSET can be defined by the ratio of μVSET/μVRESET, marked as r, which can be written as: The relationship of the switching voltage ratio r with requirement on σVSET is shown in Fig. 13(b). For example, with switching voltage ratio r=2, the maximum tolerable σVSET ≈0.05, and if device has much more asymmetric I-V curve, it can tolerate more variation of SET voltage. Simulation results of resistance variation σL=0.3 and σVRESET=0.1 are shown in Fig. 14, with partial error for the middle two cases.   According to Table 3, most existing ECM device has resistance variation over 20%, it is due to conductive filament (CF) breaks at different positions, the metal ions drift and colliding under the action of electric field The metal ions are reduced and accumulated at different positions of the bottom electrode or the branch of CF [43]. Origin of device-to-device variability is attributed to discrepancies in the fabrication processes such as variation in the switching oxide thickness, the surface roughness of the electrodes, etching damages. [44]. It eventually leads to the uncertainty of the shape of the metal filament and the contact area with the electrode in each switching process, which is reflected in the variation of resistance and switching voltage.

V. CONCLUSION
This paper analysed the advantage of minority logic among other basic Boolean logic while implementing in-memory computing. Based on the analysis, we found that some requirements on RRAM device are necessary for realizing basic logic cell in circuit array. Since there is consensus that ECM RRAM has asymmetric VSET and VRESET, we fabricated Cu/ZnO/Pt device, and the I-V curve can exactly satisfy requirements. By demonstrating the feasibility of basic minority logic cell, four RRAMs were analysed. However, due to the failure of one case, we found that the variation of resistance can affect the range of separating different cases. In summary, after exploring several emerging memory technologies that have strong potential to be commercialized in future, we find FELEX is incompatible with them due to the tight voltage and resistance variation requirement.