OCCAM: An Error Oblivious CAM

Content addressable memories (CAMs) are widely used in many applications in general purpose computer microarchitecture, networking and domain-specific hardware accelerators. In addition to storing and reading data, CAMs enable simultaneous compare of query datawords with the entire memory content. Similar to SRAM and DRAM, CAMs are prone to errors and faults. While error correcting codes (ECCs) are widely used in DRAM and SRAM, they are not directly applicable in CAM: if a dataword that is supposed to match a query altered due to an error, it will falsely mismatch even if it is ECC-encoded. We propose OCCAM, an error oblivious CAM, which combines ECC and approximate search (matching) to allow tolerating a large and dynamically configurable number of errors. We manufactured the OCCAM silicon prototype using 65-nm commercial process and verified its error tolerance capabilities through silicon measurements. OCCAM tolerates 11% error rate (7 bit errors in each 64-bit memory row) with 100% sensitivity and specificity.

result in multiple-node upsets [4], as well as to multiple-cell upsets (MCUs) [5], [6].The ability to tolerate multiple-node and MCUs is especially critical in deep submicron technologies (such as 7 and 5 nm) and under aggressive voltage scaling [7].Hence, a CAM design that tolerates multiple bit errors per row and at the same time allows voltage scaling for low-power operation is highly advantageous.
In this letter, we design, fabricate in a commercial 65-nm process, and evaluate in silicon OCCAM, an error-oblivious CAM, capable of tolerating a large number of errors per row.OCCAM is based on two notions: 1) the stored and query data are encoded using a BCH ECC and 2) approximate (Hamming distance (HD) tolerant) search.The BCH ECC [8]) guarantees a min codeword HD d, thus ensuring that up to k = [(d − 1)/2] bit changes do not turn one coded dataword into another legit coded dataword.In RAM, such min codeword HD of d allows correcting k = [(d − 1)/2] errors.In OCCAM, such errors are tolerated through approximate, or k-bit-HD tolerant search, as long as datawords that differ from a query by no more than k bits are identified by OCCAM as matches.This means that OCCAM remains effectively oblivious to up to k errors per row.Based on silicon measurements, OCCAM can be configured to tolerate up to 7-bit errors in each 64-bit memory row with 100% sensitivity and specificity.
The main contributions of our work are as follows: 1) to the best of our knowledge, this is the first error-oblivious CAM design that tolerates more than one or two errors per CAM row without memory duplication; 2) OCCAM tolerates upsets of both multiple-node and multiple-cell kind, and retains error tolerance ability with voltage scaling; and 3) we fabricated OCCAM using a commercial 65-nm process and verified its error obliviousness by silicon measurements.

A. Conventional Content-Addressable Memory
Fig. 1(a) shows the architecture of a conventional n × m CAM.A typical NOR-type CAM bitcell is illustrated in Fig. 1(b) It is based on a pair of cross-coupled inverters for storing the data.The bitcell is accessed for write and read similarly to a standard 6T cell, by using the wordline to enable the row access, and driving searchline (SL) and inverse searchline to opposite values for write, or precharging them for read.The associative search operation is implemented using the M1-M3 transistors.First, the matchline (ML) is precharged.Then, the search word is loaded onto the searchlines.If the value stored in the cell matches the value on the SL, M1 and M2 keep the gate of M3 low, cutting off the ML discharge path.In consequence, the ML remains high, which represents a match.If the SL value differs from the value in the storage cell, M3 turns on and discharges the ML, which yields a mismatch.When the entire n-bit word is considered [see Fig. 1(c)], the ML will remain high only in the case that all storage cells match the search pattern, resulting in a word match.Conversely, a single bit mismatch is enough to discharge the ML, resulting in a word mismatch.
In this letter, the

B. Soft-Error Tolerant CAM
Several techniques have been proposed in literature [2], [9], [10], [11], [12], [13], [14], [15] to enable error resilience in CAMs using: 1) ECC; 2) duplication; and 3) HD-of-1tolerance, or combinations thereof.The common limitation of these techniques is their inability to tolerate a significant number of errors caused, for example, by multiple node or multiple cell upsets.A soft error tolerant CAM was proposed in [2].It encodes the CAM content and the query patterns using Hamming code (which ensures the codeword HD of 3), and modifies the sensing scheme to tolerate one bit mismatch.This allows tolerating a single bit error in data stored in CAM row.Another error-immune CAM design based on SRAM-based ternary CAM (TCAM) and ECC-protected embedded DRAM is proposed in [9].In [10], error detection is achieved by replicating the CAM module and comparing the outputs of the two modules.Several designs employ parity bits and dedicated sensing schemes [11], [12].However, they can only handle a small HD, typically tolerating one bit error.ECC and duplication are used to achieve error tolerance in [13].Redundancy is applied to ensure error tolerance in TCAMs by [14].Bloom filters are used for error detection and correction in CAM by [15].

III. OCCAM DESIGN
OCCAM, the error-oblivious CAM proposed in this letter, is based on NOR CAM.It achieves error obliviousness by enabling approximate rather than exact search, achieved through circuit design and a new user-tunable configuration voltage source.OCCAM assumes that the data and queries are encoded using a code that guarantees a certain min codeword HD (e.g., ECC, such as BCH [8]).

A. Bitcell Design and New Configuration Voltage Source
OCCAM differs from conventional NOR CAM by its bitcell design, which is presented in Fig. 1(d).A new addition to the bitcell, an  evaluation transistor (M4), is responsible for regulating the discharge rate of the ML driven by the new configurable voltage source called V eval .OCCAM can perform approximate search by setting V eval < V DD , while a conventional exact match CAM operation is enabled when M4 is driven by a full voltage level, V eval = V DD .
Fig. 2 shows the OCCAM approximate search operation.During the precharge step (PC = "0") the ML is precharged to V DD .Following is the evaluation step (PC = "1"), in which HD (error) tolerance level is controlled by two voltages.The first voltage is number of errors CAM can tolerate.However, practically, even by significantly reducing V ref , we can confidently achieve no more than one or two error tolerance.To alleviate this fundamental limitation, OCCAM adds another control voltage source, V eval , which effectively controls the ML discharge rate.The latter is reduced by tuning down V eval , such that the ML is sampled at higher-voltage level, which results in match.The lower the V eval , the higher the OCCAM error tolerance.Setting V eval sufficiently low allows tolerating large number of errors, impossible by adjusting V ref only.

B. Monte Carlo Analysis: Local and Global Variations
Fig. 3 shows the OCCAM susceptibility to process variations, for different V ref and error rates, when operating at V eval of 0.55 V. Sensitivity and specificity results are obtained by 1000 Monte Carlo simulations for typical-typical (TT), slow-slow (SS), and fast-fast (FF) corners.The OCCAM sensitivity and specificity, and therefore confident error tolerance, change across different corners due to local/global variations.However, these variations can be counterbalanced by adjusting the V eval and/or V ref to achieve the desired error tolerance while maintaining 100% sensitivity and specificity, as experimentally shown in Section IV.

A. OCCAM Silicon and Test Chip
We designed and manufactured in a commercial 65-nm process a 128-Kbit OCCAM bank divided into four 512 rows×64 bit memory subarrays.
Area: The layout of the OCCAM is shown in Fig. 4(a), and presents a total area of 0.21 mm 2 .The inset shows the layout of the OCCAM cell with an area of 3.24 µm 2 .OCCAM comprises the memory array and the ML sensing circuitry, which includes a replica row [shown in Fig. 4(a)] whose purpose is generating the SA enable signal.OCCAM sensing scheme accounts only for 2% of the total OCCAM area (0.21 mm 2 ).Compared to conventional CAM designs and state of the art multiple cell upsets tolerant CAMs, such as [5], OCCAM presents an area overhead of less than 2%.
Power: The average OCCAM search power consumption during exact and approximate match mode are about 0.90 mW and 0.66 mW, respectively.This 27% power reduction in approximate match mode is mainly because the M4 transistor limits the ML discharge current.These power figures refer to a single 512 rows×64 bit subarray operating at room temperature, operating frequency of 150 MHz, and supply voltage of 1.2 V.
Test Chip: Fig. 4(b) shows the test board with the fabricated test chip, nicknamed "LEO-II."The layout of the fabricated chip is provided on the right side of Fig. 4(b) with the four OCCAM subarrays highlighted among the various SoC components and other research projects integrated within the chip (the overall core area is 4 mm 2 ).

B. Methodology and Measurement Results
Evaluation Setup: First, we create a random dataset, encoded using BCH(m,n,k) (where n is the original dataword length, m is the coded dataword length, and k is the tolerable number of errors) and store it in the OCCAM array.Specifically, we use BCH(63,39,4) and BCH(63,24,7), which provide the min codeword HD of 9 and 15, and enable tolerating 4 and 7 bit-errors, respectively.Second, we build a number of query datasets by injecting into the coded dataset a predefined number of errors (i.e., a certain number of bit errors in random positions in every memory row).The number of random bit-errors per row varies between 0 and 30 (whereas the dataword lenght is 63), hence the number of query datasets in our evaluation is 31.Third, we configure the OCCAM error tolerance level k (e.g., 4 or 7) using V eval and V ref voltages.
Online Test: Each query dataset is searched in the OCCAM, and the number of matches in every search is recorded.
Evaluation Criteria: We use sensitivity and specificity to evaluate the OCCAM error obliviousness capabilities.Every intended match (i.e., when the HD between the dataword and the query does not exceed k) is a true positive result.If such dataword accidentally mismatches, this is a false negative result.Every intended mismatch (i.e., when the HD between the dataword and the query is above k) is a true negative result.If such dataword accidentally matches, this is a false positive result.Using these definitions, we are able to measure the sensitivity and specificity of the OCCAM's error tolerance.
Silicon Measurement: Results are provided in Fig. 5.We show the sensitivity and specificity as functions of the number of errors (i.e., the HD between the queries and the datawords) for predefined tolerance level k (set by adjusting V eval and V ref ).The highest number of errors (7) in this example is tolerated when V ref = 0.8 V and V eval = 0.6 V, marked by a star in Fig. 5(a).
The relation between the min codeword HD d and the number of tolerable per-row errors k is demonstrated in Fig. 5(b), black curve (V ref = 0.8 V). k marks the highest number of errors (four in this example) at which the sensitivity is still 100% (i.e.zero false results); d marks the lowest-min codeword HD (10 in this example, where the specificity is still 100%), mandatory to tolerate k errors.The theoretical equation k = (d − 1/2) holds experimentally.The case of exact matching CAM (V eval = V ref = V DD ) which tolerates no error is marked in Fig. 5(c) by a yellow circle.
Fig. 6 presents the OCCAM PVT variability.The number of errors (k) tolerated with 100% sensitivity and the corresponding min codeword HD values (d) which guarantee 100% specificity are measured across 6 different chips, under V DD variation, and a wide range of temperatures.Dynamic adjustment of the OCCAM V eval and V ref enables to effectively counterbalance the effects of PVT variations.We optimize V eval and V ref through aforementioned procedure.
Fig. 7(a) shows a shmoo plot presenting pairs k/d of the max number of tolerable errors / the corresponding min codeword HD Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.for different values of V eval and V ref .While the highest number of errors can be tolerated when both V eval and V ref are at their lowest levels (marked green square), zero number of errors is tolerated when OCCAM is operated in a typical CAM exact search mode (V eval = V ref = V DD , marked red square).The left-top corner (marked yellow square) shows the best-error tolerance a CAM can achieve when only the sense amplifier reference voltage V ref can be tuned (such design requires no modification to CAM bitcell).In such case, the error tolerance is limited to two to three errors.
Voltage scaling increases the error susceptibility of memories [7].OCCAM retains error tolerance capabilities under voltage scaling, as demonstrated by the shmoo plot in Fig. 7(b).While k = [(d − 1)/2] no longer holds in practice (higher-HD d is required to tolerate the same number of errors under voltage scaling), OCCAM is still able to accomplish that (tolerating 7 errors per row in this example).

V. CONCLUSION
We propose OCCAM, an error oblivious CAM design based on approximate search and configurable minimum HD coding.The latter is achieved by using error correction codes, such as BCH.Approximate search capability is attained by augmenting the CAM bitcell by an nMOS device that allows flexible control of the matchline discharge pace.Our design was fabricated as a part of a 65-nm test chip and evaluated through post-silicon testing and measurements.OCCAM achieves 100% sensitivity and specificity while tolerating the error rate of 11% (7 out of 64 bit errors per memory row).It retains its error obliviousness ability across process, voltage, and temperature variations.

Fig. 2 .
Fig. 2. OCCAM match and mismatch timing controlled by V eval and V ref .

Fig. 4 .
Fig. 4. (a) OCCAM subarray layout with zoom into an OCCAM bitcell; (b) LEO-II SoC board along with a top-level view of the SoC layout highlighting the OCCAM bank divided into four subarrays.

Fig. 7 .
Fig. 7. Measurement results: a shmoo plot showing the max number of tolerable errors k and the corresponding min codeword HD d for different values of V eval and V ref ; (a) V DD = 1.2 V and (b) V DD = 1 V.
NOR-typeCAM bitcell is modified to support approximate matching, as presented hereafter.c2024 The Authors.This work is licensed under a Creative Commons Attribution 4.0 License.
For more information, see https://creativecommons.org/licenses/by/4.0/Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.