Skip to Main Content
The dual effects of larger die sizes and technology scaling, combined with aggressive voltage scaling for power reduction, increase the error rates for on-chip memories. Traditional on-chip memory reliability techniques (e.g., ECC) incur significant power and performance overheads. In this paper, we propose a low-power-and-performance-overhead Embedded RAID (E-RAID) strategy and present Embedded RAIDs-on-Chip (E-RoC), a distributed dynamically managed reliable memory subsystem. E-RoC achieves reliability through redundancy by optimizing RAID-like policies tuned for on-chip distributed memories. We achieve on-chip reliability of memories through the use of distributed dynamic scratch pad allocatable memories (DSPAMs) and their allocation policies. We exploit aggressive voltage scaling to reduce power consumption overheads due to parallel DSP AM accesses, and rely on the E-RoC manager to automatically handle any resulting voltage-scaling-induced errors. Our experimental results on multimedia benchmarks show that E-RoC's fully distributed redundant reliable memory subsystem reduces power consumption by up to 85% and latency up to 61 % over traditional reliability approaches that use parity/cyclic hybrids for error checking and correction.