Skip to Main Content
The tremendous increase in device density of present day designs is accompanied by a corresponding increase in transistor failures (hard faults), posing a major challenge to current fault tolerant techniques and tools. We propose a novel "design-time" fault-tolerance methodology at architecture/circuit levels to improve the reliability of applications, where it is possible to classify computations into two categories- (i) those which contribute to quality degradation and, (ii) those which result in total system failure. The proposed scheme enhances system reliability by making appropriate trade-offs between area, output quality (signal to noise ratio or mean square error), and fault tolerance. This low-overhead generic methodology is suitable not only for scaled CMOS technologies, but is also applicable to future nanotechnologies (carbon nanotubes etc.) as well, where such defects are expected to be prevalent. We evaluated this technique on a widely used DSP system - Finite Impulse Response (FIR) filters (where minor degradation in quality can be tolerated). Results show that our technique achieves an improvement between 73.4%-450% (in terms of total system failure probability under iso-redundancy) compared to conventional fault tolerance techniques.