Skip to Main Content
Design of reliable systems meeting stringent quality, reliability, and availability requirements is becoming increasingly difficult in advanced technologies. The current design paradigm, which assumes that no gate or interconnect will ever operate incorrectly within the lifetime of a product, must change to cope with this situation. Future systems must be designed with built-in mechanisms for failure tolerance, prediction, detection and recovery during normal system operation. This tutorial will focus on models and metrics for designing reliable systems, algorithms and tools for modeling and evaluating such systems, will discuss a broad spectrum of techniques for building such systems with support for concurrent error detection, failure prediction, error correction, recovery, and self-repair. Complex interplay between power, performance and reliability requirements in future systems, and associated constraints will also be discussed.