By Topic

Exploiting coarse-grain verification parallelism for power-efficient fault tolerance

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
M. W. Rashid ; Dept. of Electr. & Comput. Eng., Rochester Univ., NY, USA ; E. J. Tan ; M. C. Huang ; D. H. Albonesi

As device dimensions continue to be aggressively scaled, microprocessors are becoming increasingly vulnerable to the impact of undesired energy, such as that of a cosmic particle strike, which can cause transient errors. To prevent operational failure due to these errors, system-level techniques such as redundant execution will be increasingly required for fault detection and tolerance in future processors. However, the need for redundancy is directly opposed to the growing need for more power efficient operation. Conventional techniques that use multi-core microarchitectures to provide whole-thread duplication generally incur significant energy overhead which can exacerbate the already severe problem of power consumption and heat dissipation given a certain throughput requirement. In the future, approaches that supply the necessary level of robustness at a given throughput level must also be power-aware. We propose a thread-level redundant execution microarchitecture that significantly reduces the energy overhead of replication without unduly impacting performance. Our approach exploits the fact that with appropriate hardware support, the verification operation can be parallelized and run on a chip multiprocessor with support for frequency scaling together with supply voltage scaling and/or body biasing. To further improve the efficiency of verification, we exploit the information obtained by the leading thread to assist the trailing verification threads. We discuss in detail the required architectural support and show that our approach can be highly energy-efficient: using two checkers, fully replicated execution costs only an average 28% extra energy over non-redundant execution with virtually no performance loss.

Published in:

14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)

Date of Conference:

17-21 Sept. 2005