Using a fixed temperature for thermal throttling is pessimistic. Reduced aging during periods of low temperature can compensate for accelerated aging during periods of high temperature. Runtime tracking of the temperature-dependent aging rate means that throttling is engaged only when necessary to maintain reliability. In this article, we show that the effect of cool (low-temperature) phases can compensate for that of hot (high-temperature) phases on reliability. Existing dynamic thermal management (DTM) techniques ignore the effects of temperature fluctuations on chip lifetime and can unnecessarily impose performance penalties for hot phases. Using electromigration as the targeted failure mechanism, we apply a dynamic reliability model and propose a dynamic reliability management (DRM) technique to dynamically track the consumption of chip lifetime during operation.
Published in:
Micro, IEEE
(Volume:25
,
Issue:
6
)
Date of Publication: Nov.-Dec. 2005