By Topic

Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Bartolini, A. ; Dept. of Electron., Comput. Sci. & Syst., Univ. of Bologna, Bologna, Italy ; Cacciari, M. ; Tilli, A. ; Benini, L.

As result of technology scaling, single-chip multicore power density increases and its spatial and temporal workload variation leads to temperature hot-spots, which may cause nonuniform ageing and accelerated chip failure. These critical issues can be tackled by closed-loop thermal and reliability management policies. Model predictive controllers (MPC) outperform classic feedback controllers since they are capable of minimizing performance loss while enforcing safe working temperature. Unfortunately, MPC controllers rely on a priori knowledge of thermal models and their complexity exponentially grows with the number of controlled cores. In this paper, we present a scalable, fully distributed, energy-aware thermal management solution for single-chip multicore platforms. The model-predictive controller complexity is drastically reduced by splitting it in a set of simpler interacting controllers, each one allocated to a core in the system. Locally, each node selects the optimal frequency to meet temperature constraints while minimizing the performance penalty and system energy. Comparable performance with state-of-the-art MPC controllers is achieved by letting controllers exchange a limited amount of information at runtime on a neighborhood basis. In addition, we address model uncertainty by supporting learning of the thermal model with a novel distributed self-calibration approach that matches well the controller architecture.

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:24 ,  Issue: 1 )