By Topic

Analyzing Potential Throughput Improvement of Power- and Thermal-Constrained Multicore Processors by Exploiting DVFS and PCPG

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Jungseob Lee ; Dept. of Electr. & Comput. Eng., Univ. of Wisconsin, Madison, WI, USA ; Nam Sung Kim

Process variability from a range of sources is growing as technology is scaled below 65 nm, increasing variations of transistor delay and leakage current both within a die and across dies. This, in turn, negatively impacts maximum operating frequency and total power consumption of processors. Meanwhile, manufacturers have integrated more cores in a single die to improve the throughput of processors running highly-parallel workloads. However, many existing workloads do not have high enough parallelism to exploit multiple cores in a processor. First, in this paper, we maximize the throughput of power- and thermal-constrained multicore processors using per-core power gating and dynamic voltage/frequency scaling. When we do not have enough parallelism to effectively use all cores, we turn off some cores using per-core power gates that are already available in commercial multicore processors. This provides extra power and thermal headroom, and allows active cores to run faster through voltage/frequency scaling within power, thermal, and voltage scaling limits. Our analysis using a 32 nm predictive technology model demonstrates that jointly optimizing the number of active cores and maximum operating frequency can improve the throughput of a 16-core processor running workloads with limited parallelism by up to 14%. Second, we extend our throughput analysis and optimization to consider the impact of within-die spatial process variations that lead to considerable core-to-core frequency and leakage power variations in multicore processors. Our analysis shows that exploiting core-to-core frequency variations can improve the throughput of a 16-core processor by up to 57%.

Published in:

Very Large Scale Integration (VLSI) Systems, IEEE Transactions on  (Volume:20 ,  Issue: 2 )