Skip to Main Content
The state-of-the-art general-purpose graphic processing units (GPGPUs) can offer very high computational throughput for general-purpose, highly-parallel applications using hundreds of available on-chip cores. Meanwhile, as technology is scaled down below 65nm, each core's maximum frequency varies significantly due to increasing within-die variations. This, in turn, diminishes the throughput improvement of GPGPUs through technology scaling because the maximum frequency is often limited by the slowest core. In this paper, we investigate two techniques that can mitigate the impact of frequency variations on GPGPU's throughput: 1) running each core at its maximum frequency independently and 2) disabling the slowest cores. Both can maximize GPGPU's frequency at either the individual core or entire processor level. Our experimental results using a GPGPU simulator and a 32nm technology show that the first and second techniques can improve the throughput of compute- and problem-size-bounded applications by up to 32% and 19%, respectively.
Date of Conference: 10-12 April 2011