By Topic

GAARP: a power-aware GALS architecture for real-time algorithm-specific tasks

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Swarup Bhunia ; Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA ; Animesh Datta ; Nilanjan Banerjee ; Kaushik Roy

Reducing the energy consumption of a real-time system has emerged as an important design concern. In this paper, we propose GAARP, an adaptive scalable architecture targeted toward algorithm-specific tasks for just-in-time performance using the right amount of power. The architecture consists of Globally Asynchronous and Locally Synchronous (GALS) building blocks, where the processing hardware is realized by a set of smaller slices of similar structure, each running synchronously with independent clocks. We demonstrate that, for different real-time commercial applications with algorithm-specific jobs like online transaction processing, digital filtering, Fourier transform, etc., the proposed architecture allows dynamic load-balancing and adaptive intertask voltage scaling based on the load in each of the processing units. Compared to a synchronous implementation of the same functionality, we show that the proposed hardware can achieve higher efficiency in terms of power and performance by exploiting the flexibility to balance the load and change the supply voltage. The architecture also lends itself to process tolerance since it can detect process-shifts for the individual processing units and determine the appropriate operating voltage/frequency for each unit. Simulation results for two representative applications show that, for a modest system configuration and random job distribution, we obtain up to 67 percent improvement in MOPS/W (millions of operations per second per watt) over a fully synchronous implementation.

Published in:

IEEE Transactions on Computers  (Volume:54 ,  Issue: 6 )