Loading [MathJax]/extensions/MathMenu.js
SCRATCH: An End-to-End Application-Aware So-GPGPU Architecture and Trimming Tool | IEEE Conference Publication | IEEE Xplore

SCRATCH: An End-to-End Application-Aware So-GPGPU Architecture and Trimming Tool


Abstract:

Applying advanced signal processing and artificial intelligence algorithms is often constrained by power and energy consumption limitations, in high performance and embed...Show More

Abstract:

Applying advanced signal processing and artificial intelligence algorithms is often constrained by power and energy consumption limitations, in high performance and embedded, cyber-physical and super-computing devices and systems. Although Graphics Processing Units (GPUs) helped to mitigate the throughput-per-Watt performance problem in many compute-intensive applications, dealing more efficiently with the autonomy requirements of intelligent systems demands power-oriented customized architectures that are specially tuned for each application, preferably without manual redesign of the entire hardware and capable of supporting legacy code. Hence, this work proposes a new SCRATCH framework that aims at automatically identifying the specific requirements of each application kernel, regarding instruction set and computing unit demands, allowing for the generation of application-specific and FPGA-implementable trimmed-down GPU-inspired architectures. The work is based on an improved version of the original MIAOW system (here named MIAOW2.0), which is herein extended to support a set of 156 instructions and enhanced to provide a fast prefetch memory system and a dual-clock domain. Experimental results with 17 highly relevant benchmarks, using integer and floating-point arithmetic, demonstrate that we have been able to achieve an average of 140× speedup and 115× higher energy-efficiency levels (instructions-per-Joule) when compared to the original MIAOW system, and a 2.4× speedup and 2.1× energy-efficiency gains compared against our optimized version without pruning.
Date of Conference: 14-17 October 2017
Date Added to IEEE Xplore: 11 April 2019
ISBN Information:

ISSN Information:

Conference Location: Boston, MA, USA

1 Introduction

The evolution of General-Purpose Processing on GPU (GPGPU) has been aided by the emergence of parallel programming frameworks, especially the Compute Unified Device Architecture (CUDA) [8, 27] and Open Computing Language (OpenCL) [6, 13]. These tools allow programmers to easily use the hardware resources of massively parallel processors for their applications, processing large amounts of data in relatively shorter periods of time when compared to previous Central Processing Unit (CPU)-based architectures. However, although GPUs provide high throughput performance, o-the-shelf devices demand high power levels to operate (200W to 300W per device) and have fixed designs that cannot be adapted towards the specific needs of the target applications.

References

References is not available for this document.