Skip to Main Content
Graphics processing units (GPUs) are becoming an increasingly popular platform to run applications that require a high computation throughput. They are limited, however, by memory bandwidth and power and, as such, cannot always achieve their full potential. This paper presents the PUMA architecture; a domain specific accelerator designed specifically for medical imaging applications, but with sufficient generality to make it programmable. The goal is to closely match the performance achieved by GPUs in this domain but at a fraction of the power consumption. The results are quite promising; PUMA achieves up to 2X the performance of a modern GPU architecture and has up to a 54X improved efficiency on a floating point and memory intensive MRI reconstruction algorithm.