Skip to Main Content
To move High-Performance Computing (HPC) closer to forward operating environments and missions, the Army Research Laboratory is developing approaches using hybrid, asymmetric core computing. By blending capabilities found in Graphics Processing Units (GPUs) and traditional von Neumann multicore Central Processing Units (CPUs), approaches are being developed and optimized to provide at or near real-time processing speeds for research project applications. Algorithms are designed to partition work to resources best designed to handle the processing load. The use of commodity resources allows the design to be flexible throughout the life cycle without the costly and time-consuming delays associated with Application-Specific Integrated Circuit (ASIC) development. This paradigm allows for rapid technology transfer to end users. In this paper, we describe a synchronous impulse reconstruction radar imaging algorithm that has been designed for hybrid CPU-GPU processing. We discuss various optimizations such as asynchronous task partitioning between the CPU and GPU as well as data movement reduction. We also discuss analysis and design of the algorithms within the context of two programming models: NVIDIA's CUDA and AMD's ATI Brook+. Finally, we report on the speedup achieved by this approach that allowed us to take a code once restricted to postprocessing and transform it into one that exceeds real-time performance requirements.