Skip to Main Content
The wide application of General Purpose Graphic Processing Units (GPGPUs) results in large manual efforts on porting and optimizing algorithms on them. However, most existing automatic ways of generating GPGPU code fail to conduct optimization strategies regarding a specific computation and to reuse constantly evolving manual optimizations. In this paper, we present a computation pattern driven approach for computation-specific GPGPU code generation and optimization, which in turn reuses manual optimizations to a certain extent. We suggest language extensions to OpenMP, high-level data structure attributes, in order to assist the process of computation pattern matching and to help give users intuitive performance tuning parameters in the view of data structure attributes. We illustrate the feasibility of this approach through three important computation dwarfs, which are dense matrix, sparse matrix, and structured mesh computation in scientific computing. We also build a prototype OpenMP-to-CUDA translator that consists of computation pattern recognition and code template instantiation. The experimental results demonstrate the performance benefits of computation pattern driven method. To our best knowledge, it is the first work on reusing manual optimizations for GPGPUs with computation pattern driven approach.