Abstract:
GPU devices are currently seen as one of the trending topics for parallel computing. Commonly, GPU applications are developed with programming tools based on compiled lan...Show MoreMetadata
Abstract:
GPU devices are currently seen as one of the trending topics for parallel computing. Commonly, GPU applications are developed with programming tools based on compiled languages, like C/C++ and Fortran. This paper presents a performance and programming effort analysis employing the Python high-level language to implement the NAS Parallel Benchmark kernels targeting GPUs. We used Numba environment to enable CUDA support in Python, a tool that allows us to implement a GPU application with pure Python code. Our experimental results showed that Python applications reached a performance similar to C++ programs employing CUDA and better than C++ using OpenACC for most NPB kernels. Furthermore, Python codes required less operations related to the GPU framework than CUDA, mainly because Python needs a lower number of statements to manage memory allocations and data transfers. However, our Python versions demanded more operations than OpenACC implementations.
Published in: 2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)
Date of Conference: 09-11 March 2022
Date Added to IEEE Xplore: 18 April 2022
ISBN Information: