Skip to Main Content
The GPU computing follows the trend of GPGPU, driven by the innovations in both hardware and programming languages made available to nongraphic programmers. Since some problems require an important time to solve or data quantities that do not fit on one single GPU, the logical continuation was to make use of multiple GPUs. In order to use a multiGPU environment in a general way, our paper presents an approach where each card is driven by either a [heavyweight MPI] process or a [lightweight OpenMP] thread. We compare the two models in terms of performance, implementation complexity and particularities, as well as overhead implied by the mixed code. We show that the best performance is obtained when we use OpenMP. We also note that using “pinned memory” we further improve the execution time. The next objective will be to create a three-level multiGPU environment with internode communication (processes, distributed memory), intranode GPUs management (threads, shared memory) and computation inside the GPU cards.