Parallelization of Padé Approximation of Matrix Exponential with CUDA-Aware MPI | IEEE Conference Publication | IEEE Xplore