Performance analysis of memory transfers and GEMM subroutines on NVIDIA Tesla GPU cluster | IEEE Conference Publication | IEEE Xplore