D2MA: Accelerating coarse-grained data transfer for GPUs | IEEE Conference Publication | IEEE Xplore