ELF: maximizing memory-level parallelism for GPUs with coordinated warp and fetch scheduling | IEEE Conference Publication | IEEE Xplore