Reducing GPU offload latency via fine-grained CPU-GPU synchronization | IEEE Conference Publication | IEEE Xplore