Near-Linear Scaling Data Parallel Training with Overlapping-Aware Gradient Compression | IEEE Conference Publication | IEEE Xplore