Automatic Thread-Block Size Adjustment for Memory-Bound BLAS Kernels on GPUs | IEEE Conference Publication | IEEE Xplore