Investigating Register Cache Behavior: Implications for CUDA and Tensor Core Workloads on GPUs | IEEE Journals & Magazine | IEEE Xplore