Improved Programming of GPU Architectures through Automated Data Allocation and Loop Restructuring | VDE Conference Publication | IEEE Xplore