A framework for efficient and scalable execution of domain-specific templates on GPUs | IEEE Conference Publication | IEEE Xplore