Design and implementation of GPU-based matrix chain multiplication using C++AMP | IEEE Conference Publication | IEEE Xplore