Performance Tuning of Matrix Multiplication in OpenCL on Different GPUs and CPUs | IEEE Conference Publication | IEEE Xplore