CUDA Memory Techniques for Matrix Multiplication on Quadro 4000 | IEEE Conference Publication | IEEE Xplore