Padding free bank conflict resolution for CUDA-based matrix transpose algorithm | IEEE Conference Publication | IEEE Xplore