Skip to Main Content
GPU has a good performance ratio and exhibits the capability for applications with high level of parallelism despite its inexpensive price. The support of integer and logical instructions on the latest generation of GPU makes us to implement cipher algorithms easier with the same instructions. However the decisions such as parallel processing granularity or memory allocation place imposed heavy burden on programmers. For this reason this paper shows the results of several experiments to study relation between memory allocation style of AES parameters and granularity as the parallelism exploited from AES encoding process using CUDA with NVIDIA Geforce GTX285. The result of experiments cleared up that the 16Byte/thread granularity had the highest performance and it achieved approximately 35Gbps throughput. Moreover, implementation with overlapping between processing and data transfer brought up 22.5Gbps throughput including data transfer time. Also, it cleared up that it is important to decide granularity and memory allocation to effective processing in AES encryption on GPU.
Date of Conference: 17-19 Nov. 2010