Skip to Main Content
This paper presents a high performance algorithm for modular multiplication on a graphics processing unit (GPU) implemented in assembler. The proposed algorithm carries out finite field multiplication over the NIST prime fields of size 192, 224, 256 and 384 bits. Included is a detailed explanation of our algorithm, an instruction count analysis, and a comparison to recently published work; compared to the next fastest design, the proposed algorithm's execution time is 27 to 71 times faster.
Date of Conference: 20-23 May 2012