Abstract:
SPHINCS+ was selected as one of NIST Post-Quantum Cryptography Digital Signature Algorithms (PQC-DSA). However, SPHINCS+ processes are slower compared to other PQC-DSA. W...Show MoreMetadata
Abstract:
SPHINCS+ was selected as one of NIST Post-Quantum Cryptography Digital Signature Algorithms (PQC-DSA). However, SPHINCS+ processes are slower compared to other PQC-DSA. When integrating it into protocols ( e.g. , TLS and IPSec), optimization research from the server perspective becomes crucial. Therefore, we present highly parallel and optimized implementations of SPHINCS+ on various NVIDIA GPU architectures (Pascal, Turing, and Ampere). We discovered parts within the internal processes of SPHINCS+ that could be parallelized and optimized them ( e.g. , leaf node generation and node merging process in MSS, subtree constructions in FORS, signature generation in WOTS+ and hypertree layer construction), leveraging the characteristics of GPU architecture ( e.g. , warp-based execution and efficient memory access). As far as we know, this is the first SPHINCS+ implementations on GPUs. Our implementations achieve 44,391(resp. 24,997 and 11,401) signature generations, 725,118(resp. 354,309 and 100,168) key generations, and 285,680(resp. 155,800 and 106,280) verifications per second at security level 1(resp. 3 and 5) on RTX3090. Furthermore, on GTX1070, our SPHINCS+ shows an enhanced throughput of \times 2.10 for signature generation, \times 1.03 for key generation, and \times 9.86 for verification at security level 1, surpassing the study conducted by Sun et ~al. (IEEE TPDS 2020) on the GTX1080 having 640 more cores than GTX1070.
Published in: IEEE Transactions on Circuits and Systems I: Regular Papers ( Volume: 71, Issue: 6, June 2024)