Skip to Main Content
Modular multiplication is a crucial operation in public key cryptosystems like RSA and elliptic curve cryptography (ECC). This paper presents a new word-based Montgomery modular multiplication algorithm which can be used to achieve a low-latency scalable architecture for efficient hardware implementations. We show how to relax the data dependency in conventional word-based algorithms so that a latency of exactly one cycle can be obtained regardless of the chosen word size w (w > 1). With the presented operand reduction scheme, the proposed scalable architecture can operate at high speeds and suitable data paths can be chosen for specific applications. Complexity analysis shows that the proposed architecture has the lowest latency and area complexity compared to related scalable architectures. Experimental results demonstrate that our design has area, speed, and flexibility advantages over related schemes.