A Fully Digital True Random Number Generator With Entropy Source Based in Frequency Collapse

All cryptography systems have a True Random Number Generator (TRNG). In the process of validating, these systems are necessary for prototyping in Field Programmable Gate Array (FPGA). However, TRNG uses an entropy source based on non-deterministic effects challenging to replicate in FPGA. This work shows the problems and solutions to implement an entropy source based on frequency collapse in multimodal Ring Oscillators (RO). The entropy source implemented in FPGA pass all SP800-90B tests from the National Institute of Standards and Technology (NIST) with a good entropy compared to related works. The TRNG passes all NIST SP800-22 with and without the post-processing stage. Besides, the TRNG and the post-processing stage pass all tests of Application notes and Interpretation of the Scheme (AIS31). The TRNG implementation on a Xilinx Artix-7 XC7A100TCSG324 FPGA occupies less than 1% of the resources. This work presents <inline-formula> <tex-math notation="LaTeX">$0.62~\mu \text{s}$ </tex-math></inline-formula> up to <inline-formula> <tex-math notation="LaTeX">$9.92~\mu \text{s}$ </tex-math></inline-formula> of sampling latency and 1.1 Mbps up to 9.1 Mbps of bit rate throughput.


I. INTRODUCTION
Random Number Generators (RNG) conform a crucial part of cryptographic systems. RNG circuits are implemented as in-core key generation, where the data can be ciphered with such random keys. A TRNG implements an entropy source to the circuit, hence reducing its predictability for known-key attacks [1]. The NIST provides a group of constraints and tests for TRNG implementations within a crypto-core [2]. For implementation, such TRNG needs to be implemented inside of the system, and use secure channels to the cryptocore. Moreover, the NIST offers some tests to evaluate the quality of entropy sources. Due to the implementation needs, the RNG circuit needs to be included within the same system that contains the cryptography engine. However, the quality The associate editor coordinating the review of this manuscript and approving it for publication was Sedat Akleylek . of the TRNG circuit depends mainly on the entropy source, which is often found external to the digital system.
The entropy sources exploit the random characteristics of noise, sampling the physical phenomenon, and finally applying a processing stage to deliver random numbers. Different physical phenomenons are used to capture the entropy of the noise. Peyroula et al. [3] shows an entropy source based in Resistive Random-Access Memory (RRAM). The entropy is extracted in the bitset operations. However, the characterization of the V set and R t is necessary to extract the best entropy in this source. Li et al. [4] present a TRNG based on metastability implemented in FPGA. The entropy source is based on metastability extraction of cross-coupled NAND gates. This design requires the construction of symmetrical NAND gates by using forced Look-Up Tables (LUTs) configurations. Although it can be constructed with special FPGA configurations, the FPGA manufacturers usually adopt some technologies to decrease the probability of metastable events. Sarkisla and Ergun [5] show two methods to optimize area in TRNG based on metastability in Flip-Flops (FF). Using several RO as clock sources, the jitter generates random delays for the different clocks, and then the metastability is sampled using a FF. However, an external clock and manual placement of the ROs are necessary for this design. Mathew et al. [6] describe a micro random number generator combining multiple entropy sources, fabricated in a 14 nm FinFET CMOS technology, using three independent self-calibrating all-digital entropy sources with cross-coupled inverted pairs. Also, the sources are coupled with an XOR feedback shift register. Lu et al. [7] presents a TRNG based on extremely small Vernier interval. The jitter is quantified by a Time-to-Digital Converter. Besides, the presented TRNG is robust against Process, Voltage, and Temperature variations. Several works implement multimodal ROs as an entropy source [8]- [10]. The implementation of the TRNG depends heavily on the used technology. Some TRNG extracts the entropy source from an external source, requiring analog design or external components to the SoC [3], [11], [12]. All-digital sources can be implemented in both ASIC and FPGA with some modifications to the entropy circuits depending on the available resources. As for internal digital sources, the TRNG may present security problems due to entropy manipulation attacks [13], [14]. Particularly, multimodal ROs have resistance against power side-channel attacks [8].
In this work, we implement the entropy source based on frequency collapse in multimodal RO. We describe the implementation in FPGA by utilizing regular LUT blocks to construct the multimodal and common ROs. A Phase and Frequency Detector (PFD) circuit is used to compare the output of a multimodal RO and a regular RO. The TRNG counts the clock cycles from the multimodal RO during the comparison. Although the entropy source and the PFD can be implemented inside the same circuit, manual placement of the ROs is necessary. Moreover, there is a trade-off between the throughput of random numbers and the quality of randomness, which is determined by the length of the ROs. We check the randomness of FPGA implementation using the NIST SP800-22 and AIS31 tests, with the last reporting a Shannon entropy of 7.996012. The quality of the entropy source is tested using the NIST SP800-90B entropy test with 0.892911 of minimum entropy normalized. In addition, a post-processing stage based on a linear corrector is implemented. The FPGA implementation occupies 58 LUTs and 21 FFs in the TRNG, and the post-processing stage uses 4 LUTs, which represents less than 1% of the resources available. The TRNG presents 0.62 µs up to 9.92 µs of latency and 1.1 Mbps up to 9.1 Mbps of bit rate throughput.
The remainder of this paper is organized as follows. Section II present the analytical model of the entropy source. Section III shows the architecture of TRNG. Section IV shows the latency and bit rate of the implementation, and presents the test results of NIST SP800-90B, NIST SP800-22, and AIS31. Finally, section V concludes the paper.

II. ENTROPY SOURCE A. ANALYTICAL MODEL
The multimodal RO have two option in the construction of the entropy source. The first option is to have the number of edges even. In this case, the odd and even pulses travel in different paths through the ring. The rising and falling delays in CMOS inverters are different and change with the process variations. The frequency collapse occurs by the difference of rising and falling delays, and the jitter presents the two paths [15]. In the second option, the number of edges can be odd. In comparison to even edges, the pulses change the path in each oscillation. In this way, the difference between rising and falling delays is inherently reduced. The time of any pulse to arrive at the output (T pulse ) in the RO with odd edges is as follows: In this equation, k and n represent the number of edges and cycles, respectively. Also, the δ (n,i) and δ (n,j) denotes the jitter generated for the thermal noise in the even an odd stages. Finally, t edge represents the typical delay of the logic gates in the all stages. The model of three edges (2) demonstrates the variance increasing linearly with the number of cycles, when δ(n, j) = δ(n, i) ∼ N (0, σ 2 ) [8].
However, in FPGA implementation, the inverter gate is reproduced with a LUT, introducing undesirable delays in the RO. Besides, the effects of the distance between each LUT and the interference of external signals take a greater role in the frequency collapse. In this way, the FPGA implementation introduces other undesirable delays, increasing the systematic mismatch compared to ASIC implementation.  The entropy source architecture uses three NAND gates to increase oscillation frequency compared to conventional RO. Each NAND gate has a delay stage, which determines the frequency of the RO. The enable generates a pulse in each edge. The approximate entropy is increased according to the number of the delay stages. However, if the delay stages are increased, the frequency collapse takes more time to happen [8]. Fig. 1.b) illustrates the frequency collapse caused for the accumulation jitter in the multimodal RO. In the initial event, the three pulses are generated separately with a spacing of 120 degrees. The jitter present in the RO generates a break of the balance of the three pulses. Before the collapse event, the accumulation of jitter causes the break of phase balance of the pulses. In the moment of collapse event, the shock of two pulses cause come back of balance, and the oscillation frequency of the RO decreases.
The physical effect in the multimodal RO to extract the entropy does not depend on the external clock compared to the metastability entropy sources [4], [5]. The odd edges analytical model stated in (2) determines that the frequency collapse depends only on the systematic mismatch and random jitter. If the systematic mismatch is small, the noise predominant generates a good entropy source. However, if the systematic mismatch is significant, the time for the frequency collapse tends to be constant, reducing the entropy source quality. The multimodal RO with high systematic mismatch variations is not used for random number generation, but rather for Physical Unclonable Functions (PUF) [16]. To reduce the systematic mismatch, we configure the placement, routing, and LUT constraints according to the entropy source model. We reduce this according to the analytical model of three multimodal RO proposed by Yang et. al [8], and including the systematic mismatch introduced by the FPGA implementation. We perform a hardware mismatch fixation to distribute the RO elements in the FPGA evenly. Fig. 2 depicted techniques applies to reduce the systematic mismatch in FPGA. Fig. 2.a) shows the entropy sources with different N-stages of delay implemented in FPGA, along with a RISC-V processor for testing and debugging. The arrangement of the RO edges needs to be configured in the FPGA in a special way to guarantee the entropy of the circuit. The RO edges occupy several FPGA cells for both positioning and blockages, which is represented in blue. Fig. 2.b) illustrate the placement cell used in three edges multimodal RO. In FPGA, each slice has different LUTs to implement a logic cell. Nevertheless, using the same LUT is mandatory to reduce systematic mismatch variations. In this case, we used the B6LUT by Nand (N) and Delay (D) cells. In addition, the routing delay contributes to the systematic mismatch present in the FPGA implementation. A solution to reduce the routing delays is the manual placement of the LUTs of the RO in the same form as the figure. In this form, when we applied a manual route, the distance between LUT and LUT needs to be similar as possible. However, the analytical model based on implementing the entropy source does not consider the intervention of the other signals. Thus, the adjacent slices are restricted. Fig. 2.c) shows the content of LUT6 in a Xilinx FPGA [17]. The Xilinx FPGA presents differents types of slices. Nevertheless, the implementation of the entropy source only uses the SLICEL due to their different delay times. The multimodal RO is implemented in 6-bit Look-Up Table (LUT6) in a Xilinx Artix-7 FPGA. A RAM-based function generator reproduces the LUT in FPGA with an initial value. For example, to reproduce an inverter gate, the initial value can take more values. However, the delay of the logic gate depends on the initial value. Consequently, the initial value to implement the entropy source must be the same for the logic cells used.
The quality of the entropy sources is evaluated using the SP800-90B test suite provided by NIST [18]. Determination of the Independent and Identically Distributed (IID) from the data generated from the entropy source is fundamental to prevent overestimating the test suite [1]. The physical phenomenon used to generate the frequency collapse depends on the noise and systematic mismatch. Therefore, the data generated from the entropy source is non-IID according to the analytical model.

III. TRUE RANDOM NUMBER GENERATOR (TRNG)
A. TRNG CORE Fig. 3 presents the TRNG architecture. The architecture needs a reference oscillation to capture the frequency collapse. The reference oscillation is generated by a conventional RO used as a reference (RO REF). Fig. 3.a) illustrates the TRNG architecture. The PFD takes the signal generated of RO RNG and RO REF by indicating a frequency collapse phenomenon. The random number is generated in a 12-bit counter in the capture stage highlighted in blue. However, the four least significant bits (LSB) are truncated by mitigating the error that introduces the method to the digitization of the entropy. This counter uses the clock generated by RO RNG. The enable of the counter is assert when the TRNG is initialized. To deassert the enable of the counter need the frequency collapse occurs, and the fourth bit of the counter is high to prevent false triggers of the PFD in the initial stages before the frequency collapse. After the frequency collapse, the Valid signal is triggered, indicating a random number in the Out port. Fig. 3.b) depicts the implementation of PFD used in the compare stage. The fully digital PFD is highlighted in green and consists of two FF, one NAND gate, and two AND gates. The input of the FF is connected to VDD, and the inputs are the clock generated for RO REF and RO RNG, respectively. When one clock changes to high, the FF charges and changes the output to high. The function of the gates is to prevent a false event when the two clocks are in phase. In the event of frequency collapse, the oscillation signal is deformed by the accumulation jitter in RO. The deformed signals generate glitches, causing false events in the PFD. A glitch removal circuit is implemented highlighted in red to prevent the false events for the comparison signals deformation. A 2 bit shift register highlighted in blue is implemented by evading other false events caused for the variations of the time response of the PFD. The fully digital PFD used in the TRNG architecture presents a dead zone in the frequency comparison range. The dead zone is a small difference in the phase of the inputs that the PFD does not detect, caused by the delay time of logic components and the feedback path of the FF. Fig. 4 illustrates the phase error in the ideal PFD on the left and digital PFD implemented in the right. The dead zone causes errors when the oscillation frequency of RO RNG and RO REF are in phase. The truncation of the four LSB in the 12 bit counter

B. POST-PROCESSING
The methodology for implement the entropy source described in Section II reduces the systematic mismatch to improve the quality of the number generated. However, in modern TRNGs, the post-processing stage is an integral part. Fig. 6 shows a linear corrector used in the post-processing stage implemented, which uses to input the 8 bits generate of the TRNG. The systematic mismatch with the methodology presented influences more in the LSB. The post-processing stage masks the most significant bits (MSB) with the LSB through an XOR operation to reduce the systematic mismatch influence in the LSB bits, generating a final output of 4 bits.

IV. RESULTS
This section presents results to evaluate the quality of entropy source based on frequency collapse in multimodal RO with several tests. First, the entropy source's quality is evaluated using the estimators described in NIST SP800-90B [19]. Second, the latency and bit rate of the TRNG is presented. Next, the TRNG without and with post-processing stage is tested applying the NIST SP800-22 [2]. Lastly, the AIS31 [20] test is applied to demonstrate the strength of the quality output of this work. All tests were implemented in a Xilinx Artix-7 FPGA to obtain the data.

A. ENTROPY SOURCE
In section II, the output data is classified as non-IID, according to the analytical model presented. Table 1 applying the non-IID track without a conditional component. The Collision, Markov, and Compression estimates tests are only applied to binary inputs. The minimum of all estimates in the non-IID test determines the minimum entropy initial (H I ).
In this case, the minimum entropy is the estimation of the Compression test is H I = 0.892911 * 8 = 7.143286. However, the entropy source estimation is calculated with a single and long-output sequence. The entropy source can generate a correlated sequence after restarts.  Table 2 shows the results to apply the restart test. The data used in this test, the entropy source is restarted 1000 times. For each restart, 1000 samples shall be collected directly from the entropy source. Two datasets are constructed concatenating in rows and columns. The entropy source passes the sanity check with a α = 0.01. The minimum entropy in rows (H R ) and columns (H C ) are 7.353758. The results of the NIST SP800-90B with non-IID data is determinate of H min = min(H R , H C , H I ). The entropy source implemented presents a H min = 7.143286. Finally, Table 3 shows the comparison of the minimum entropy of the sources with different physical phenomenons. The entropy source presents a 0.892911 of minimum entropy normalized without conditional components. The data is recollected in nominal conditions. Also, the entropy source implemented has eight delay stages per edge.

B. STATISTICAL TESTS
This section explains the frequency collapse time to obtain the latency and the bit rate of the implementation. In addition, it presents the resources occupied in the TRNG implementation. Finally, a different statistical test is applied to determine the quality of the implemented TRNG. Fig. 7 illustrates the occurrences in the time required for each frequency collapse. The number of samples is 5Mb.  Because the nature of the physical phenomenon of the entropy source, the time necessary in each frequency collapse event occurs is aleatory. The minimum and maximum time for the event are 0.62 µs and 9.92 µs, respectively. Fig. 8 shows the bit rate of the TRNG implemented. The bit rate of the TRNG implemented depends on the time for all collapses in the sequence generated. In this way, 2500 samples are taken, showing a mean of 5.102 and a deviation standard of 1.333, respectively. The range of bit rate using 3σ is 1.102 up to 9.109 Mbps. Fig. 9 depicts the histogram of normalized data generated for the TRNG. The data present a mean µ = 0.498 and deviation standard σ = 0.288. The ideal standard distribution presents a statistic parameter of µ = 0.5 and σ = 0.288. Table 4 shows the result of applying the NIST statistical test suite in a stream of 5MB. The significance level applied in the test is α = 0.01, indicating that one would expect one sequence in 100 sequences to be rejected. Also, a P-value 0.01 would mean that the sequence would be considered to be random with confidence of 99%. The TRNG passes all the tests with and without the post-processing stage. The test data are generated by Xilinx Artix-7 FPGA in nominal conditions.     The first part is denoted P1 (T0-T5), and it's to prove the output of the post-processing stage of the TRNG. The second part, P2 (T6-T8) is used to test the output of the noise source. The TRNG implemented passes all tests of AIS31, using a dataset of 5MB. Besides, test T8 indicates the average information content of the random number or the Shannon entropy. The Shannon entropy in the implementation is 7.996012. Table 6 compiles the comparison of performance results of some TRNG reported in FPGA. The TRNG occupies 62 LUT and 21 FF with the post-processing stage, using less resources compared to other TRNG with other physical phenomenons. The bit rate of the TRNG implemented decreases due to the physical phenomenon of the entropy source. However, the architecture implemented does not depend on the external clocks, denoting robustness to clock attacks.

V. CONCLUSION
This work introduces a fully digital implementation of a TRNG in FPGA. The TRNG is based on the frequency collapse phenomenon using a multimodal RO and a regular RO. The implementations follow an analytical model of the frequency collapse in multimodal RO, proposing strategies to extract the most entropy possible. This TRNG was implemented in a Xilinx Artix-7 FPGA. The entropy source assumes a non-IID to apply the NIST SP800-90B. The entropy source passed all non-IID tests and the restart test. The minimum entropy normalized is 0.892311. The TRNG in FPGA passes all tests of NIST SP800-22 with and without the post-processing stage using a dataset of 5MB. Besides, the entropy source and TRNG with the post-processing stage pass all AIS 31 tests. The T8 sub-test of AIS31 shows a 7.996012 of Shannon entropy. The FPGA implementation of TRNG based on frequency collapse occupies 64 LUT and 21 FF, occupying less than 1% of the total Xilinx Artix-7 FPGA resources. The TRNG presents a 0.62 µs up to 9.92 µs of latency and 1.1 Mbps up to 9.1 Mbps of bit rate throughput.  AKIRA TSUKAMOTO received the M.S. degree in computer science from Columbia University, New York. He is currently with the National Institute of Advanced Industrial Science and Technology (AIST). He has worked on products based on Cell/B.E. and ARM. His research interests include software engineering on a network, operating systems, and system security, and he is enthusiastic regarding any kind of technical development. His research interest includes the design of analog and digital systems using integrated circuits. VOLUME 9, 2021