A Novel Clock Gating Approach for the Design of Low-Power Linear Feedback Shift Registers

This paper presents an efficient solution to reduce the power consumption of the popular linear feedback shift register by exploiting the gated clock approach. The power reduction with respect to other gated clock schemes is obtained by an efficient implementation of the logic gates and properly reducing the number of XOR gates in the feedback network. Transistor level simulations are performed by using standard cells in a 28-nm FD-SOI CMOS technology and a 300-MHz clock. Simulation results show a power reduction with respect to traditional implementations, which reaches values higher than 30%.

FIGURE 2. Traditional gated clock circuit for FFs without enable signal. length of the pseudo-random sequence and the other statisti-89 cal properties of the bit generator. 90 By defining P FF and P XOR the power consumption of the 91 FFs and the XOR gates, respectively, the power consumption 92 of the conventional LFSR in Fig. 1 can be modeled as 93 P Conv = nP FF + n t αP XOR (2) 94 where n is the register length (i.e., the order of the generator), 95 n t is the number of the inner taps (i.e., the number of the terms 96 of the polynomial characteristic except x n and 1), α is the 97 switching activity at the inner nodes, which, in a LFSR with 98 n ≥ 6 and assuming maximum period, is approximately equal 99 to 0.5 [22].

100
From (2), it appears that for the topologies in Fig. 1 the 101 clock path toggles at every clock cycle, thus dissipating a 102 significant amount of power especially at high clock rates. 103 Vice versa, power consumption of FF D-path and XOR 104 gates depend on the switching activity and hence its value 105 is reduced by 50% with respect to the maximum value. Dynamic Power Management (DPM) is a commonly adopted 108 strategy to reduce power consumption in a digital system. 109 It consists in disabling the logic circuits that are not perform-110 ing functional operations during a particular time frame. 111 At circuit level, this strategy is known as ''gated clock 112 approach'' [24], [25] and, for flip-flops with no enable signal, 113 it consists in their activation only when the input signal 114 is different from the actual output value, according to the 115 scheme depicted in Fig. 2. 116 A modified LFSR that takes advantage of the gated clock 117 strategy is shown in Fig. 3. The topology reduces the flip-flop 118 VOLUME 10, 2022 power consumption, P FF , at the price of additional power 119 consumption due to the extra gates required to implement the 120 gated clock approach.

121
Therefore, for the gated clock LFSR in Fig. 3, the power 122 consumption in (2) turns into 123 P GC ≈ nαP FF + (n + n t ) αP XOR + nαP NAND (3) where the term n · α · P FF represents the dissipation of the   outputs of the CPL gates. Moreover, in case of non-adjacent 169 taps, we can exploit the property For example, the polynomial x 5 +x 2 +1, which needs only 171 one XOR in the traditional topology, can be implemented 172 again with only one XOR whose inputs are the binomi-173 als (x 2 + x) and (x + 1) available at the outputs of the 174 CPL-XOR/XNOR.

175
To derive the number of XOR gates required in the feed-176 back network by using the proposed strategy, let us consider 177 the ordered m-elements array, a i , of the taps exponent (for 178 example, for the polynomial x 10 + x 4 + x 3 + x + 1 the array 179 elements are a 1 = 1, a 2 = 3, a 3 = 4 and a 4 = 10). Then, 180 the number of the XOR required in the feedback network is 181 given by Note that in (6) a 1 is the lowest exponent of the polynomial 184 characteristic, and terms in the sum are couple of close taps 185 exponents, without the highest one.

186
By inspection of relationship (6), it is apparent that the 187 minimum number of XOR is required when the characteristic 188 polynomial contains the term x, and all the couple of taps are 189 also adjacent.
190 Table 1 summarizes the number of XOR gates necessary to 191 implement the feedback circuit of some characteristic poly-192 nomials both in the traditional topology, n t (i.e., number of 193 the inner taps), and by adopting the proposed strategy, n t 194 evaluated through relationship (6).

195
If we now focus on Table 1, it is apparent that the proposed 196 strategy does not always need a lower number of XOR gates. 197 Thus, to achieve a further reduction on the number of XOR 198 gates, we can efficiently use together the outputs of the 199 CPL-XOR/XNOR sections (i.e., the terms x i+1 ⊕ x i ), and the 200 terms x i at the outputs of the FFs.

201
Thus, a further reduction on the number of XOR gates in 202 the feedback path is achieved, since it results equal to where m c is the number of adjacent taps couples, but con-205 sidering each tap in only one couple. For example, in the 206

229
We have compared the power consumption among the LFSRs  targeted for BCH and CRC encoders, but due to the very 239 different architecture, a comparison between the LFSR pre-240 sented in this paper and these parallel approaches is not 241 fair, therefore we do not include parallel approaches in the 242 comparison.

243
Specifically, using a commercial 28-nm CMOS FD-SOI 244 technology process in the Cadence simulation environment, 245 we have run several transistor level simulations on the topolo-246 gies having the characteristic polynomials in Table 1. For 247 the digital blocks, we used the master-slave positive edge 248 triggered D-type Flip-Flop depicted in Fig. 5 and the two-249 input speed-optimized XOR gate in Fig. 6, both included in a 250 standard threshold voltage, low-power option standard cells 251 library. In addition, for the circuits reported in [26] and in 252 Fig. 4, we used the thin oxide N-type and P-type MOSFETs 253 with low threshold voltage and minimum channel length of 254 28-nm, included in the same design kit. All circuits have been 255 clocked at 300 MHz and powered at 1 V.

256
The simulation results of the LFSRs designed with the dif-257 ferent approaches are summarized in Table 2. By comparing 258 VOLUME 10, 2022   On the other hand, as expected, the proposed solution, 267 as reported in Table 2 and plotted in Fig. 7, allows a sig-268 nificant power saving, which is typically higher than 20%  For area and delay estimation purposes, we have coded 279 in VHDL and synthesized by using the Cadence Genus TM 280 tool the 16 bits LFSRs reported in Table 2, considering both 281 the conventional and the gated clock implementation in [22]. 282 To estimate area and delay of the LFSRs exploiting the 283 approach proposed in this paper, we have implemented also 284 the full custom layout of the power-aware XORAND circuit 285 in Fig. 4. The area of the LFSRs has then been estimated 286 by summing the area of the standard cells and the area of 287 the power-aware XORAND exploited in the different 16 bits 288 LFSRs implementations. Table 3 summarizes the area and critical path delays of 290 the 16 bits LFSRs reported in Table 2, confirming how the 291 proposed approach does not affect the critical path delay, 292 which is, in all cases, set by the feedback path. The area 293 estimations suggest also that the proposed approach results 294  Department. He teaches undergraduate and gradu-453 ate courses on basic electronics and microelectronics. His research interests 454 include integrated circuits design and focused on design methodologies 455 able to guarantee robustness with respect to parameter variations in both 456 analog circuits and digital VLSI circuits. In the context of analog design his 457 research activity was concerned with circuit topologies for the realization of 458 low-voltage analog building blocks using ultra-short channel CMOS technol-459 ogy, whereas in the context of cryptographic hardware his focus has been on 460 novel PAAs methodologies and countermeasures. He has been also involved 461 in research and development activities held in collaboration between ''La 462 Sapienza'' University and some industrial partners, which led, between 463 2000 and 2015, to the implementation of 13 ASICs. He has coauthored 464 more than 70 publications in international journals, about 70 contributions in 465 conference proceedings. He is the co-inventor of two international patents.