3-D CMOS Chip Stacking for Security ICs Featuring Backside Buried Metal Power Delivery Networks With Distributed Capacitance

3-D stacks of complimentary metal–oxide–semiconductor (CMOS) integrated circuit (IC) chips for security applications monolithically embed backside buried metal (BBM) routing with low series impedance and high decoupling capability in a power delivery network (PDN), thanks to distributed capacitances over a full-chip backside area. The 3-D Si demonstrator integrating cryptographic engines was fabricated in a 0.13-<inline-formula> <tex-math notation="LaTeX">$\mu \text{m}$ </tex-math></inline-formula> CMOS technology with post-Si wafer-level BBM Cu processing with 10, 15, and <inline-formula> <tex-math notation="LaTeX">${10}~\mu \text{m}$ </tex-math></inline-formula> of thickness, linewidth, and space, respectively, along with through Si vias (TSVs) with 10 and <inline-formula> <tex-math notation="LaTeX">$40~\mu \text{m}$ </tex-math></inline-formula> of diameter and depth, respectively. The capacitance of 0.18 nF/mm<sup>2</sup> in the effective backside area of 71 mm<sup>2</sup> suppressed dynamic IR drops in 10% and 59% for the single chip and four chip stack samples, respectively, during the operation of a 3.9 M-gate crypto core at 30 MHz. On-chip power noise monitoring (OCM) was applied in these measurements. The 3-D BBM PDN also effectively reduces power side channel information leakage, which is evaluated by <inline-formula> <tex-math notation="LaTeX">$14\times $ </tex-math></inline-formula> increase in the number of externally observed electromagnetic (EM) noise waveforms to attain the <inline-formula> <tex-math notation="LaTeX">${t}$ </tex-math></inline-formula>-test value of larger than 4.5.

geneous evolvements of very large scale integrated (VLSI) systems. Capacitors essentially contribute to bring stable and capable power delivery for high-performance circuit operation and also to prevent unexpected intercircuit interference due to power noise coupling. Those capacitors are desirably embedded in a system with the higher density and larger total capacitance, the more compact footprint and lower profile, and the higher voltage tolerance and smaller electrical impedance. Their leverage toward system performance prompts the advancement of Si wafer processing as well as post-Si packaging technology platforms.
On-die high-density capacitors have metal-insulator-metal (MIM) thin-layered structures within a metal stack or involve deeply etched trenches with dielectric and metal fillings. Package capacitors usually deploy multilayer ceramic capacitor (MLCC) surface mount discrete (SMD) components, which are located on the backside shadow of an integrated circuit (IC) chip or adjacent to it, called a land side capacitor (LSC) or a die side capacitor (DSC), respectively. On-die as well as package capacitors recently start to be embedded within a packaging interposer of high-performance large IC chips in high volume production, e.g., a microprocessor [1] and field-programmable gate array (FPGA) [2]. This aims at the very large total capacitance and efficiently decoupled utilization of semiconductor manufacturing sites with different technology nodes. Adopted technology platforms among literature include thin-film multiple layers within a plastic interposer [3], MIM and deep trench (DT) capacitors processed on a Si interposer [4], [5], DT capacitors by wafer-to-wafer bonding [6], and so forth. The suppression of power supply noise (PSN) is evaluated as general benefits, along with the improvements on clock jitters and delay time of critical logic paths.
While those capacitor technologies are primarily for 2.5-D interposer assembly [7], further developments are expected for PSI in 3-D VLSI systems [8]- [10]. This article describes BBM capacitors distributed over the whole 3-D complimentary metal-oxide-semiconductor (CMOS) chip stack. In addition to PSN suppression, we will introduce another measure of 3-D embedded capacitance on the benefits from hardware-security perspectives.
The backside of a Si substrate is an open space for passive elements to be integrated, which are almost free of design constraints posed by transistor technology nodes on its front side. The BBM technique in a post CMOS wafer-level processing way has been developed to form thick and wide Cu stripes buried from the backside of Si [11]. The back-and front-side metal stacks are electrically connected by TSVs at the periphery of a die or in the area of high-voltage analog circuits to avoid keep out zones in a core logic area. This dualside metallization has realized a passive Si interposer placed over a CMOS chip for in-package low-impedance power delivery [12] and a monolithic CMOS chip with backside attack protection circuits [13]. The Si backside technologies are in the trend of technology developments of near-core power supply functionality, as reported for monolithic PDNs in a very scaled technology toward 3-nm node [14], [15]. Another proposal is also given for the functionally integrated in-package magnetic core inductor [16].
In this article, we demonstrate the use of backside Si for integrated passives and routings for power delivery in a secure 3-D CMOS chip stack typically with cryptographic functions [17]. The next section describes PDN architecture using BBM. Section III details a demonstrator fabricated through wafer-level processing and assembled in a system. Measurement results will be described in Section IV. A brief summary will be given in Section V.

II. CMOS POWER DELIVERY WITH BBM
The BBM routing in the cross-sectional view of Fig. 1 is formed through wafer-level post CMOS processing. In digital ICs consisting of conventional CMOS logic cells, the ground side (V SS ) nodes are strongly tied up on a p-type Si wafer through Si substrate contacts. With this reason, the whole BBM routing is dedicated to the power supply (V DD ) side of PDN, aiming at side and bottom wall capacitances distributed over the full-chip backside, C BBM , for chip-wide PSN suppression. The BBM PDN is connected to the front-side counterpart by TSVs processed only at the position of I/O pads in the die periphery, so as not to interfere with layout of the core transistors in a logic cell array. The BBM PDN has the dimensions given in Fig. 1, featured by more than ten times thicker and wider Cu in comparison with typical front-side top-layer metal wirings. The BBM PDN reduces the on-chip PDN impedance, which is very much desirable to a large-area digital IC chip embodying such as public-key cryptographic algorithm.
In the 3-D evolution of BBM PDN in Fig. 2, the backside of an upper tier provides V DD and V SS BBM cross trunks that are directly contacted with surface µbumps on the front side of adjacent bottom tier. The µbumps are stacked on area Al pads and aligned in sequence respectively for V DD and V SS top metal cross trunks in the middle part of the CMOS front side. The remaining BBM portions fully belong to the V DD domain and densely form meshes for PDN capacitance against Si substrate biased at V SS . The V DD and V SS domains are, respectively, unified by the BBM cross trunks and TSVs in periphery for tier-to-tier vertical PDN interconnections, and, therefore, the BBM PDN capacitance is evenly distributed over the whole 3-D stack.
Flip-chip ball grid array packaging is adopted for assembly. The first (bottom) chip is faced to a fine-pitch plastic interposer, where µbumps on the periphery pads are directly contacted on Al lands of an interposer. The subsequent chips are homogeneously stacked. Fig. 3 shows the structural view and cross-sectional photos of a four-tier stack test sample. The BBM on the top tier is visible if the stack is decapsulated. It can be isolated from the PDN and biased to an externally applied clean voltage (e.g., V SHD , for EM shielding).

A. Power Delivery
A single chip implementation of BBM PDN is represented by the equivalent circuit of Fig. 4(a). A digital core is supplied by an on-chip micro voltage regulator module (µVRM) where the capacitance on the BBM meshes, C BBM , is formed on the core V DD against V SS and serves as the immediate decoupling capacitor. Cryptographic engines are integrated in a digital core of the Si demonstrator in this article. To maximize the power delivery efficiency, the chip is flipped down and assembled on a plastic interposer where land-side capacitors, C LS , can be additionally integrated on or within its laminates; however, the series impedance parasitic to routing and contacts (Z IP ) is unavoidable.
In a 3-D stack as given in Fig. 4(b), the identical chips with BBM and TSVs are cascaded in a flipped 3-D stack, aiming for  the energy efficiency of parallelized crypto transactions. It is noted that the explicit capacitance of C BBM for V DD is inserted in every tier, while C LS is accommodated only on the first one. C BBM is not deteriorated by the series impedance between adjacent tiers thanks to distributed vertical PDN connections.
While V SS is unified over the whole 3-D stack, V DD on a respective tier is regulated by the µVRM or even halted in a power-down mode for logic circuits. Here, the V DD voltage is generated with respect to V SS as the global reference voltage. The BBM then lowers parasitic impedance primarily on the V DD domain and consequently improves the capacity of power delivery in the whole 3-D chip stack.

B. Functionality
We have developed the prototype IC chip of  which are located on the left top and the right bottom corners of the chip, according to test scenarios. In addition, the smallscale chips of 4 × 3 mm 2 with some test circuits are prepared for the comparison of BBM capacitance.
On-chip power noise monitoring (OCM) function is equipped in each chip for the measurements of voltage variations on V DD and V SS nodes [9], [12]. The OCM channel includes a source follower (SF) to sense the voltage of interest at its input and a subsequent 11-bit successive approximation register (SAR) analog-to-digital converter (ADC) for on-chip voltage digitization. The SF with n-and p-channel MOSFET is prepared for the core V DD nominally at 1.5 V and V SS at 0.0 V voltages, in order to provide the negative and positive offset DC voltages at its output, respectively. This makes power noise voltages match the input voltage range of the ADC.
The photographies of front-side CMOS and backside BBM on the same die are shown in Fig. 6, along with magnified views around the center of die area showing cross trunks with µbump areas and BBM PDN meshes.

A. Passive Impedance Measurements
The capacitance of each chip in a standalone (not stacked), generally defined as the total capacitance between V DD and V SS , is measured with and without the BBM PDN, as shown in Fig. 7. The capacitance of Fig. 7 includes the parasitic caps to frontend circuitry and C BBM . The capacitance per area of C BBM are then dissolved through linear regression and summarized in Fig. 8. The density of BBM meshes differs among the small and large size chips and provides the capacitance of 0.25 and 0.18 nF/mm 2 , respectively. The BBM capacitor in total of 12.8 nF (in average among the large area chips) is distributed over the full-chip backside.
In comparison with typical in-circuit structures such as metal-on-metal (MOM) and MIM capacitors, the capacitance   of BBM is roughly from 4× to 8× smaller. However, it is of importance to note that the BBM capacitor eliminates the sacrifice of front-side core areas. On the contrary, although the transistor gate electrode exhibits the largest capacitance, it imposes the highest area overheads associated with contacts and wirings and reduces the area efficiency. Those comparisons are executed with manual layouts and simulations according to physical design kits and rules of the given technology.

B. Active IR Drop Measurements
PSN is measured by the OCM during the operation of a cryptographic engine (public-key crypto algorithm #2) on the first tier. The waveforms are measured on V DD and V SS nodes as given in Fig. 9, among demonstrators with and without BBM. The measurements are also explored for a single chip as well as in 3-D chip stacks with two and four tiers. The vertical axis shows the voltage as measured in the step of least significant bit (LSB) of ADC, which is approximately 0.73 mV/LSB, and also includes the DC offset voltage of the SF. The horizontal time axis is resolved by the sampling interval of 1.0 ns, controlled by an external data timing generator.
The periodic voltage variations on V DD , called dynamic IR drops, are synchronous to clocking and associated with the internal operation of addition and multiplication of binary data with the width of 256 bits or even larger. The huge number of logic gates is continuously toggling for each clock cycle during the arithmetic operations, which limits the maximum clock frequency at 30 MHz. The low-frequency droops, which continue for approximately 1.5 µs, follow to the evolvement of logic activities associated with mathematic expressions in crypto algorithm. It is shown that both periodic and low-frequency voltage variations are more suppressed with the larger number of tiers in a 3-D stack. On the other hand, the voltage among V SS nodes is very stable around the nominal ground voltage, reflecting the strong unification of V SS networks in the stack. The full use of BBM stripes on V DD domains is, therefore, proven to be effective. There is no recognizable discrepancy among tested 3-D IC chip samples in cryptographic operation and in total power consumption.
The peak-to-peak voltage variation is derived from the waveforms and summarized in Fig. 10 for comparison between conventional and BBM stacks. The reduction of dynamic IR drops on V DD nodes reaches 59% if the four-tier stack embeds BBM capacitors, which is 14% more efficient than in the conventional 3-D stack.  The effect of BBM stripes on the reduction of series impedance is estimated to be 42% as explained in Fig. 11. The resistance is derived from the slopes of on-chip power supply DC drops measured against the toggling frequency in clock buffer trees, where crypto cores are intentionally halted.

C. EM Radiation and SC Leakage Measurements
Dynamic IR drop suppression by the distributed BBM capacitance contributes to the mitigation of EM side channel (SC) leakage from cryptographic operation. This is straightforwardly expected since the local EM emissions that originate from nearby power supply current will be effectively suppressed by the distributed capacitances. The following experiments are provided to relate the SC leakage mitigation with the dynamic IR drop suppression. Fig. 12 shows the measurement setup where an EM probe senses local EM waves emitted from the demonstrator. The output from EM probe is amplified and then stored in an oscilloscope as EM waveform traces. The local EM radiation is governed by dynamic power current consumption of a crypto core and correlated with logic operation internally dealing with secret information. We have evaluated the hidden data dependence in EM radiation during crypto processing by applying the statistical t-test method on captured waveforms. The t-test value, t, suggests the presence of SC leakage when it exhibits the statistical significance with |t| > 4.5, according to the test vector leakage assessment (TVLA) methodology [18]. The t-test method has been widely adopted to quantify the effects of countermeasures against SC leakage in cryptographic engines at architecture and circuit levels [19], [20]. We apply this method to assess SC leakages among packaging structures.
The statistical significance is evaluated between two ensembles of EM traces measured in the crypto core operation of interest. The one uses a certain plain text of 256 bits uniquely over the entire set. The other chooses 256-bit plain text randomly generated in every input case. The EM measurement is alternatively performed on these two crypto operations, and EM traces are collected up to 20 K input cases. An oscilloscope stores EM traces, where each trace is averaged over five iterations of the crypto operation with an input case. The channel bandwidth of 125 MHz is chosen for waveform measurements. These measurement conditions are carefully designed to quantify and compare SC leakage suppression by the BBM capacitance among various 3-D stacked samples.
The following measurement results firmly support and extend the experimental conclusions reported in [17].
First, we measure the local EM emission at the proximate position over the stack, as the measurement point #1 in Fig. 12. The highest SC leakage is observed for the single chip with BBM in Fig. 13 showing t-test values. This is naturally understood since the power consumption current of crypto operation flows through BBM as a part of V DD wirings and strongly couples to the EM probe nearby its backside surface. The leakage is evidently attenuated as the number of chips in a 3-D stack increases, resulting from the power noise attenuation by BBM capacitance distributed over the stack. The number of EM traces to reach the significant leakage (|t| > 4.5) in the four-chip BBM stack becomes almost equivalent to the single chip without BBM. We have also experimentally confirmed that the suppression is almost negligible among multitier conventional 3-D stacks without BBM. It is important to claim that t-value no longer exceeds 4.5 even with 20 K test cases, if we cover the single chip with outer metal shields biased at V SHD , as measured at the point #2 of Fig. 12.
Second, EM measurements were performed at the location of power supply terminals on the printed circuit board (PCB), as the point #3 in Fig. 12. This facilitates the leakage assessment by an adversary without knowing internal device structures. The t-test values are compared in Fig. 13 among the public-key crypto demonstrator in conventional assembly and in 3-D BBM stacking. The leakage level remains sufficiently suppressed over 20 K test cases in the four-tier stack with BBM, in contrast to the conventional single chip assembly even with EM shielding, achieving 14× increase in the number of EM traces for t-value >4.5. The increase is also measured as 8× for the two-tier BBM stacked demonstrator.
While the EM waves in the first measurements are vertically emitted from the backside plane of crypto chips, the second ones are caused on the board by power current horizontally flowing from on-die µVRMs having low-pass characteristics with limited bandwidth. EM radiation is more or less present in these measurements; however, the statistical significance among EM traces is effectively diminished with the distributed BBM capacitance.

V. CONCLUSION
The characteristics of BBM PDN in 3-D CMOS chip stacking are experimentally evaluated with Si demonstrators equipped with cryptographic engines and on-chip power noise monitors. Post-wafer BBM processing on 0.13-µm CMOS wafers attains 12.8 nF on the backside of a crypto chip. Up to four BBM chips are 3-D stacked and packaged in flip-chip assembly.
The advantage of BBM capacitance distributed over the 3-D stack efficiently attenuates dynamic IR drops on power nodes for 59% in voltage variation. The mitigation of SC leakage by 14× is achieved in comparison with conventional single chip assembly, which is metered with t-value representing the statistical significance from hardware security viewpoints.
The 3-D chip stacking with BBM PDN provides novel technology options toward high SC leakage resiliency among security ICs, in close collaboration with countermeasure design techniques of cryptographic circuits and algorithms.