IEEE Xplore At-A-Glance
  • Abstract

Perspectives and Issues in 3D-IC from Designers' Point of View

Recent progress of through-silicon-via (TSV) process is so impressive that everyone can expect real 3D-IC era. The most valuable advantages of 3D-IC is decreasing interconnects. Although analysis of this advantages has been reported in some specific case study, the general theory for quantitative analysis has not been studied. In some cases, the advantage of 3D-IC has been overestimated and much different from that of real chip designs expected. This paper presents the qualitative analysis of general 3D-IC design especially for sub-65 nm CMOS generation from designers' point of view. What is understood from this paper is how important IC-design is for 3D-IC and how to gain a big advantage of 3D-IC.

Recently, through-silicon-via (TSV) technology for 3D circuit has been intensively developed so that 3D-SoC (3D-system-on chip) is becoming more likely. One of the most advantageous features expected for 3D-SoC is to decrease interconnect delay, since the interconnect delay increases seriously with CMOS scaling, especially in sub-90 nm CMOS. It means “interconnect-aware design” is much important for performance improvement by 3D-SoC. System level simulation of 3D-SoC [1], [2] suggests that circuit performance is improved mainly by shortening interconnect. However, for such system level study, contribution of shortening interconnect to circuit performance is not clear enough. Hence, it has not been clarified how to decrease interconnect delay effectively. We have deeply analyzed the contribution of interconnect delay on the circuit level from an industrial view firstly, and present an effective layout method, Folding Tape Method, for the 3D circuit layout design based on this analysis.

It is well known that interconnect delay is as large as CMOS FO4 delay in sub-90 nm CMOS. In this paper, we introduce the “critical length” to discuss effect of decrease in interconnect delay, where the critical length is defined as the length of interconnect whose delay is compatible to the FO4 delay. Fig. 1 shows the critical length of interconnects, such as Metal 1 and intermediate interconnect and global interconnect, having repeaters, where delay is calculated using data in ITRS [3]. It shows the critical length of Metal 1 is about 110 um, 70 um and 40 um for 65 nm, 45 nm and 32 nm CMOS, respectively. Note that the target interconnects that should be shortened by 3D design to reduce the delay are not short ones but longer ones than the critical length.

Figure 1
Fig. 1. Critical length of Metal 1, intermediate interconnect and global interconnect for each CMOS technology.

Fig. 2 shows frequency of interconnect length, density function of interconnect length, of a CMOS chip with ten million gates in 65 nm CMOS with the interconnect delays as a function of the length. Arrows denote critical lengths for each technology. This calculation uses conventional interconnect distribution theory [4]. Theses figures also show frequency of interconnect in the same chip using normal 3D-layout with simply four divided layers. As shown in Figs. 2, the number of interconnect having critical length is almost the same between 2D layout and 3D layout. Although very long interconnect about 1 mm in length can be shortened by 3D layout, the frequency of such long interconnect is extremely small, compared to that of interconnects having critical length as shown in Figs. 2. To decrease the target interconnect length, the die dimension Z has to be decreased to near critical length, and the number of 3D-stacked layers should thus be increased up to more than several ten layers. It is, however, unrealistic, considering heat dissipation issue and stacking process cost. Other design method is needed to effectively decrease interconnect delay.

Figure 2
Fig. 2. Interconnect delay and frequency of interconnect length for 2D-layout and 3D-layout with 4 layers by conventional 3D design. Vertical arrows denote the critical length.

We present a 3D layout method, “Folding Tape (FT)—method”, to decrease the target interconnect length with just several 3D-layers. As depicted in Figs. 3 (a), it starts from the first partitioning, tentative partitioning, on a 2D layout, then it is changed to the second portioning to form 2D-“pseudo-one-dimensional” layout which is called “tape” in this work. Next, it is “folded” like “Origami,” as shown in Fig. 3(b) to (d). Such transformation from 2D layout to 3D layout is valuable, since conventional 2D-CAD tools can be simply used. The tape is folded with each N layers, where N is the number of 3D-stacked layer, from the lowest layer to the highest layer of 3D-circuit and then from the highest layer to the lowest layer one by one, as shown in Fig. 3(d). Then interconnect lying over two layers is shortened by short-cut using TSV, as shown in Fig. 3(e)–(f), thus the final interconnect length is shorter than the folded length in the y-direction, Lc. At the same time, repeaters can be removed.

Figure 3
Fig. 3. Folding Tape (FT) method for 3D-circuit layout. (Triangles represent repeaters.)

Although other 3D-folding methods by transforming 2D layout like “Origami” [5], [6] are proposed, they are much different from our FT-method, since such 3D-folding method itself does not contribute to shortening interconnect length, and further complex computation for decrease in interconnect and placement & routing is needed to shorten interconnect. Much long computation time is generally needed for such complex computation compared to our simple method. For the longer interconnect lying over more than 3 layers as shown in Fig. 3(f), its final length is also less than Lc. For the longer interconnect lying over more than N layers (originally longer than N × Lc), its final length is less than 2 × Lc, which means very long interconnect cannot be so shortened as interconnect shorter than N × Lc. However, since the number of such a very long interconnect is much small as described before, it does not affect decrease in interconnect delay of the chip.

Using notations as defined in Fig. 4, average of interconnect length shortened by the FT-method is a function of interconnect length, L, and partial block size, Lc, and it can be expressed as follows:Formula TeX Source $$\int^{Lc}_{0}(L + y - Lc) \cdot dy/Lc ={L^{2} \over 2Lc}\eqno{(1)}$$

Figure 4
Fig. 4. Partial block of a part of the “tape”.

Using equation (1) and distribution theory of interconnect length used above, relative frequency of interconnect length on the 3D-layout using FT-method can be calculated. Fig. 5 shows the calculation results, where the critical length is selected for Lc for each CMOS technology and 3D design uses five stacked layers (N = 5).

Figure 5
Fig. 5. Distribution of interconnects on the 2D design and 3D design using FT-method.

Fig. 5 indicates that the number of interconnect longer than the critical length has been dramatically reduced to less than several % of that for 2D. To analyze contribution of decreasing interconnect length to the chip performance, we introduce the interconnect delay factor that is defined as the product of frequency and interconnect delay. The interconnect delay factor is thought to be strongly related to the total delay of the chip. Fig. 6 shows the interconnect delay factor for 65 nm CMOS technology. It indicates that the delay factor around the critical length is decreased by more than 30 times due to FT-method. Thus, it has been confirmed that FT-method is much effective to reduce the interconnect delay, even though the number of 3D-stacked layers is small.

Figure 6
Fig. 6. Comparison of the interconnect delay factor between 2D and 3D with FT-method.

Fig. 7 depicts an ideal layout for CMOS logic circuit with TSV on a layer designed by FT-method. It suggests that area overhead of TSV is negligible, if the TSV diameter is smaller than distance between VDD line and GND line. For a 45 nm CMOS case, when the diameter of TSV is less than 0.5 um, 3D-circuit does not have any area overhead. Also, the density and pitch of TSV needed for this case is about 0.12 um− 2 and 8.1 um. Such a small and high-density TSV can be fabricated by diverting fabrication technology for a deep trench capacitor of e-DRAM (e.g., diameter: 0.2 um, depth: 8 um @ 90 nm CMOS), and similar dimension of TSV have been already fabricated in [7]. If some area overhead is accepted, larger TSV can be used.

Figure 7
Fig. 7. Ideal CMOS logic circuit layout of TSV for 3D-layout by FT-method.

Using the FT-method, about 95% repeaters can be removed for a 45 nm CMOS case calculated from results in Fig. 6, as shown in Fig. 8. The repeaters thus removed makes up about 10% of all logic gates in the chip, and area and power reduction by 10% can also be expected. To our best knowledge, this is the most effective and simple method to effectively reduce interconnect delay among those ever reported.

Figure 8
Fig. 8. Comparison of number of repeaters between 2D and 3D with FT-method.

Based on interconnect delay-aware design, or critical length-aware design, CMOS digital circuit for which 3D circuit design is effective to reduce interconnect delay can be clarified. Fig. 9 shows relative interconnect delay compared to FO4 gate delay as a function of interconnect length. In this graph, area for typical digital CMOS circuits is also plotted, which indicates longest interconnect containing in each digital circuits. These show that small circuits like standard cells (NAND, NOR, XOR, Flip-flop..) are not targets for 3D circuit design. Even 16-bit counters and multipliers are not the targets, as the reduction of interconnect delay by 3D circuit design is negligible. Large circuit blocks like processor element (PE), FFT, general SoC, Processor and CMOS contain long interconnect whose delay is not neglected compared to gate delay, as shown in Fig. 8.

Figure 9
Fig. 9. Relative interconnect delay compared to FO4 gate delay as a function of interconnect length. Circles for typical digital CMOS circuit indicates longest interconnect containing in each digital circuit.

In conclusion, since decrease in interconnect delay is the most valuable effect for 3D-SoC, “interconnect-delay aware design” is needed. We have systematically analyzed interconnect-delay in sub-90 nm CMOS and compared with gate delay for each CMOS generation, suggesting importance of critical length aware design. Based on the analysis, we have presented new design method: Folding Tape Method (FTM) to shorten interconnects with critical length. Improvement of interconnect delay by FTM and decrease in repeaters with small overhead has been shown. Thus, performance improvement of 3D circuits strongly depends on the design.

Footnotes

Shinobu Fujita, Keiko Abe, Kumiko Nomura, Shin'ichi Yasuda and Tetsufumi Tanamoto are with the Advanced LSI Technology Laboratory, Toshiba Corporation, Komukai-Toshiba-cho 1, Saiwai-ku, Kawasaki, Japan, 212-8582; shinobu.fujita@toshiba.co.jp

References

1. A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy

G. L. Loi, et al.

San Francisco, California DAC 2006, 2006-07-24/28 2006

2. Scalability of 3D-integrated arithmetic units in high-performance microprocessors

K. Puttaswamy, et al.

ACM DAC, 2007

4. Interconnect limits on gigascale integration in the 21st century

J. A. Davis, et al.

Proceedings of the IEEE, Vol. 89, issue (3), pp. 305–324

5. Japanese Patent

T. Tanamoto

Japanese Patent, P2006-71021,

6. Thermal-aware 3D IC placement via transformation

J. Cong, et al.

Yokohama, Japan
Proceedings of the 12th ASP-DAC 2007, 2007-01, 780–785

7. Enabling SOI based assembly technology for three-dimensional (3D) integrated circuits (ICs)

A. W. Topol, et al.

IEDM Tech. Digest, 2005, 363

Authors

No Photo Available

Shinobu Fujita

No Bio Available
No Photo Available

Keiko Abe

No Bio Available
No Photo Available

Kumiko Nomura

No Bio Available
No Photo Available

Shin’ichi Yasuda

No Bio Available
No Photo Available

Tetsufumi Tanamoto

No Bio Available

Cited By

No Citations Available

Keywords

INSPEC: Non-Controlled Indexing

No Keywords Available

Authors Keywords

No Keywords Available

More Keywords

No Keywords Available

Corrections

No Corrections

Media

No Content Available

Indexed by Inspec

© Copyright 2011 IEEE – All Rights Reserved