Skip to Main Content
In this paper, we study the retiming problem of sequential circuits with net topology optimization. Both interconnect and gate delay are considered in retiming. Most previous retiming algorithms have assumed ideal conditions for the nonlogical portions of data paths, which are not sufficiently accurate to be used in high-performance circuits today. In our modeling, we assume that the delay of a wire is directly proportional to its length. This assumption is reasonable since the quadratic component of a wire delay is significantly smaller than its linear component when the more accurate Elmore delay model is used. A simple experiment was conducted to illustrate the validity of this assumption. We present two approaches to solve the retiming problem, both of which have polynomial time complexity. The first one can compute the optimal clock period, while the second one is an improvement over the first one in terms of practical applicability. The second approach gives solutions that are very close to the optimal (0.06% more than the optimal on average) but in a much shorter runtime. The optimally retimed circuit will then be realized physically by placing the registers and finding the net topologies. In contrast to many previous works [Proc. IEEE Int. Conf. Comput.-Aided Des., p. 136, 1998], [IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., 22(7) Jul. 2003] that performed simple calculations to determine the register positions, our approach can preserve the optimal clock period that is obtained by the retiming step and utilize as few registers as possible. Minimization of register number saves both area and power in register and clock loading. Our topology optimization step is shown to be optimal for nets with four or fewer pins, and this type of nets constitutes over 90% of the nets in a sequential circuit on average. Using the ISCAS89 benchmark, we tested our algorithm with a 0.35-mum complementary metal-oxide-semiconductor standard cell library. Silicon Ensem- ble was used to layout the design with a row utilization of 50%. Experimental results showed that our algorithm could find the best sharing of registers for a net in most of the cases, i.e., using the minimum number of registers while preserving the target clock period that is obtained by the retiming step, within a minute run on an Intel Pentium IV 1.5 GHz PC with 512 MB RAM.