This paper describes design methodologies for the optimal inductive peaking structures used for the 40-Gb/s serializing transmitter circuits presented in. The implemented transmitter had more than 400 on-chip inductors and transformers in order to achieve the bandwidth required for the 38.4-Gb/s operation demonstrated in a 0.13-μm CMOS process. A bridged T-coil network with inverted mutual coupling was found more effective than the conventional T-coil with sizeable driver-side capacitance. An iterative refinement procedure that directly optimizes the circuit's large-signal transient response at the presence of the inductor parasitics and device nonlinearities via HSPICE-ASITIC joint-simulation is described. The procedure resulted in more than 3 ?? improvement in bandwidth for the CML buffer, multiplexer, and latch circuits. It is shown that the area and the achievable bandwidth of the optimal inductive peaking structures will scale favorably with the CMOS technology trends.