Skip to Main Content
This paper describes PGR, an architectural technique to reduce dynamic power via GlitchLess or to improve performance via clock skew scheduling (CSS) and delay padding (DP). It is integrated into VPR 5.0, and is invoked after the routing stage. We use programmable delay elements (PDEs) as a novel architecture modification to insert delay on FF clock inputs, enabling all optimization steps to share it, avoiding multiple architecture modifications. The central theme of this paper is considering the trade-off between power and performance, and finding an appropriate compromise considering process variation and timing uncertainties. Overall, an average of 15% speedup can be achieved via CSS alone, or up to 37% for individual circuits. Although delay padding only benefits several circuits, the average improvement of those circuits is an additional 10% of the original period, or up to 23% for individual circuits. In addition, a new model to estimate glitching power is proposed, taking into account the analog behavior of glitch pulse width reduction as it travels along FPGA routing tracks. We show that the original glitch estimation method can underestimate glitching power by up to 48%, and overestimate by up to 15%. GlitchLess is performed on both the original VPR and post-CSS solutions. We are able to eliminate on average 16% of glitching power, and up to 63% for individual circuits.