Skip to Main Content
This paper investigates how the latency and energy of register alias tables (RATs) vary as a function of the number of global checkpoints (GCs), processor issue width, and window size. It improves upon previous RAT checkpointing work that ignored the actual latency and energy tradeoffs and focused solely on evaluating performance in terms of instructions per cycle (IPC). This work utilizes measurements from the full-custom checkpointed RAT implementations developed in a commercial 130-nm fabrication technology. Using physical- and architectural-level evaluations together, this paper demonstrates the tradeoffs among the aggressiveness of the RAT checkpointing, performance, and energy. This paper also shows that, as expected, focusing on IPC alone incorrectly predicts performance. The results of this study justify checkpointing techniques that use very few GCs (e.g., four). Additionally, based on full-custom implementations for the checkpointed RATs, this paper presents analytical latency and energy models. These models can be useful in the early stages of architectural exploration where actual physical implementations are unavailable or are hard to develop. For a variety of RAT organizations, our model estimations are within 6.4% and 11.6% of circuit simulation results for latency and energy, respectively. This range of accuracy is acceptable for architectural-level studies.