Skip to Main Content
Powerful branch predictors along with a large branch target buffer (BTB) are employed in superscalar processors for instruction-level parallelism exploitation. However, the large BTB not only dominates the predictor energy consumption, but also becomes a major roadblock in achieving faster clock frequencies at deep sub-micron technologies. In this paper, we propose a filtering scheme to reduce the accesses to the BTB to achieve a significant dynamic energy reduction in the BTB while maintaining the performance. Our experimental evaluation using the SPEC2000 benchmark suite shows that our BTB Access Filtering (BAF) design achieves a 88.5% dynamic energy reduction over a default 2K-entry 2-way BTB at the cost of a negligible 0.1% performance loss, on the average across all benchmarks. We also studied the leakage behavior and its control in our BAF design. The results show that by applying a drowsy strategy, we can achieve a very effective leakage control in the BTB, a 83% leakage reduction at a marginal 0.3% performance overhead. For high performance design, our BAF can also improve BTB's performance scalability at new technologies. In deeply-pipelined designs, BAF design yields a 2.7% (and 8.1%) performance improvement over a conventional 2-cycle (and 3-cycle) BTB, with its energy efficiency fully exploited.