Skip to Main Content
This uses Wattch and the SPEC 2000 integer and floating-point benchmarks to explore the role of branch predictor organization in power/energy/performance trade offs for processor design. Even though the direction predictor by itself represents less than 1 percent of the processor's total power dissipation, prediction accuracy is nevertheless a powerful lever on processor behavior and program execution time. A thorough study of branch predictor organizations shows that, as a general rule, to reduce overall energy consumption in the processor, it is worthwhile to spend more power in the branch predictor if this results in more accurate predictions that improve running time. This not only improves performance, but can also improve the energy-delay product by up to 20 percent. Three techniques, however, can reduce power dissipation without harming accuracy. Banking reduces the portion of the branch predictor that is active at any one time. A new on-chip structure, the prediction probe detector (PPD), uses predecode bits to entirely eliminate unnecessary predictor and branch target buffer (BTB) accesses. Despite the extra power that must be spent accessing it, the PPD reduces local predictor power and energy dissipation by about 31 percent and overall processor power and energy dissipation by 3 percent. These savings can be further improved by using profiling to annotate branches, identifying those that are highly biased and do not require static prediction. Finally, we explore the effectiveness of a previously proposed technique, pipeline gating, and find that, even with adaptive control based on recent predictor accuracy, pipeline gating yields little or no energy savings.
Date of Publication: Feb 2004