For modern superscalar processors, branch prediction is a must, and there has been significant progress in this field during recent years. For the IBM System ESA/390™ environment, a set of traces exists which represent different kinds of commercial workloads, and they include operating-system interactions. We have used four of these traces to evaluate a large variety of branch-prediction algorithms in order to identify possible design tradeoffs. One property of ESA/390 architecture is that for most branches, target address calculation involves the use of values stored in general-purpose registers. Therefore, not only branch directions but target addresses must be predicted. When performing prefetch-time prediction, a branch target buffer (BTB) is used to provide/predict the target address. In this paper, all evaluated prediction methods are combined with such a BTB. The resulting size for the BTB is significantly larger than for designs evaluated with SPECmark™ traces. Algorithms for determining branch direction are examined and compared. These algorithms include local branch history methods as well as global history and path history procedures. Finally, combinations of some of these methods, known as hybrid predictors, are evaluated. The path history algorithm we use is an adaptation of a known algorithm, but including it in the hybrid predictor is new. For all of these methods, design parameters are varied to find the tradeoff between the hardware needed and the prediction quality achieved. Results, except for those for the path predictor, are comparable to SPECmark results, except that for most cases less history must be used. Another property of ESA/390 architecture, the absence of specific subroutine call and return instructions, led to the investigation of hardware for self-detecting call/return pairs. A new approach has been developed, and its prediction quality is demonstrated. All of the methods described above use a BTB. A BTB performs well- if b ranches have fixed targets. However, about 5% of the branches we consider have changing target addresses. Very recently an algorithm was proposed for treating such branches using a modification to the BTB approach. We have implemented an enhancement to this method, and the prediction correctness achievable using the enhanced method is shown in the results presented in this paper. Finally, combining several of the investigated schemes increases branch-prediction correctness in commercial environments. However, it remains to be shown whether the tremendous increase in hardware required for their implementation can be justified.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.