By Topic

Increasing the lookahead of multilevel branch prediction

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Veidenbaum, A.V. ; University of California Irvine

Many techniques have been proposed for tolemting memory latency in future systems, including prefetching and Decoupled-Access DRAM (DA-DRAM) architectures. In order for these techniques to be effective they need to have a suficient lookahead, i.e. to be far enough ahead of processor execution in requesting data. Bmnch prediction has been utilized before to achieve this but only small degrees of lookahead have been studied. This is not enough for latencies of tens to hundreds of clock cycles. In particular, DA-DRAM systems need a higher degree of decoupling because the access processor is placed in memory. This paper investigates the potential of multilevel prediction for increased lookahead, beyond the 2-3 level predictors that have been reported. A new multilevel predictor is defined and up to 8 levels of lookahead are studied. We find that the predictor has a high degree of accuracy, over 80%, even for integer benchmarks such as gcc and Oracle. Branch outcome as well as the address of the next instruction after the branch are predicted. A comparison of singleand multiple lookup prediction is presented. In general, accuracy decreases with lookahead. However, given sufficient predictor size it can be made quite high. With the DA-DRAM predictor located in DRAM, a purely software predictor with storage in the DRAM can be implemented. A similar predictor can be used for instruction and L2 cache prefetching or trace scheduling.

Published in:

Innovative Architecture for Future Generation High-Performance Processors and Systems, 1998

Date of Conference:

24-24 Oct. 1998