Scheduled System Maintenance:
On May 6th, system maintenance will take place from 8:00 AM - 12:00 PM ET (12:00 - 16:00 UTC). During this time, there may be intermittent impact on performance. We apologize for the inconvenience.
By Topic

Cyclone: a broadcast-free dynamic instruction scheduler with selective replay

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

The purchase and pricing options are temporarily unavailable. Please try again later.
3 Author(s)
Ernst, D. ; Adv. Comput. Archit. Lab, Michigan Univ., Ann Arbor, MI, USA ; Hamel, A. ; Austin, T.

To achieve high instruction throughput, instruction schedulers must be capable of producing high-quality schedules that maximize functional unit utilization while at the same time enabling fast instruction issue logic. Many solutions exist to the scheduling problem, ranging from compile-time to run-time approaches. Compile-time solutions feature fast and simple hardware, but at the expense of conservative schedules. Dynamic schedulers produce high-quality schedules that incorporate run-time information and dependence speculation, but implementing these schedulers requires complex circuits that can slow processor clock speeds. We present the Cyclone scheduler, a novel design that captures the benefits of both compile-and run-time scheduling. Our approach utilizes a list-based single-pass instruction scheduling algorithm, implemented by hardware at run-time in the front end of the processor pipeline. Once scheduled, instructions are injected into a timed queue that orchestrates their entry into execution. To accommodate branch and load/store dependence speculation, the Cyclone scheduler supports a simple selective replay mechanism. We implement this technique by overloading instruction register forwarding to also detect instructions dependent on incorrectly scheduled operations. Detailed simulation analyses suggest that with sufficient queue width, the Cyclone scheduler can rival the instruction throughput of similarly wide monolithic dynamic schedulers. Furthermore, the circuit complexity of the Cyclone scheduler is much more favorable than a broadcast-based scheduler, as our approach requires no global control signals.

Published in:

Computer Architecture, 2003. Proceedings. 30th Annual International Symposium on

Date of Conference:

9-11 June 2003