Modern processors rely heavily on broadcast networks to bypass instruction results to dependent instructions in the pipeline. However, as clock rates increase, architectures get wider, and pipelines get deeper, broadcasting becomes more complex, slower, and more difficult to implement. This complexity is compounded by shrinking feature size, as the communication speed decreases relative to transistor switching speeds. We examine the fundamental needs of bypassing networks and propose a method for classifying these inter-ALU networks based on how operands are routed from producers to consumers. We then propose and evaluate at both the circuit and architectural level a fine grain point-to-point routed inter-ALU network (RIAN) that delivers the same or higher instruction throughput as a full bypass network but at higher speeds while using fewer wires.
Published in:
Computer Design, 2003. Proceedings. 21st International Conference on
Date of Conference: 13-15 Oct. 2003