Skip to Main Content
Because they are based on large, content-addressable memories, load-store queues (LSQs) present implementation challenges in superscalar processors. In this paper, we propose an alternate LSQ organization that separates the time-critical forwarding functionality from the process of checking that loads received their correct values. Two main techniques are exploited: First, the store-forwarding logic is accessed only by those loads and stores that are likely to be involved in forwarding, and second, the checking structure is banked by address. The result of these techniques is that the LSQ can be implemented by a collection of small, low-bandwidth structures yielding an estimated three to five times reduction in LSQ dynamic power.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.