Cart (Loading....) | Create Account
Close category search window
 

Store Buffer Design for Multibanked Data Caches

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Torres, E. ; Dept. de Inf. e Ing. de Sist., Univ. de Zaragoza, Zaragoza, Spain ; Ibanez, P. ; Vinals-Yufera, V. ; Llaberia, J.M.

This paper focuses on how to design a store buffer (STB) well suited to first-level multibanked data caches. The goal is to forward data from in-flight stores into dependent loads within the latency of a cache bank. Taking into account the store lifetime in the processor pipeline and the data forwarding behavior, we propose a particular two-level STB design in which forwarding is done speculatively from a distributed first-level STB made of extremely small banks, whereas a centralized, second-level STB enforces correct store-load ordering. Besides, the two-level STB admits two simplifications that leave performance almost unchanged. Regarding the second-level STB, we suggest to remove its data forwarding capability, while for the first-level STB, it is possible to: 1) remove the instruction age checking and 2) compare only the less significant address bits. Experimentation covers both integer and floating point codes executing in dynamically scheduled processors. Following our guidelines and running SPEC-2K over an 8-way processor, a two-level STB with four 8-entry banks in the first level performs similar to an ideal, single-level STB with 128-entry banks working at the first-level cache latency. Also, we show that the proposed two-level design is suitable for a memory-latency-tolerant processor.

Published in:

Computers, IEEE Transactions on  (Volume:58 ,  Issue: 10 )

Date of Publication:

Oct. 2009

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.