By Topic

Dcache Warn: an I-fetch policy to increase SMT efficiency

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
F. J. Cazorla ; Dept. d'Arquitectura de Computadors, Univ. Politecnica de Catalunya, Barcelona, Spain ; A. Ramirez ; M. Valero ; E. Fernandez

Summary form only given. Simultaneous multithreading (SMT) processors increase performance by executing instructions from multiple threads simultaneously. These threads share the processor's resources, but also compete for them. In this environment, a thread missing in the L2 cache may allocate a large number of resources for a long time, causing other threads to run much slower than they could. To prevent this problem we should know in advance if a thread is going to miss in the L2 cache. L1 misses are a clear indicator of a possible L2 miss. However, to stall a thread on every L1 miss is too severe, because not all L1 misses lead to an L2 miss, and this would cause an unnecessary stall and resource under-use. Also, to wait until an L2 miss is declared and squash the thread to free up the allocated resources is too expensive in terms of complexity and reexecuted instructions. We propose a novel fetch policy, which we call DWarn. DWarn uses L1 misses as indicators of L2 misses, giving higher priority to threads with no outstanding L1 misses. DWarn acts on L1 misses, before L2 misses happen in a controlled manner to reduce resource under-use and to avoid harming a thread when L1 misses do not lead to L2 misses. Our results show that DWarn outperforms previously proposed policies, in both throughput and fairness, while requiring fewer resources and avoiding instruction reexecution.

Published in:

Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International

Date of Conference:

26-30 April 2004