By Topic

Improving last level cache locality by integrating loop and data transformations

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Wei Ding ; Pennsylvania State Univ., University Park, PA, USA ; Kandemir, M.

Motivated by the observation that most existing data locality optimizations do not specifically target shared last-level caches of emerging multicores and that even multicore-specific locality-oriented techniques employ either loop or data layout optimizations but not both, in this paper we present an integrated loop and data layout optimization strategy, with the goal of improving the last-level cache performance of multicores that execute multithreaded applications. We present a detailed mathematical formulation of our locality optimization strategy and present experimental data from our current implementation. Our results, collected using 14 application programs, clearly show that the proposed integrated approach is very successful in practice, and outperforms both pure loop optimization and pure data layout optimization based alternatives. Our results also indicate that the savings achieved increase with increased core count and larger data set sizes.

Published in:

Computer-Aided Design (ICCAD), 2012 IEEE/ACM International Conference on

Date of Conference:

5-8 Nov. 2012