System Maintenance:
There may be intermittent impact on performance while updates are in progress. We apologize for the inconvenience.
By Topic

Improving locality using loop and data transformations in an integrated framework

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Kandemir, M. ; Dept. of Electr. Eng. & Comput. Sci., Northwestern Univ., Evanston, IL, USA ; Choudhary, A. ; Ramanujam, J. ; Banerjee, P.

This paper presents a new integrated compiler framework for improving the cache performance of scientific applications. In addition to applying loop transformations, the method includes data layout optimizations, i.e., those that change the memory layouts of data structures (arrays in this case). A key characteristic of this approach is that loop transformations are used to improve temporal locality while data layout optimizations are used to improve spatial locality. This optimization framework was used with sixteen loop nests from several benchmarks and math libraries, and the performance was measured using a cache simulator in addition to using a single node of the SGI Origin 2000 distributed-shared-memory machine for measuring actual execution times. The results demonstrate that this approach is very effective in improving locality and outperforms current solutions that use either loop or data transformations alone. We expect that our solution will also enable better register usage due to increased temporal locality in the innermost loop, and that it will help in eliminating false-sharing on multiprocessors due to exploiting spatial locality in the innermost loop

Published in:

Microarchitecture, 1998. MICRO-31. Proceedings. 31st Annual ACM/IEEE International Symposium on

Date of Conference:

30 Nov-2 Dec 1998