By Topic

Optimizing memory accesses for spatial computation

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Seth, M.B. ; Dept. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA ; Goldstein, S.C.

We present the internal representation and optimizations used by the CASH compiler for improving the memory parallelism of pointer-based programs. CASH uses an SSA-based representation for memory, which compactly summarizes both control-flow- and dependence information. In CASH, memory optimization is a four-step process: (1) first an initial, relatively coarse representation of memory dependences is built; (2) next, unnecessary memory dependences are removed using dependence tests; (3) third, redundant memory operations are removed (4) finally, parallelism is increased by pipelining memory accesses in loops. While the first three steps above are very general, the loop pipelining transformations are particularly applicable for spatial computation, which is the primary target of CASH. The redundant memory removal optimizations presented are: load/store hoisting (subsuming partial redundancy elimination and common-subexpression elimination), load-after-store removal, store-before-store removal (dead store removal) and loop-invariant load motion. One of our loop pipelining transformations is a new form of loop parallelization, called loop decoupling. This transformation separates independent memory accesses within a loop body into several independent loops, which are allowed dynamically to slip with respect to each other A new computational primitive, a token generator is used to dynamically control the amount of slip, allowing maximum freedom, while guaranteeing that no memory dependences are violated.

Published in:

Code Generation and Optimization, 2003. CGO 2003. International Symposium on

Date of Conference:

23-26 March 2003