Home  |   Login  |   Logout  |   Access Information  |   Alerts  |   Purchase History  |   Cart  |   Sitemap  |   Help   
 
Abstract
BROWSE SEARCH IEEE XPLORE GUIDE SUPPORT
arrow_left View Search Results  
Email/Printer Friendly Format  
 

On improving performance and energy profiles of sparse scientific applications

Malkowski, K.   Ingyu Lee   Raghavan, P.   Irwin, M.J.  
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
This paper appears in: Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Publication Date: 25-29 April 2006
On page(s): 8 pp.
ISBN: 1-4244-0054-6
Digital Object Identifier: 10.1109/IPDPS.2006.1639589
Current Version Published: 2006-06-26

Abstract
In many scientific applications, the majority of the execution time is spent within a few basic sparse kernels such as sparse matrix vector multiplication (SMV). Such sparse kernels can utilize only a fraction of the available processing speed because of their relatively large number of data accesses per floating point operation, and limited data locality and data re-use. Algorithmic changes and tuning of codes through blocking and loop unrolling schemes can improve performance but such tuned versions are typically not available in benchmark suites such as the SPEC CFP 2000. In this paper, we consider sparse SMV kernels with different levels of tuning that are representative of this application space. We emulate certain memory subsystem optimizations using SimpleScalar and Wattch to evaluate improvements in performance and energy metrics. We also characterize how such an evaluation can be affected by the interplay between code tuning and memory subsystem optimizations. Our results indicate that the optimizations reduce execution time by over 40%, and the energy by over 85%, when used with power control modes of CPUs and caches. Furthermore, the relative impact of the same set of memory subsystem optimizations can vary significantly depending on the level of code tuning. Consequently, it may be appropriate to augment traditional benchmarks by tuned kernels typical of high performance sparse scientific codes to enable comprehensive evaluations of future systems.

Index Terms
Available to subscribers and IEEE members.

References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.
You are not logged in.
Guests may access Abstract records free of charge.
Login
Username
Password
» Forgot your password?
Please remember to log out when you have finished your session.
You must log in to access:
• Advanced or Author Search
• CrossRef Search
• AbstractPlus Records
• Full Text PDF
• Full Text HTML
Access this document
Full Text PDF icon
Full Text: PDF (232 KB)
» Buy this document now
» Learn more about
» Learn more about
   purchasing articles
   and standards
Rights and Permissions>
» Learn More
Download this citation
Available to subscribers and IEEE members.
 
arrow_left View Search Results  
Indexed by IEE Inspec
© Copyright 2010 IEEE – All Rights Reserved