By Topic

Parallel performance prediction using lost cycles analysis

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Crovella, M.E. ; Dept. of Comput. Sci., Rochester Univ., NY, USA ; LeBlanc, T.J.

Most performance debugging and tuning of parallel programs is based on the “measure-modify” approach, which is heavily dependent on detailed measurements of programs during execution. This approach is extremely time consuming and does not lend itself to predicting performance under varying conditions. Analytic modeling and scalability analysis provide predictive power, but are not widely used in practice, due primarily to their emphasis on asymptotic behavior and the difficulty of developing accurate models that work for real world programs. We describe a set of tools for performance tuning of parallel programs that bridges this gap between measurement and modeling. The approach is based on lost cycles analysis, which involves measurement and modeling of all sources of overhead in a parallel program. We first describe a tool for measuring overheads in parallel programs that we have incorporated onto the runtime environment for Fortran programs on the Kendall Square KSR1. We then describe a tool that fits these overhead measurements to analytic forms. We illustrate the use of these tools by analyzing the performance tradeoffs among parallel implementations of 2D FFT. These examples show how our tools enable programmers to develop accurate performance models of parallel applications without requiring extensive performance modeling expertise

Published in:

Supercomputing '94., Proceedings

Date of Conference:

14-18 Nov 1994