By Topic

An Analytic Framework for Detailed Resource Profiling in Large and Parallel Programs and Its Application for Memory Use

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Finkler, U. ; IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA

Profiling is an essential and widely used technique to understand the resource use of applications. For example, the memory use of large applications is becoming an important cost factor. Very large systems are typically sized to accommodate designated tasks, and thus, the price, as well as cache and TLB efficiency, depends significantly on the memory footprint of the target applications. Importantly, the increasing use of multicore systems magnifies the problem since memory use grows with the number of parallel tasks. Additionally, the presence of multiple tasks or threads makes the problem of correlating resource use to the program structure harder. Thus, tools that correlate resource use with program structure with quantitative error margins are essential for optimizing the resource use of complex software applications. While efficient tools for the profiling of execution time are available, the choices for detailed profiling of memory use or other hardware resources are very limited. We were unable to find tools that provided sufficiently accurate insight into, e.g., memory use without adding unacceptable overhead in memory use and execution time for the performance analysis of very large applications. In this paper, we present a highly efficient probabilistic method for profiling that provides detailed resource usage information R?(t) indexed by the full location descriptor ? (e.g., process id, thread id, and call chain) and time t. Importantly, we provide an analytical framework, which provides error estimates and allows to analyze and quantitatively optimize a wide variety of profiling scenarios. We employed the probabilistic approach to implement a memory profiling tool that adds minimal overhead and does not require recompilation or relinking. The tool provides the memory use M? (t) for all location descriptors ? over the execution time for single and multithreaded programs. Experimental results confirm that execution time and memory o- - verhead are less than 10 percent of the unprofiled, optimized execution. Importantly, the technique is sufficiently general to be applicable to profiling of other hardware resources as cache or TLB misses over time for all location descriptors with similarly low overhead and across multiple processes, threads, and processors.

Published in:

Computers, IEEE Transactions on  (Volume:59 ,  Issue: 3 )