Loading [a11y]/accessibility-menu.js
Extending the Performance Analysis Tool Box: Multi-stage CPI Stacks and FLOPS Stacks | IEEE Conference Publication | IEEE Xplore

Extending the Performance Analysis Tool Box: Multi-stage CPI Stacks and FLOPS Stacks


Abstract:

CPI stacks are an intuitive way to visualize processor core performance bottlenecks. However, they often do not provide a full view on all bottlenecks, because stall even...Show More

Abstract:

CPI stacks are an intuitive way to visualize processor core performance bottlenecks. However, they often do not provide a full view on all bottlenecks, because stall events can occur concurrently (e.g., an instruction cache miss and a data cache miss). To not double-count penalties, typically one of the events is selected, which means information about the non-chosen stall events is lost. Furthermore, we show that there is no single correct CPI stack: stall penalties can be hidden, can overlap or can cause second-order effects, making total CPI more complex than just a sum of components. Instead of showing a single CPI stack, we propose to measure multiple CPI stacks during program execution: a CPI stack at each stage of the processor pipeline. This representation reveals all performance bottlenecks and provides a more complete view on the performance of an application. Additionally, we propose FLOPS stacks, targeted at HPC performance analysis. FLOPS stacks are a variant of CPI stacks at the issue stage, but instead of considering all instructions, they focus at floating point performance specifically, which is the common definition of useful work in the HPC domain. Multi-stage CPI stacks and FLOPS stacks are easy to collect. We show that they can be included in a simulator with negligible slowdown, and we provide recommendations how to include them in a hardware core.
Date of Conference: 02-04 April 2018
Date Added to IEEE Xplore: 28 May 2018
ISBN Information:
Conference Location: Belfast, UK

Contact IEEE to Subscribe

References

References is not available for this document.