Skip to Main Content
Fast, accurate, and effective performance analysis is essential for the design of modern processor architectures and improving application performance. Recent trends toward highly concurrent processors make this goal increasingly difficult. Conventional techniques, based on simulators and performance monitors, are ill-equipped to analyze how a plethora of concurrent events interact and how they affect performance. Prior research has shown the utility of critical path analysis in solving this problem. This analysis abstracts the execution of a program with a dependence graph. With simple manipulations on the graph, designers can gain insights into the bottlenecks of a design. This paper extends critical path analysis to understand the performance of a next-generation, high-ILP architecture. The TRIPS architecture introduces new features not present in conventional superscalar architectures. We show how dependence constraints introduced by these features, specifically the execution model and operand communication links, can be modeled with a dependence graph. We describe a new algorithm that tracks critical path information at a fine-grained level and yet can deliver an order of magnitude (30x) improvement in performance over previously proposed techniques. Finally, we provide a breakdown of the critical path for a select set of benchmarks and show an example where we use this information to improve the performance of a heavily-hand-optimized program by as much as 11%.