Automatic Microprocessor Performance Bug Detection | IEEE Conference Publication | IEEE Xplore

Automatic Microprocessor Performance Bug Detection


Abstract:

Processor design validation and debug is a difficult and complex task, which consumes the lion’s share of the design process. Design bugs that affect processor performanc...Show More

Abstract:

Processor design validation and debug is a difficult and complex task, which consumes the lion’s share of the design process. Design bugs that affect processor performance rather than its functionality are especially difficult to catch, particularly in new microarchitectures. This is because, unlike functional bugs, the correct processor performance of new microarchitectures on complex, long-running benchmarks is typically not deterministically known. Thus, when performance benchmarking new microarchitectures, performance teams may assume that the design is correct when the performance of the new microarchitecture exceeds that of the previous generation, despite significant performance regressions existing in the design. In this work we present a two-stage, machine learning-based methodology that is able to detect the existence of performance bugs in microprocessors. Our results show that our best technique detects 91.5% of microprocessor core performance bugs whose average IPC impact across the studied applications is greater than 1% versus a bug-free design with zero false positives. When evaluated on memory system bugs, our technique achieves 100% detection with zero false positives. Moreover, the detection is automatic, requiring very little performance engineer time.
Date of Conference: 27 February 2021 - 03 March 2021
Date Added to IEEE Xplore: 22 April 2021
ISBN Information:

ISSN Information:

Conference Location: Seoul, Korea (South)

Funding Agency:

References is not available for this document.

I. Introduction

Verification and validation are typically the largest component of the design effort of a new processor. The effort can be broadly divided into two distinct disciplines, namely, functional and performance verification. The former is concerned with design correctness and has rightfully received significant attention in the literature. Even though challenging, it benefits from the availability of known correct output to compare against. Alternately, performance verification, which is typically concerned with generation-over-generation workload performance improvement, suffers from the lack of a known correct output to check against. Given the complexity of computer systems, it is often extremely difficult to accurately predict the expected performance of a given design on a given workload. Performance bugs, nevertheless, are a critical concern as the cadence of process technology scaling slows, as they rob processor designs of the primary performance and efficiency gains to be had through improved microarchitecture.

Select All
1.
ChampSim: A trace based microarchitecture simulator,” https://github.com/ChampSim/ChampSim
2.
SPEC CPU2006,” https://www.spec.org/cpu2006
3.
SPEC CPU2017,” https://www.spec.org/cpu2017
4.
M. Alam, J. Gottschlich, N. Tatbul, J. S. Turek, T. Mattson, and A. Muzahid, “A zero-positive learning approach for diagnosing software performance regressions,” in Advances in Neural Information Processing Systems, 2019, pp. 11 623–11 635.
5.
R. Atachiants, G. Doherty, and D. Gregg, “Parallel performance problems on shared-memory multicore systems: taxonomy and observation,” in IEEE Transactions on Software Engineering, vol. 42, no. 8, 2016, pp. 764–785.
6.
E. C. Barboza, S. Jacob, M. Ketkar, M. Kishinevsky, P. Gratz, and J. Hu, “Automatic microprocessor performance bug detection,” in arXiv:2011.08781 [cs.AR] preprint, 2020.
7.
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, “The Gem5 simulator,” in SIGARCH Computer Architecture News, vol. 39, no. 2, 2011, pp. 1–7.
8.
W. L. Bircher and L. K. John, “Complete system power estimation: A trickle-down approach based on performance events,” in IEEE International Symposium on Performance Analysis of Systems Software, 2007, pp. 158–168.
9.
B. Black, A. S. Huang, M. H. Lipasti, and J. P. Shen, “Can trace-driven simulators accurately predict superscalar performance? ” in International Conference on Computer Design. VLSI in Computers and Processors, 1996, pp. 478–485.
10.
P. Bose, “Architectural timing verification and test for super scalar processors,” in IEEE International Symposium on Fault-Tolerant Computing, 1994, pp. 256–265.
11.
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785–794.
12.
F. Chollet, “Keras,” https://keras.io, 2015.
13.
T. M. Conte, M. A. Hirsch, and K. N. Menezes, “Reducing state loss for effective trace sampling of superscalar processors,” in International Conference on Computer Design. VLSI in Computers and Processors, 1996, pp. 468–477.
14.
G. Contreras and M. Martonosi, “Power prediction for Intel XScale processors using performance monitoring unit events,” in International Symposium on Low Power Electronics and Design, 2005, pp. 221–226.
15.
B. Cook, T. Kurth, B. Austin, S. Williams, and J. Deslippe, “Performance variability on Xeon Phi,” in International Conference on High Performance Computing, 2017, pp. 419–429.
16.
C. Delimitrou and C. Kozyrakis, “iBench: Quantifying interference for datacenter applications,” in IEEE International Symposium on Workload Characterization, 2013, pp. 23–33.
17.
C. Delimitrou, D. Sanchez, and C. Kozyrakis, “Tarcil: Reconciling scheduling speed and quality in large shared clusters,” in ACM Symposium on Cloud Computing, 2015, pp. 97–110.
18.
R. Desikan, D. Burger, and S. W. Keckler, “Measuring experimental error in microprocessor simulation,” in International Symposium on Computer Architecture, 2001, p. 266–277.
19.
J. Doweck, W.-F. Kao, A. K.-y. Lu, J. Mandelblat, A. Rahatekar, L. Rappoport, E. Rotem, A. Yasin, and A. Yoaz, “Inside 6th-generation Intel Core: New microarchitecture code-named Skylake,” in IEEE Micro, vol. 37, no. 2, 2017, pp. 52–62.
20.
L. Eren, T. Ince, and S. Kiranyaz, “A generic intelligent bearing fault diagnosis system using compact adaptive 1D CNN classifier,” in Journal of Signal Processing Systems, vol. 91, no. 2, 2019, pp. 179–189.
21.
J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” in Annals of Statistics, 2001, pp. 1189–1232.
22.
Y. Gan, Y. Zhang, K. Hu, D. Cheng, Y. He, M. Pancholi, and C. Delimitrou, “Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices,” in International Conference on Architectural Support for Programming Languages and Operating Systems, 2019, pp. 19–33.
23.
P. Gepner, D. L. Fraser, and V. Gamayunov, “Evaluation of the 3rd generation Intel Core processor focusing on HPC applications,” in International Conference on Parallel and Distributed Processing Techniques and Applications, 2012, pp. 1–6.
24.
R. C. Ho, C. H. Yang, M. A. Horowitz, and D. L. Dill, “Architecture validation for processors,” in International Symposium on Computer Architecture, 1995, pp. 404–413.
25.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” in Neural Computation, vol. 9, no. 8, 1997, pp. 1735–1780.
26.
K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” in Neural Networks, vol. 2, no. 5, 1989, pp. 359–366.
27.
O. Ibidunmoye, F. Hernández-Rodriguez, and E. Elmroth, “Performance anomaly detection and bottleneck identification,” in ACM Computing Surveys, vol. 48, no. 1, 2015, pp. 1–35.
28.
Intel Corporation, “Intel386™ DX processor: Specification update,” 2004.
29.
Intel Corporation, “Intel Xeon processor scalable family: Specification update,” 2019.
30.
R. Joseph and M. Martonosi, “Run-time power estimation in high performance microprocessors,” in International Symposium on Low power electronics and design, 2001, pp. 135–140.

Contact IEEE to Subscribe

References

References is not available for this document.