Skip to Main Content
Large-scale software systems handle increasingly larger workloads by implementing highly concurrent and distributed design patterns. The thread pool pattern uses pools of pre-existing and reusable threads to limit thread lifecycle over-head (thread creation and destruction) and resource thrashing (thread proliferation). However, these advantages are weighed against performance issues caused by concurrency risks, like synchronization errors or deadlock, and thread pool-specific risks, like poorly tuned pool size or thread leakage. Detecting these performance issues during load testing requires a thorough understanding of how thread pools behave, yet most performance analysts have limited knowledge of the system and are flooded with terabytes of data from load tests. We propose a methodology to identify threads with performance deviations in thread pools. Our methodology ranks threads based on the dissimilarity of their resource usage metrics. A case study on a large-scale industrial software system shows that our methodology can identify threads with performance deviations with an average precision of 100% and an average recall of 76.61%. Our methodology performs very well when ranking long-lived deviations, such as memory leaks, but more work is needed to rank short-lived deviations, such as CPU spikes.