Skip to Main Content
Parallelism helps performance but at the same time stresses computer resources that are shared among threads. In this paper, we propose a low-overhead hardware counter based profiling method to accurately identify time-relevant contention locations in the program, then these contentions are mitigated so that performance of multithreading tasks can be boosted by the reduction of unnecessary contention cycles. In our preliminary experiment using NAS Parallel Benchmark (NPB), the contention searching algorithm is able to find an severe memory contention loop in FT code. After contention mitigation, more than 10% of the total cycles is eliminated, and the execution time of FT is reduced by 3% at the same time.