Abstract:
The performance of a task running on a many-core with distributed shared last-level cache (LLC) strongly depends on two parameters: the power budget needed to guarantee t...Show MoreMetadata
Abstract:
The performance of a task running on a many-core with distributed shared last-level cache (LLC) strongly depends on two parameters: the power budget needed to guarantee thermally-safe operation and the LLC latency. The task's thread-to-core mapping determines both the parameters and needs to make a trade-off because both cannot be simultaneously optimal. Arrival and departure of tasks on a many-core deployed in an open system can change its state significantly in terms of available cores and power budgets. Task migrations can thereupon be used as a tool to keep the many-core operating at peak performance. Furthermore, the relative impacts of power budget and LLC latency on a task's performance may change with its different execution phases mandating its migration on-the-fly. We propose the first run-time algorithm PCMig that increases the performance of a many-core with distributed shared LLC by migrating tasks based on their phases and the many-core's state. PCMig is based on a model that predicts the performance impact of migrations. We propose a performance prediction model based on a lightweight neural network (NN). To serve as a reference, we also propose an analytical model of the many-core that operates on CPI stacks. We demonstrate an NN-based model achieves a higher prediction accuracy at a lower overhead than an analytical model. PCMig is based on the NN prediction model and results in an up to 7.3 percent increase in performance under a thermal constraint for mixed workloads compared to architecture-aware state-of-the-art (up to 20 percent increase for individual applications). This is achieved with a run-time overhead of less than 0.5 percent.
Published in: IEEE Transactions on Computers ( Volume: 70, Issue: 10, 01 October 2021)