Skip to Main Content
Helper thread is a promising prefetching technique to bridge the memory wall on contemporary CMP platform. However, the synchronization between application and helper thread is important to the performance improvement. Previous research mainly focused on the loop-count based synchronization, and it is only suitable for the main thread which has enough computation workload. As for the situation of small computation workload in main thread, this paper presents a multi-parameter helper thread prefetching model. By using memory intensive workloads, this paper gives a detailed performance evaluation of data-push(helper) thread on commercial CMP platform. As well, we evaluated the applicability of data push thread prefetching in multiple process environment. A methodology including workload selection and measurement metrics and hardware prefetcher throttle effect has been described. The evaluation results using data-push threads on em3d, mcf and mst show gains of 12%, 24%, 42% respectively when the hardware prefetcher was adjusted properly.