Skip to Main Content
The memory wall problem is one of the important issues in modern computer system - it affects the system performance in spite of the powerful processor. The emergence of multi-core processors has further exacerbated the problem. On the other hand, the increasing use of the linked data structure in applications aggravates the memory access latency. This paper dispatches two thread prefetching techniques based on CMP which prefetches the demanded data into the shared cache in advance to hide the long memory access latency, and evaluates these two helper thread prefetching algorithms based on three fundamental aspects of data prefetching. The performance evaluation of thread prefetching shows with acceptable coverage, increasing timeliness and accuracy can improve performance of threaded prefetching. However, improving accuracy by issuing prefetches earlier introduces more uncertainty in the prefetching, hence reducing coverage. Performance results of four benchmark applications when applying with two thread prefetching techniques is also provided.