Skip to Main Content
Current trends in CMPs indicate that the core count will increase in the near future. One of the main performance limiters of these forthcoming microarchitectures is the latency and high demand of the on-chip network and the off-chip memory communication. One of the main trade-offs when searching an optimal cache hierarchy is the sharing degree of cache space and its on-die distribution. Several techniques have appeared recently that optimize these parameters to get a better performance. This work provides some insight in the most promising configurations for tiled microarchitectures and shows the advantages and limitations of each of them in terms of performance and energy efficiency. This paper extends previous works by providing a complete study that evaluates different network topologies, single and multithreaded benchmarks, and single and multiprogrammed execution. In all these studies, the Distributed Cooperative Caching shows to be a promising alternative to traditional configurations for chip multiprocessors, providing a scalable and energy efficient solution.