Skip to Main Content
Implementing shared memory consistency models on top of hardware caches gives rise to the well-known cache coherence problem. The standard solution involves implementing coherence protocols in hardware, an approach with some design complexity, hardware costs, and restrictions on interconnect behavior. However, for some memory consistency models, an alternative is to enforce coherence in the software implementation of synchronization primitives, using software controlled invalidations and forced writebacks. This requires minimal hardware support but gives less selective enforcement, which affects performance. This paper proposes a novel hybrid software-hardware coherence mechanism. In this scheme, software is responsible for triggering the coherence actions-self-invalidations and writebacks-at appropriate times while hardware uses Bloom filters to perform more selective self-invalidations. We evaluate the proposed scheme on applications from two different domains: the SPLASH-2 scientific and ALP multimedia benchmarks. Experimental results show that while the software-only coherence scheme shows less performance degradation than expected, it still unacceptably degrades performance for some of the benchmarks. Filtering out unnecessary invalidations improves the worst-case performance by as much as 93 percent, and brings the performance of the hybrid scheme within five percent of full hardware coherence for 10 out of 13 benchmarks, on a 32-core CMP with a shared L2 cache.