Skip to Main Content
The functional correctness of shared memory applications executing on multicores and multiprocessor systems is supported by cache coherence protocols. The correct operation of these applications thus depends on the correctness of the cache coherence transactions. However, verifying the correctness of these transactions is not trivial since even simple coherence protocols have multiple states. Transitions among the states can fail due to aging of devices or single event upsets. In this paper we present a centralized mechanism for online verification of cache coherence transactions in snoopy bus multicore systems. We make use of an architecture that we previously proposed for opportunistic Dual Modular Redundancy (DMR). This architecture includes, in addition to the general-purpose cores, a diminutive core called the Sentry Core (SC) that is small and simple and thus, can be assumed to be fault-free. Like other cores, the SC has access to the shared bus and is aware of the cache coherence protocol. It monitors all bus transactions and by observing the current state of the cache line being addressed and the type of operation (e.g., read or write) it knows the expected next state for that cache line. Deviation from expected behavior will indicate a possibe error. Our preliminary experiments show that a significant fraction of the coherence transactions can be verified by our scheme.
Date of Conference: 3-5 Oct. 2012