Skip to Main Content
We present a framework for a distributed and lowcost implementation of synchronization mechanisms for embedded shared-memory multiprocessors. The proposed architecture effectively implements the queued-lock semantics in a completely decentralized manner through low-cost and distributed synchronization controllers performing distributed synchronization management protocols. The proposed approach achieves three major benefits. First, it completely eliminates the overwhelming bus contention traffic when multiple cores compete for a synchronization variable. Second, it exhibits extremely low best-case latency of lock acquisition (with zero bus transactions). Third, the approach enables multiple venues for high energy efficiency as the local synchronization controllers can efficiently determine, without any bus transactions or local cache spinning, the exact timing of when a lock is made available to or a barrier enabled at the local processor. It becomes possible for the system software or the thread library to employ various low-power policies.