Conferences >2010 43rd Annual IEEE/ACM Int...

ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Recently-proposed architectures that continuously operate on atomic blocks of instructions (also called chunks) can boost the programmability and performance of shared-me...Show More

Metadata

Abstract:

Recently-proposed architectures that continuously operate on atomic blocks of instructions (also called chunks) can boost the programmability and performance of shared-memory multiprocessing. However, they must support chunk operations very efficiently. In particular, in lazy conflict-detection environments, it is key that they provide scalable chunk commits. Unfortunately, current proposals typically fail to enable maximum overlap of conflict-free chunk commits. This paper presents a novel directory-based protocol that enables highly-overlapped, scalable chunk commits. The protocol, called Scalable Bulk, builds on the previously-proposed BulkSC protocol. It introduces three general hardware primitives for scalable commit: preventing access to a set of directory entries, grouping directory modules, and initiating the commit optimistically. Our results with SPLASH-2 and PARSEC codes with up to 64 processors show that Scalable Bulk enables highly-overlapped chunk commits and delivers scalable performance. Unlike previously proposed schemes, it removes practically all commit stalls.

Published in: 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture

Date of Conference: 04-08 December 2010

Date Added to IEEE Xplore: 20 January 2011

Print ISBN:978-1-4244-9071-4

ISSN Information:

DOI: 10.1109/MICRO.2010.29

Conference Location: Atlanta, GA, USA

Contents

1. Introduction

There are several recent proposals for shared-memory architectures that efficiently support continuous atomic-block operation [2], [5], [6], [8], [9], [14], 1[8], 1[9]. In these architectures, a processor repeatedly executes blocks of consecutive instructions from a thread (also called chunks) in an atomic manner. These systems include TCC [6], [9], BulkSC [5], Implicit Transactions (IT) [18], ASO [19], InvisiFence [2], DMP [8], and SRC [14] among others. This mode of execution has performance and programmability advantages. For example, it can support transactional memory [6], [9], [14]; high-performance execution, even for strict memory consistency models [2], [5], [19]; a variety of techniques for parallel program development and debugging such as determinism [8], program replay [12], and atomicity violation debugging [10]; and even provide a substrate for new high-performance compiler transformations [1], [13].

References is not available for this document.

ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?