Skip to Main Content
High-performance I/O is a key requirement for many of today's critical computational science applications, and parallel file systems are being driven to progressively larger scales to keep pace with demand. One cost-effective way to meet this demand is through the deployment of commodity storage hardware in conjunction with file systems that provide software resiliency. This requires a re-evaluation of the core components of parallel file system architecture, however. In addition to interacting with resilient protocols, parallel file systems must also take into account unique HPC workloads that include bursty, highly concurrent access to large shared files. Such workloads are traditionally a challenge for software replication algorithms, in part because the underlying storage does not provide convenient semantic building blocks. In this work we isolate a common component of many parallel file systems, the object storage abstraction layer, and propose the introduction of semantic properties that will enable it to better serve as the building block for resilient HPC storage architectures. The properties that we have identified are atomicity, explicit versioning, and commutativity. We outline how these properties can be used to simplify software replication protocols for highly concurrent workloads. We also demonstrate that these properties can be implemented portably while still maintaining high performance on both commodity and enterprise-class storage platforms.