Skip to Main Content
Nowadays, dependability is of paramount importance in modern distributed storage systems. A challenging issue to deploy a storage system with certain dependability requirements or improve existing systems' dependability is how to comprehensively and efficiently characterize the dependability of those systems. In this paper, we present a two-layer Hidden Markov Model (HMM) to characterize the dependability of a distributed storage system, focusing on the layer of parallel file system. By training the model with observable measurements under faulty scenarios, such as I/O performance, we quantify the system dependability via a tuple of state transition probability, service degradation, and fault latency under those scenarios. Our experimental results on a distributed storage system with PVFS (Parallel Virtual File System) demonstrate the effectiveness of our HMM-based approach, which efficiently captures the behavior patterns of the target system under disk faults and memory overusage.