Skip to Main Content
In this paper we present a framework for analyzing the fault tolerance of deduplicated storage systems. We discuss methods for building models of deduplicated storage systems by analyzing empirical data on a file category basis. We provide an algorithm for generating component-based models from this information and a specification of the storage system architecture. Given the complex nature of detailed models of deduplicated storage systems, finding a solution using traditional discrete event simulation or numerical solvers can be difficult. We introduce an algorithm which allows for a more efficient solution by exploiting the underlying structure of dependencies to decompose the model of the storage system. We present a case study of our framework for a real system.We analyze a production deduplicated storage system and propose extensions which improve fault tolerance while maintaining high storage efficiency.