Skip to Main Content
Data deduplication has recently become commonplace in most secondary storage and even in some primary storage for the capacity optimization purpose. Aside from its write performance, read performance of the deduplication storage has been gaining in significance with a wide range of its deployments. In this paper, we emphasize the importance of read performance in reconstituting a data stream from its unique and shared chunks physically dispersed over deduplication storage. We newly introduce a read performance indicator called Chunk Fragmentation Level (CFL). We also validate that the CFL is very effective to indicate read performance of deduplication storage through a developed theoretical performance model and extensive experiments. Finally, we articulate further research issues.