Skip to Main Content
The paper studies different schemes to enhance the reliability, availability and security of a high performance distributed storage system. We have previously designed a distributed parallel storage system that employs the aggregate bandwidth of multiple data servers connected by a high speed wide area network to achieve scalability and high data throughput. The general approach of the paper employs erasure error correcting codes to add data redundancy that can be used to retrieve missing information caused by hardware, software, or human faults. The paper suggests techniques for reducing the communication and computation overhead incurred while retrieving missing data blocks form redundant information. These techniques include clustering, multidimensional coding, and the full two dimensional parity scheme.