Optimal Copyset in Distributed Object Storage | IEEE Conference Publication | IEEE Xplore

Optimal Copyset in Distributed Object Storage


Abstract:

In distributed storage systems, the replication mechanisms are usually used to ensure system reliability and data availability. Random replication is widely used in cloud...Show More

Abstract:

In distributed storage systems, the replication mechanisms are usually used to ensure system reliability and data availability. Random replication is widely used in cloud storage systems to prevent data loss. Copyset Replication (CR) as a replication strategy, makes a nearly optimal trade-off between the number of scattered nodes and the probability of data loss. Compared with random replication, CR greatly reduces the probability of data loss caused by node failure. However, CR's random selection strategy makes it difficult to select the optimal copyset based on data characteristics such as calculation and storage. In response to this problem of CR, the Optimal Copyset Replication (OCR) proposed in this paper can select the optimal copyset according to the specified data characteristics and its corresponding node conditions. Finally, combined with Cyberspace Mimicry Defense (CMD) , we implemented OCR in a distributed object storage system and conducted related experiments. When the calculation type data reaches 300,000, the experimental results prove that compared with CR randomly selecting copyset, OCR reduces the data processing time by nearly 10% through selecting the optimal copyset. By setting relevant parameters, OCR can also ensure that the data distribution of each node is relatively uniform, and avoid data skew.
Date of Conference: 15-18 December 2021
Date Added to IEEE Xplore: 13 January 2022
ISBN Information:
Conference Location: Orlando, FL, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.