Conferences >IEEE INFOCOM 2024 - IEEE Conf...

CDCache: Space-Efficient Flash Caching via Compression-before-Deduplication

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Large-scale storage systems boost I/O performance via flash caching, but the underlying storage medium of flash caching incurs significant financial costs and also exhibi...Show More

Metadata

Abstract:

Large-scale storage systems boost I/O performance via flash caching, but the underlying storage medium of flash caching incurs significant financial costs and also exhibits low endurance. Previous studies adopt compression-after-deduplication to mitigate writing redundant contents into the flash cache, so as to address the cost and endurance issues. However, deduplication and compression have conflicting preferable cases, and compression-after-deduplication essentially compromises the space-saving benefits of either deduplication or compression. To simultaneously preserve the benefits of both approaches, we explore compression-before-deduplication, which applies compression to eliminate byte-level redundancies across data blocks, followed by deduplication to write only a single copy of duplicate compressed blocks into the flash cache. We present CDCache, a space-efficient flash caching system that realizes compression-before-deduplication. It proposes to dynamically adjust the compression range of data blocks, so as to preserve the effectiveness of deduplication on the compressed blocks. Also, it builds on various design techniques to approximately estimate duplicate data blocks and efficiently manage compressed blocks. Trace-driven experiments show that CDCache improves the read hit ratio and the write reduction ratio of a previous compression-after-deduplication approach by up to 1.3× and 1.6×, respectively, while it only has small memory overhead for index management.

Published in: IEEE INFOCOM 2024 - IEEE Conference on Computer Communications

Date of Conference: 20-23 May 2024

Date Added to IEEE Xplore: 12 August 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/INFOCOM52122.2024.10621089

Conference Location: Vancouver, BC, Canada

Funding Agency:

Contents

I. Introduction

Large-scale storage systems [1], [7], [23], [27], [33] apply flash caching to improve the performance of hard disk drives (HDDs). They propose to use solid-state drives (SSDs) atop of HDDs to serve as buffers for frequently accessed contents, thereby mitigating the performance overhead of direct access to HDDs. However, SSDs have two fundamental limitations that impede their applications in flash caching. First, compared to HDDs, SSDs incur significantly higher cost-per-GiB, and this substantial cost disparity remains prevalent today. For example, the I/O performance of Crucial Pro T700 (a top-selling SSD device) [2] achieves 12,400MiB/s, about 70.8× that of WD Blue WD40EZRZ (a top-selling HDD device), but its cost-per-bit is also 7.5× that of WD Blue WD40EZRZ [4] (based on the available pricing plans in the respective official websites [2], [4] in July 2023). In addition, SSDs exhibit limited endurance and are susceptible to wear-out issues [11]. Specifically, owing to the underlying NAND flash technology in SSDs, each memory cell endures only a finite number of write cycles before experiencing degradation. This degradation manifests in various forms, such as reduced performance, increased error rates, or even complete failure of the drive.

References is not available for this document.

CDCache: Space-Efficient Flash Caching via Compression-before-Deduplication

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

CDCache: Space-Efficient Flash Caching via Compression-before-Deduplication

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?