Abstract:
The enormous volume of data generated by large-scale instruments and simulations poses significant challenges in archiving, transferring, sharing and analyzing data for v...Show MoreMetadata
Abstract:
The enormous volume of data generated by large-scale instruments and simulations poses significant challenges in archiving, transferring, sharing and analyzing data for various scientific groups. Lossy reduction techniques are vital to reducing data size to acceptable levels. However, putting more information content per bit, increases the severity of loss if perturbed by malicious users or hardware failures. In the worst case, the entire dataset is compromised. Malevolent alteration or destruction of datasets containing crucial discoveries can completely invalidate research outcomes in scientific studies. Therefore, it is critical to integrate compression and encryption to handle data securely and efficiently. The current state-of-the-art combination technique Cmpr- Encr handles compression and encryption as two distinct processes. This reduces the compression ratio and bandwidth, especially for hard-to-compress datasets. In this paper, we propose two data protection strategies that work in conjunction with the lossy compressor SZ: Encr-Quant and Encr-Huffman, and carefully evaluate the overhead they introduce on compression bandwidth and ratio. Based on the results of testing with real-world scientific datasets, we find that the cost of Encr-Quant varies with the dataset's properties and requires cautious selection. Encr-Huffman is able to maintain more than 99% of the original compression ratio while saving 6.5% in compression time compared to SZ. Applying Cmpr-Encr leads to a reduction in compression bandwidth, whereas Encr-Huffman increases bandwidth by 3.1% over the SZ, on average.
Date of Conference: 05-08 September 2022
Date Added to IEEE Xplore: 18 October 2022
ISBN Information:
ISSN Information:
References is not available for this document.
References is not available for this document.