A Hausdorff Regression Paradigm for Interval Privacy | IEEE Journals & Magazine | IEEE Xplore

A Hausdorff Regression Paradigm for Interval Privacy


Abstract:

Data privacy has become a critical concern in today's data-driven world. Interval privacy emerges as a promising safeguard, representing private values as intervals. Trad...Show More

Abstract:

Data privacy has become a critical concern in today's data-driven world. Interval privacy emerges as a promising safeguard, representing private values as intervals. Traditional interval analysis methods, however, often rely on critical assumptions that are questionable in practice. To address this gap, we propose a novel paradigm for analyzing interval-valued data generated by the interval privacy mechanism. Our contributions are two-fold: First, we innovatively model intervals as random objects in a metric space and use the Hausdorff distance to quantify their dissimilarity without imposing restrictive assumptions. Second, as an application of our paradigm, we develop an interval-to-interval regression method named Hausdorff distance-based regression (HDBR), extending multivariate linear regression to metric spaces. The HDBR method estimates regression coefficients by minimizing the Hausdorff distance between the observed and estimated intervals. Simulation studies demonstrate the effectiveness and robustness of our proposed approach compared to mainstream competitors. We also provide a real data example to illustrate how to perform regression analysis within the interval privacy framework, and the results further validate the superiority of the HDBR method.
Published in: IEEE Signal Processing Letters ( Volume: 31)
Page(s): 146 - 150
Date of Publication: 18 December 2023

ISSN Information:

Funding Agency:


I. Introduction

Data privacy is a crucial aspect of data generation, storage, and processing within the context of increasing emphasis on data security. In the past decade, numerous methods have been developed to protect privacy. Traditional approaches rely on anonymization techniques [1]. Examples of such methods include HybrEx [2], -anonymity [3], -closeness [4], and -diversity [5], aiming to render each released dataset indistinguishable concerning (w.r.t.) a minimum number of individuals in the population. These techniques safeguard published datasets from identity disclosure. However, the anonymization data remains a problem because it is hard to analyze and can't maintain the relationship between the data. Another category of privacy protection methods that allow for analysis employs encryption techniques, including garbled circuits [6], homomorphic encryption [7], secret sharing [8], and others. More recent advancements in privacy protection involve noise-based algorithms, such as differential privacy [9], [10]. These techniques necessitate that the result of analyses conducted on a released dataset remains insensitive to the insertion or deletion of a tuple in the dataset. However, previous data privacy protection methods typically perturb the true value of the data, thus affecting its usability and accuracy.

Contact IEEE to Subscribe

References

References is not available for this document.