Skip to Main Content
Privacy is fetching a progressively more imperative issue in several data-mining applications dealing with sensitive data especially in health care, security, financial, behavioral etc., Most of the existing techniques are managing a Secure Two-Party Computation model, where two parties, each having a private database, want to cooperatively conduct data-mining operations on the union of their data. The problem we are pinning down for Privacy Preserving Data Mining(PPDM), is how a data owner can release a version of its confidential data with guarantees that the original sensitive information cannot be re-identified while the analytic properties of the data are preserved. In this paper we work to investigate the leeway of using multiplicative random projection sparse matrices for privacy preserving data in datasets which gets incremented asynchronously over time from various sources. The data stream is asynchronous. This work proposes the use of random projections with a sparse matrix to maintain a sketch of a collection of high-dimensional data-streams that are updated asynchronously. This sketch allows us to estimate L2 (Euclidean) distances and dot products with high accuracy. We have also proposed a conceptual architecture for implementing the privacy preservation techniques especially the Sparse Random Projection Matrix technique in incremental data to improve the level of privacy protection. We have tested to see that the perturbed data still preserves certain statistical characteristics of the data as the original unperturbed data. At this juncture we have proposed a generic projection based sketch for incremental data stream which can be used not only for this application but also can be used for any other applications, which supports incremental data bases. We have traced the origin of PPDM, the definition of privacy preservation in data mining, and the implications of benchmark privacy doctrine in information detection and advocate a few policies for PPDM b- ased on these privacy principles. These are vital for the development and deployment of methodological solutions. This will let vendors and developers to construct unyielding information reuse and integration (IRI) in PPDM. We pursue to capitalize on the reuse of PPDM information by crafting easy, affluent, and reusable knowledge depictions and accordingly investigates tactics for amalgamate this knowledge into heritage systems and make advances in the upcoming of PPDM.
Date of Conference: 13-15 July 2008