Recently, it has been expected to realize privacy preserving data mining in order to acquire valuable knowledge from the combined information sources of several service providers. Therefore researches have been conducted on a distributed anonymization method, which combines the personal information and anonymize it to prevent identifying specific user records. However, in those researches, when sets of the users in the providers are not the same, there is a problem that users' presence in either provider may be revealed. Thus, this paper proposes a new indicator which represents the probability of the presence of users being revealed and introduces a modified distributed anonymization method to satisfy the proposed indicator. Also, we use U.S. census data for evaluation and calculate the relative error of its anonymized data. The results show that it is almost 10-25% in specific cases.
Published in:
Advanced Applied Informatics (IIAIAAI), 2012 IIAI International Conference on
Date of Conference: 20-22 Sept. 2012