Skip to Main Content
Governments are increasingly interested in making their data accessible through open data platforms to promote accountability and economic growth. Since the first data.gov initiative was launched by the U.S. government, more than 150 city agencies and authorities have made over one million datasets available through open data portals. Open data are increasingly generating new business worldwide, providing citizens with a wealth of information that they can combine and aggregate in unprecedented ways. An important characteristic of open data environments is that once the data are published, it is difficult to anticipate how the data will be used. As a result, potentially innocuous datasets, once linked together, may lead to serious privacy violations, and powerful analytic tools may reveal sensitive patterns that were unknown at the time that the data were published. In this paper, we provide an introduction to data privacy and present some popular privacy models that have been proposed for privacy-preserving data publishing and knowledge hiding, focusing on their strengths and limitations. Subsequently, using QuerioCity (an open urban information management platform) as a use case, we explain the important challenges that open data platforms introduce with respect to data privacy.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.