Skip to Main Content
Healthcare is a major industry in the Smarter Planet initiative of IBM and a key area where analytics can have a substantial impact by improving disease prediction and treatment. To facilitate healthcare analytics, patient data usually need to be widely disseminated. This, however, may risk the disclosure of private and sensitive patient information. In this paper, we illustrate the importance of preserving medical data privacy and the inapplicability of several popular techniques to preserve the privacy of structured medical data. Subsequently, we review a privacy-preserving approach for the dissemination of patient records. This approach involves patient record de-identification, anonymization of diagnosis codes contained in the records, and a method for balancing data utility with privacy. This approach is practical in that it allows healthcare data providers to specify fine-grained privacy and utility requirements, and it is able to construct anonymized data with a desired balance between utility and privacy. The effectiveness of the approach is demonstrated through a case study using electronic medical records. We conclude this paper with a roadmap for future trends in medical data privacy.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.