Skip to Main Content
Compliance with regulatory policies on data remains a key hurdle to cloud computing. Policies such as EU privacy, HIPAA, and PCI-DSS place requirements on data availability, integrity, migration, retention, and access, among many others. This paper proposes a policy management service that offers scalable management of data retention policies attached to data objects stored in a cloud environment. The management service includes a highly available and secure encryption key store to manage the encryption keys of data objects. By deleting the encryption key at a specified retention time associated with the data object, we effectively delete the data object and its copies stored in online and offline environments. To achieve scalability, our service uses Hadoop MapReduce to perform parallel management tasks, such as data encryption and decryption, key distribution and retention policy enforcement. A prototype deployed in a 16-machine Linux cluster currently supports 56 MB/sec for encryption, 76 MB/sec for decryption, 31,000 retention policies/sec read and 15,000 retention policies/sec write.