I. Introduction
According to an IDC survey [1] , during the next five years, about 175ZB of digital data will be generated worldwide. It’s a huge challenge to manage such large amounts of data efficiently. However, many studies found that there was a large volume of duplicate data in the storage system. [2] , [3] . As a result, data deduplication, a lossless data compression technology that avoids duplicate data from being stored and transmitted, is one of the most important methods to tackle this challenge [4] .