Skip to Main Content
We study the problem of enumerating concepts in a Sperner family concept class using subconcept queries, which is a general problem including maximal frequent itemset mining as its instance. Though even the theoretically best known algorithm needs quasi-polynomial time to solve this problem in the worst case, there exist practically fast algorithms for this problem. This is because many instances of this problem in real world have low complexity in some measures. In this paper, we characterize the complexity of Sperner family concept class by the VC dimension of its intersection closure and its characteristic dimension, and analyze the worst case time complexity on the enumeration problem of its concepts in terms of the VC dimension. We also showed that the VC dimension of real data used in data mining is actually small by calculating the VC dimension of some real datasets using a new algorithm closely related to the introduced two measures, which does not only solve the problem but also let us know the VC dimension of the intersection closure of the target concept class.
Date of Conference: 15-19 Dec. 2008