Skip to Main Content
Given a set of clusters, we consider an optimization problem which seeks a subset of clusters that maximizes the microaverage F-measure. This optimal value can be used as an evaluation measure of the goodness of clustering. For arbitrarily overlapping clusters, finding the optimal value is NP-hard. We claim that a greedy approximation algorithm yields the global optimal solution for clusters that overlap only by nesting. We present a mathematical proof of this claim by induction. For a family of n clusters containing a total of N objects, this algorithm has an O(n2) time complexity and O(N) space complexity.