Skip to Main Content
This paper investigates a new hybrid algorithm for instance selection adapted to large databases. The key idea is to apply condensation algorithms to only small sets and useful patterns to reduce computation cost. The initial population is divided into “meta strata” resulting from the union of strata randomly generated. Interesting patterns are resulting from a reference “meta stratum” and are partitioned in clusters. For each “meta stratum” and each cluster, influencing patterns are selected on the basis of a 1-nn procedure. The sets of instances determined from all “meta strata” provide the final set. Experiments performed with various data sets are revealing the effectiveness and adequacy of the proposed approach.