Skip to Main Content
Rule learning algorithms, for example, Ripper, induces univariate rules, that is, a propositional condition in a rule uses only one feature. In this paper, we propose an omnivariate induction of rules where at each condition, both a univariate and a multivariate condition is trained and the best is chosen according to a novel statistical test. This paper has three main contributions: First, we propose a novel statistical test, the combined 5$times$2 cv $t$ test, to compare two classifiers, which is a variant of the 5$times$2 cv $t$ test and give the connections to other tests as $5times 2$ cv $F$ test and $k$-fold paired $t$ test. Second, we propose a multivariate version of Ripper where Support Vector Machine (SVM) with linear kernel is used to find multivariate linear conditions. Third, we propose an omnivariate version of Ripper where the model selection is done via the combined 5$times$2 cv $t$ test. Our results indicate that (1) the combined 5$times$2 cv $t$ test has higher power (lower type II error), lower type I error, and higher replicability compared to the 5$times$2 cv $t$ test, (2) omnivariate rules are better in that they choose whichever condition is more accurate, selecting the right model automatically and separately for each condition in a rule.
Knowledge and Data Engineering, IEEE Transactions on (Volume:PP , Issue: 99 )