1 Introduction
Accuracy and interpretability are two desired goals in predictive modeling, including both classification and regression. Previous work can be characterized into two lines. One line has ordinary performance with strong interpretability on a set of simple features, but meets a serious bottleneck when modeling complex high-order interactions between features, such as linear regression, logistic regression [18], and support vector machine [34] . The other line consists of models that are more often studied for their high accuracy, for example, tree-based models including random forest [2] and gradient boosted trees [16] as well as the neural network models [20], which model nonlinear relationships with high-order combinations of different features. However, their lower interpretability and high complexity prevent practitioners from deploying in practice [18]. In the real-world scientific and medical applications which require both intuitive understanding of the features and high accuracies, the practitioners are not satisfied with neither line of models, and thus, it is important and challenging to develop an effective prediction framework with high interpretability when dealing with high-order interactions with features.