1. Introduction
Bag-of-Feature (BoF) model is one of the most powerful and popular frameworks for image classification which represents image as a histogram of visual words. Standard BoF-based framework is mainly composed of four steps: feature extraction, feature coding, spatial pooling and SVM classification. This pipeline is almost fixed in recent literatures except for the “feature coding” part. To this end, many elegant algorithms have been designed to improve the discriminative power of the learned codes [1]–[4] among which deep learning based approaches draw a lot of attention due to its representational power of deep transformation compared with other dictionary learning methods in a single step [5]–[7].