Loading [MathJax]/extensions/MathMenu.js
Image classification using RBM to encode local descriptors with group sparse learning | IEEE Conference Publication | IEEE Xplore

Image classification using RBM to encode local descriptors with group sparse learning


Abstract:

This paper proposes to employ deep learning model to encode local descriptors for image classification. Previous works using deep architectures to obtain higher represent...Show More

Abstract:

This paper proposes to employ deep learning model to encode local descriptors for image classification. Previous works using deep architectures to obtain higher representations are often operated from pixel level, which lack the power to be generalized to large-size and complex images due to computational burdens and internal essence capture. Our method slips the leash of this limitation by starting from local descriptors to leverage more semantical inputs. We investigate to use two layers of Restricted Boltzmann Machines (RBMs) to encode different local descriptors with a novel group sparse learning (GSL) inspired by the recent success of sparse coding. Besides, unlike the most existing pure unsupervised feature coding strategies, we use another RBM corresponding to semantic labels to perform supervised fine-tuning which makes our model more suitable for classification task. Experimental results on Caltech-256 and Indoor-67 datasets demonstrate the effectiveness of our method.
Date of Conference: 27-30 September 2015
Date Added to IEEE Xplore: 10 December 2015
ISBN Information:
Conference Location: Quebec City, QC, Canada

1. Introduction

Bag-of-Feature (BoF) model is one of the most powerful and popular frameworks for image classification which represents image as a histogram of visual words. Standard BoF-based framework is mainly composed of four steps: feature extraction, feature coding, spatial pooling and SVM classification. This pipeline is almost fixed in recent literatures except for the “feature coding” part. To this end, many elegant algorithms have been designed to improve the discriminative power of the learned codes [1]–[4] among which deep learning based approaches draw a lot of attention due to its representational power of deep transformation compared with other dictionary learning methods in a single step [5]–[7].

Contact IEEE to Subscribe

References

References is not available for this document.