Abstract:
This paper presents a novel two-stream approach for document image classification. The proposed approach leverages textual and visual modalities to classify document imag...Show MoreMetadata
Abstract:
This paper presents a novel two-stream approach for document image classification. The proposed approach leverages textual and visual modalities to classify document images into ten categories, including letter, memo, news article, etc. In order to alleviate dependency of textual stream on performance of underlying OCR (which is the case with general content based document image classifiers), we utilize a filter based feature-ranking algorithm. This algorithm ranks the features of each class based on their ability to discriminate document images and selects a set of top 'K' features that are retained for further processing. In parallel, the visual stream uses deep CNN models to extract structural features of document images.Finally, textual and visual streams are concatenated together using an average ensembling method. Experimental results reveal that the proposed approach outperforms the state-of-the-art system with a significant margin of 4.5% on publicly available Tobacco-3482 dataset.
Date of Conference: 20-25 September 2019
Date Added to IEEE Xplore: 03 February 2020
ISBN Information: