Abstract:
This paper presents a method to detect table regions in document images by identifying the column and row line-separators and their properties. The method employs a run-l...Show MoreMetadata
Abstract:
This paper presents a method to detect table regions in document images by identifying the column and row line-separators and their properties. The method employs a run-length approach to identify the horizontal and vertical lines present in the input image. From each group of intersecting horizontal and vertical lines, a set of 26 low-level features are extracted and an SVM classifier is used to test if it belongs to a table or not. The performance of the method is evaluated on a heterogeneous corpus of French, English and Arabic documents that contain various types of table structures and compared with that of the Tesseract OCR system.
Date of Conference: 25-28 August 2013
Date Added to IEEE Xplore: 15 October 2013
Electronic ISBN:978-0-7695-4999-6