Abstract:
Most existing table structure recognition methods can be classified into two major categories: detecting table borders methods and detecting table rows and columns method...Show MoreMetadata
Abstract:
Most existing table structure recognition methods can be classified into two major categories: detecting table borders methods and detecting table rows and columns methods. The method of detecting the table borders can produce the imbalance between positive and negative samples, because the number of pixels in the table borders is very small. Although the method of detecting the rows and columns of the table avoids this imbalance, some studies simplify the prediction of rows and columns into column-by-column and row-by-row prediction, which creates a problem with large error tolerance. To solve this problem, two modules are proposed, called Rows Aggregated (RA) module and Columns Aggregated (CA) module. Firstly, the method of feature slicing and tiling is used to make approximate prediction for the rows and columns that solves the problem of large error tolerance. Secondly, the row and column information is further retrieved by calculating the attention maps of channels. Finally, we use RA and CA to build a semantic segmentation network, which is called Rows and Columns Aggregated Network (RCANet), to complete the rows segmentation and columns segmentation. We generate rows and columns masks on ICDAR2013 dataset and evaluate the model. Experiments show that the proposed model has better performance than the segmentation model based on detection table rows and columns method, and its average precision, recall and F1 value are 2.08%, 3.21% and 2.45% higher respectively.
Date of Conference: 06-08 May 2022
Date Added to IEEE Xplore: 23 May 2022
ISBN Information: