Skip to Main Content
In this paper, we tackle the problem of localizing graphical symbols on complex technical document images by using an original approach to solve the subgraph isomorphism problem. In the proposed system, document and symbol images are represented by vector-attributed region adjacency graphs (RAG) which are extracted by a segmentation process and feature extractors. Vertices representing regions are labeled with shape descriptors whereas edges are labeled with feature vector representing topological relations between the regions. Then, in order to search the instances of a model graph describing a particular symbol in a large graph corresponding to a whole document, we model the subgraph isomorphism problem as an integer linear program (ILP) which enables to be error-tolerant on vectorial labels. The problem is then solved using a free efficient solver called SYMPHONY. The whole system is evaluated on a set of synthetic documents.