Skip to Main Content
This paper makes a study on content-based image retrieval algorithm for document image database. Given a query image the system returns overall similar images in database. For document images, we propose the algorithm based on hierarchical matching tree. First segment an image into several regions with paragraph marking based on paragraph height estimation, and then segment the region into line blocks, the algorithm for document image retrieval by regions and line blocks with hierarchical matching tree is presented. Also we describe the matching model and the texture character strings for indexing. This algorithm is tested through trials. The experiment results indicate this algorithm is accuracy and effective. The response time of retrieval is strongly reduced by image scaling. The efficiency of retrieval is highly valuable in document image database.