Skip to Main Content
We present a novel mobile printed document retrieval system that utilizes both text and low bit-rate features. On the client phone, text are detected using an algorithm based on edge-enhanced Maximally Stable Extremal Regions. The title text image patch is rectified using a gradient based algorithm and recognized using Optical Character Recognition. Low bit-rate image features are extracted from the query image. Both text and compressed features are sent to a server. On the server, the title text is used for on-line search and the features are used for image-based comparison. The proposed system is capable of web-scale document retrieval using title text without the need of constructing a document image database. Using features for image-based comparison, we can reliably match retrieved documents to the query document. Last, by using text and low bit-rate features, we can reduce the transmitted query size significantly.