By Topic

Content-based retrieval of historical Ottoman documents stored as textual images

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
E. Saykol ; Dept. of Comput. Eng., Bilkent Univ., Ankara, Turkey ; A. K. Sinop ; U. Gudukbay ; O. Ulusoy
more authors

There is an accelerating demand to access the visual content of documents stored in historical and cultural archives. Availability of electronic imaging tools and effective image processing techniques makes it feasible to process the multimedia data in large databases. A framework for content-based retrieval of historical documents in the Ottoman Empire archives is presented. The documents are stored as textual images, which are compressed by constructing a library of symbols occurring in a document, and the symbols in the original image are then replaced with pointers into the codebook to obtain a compressed representation of the image. The features in wavelet and spatial domains, based on angular and distance span of shapes, are used to extract the symbols. In order to make content-based retrieval in the historical archives, a query is specified as a rectangular region in an input image and the same symbol-extraction process is applied to the query region. The queries are processed on the codebook of documents and the query images are identified in the resulting documents using the pointers in the textual images. The query process does not require decompression of images. The new content-based retrieval framework is also applicable to many other document archives using different scripts.

Published in:

IEEE Transactions on Image Processing  (Volume:13 ,  Issue: 3 )