By Topic

Fast retrieval of electronic documents in digital libraries

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Wang, J.T.L. ; Dept. of Comput. & Inf. Sci., New Jersey Inst. of Technol., Newark, NJ, USA ; Chia-Yo Chang

This paper presents an index structure for retrieving electronic documents in digital libraries. The documents considered may contain mistyped words or spelling errors. Given a query string (e.g., a search key), we want to find those documents that approximately contain the query, i.e., certain inserts, deletes and mismatches are allowed when matching the query with a word, (or phrase) in the documents. Our approach is to store the documents sequentially in a database and hash their “fingerprints” into a number of “fingerprint files”. When the query is given, its fingerprints are also hashed into the files and a histogram of votes is constructed on the documents. We derive a lower bound, based on which one can prune a large number of nonqualifying documents (i.e., those whose votes are below the lower bound) during searching. The paper presents some experimental results, which demonstrate the effectiveness of the index structure and the lower bound

Published in:

Tools with Artificial Intelligence, 1995. Proceedings., Seventh International Conference on

Date of Conference:

5-8 Nov 1995