Search engine coverage of the OAI-PMH corpus
McCown, F.
Liu, X.
Nelson, M.L.
Zubair, M.
Dept. of Comput. Sci., Old Dominion Univ., Norfolk, VA, USA;
This paper appears in: Internet Computing, IEEE
Publication Date: March-April 2006
Volume: 10,
Issue: 2
On page(s): 66- 73
ISSN: 1089-7801
INSPEC Accession Number: 8972223
Digital Object Identifier: 10.1109/MIC.2006.41
Current Version Published: 2006-03-20
Abstract
Having indexed much of the "surface" Web, search engines are now using various approaches to index the "deep" Web. At the same time, institutional repositories and digital libraries are adopting the open archives initiative protocol for metadata harvesting (OAI-PMH) to expose their holdings. The authors harvested nearly 10 million records from OAI-PMH repositories. From these records, they extracted 3.3 million unique resource URLs and then conducted searches on samples from this collection to determine how much of the OAI-PMH corpus the three major search engines have indexed.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.