Skip to Main Content
This paper analyzes the computational complexity of finding relevant documents on the Web. Given a search query that has n significant terms, relevant documents retrieved by search engines will contain at least a number k of the significant terms. The threshold k chosen will depend on the collection of documents and is determined experimentally upon formation of the collection. Algorithms are then provided to compute a similarity ranking. The fundamental analysis is based on combinatorial theory and theorems providing bounds on the runtime complexity of the algorithms are proven.