Skip to Main Content
With the ever-increasing large amount of digital information available, the need for advanced information retrieval (IR) systems increases. this wealth of digital information presents a major data-analysis challenge for us. How to manipulate, analyze and understand large quantities of complex data becomes extremely important. Over the past decades, significant progress has been made in IR. However, many challenges remain. First, most Web search engines take a short text query as input and output a ranked list of documents. The retrieval decision is made primarily based on the current query and document collection. Web search engines generally treat search requests in isolation. The results for a given query are identical, independent of the user or the context in which the user makes the request. However, it is unlikely that different users are so similar in their interests that one standardized way of retrieving information fits all needs. Different users may have different information needs. They may use the same query to search for different kinds of information. Moreover, even the same user may use identical queries to express different information needs. For example, a person may use ldquoIRIXrdquo to mean information retrieval in context at one time, but IRIX operating systems at another time. It is impossible for the current Web search engines to distinguish these two cases because the userpsilas search context is not considered. Second, IR is, in general, an interactive process. A userpsilas information need is rarely satisfied with just one iteration of search. With the current document-centered retrieval paradigm, interactive retrieval is treated as a sequence of independent simple retrieval decision-making steps. The information about search history is ignored, which makes the retrieval performance of existing IR systems inherently non-optimal. However, it has been brought into attention that analysis of task-oriented user sessions provides useful insight- - into the query behavior of the users. Third, most of present IR systems including general search engines (e.g. Google and Yahoo) and scientific literature search engines (e.g. PubMed and ACM Digital Library) use keywords to query and index documents. However, this traditional keyword-based IR model provides little semantic context for the understanding of user information needs. For example, a keyword usually has several senses and its meaning is ambiguous without context. In addition, one meaning can be expressed by many keywords. Thus, the integration of semantic context according to the userpsilas information need and the userpsilas understanding of the documents in the collection into IR systems will definitely improve the IR performance.