By Topic

DOTS: Detection of Off-Topic Search via Result Clustering

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Goharian, N. ; Illinois Inst. of Technol, Chicago ; Platt, A.

Often document dissemination is limited to a "need to know" basis so as to better maintain organizational trade secrets. Retrieving documents that are off-topic to a user's predefined area of information need (task) via a search engine is potentially a violation of access rights and is a concern to every private, commercial, and governmental organization. Such misuse, defined as "off-topic access to sensitive data by an authorized user", is the second most prevalent form of computer crime after viruses per a recent Computer Security Institute/Federal Bureau of Investigation study. We present a content-based off-topic detection approach that uses query result clustering to detect off-topic searches. This approach supports higher detection precision than the state of the art. Multiple methods for picking the "good" clusters are proposed, and their effect on the detection rate and precision is evaluated. A high detection precision is critical as a false access violation accusation unfairly and inappropriately subjects the user to scrutiny. Our empirical results show that using clustering query results can significantly reduce such false positives.

Published in:

Intelligence and Security Informatics, 2007 IEEE

Date of Conference:

23-24 May 2007