By Topic

Relative N-gram signatures: Document visualization at the level of character N-grams

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)

The Common N-Gram (CNG) classifier is a text classification algorithm based on the comparison of frequencies of character n-grams (strings of characters of length n) that are the most common in the considered documents and classes of documents. We present a text analytic visualization system that employs the CNG approach for text classification and uses the differences in frequency values of common n-grams in order to visually compare documents at the sub-word level. The visualization method provides both an insight into n-gram characteristics of documents or classes of documents and a visual interpretation of the workings of the CNG classifier.

Published in:

Visual Analytics Science and Technology (VAST), 2012 IEEE Conference on

Date of Conference:

14-19 Oct. 2012