By Topic

SOM-based methodology for building large text archives

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Azcarraga, A.P. ; Program for Res. into Intelligent Syst., Nat. Univ. of Singapore, Singapore ; Yap, T.N., Jr.

Not only have self-organizing maps (SOMs), such as the WEBSOM, been shown to scale up to very large datasets, these maps also allow for a novel mode of navigating through a large collection of text documents. The entire text collection is presented to a user as a regular map, where each point in the map is associated to a group of documents that are likely to be composed of similar terms and phrases. In addition, the closer two points are in the map, the more similar are their respective associated documents. Thus, once an interesting document is found in the map, the user just has to click around the vicinity of that document to retrieve other similar documents. A major drawback of SOMs, however, is the long training time required, especially for document collections where both the volume and the dimensionality are huge. We demonstrate how the size of the initial text collection is progressively and drastically reduced from the raw document collection to the final SOM-based text archive. We demonstrate this using a widely studied Reuters collection.

Published in:

Database Systems for Advanced Applications, 2001. Proceedings. Seventh International Conference on

Date of Conference:

21-21 April 2001