To explore the fusion of unstructured text data, the concept of entity networks has been proposed in many systems to facilitate analysis of entity relationship. Using appropriate visualization, entity networks could illustrate the relations of extracted entities. However, such a system generates too many entities and links from a single document, and many of them are trivial. With a complicated entity network, it is difficult to grasp the key points of the text. Also it is less effective to combine complicated entity networks across time. Using novel and highly efficient algorithms for shortest-path estimating and tracing, this paper proposes a simplification scheme for entity networks, which has two main functions. One is to divide an entity network into segments, with each segment representing a meaningful event. Another is to identify the topic by extracting primary nodes and the linking paths. With the simplified network, we could observe the overall structure, and get the key concepts and their relationship. The examples in this paper demonstrate the purport and effectiveness of the simplification scheme.
Published in:
Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on
Date of Conference: 10-12 Sept. 2008