Abstract:
As the world wide web grows rapidly, a text corpus is becoming increased online at an incredible rate. Managing a corpus of documents is critical for many areas of scienc...Show MoreMetadata
Abstract:
As the world wide web grows rapidly, a text corpus is becoming increased online at an incredible rate. Managing a corpus of documents is critical for many areas of science, industry, and culture. For example, bioengineering researchers, who study a new generation of advanced materials, frequently need to identify and understand a comprehensive body of literature describing an association between material features of interest. However, there is no inspection technique to help such researchers who need to make critical decisions based on their understanding of a corpus of documents. In this paper, we present a text visualization approach, TOPEXPLORER which extracts and visualizes topic models regarding bioengineering document collections. TOPEXPLORER displays text data in a logical layout to inspect and understand the relations among documents. It applies three probabilistic topic modeling algorithms as complementary study methods to systematically discover hidden thematic structures in a collection of documents. In the evaluation, we assessed TOPEXPLORER by building topic models on 600 documents in the bioengineering domain. The interactive visualization of TOPEXPLORER allows users to explore the hidden structure that a topic model discovers. TOPEXPLORER helps users understand and explore the output of models by effectively organizing, summarizing, visualizing, and interacting with a corpus.
Date of Conference: 31 July 2020 - 01 August 2020
Date Added to IEEE Xplore: 29 September 2020
ISBN Information: