I. Introduction
Text cube is a multidimensional data structure with text documents residing in, where the dimensions correspond to multiple aspects (e.g., topic, time, location) of the corpus. Text cube analysis has been demonstrated as a powerful text analytics tool for a wide spectrum of applications in bioinformatics, healthcare, and business intelligence. For example, by organizing a news corpus into a three-dimensional topic-time-location cube, decision makers can easily browse the corpus and retrieve desired articles with simple queries (e.g., (Sports, 2017, USA)). Any text mining primitives, e.g., sentiment analysis, can be further applied on the retrieved data for gaining useful insights. As another example, one can organize a corpus of biomedical research papers into a neat cube structure based on different facets (e. g., disease, gene, protein). Such a text cube allows people to easily identify relevant papers in biomedical research and acquire useful information for disease treatment.
Text cube construction on a news corpus with three dimensions: Topic, location and time. Each document needs to be assigned with one label in each of the three dimensions.