Skip to Main Content
In document classification method by using appeared words as features, it is important to determine keywords for the features to characterize each document. However, conventional methods select the keywords based on their frequency or/and particular importance index such as tf-idf, and cut-off the other appeared words by using a threshold value. This omits remaining information such as rare combinations of the appeared words and time dependent differences of their usages. In this paper, we present the availability of the features based on temporal patterns of the overall words and phrases for temporally published documents in one domain. Thus, the documents are characterized by the temporal patterns of one or more importance indices for considering temporal differences of the overall term usages. In the experiment, we compare document classification results of two sets of bibliographical documents on the time dependency by using the two types of the feature set. For an exploratory class labels, we show the availability for obtaining classification rules that mention the relationship between the class and the important temporal patterns for the prediction.