A Case Study: News Classification Based on Term Frequency
Kroha, P.; Baeza-Yates, R.
Database and Expert Systems Applications, 2005. Proceedings. Sixteenth International Workshop on
Volume , Issue , 26-26 Aug. 2005 Page(s):428 - 432
Digital Object Identifier 10.1109/DEXA.2005.6
Summary:In this paper, we investigate how much similarity good news and bad news have in context of long-terms market trends and we discuss the relation between information retrieval and text mining. We have analyzed about 400 thousand news stories coming from the years 1999 to 2002 and we argue that classification methods of information retrieval are not strong enough to solve problems like this one because the meaning of news is given not only by the used words and their frequency but also by the structure of sentences and their context. We present results of our experiments and examples of news that support this statement
View citation and abstract |