Abstract:
This paper describes applications of the optimized pattern discovery framework to text and Web mining. In particular, we introduce a class of simple combinatorial pattern...Show MoreMetadata
Abstract:
This paper describes applications of the optimized pattern discovery framework to text and Web mining. In particular, we introduce a class of simple combinatorial patterns over phrases, called proximity phrase association patterns, and consider the problem of finding the patterns that optimize a given statistical measure within the whole class of patterns in a large collection of unstructured texts. For this class of patterns, we develop fast and robust text mining algorithms based on techniques in computational geometry and string matching. Finally, we successfully apply the developed text mining algorithms to the experiments on interactive document browsing in a large text database and keyword discovery from Web bases.
Published in: Proceedings 2000 Kyoto International Conference on Digital Libraries: Research and Practice
Date of Conference: 13-16 November 2000
Date Added to IEEE Xplore: 06 August 2002
Print ISBN:0-7695-1022-1