Efficient incremental phrase-based document clustering | IEEE Conference Publication | IEEE Xplore

Efficient incremental phrase-based document clustering


Abstract:

Document clustering has become inevitable for applications that aim to extract information from huge corpuses. Such applications face two main challenges; one is the effi...Show More

Abstract:

Document clustering has become inevitable for applications that aim to extract information from huge corpuses. Such applications face two main challenges; one is the efficient representation of the documents, along with using an efficient similarity measure, and the second is dealing with the dynamic nature of the corpus. In this paper, an efficient document clustering model is introduced for incrementally storing and updating clusters of a dataset. A new phrase-based similarity method is developed along with the model to calculate the similarity between documents and clusters. Experimental results show that the new clustering model can achieve more accurate results than the traditional algorithms.
Date of Conference: 11-15 November 2012
Date Added to IEEE Xplore: 14 February 2013
ISBN Information:

ISSN Information:

Conference Location: Tsukuba, Japan

Contact IEEE to Subscribe

References

References is not available for this document.