Skip to Main Content
Clustering as an intelligent technique for mining XML documents has been utilised as an excellent way of grouping the documents by their content or structure. A main step in many distance based XML clustering algorithms is to calculate pair-wise distances between documents; naturally, a time-efficient technique requests the pair-wise distances to be determined in a timely manner. In case of dynamic XML documents, the amount of changes between versions cannot be predicted. Therefore, in case of clustered dynamic XML documents, if changes were little or if they affected only some of the clustered documents, recalculating pair-wise distances every time would be highly redundant. In this paper we propose a time-efficient technique to reassess pair- wise distances between clustered dynamic XML documents which change in time, without performing redundant calculations but considering the previously known distances and the set of changes which might have affected the documents versions.
Date of Conference: 25-28 March 2008