Skip to Main Content
Today, a deluge of data is collected from different fields. These massive amounts of data which are often geographically distributed and owned by different organisations are being mined. As consequence, a large mount of knowledge is being produced. This causes the problem of efficient knowledge management in distributed data mining (DDM). The main aim of DDM is to exploit fully the benefit of distributed data analysis while minimising the communication overhead. Existing DDM techniques perform partial analysis of local data at individual sites and then generate global models by aggregating the local results. These two steps are not independent since naive approaches to local analysis may produce incorrect and ambiguous global data models. To overcome this problem, we present a tool called "knowledge map " to easily and efficiently represent knowledge built from mining process in a large scale distributed platform such as Grid. This will also facilitate the integration/coordination of local mining processes and existing knowledge to increase the accuracy of the final models. This approach is being tested on very large datasets.