Skip to Main Content
A large number of deep Web data sources are only accessible through their query interfaces. For any domain of interest, there may be many such sources with varied query capabilities and content coverage. To obtain mass valuable information in deep Web, we need to integrate large heterogeneous information. Schema matching is a critical problem in the integration process. This paper propose a new holistic schema matching method based on data mining, named as correlated-clustering, which mines positively correlated attributes to form potential attribute groups, and finds synonym attributes by clustering. We design experiments to implement mentioned algorithms and technology. Experimental results testify that our solution achieves accurately and effectively.