Skip to Main Content
In distributed heterogeneous scenarios, querying data from various sources efficiently and concisely is always a key challenge. In this paper, we analyze the characteristics of hydrological data and study the data integration in grid environment. We propose a query processing procedure based on domain topics and grid infrastructure. Domain topics are defined according to domain query patterns, and data sources are described for each domain topic. In the following schema mapping procedure, we only focus on a small subset of schémas. A topic query is decomposed into sub-queries against global schema and then mapped to local queries against local schémas in the query processing procedure. Performance metrics are retrieved from grid infrastructure to generate an optimum query plan. And query answers are cached and described as new sources for new queries. Empirical results have shown that our approach works well in the hydrological science grid environment.