Skip to Main Content
This paper proposes a model of massive heterogeneous data integration system based on Lucene and XQuery. This model shields distribution and heterogeneity of resources and achieves transparent access using materialized view of database. The query efficiency is increased due to the highly effective categorization algorithm to segment data as an index with open source tool Lucene. Further, the model makes full use of the advantage of XQuery, which can process not only structured data but also non-structured data so as to solve the significant difference among various data sources as well as the efficiency of massive data access.