Skip to Main Content
XML is nature to express the uncertainty in real world, therefore the data in uncertain environment can be stored it the format of XML. For improving the efficiency of keyword search in uncertain environment, we use dewey code for indexing the XML elements, which is a kind of prefix-based encoding method. When dealing with big data, the lengths of element's Dewey codes are quit big, which leads to low efficiency of judging the relationships among the elements and needs large storage space. Thus, the big XML data and complicated XML schema are the bottlenecks of keyword search. In this paper, we incorporate the map-reduce mechanism to manage the uncertain data with partition, and design a parallel method to process information retrieve. The different XML fragments are stored in distributed network, and these can be parallel processed to retrieve the Smallest Lowest Common Ancestors (SLCAs) and return the k results with the largest probabilistic values. In our experiment, the result shows that our approach can improve the efficiency of executing parallel keyword search.