Skip to Main Content
With the increasing availability of LBS (Location Based Services) and mobile internet, the amount of spatial data is growing larger and larger. It poses new requirements and challenges towards cloud environments, such as how to accomplish efficient index and query processing on large scale spatial data. A scalable and distributed spatial data index is a best choice for the effective processing of the spatial data analysis and query. There are several approaches that implement distributed indices and query processing with MapReduce, such as R-tree and Voronoi-based index. However, R-tree is unsuitable for parallelization and query processing on Voronoi-based index needs extra computation for localization or local index reconstruction. The regularity of grid partition is much easier to scale and parallel comparing with the above two approaches. Inverted Index utilizes limited index entries to index unlimited data points. In this paper, we propose a new distributed spatial data index: Inverted Grid Index, which is a combination of inverted index and grid partition. Our index structure is more simple and suitable for large-scale parallel spatial query application. We present MapReduce-based approaches that both construct Inverted Grid Index and process kNN query over large spatial data sets. Extensive experiments have been done to evaluate the scalability and the performance of kNN query processing on our index structure. The results demonstrate the efficiency and scalability of our kNN query algorithm based on Inverted Grid Index.