Skip to Main Content
With the ever increasing size of data sets, traditional parallel relational database solution can be prohibitively expensive and may suffer limited scalability. To perform large-scale data processing in a cost-effective manner, several NoSQL data processing systems have been proposed. In this paper, we devise a new, distributed B-tree column indexing scheme for HBase, which can support indexing for non-row-key columns, as well as parallel B-tree search in large data table. Our experiment results demonstrate both performance and scalability advantage of our indexing scheme on point queries, range queries, and aggregation operations, compared with HBase.