The range-join of two sets R and S is the set that contains all tuples (r, s) satisfying e1⩽|r-s|⩽e2, r∈R and s∈S. For computing the range-join of R and S in a hypercube of p processors, this paper presents an improved selection-based parallel algorithm which reduces the local memory from O(n) repaired in the previous algorithm to O(m+n/p), where |R|=m, |S|=n and p⩽max{m,n}. The new algorithm also reduces the best-case time complexity from O(m/p log2 p+n/p log m) of the previous result to O(m+n/p log2p) when m⩾plog, while maintaining the cost optimality in the worst case. Unlike the previous algorithm, our algorithm works by selecting the median of RUS to evenly partition the whole data set for divide-and-conquer join in the next phase. We present an upper bound of time complexity of the algorithm in the general case and show that the best-case time complexity of the algorithm is better than permutation-based range-join when n⩾plogp+1
Published in:
EUROMICRO 94. System Architecture and Integration. Proceedings of the 20th EUROMICRO Conference.
Date of Conference: 5-8 Sep 1994