MultiMap: Preserving disk locality for multidimensional datasets
Minglong Shao
Schlosser, S.W.
Papadomanolakis, S.
Schindler, J.
Ailamaki, A.
Ganger, G.R.
Carnegie Mellon Univ., Pittsburgh, PA;
This paper appears in: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Publication Date: 15-20 April 2007
On page(s): 926-935
Location: Istanbul,
ISBN: 1-4244-0803-2
INSPEC Accession Number: 9551965
Digital Object Identifier: 10.1109/ICDE.2007.367938
Current Version Published: 2007-06-04
Abstract
MultiMap is an algorithm for mapping multidimensional datasets so as to preserve the data's spatial locality on disks. Without revealing disk-specific details to applications, MultiMap exploits modern disk characteristics to provide full streaming bandwidth for one (primary) dimension and maximally efficient non-sequential access (i.e., minimal seek and no rotational latency) for the other dimensions. This is in contrast to existing approaches, which either severely penalize non-primary dimensions or fail to provide full streaming bandwidth for any dimension. Experimental evaluation of a prototype implementation demonstrates MultiMap's superior performance for range and beam queries. On average, MultiMap reduces total I/O time by over 50% when compared to traditional linearized layouts and by over 30% when compared to space-filling curve approaches such as Z-ordering and Hilbert curves. For scans of the primary dimension, MultiMap and traditional linearized layouts provide almost two orders of magnitude higher throughput than space-filling curve approaches.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.