By Topic

Non-blocking Disk-Tape Join Algorithm for Data on Tertiary Storage Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Baoliang Liu ; Harbin Institute of Technology ; Jianzhong Li ; Lei Nie ; Yanqiu Zhang

Massive data accumulated by business or scientific applications have reached such a great amount that they can be accommodated only on tapes. In order to make full use of these data, tools of data analysis and data mining should be developed. In such applications, disk resident data are needed to join with tape resident data. Many disk-tape join methods have been proposed, examples are CDT-NB and CDT-GH. But all these algorithms have blocking behaviour that user must wait quite a while before the first result can be seen. Since disk-tape join operation often takes a long time to finish, it is desirable to produce the join result as early as possible while the join performance doesn't deteriorate too much. The non-blocking disk-tape join (NDT) presented in this paper is the first disk-tape join algorithm designed with this goal in mind. It has three phases: the hashing phase, the merging phase and the probing phase. Join results can be produced in each phase. Tuples of disk resident relation and tape resident relation are read simultaneously into memory and be joined in the hashing phase. The merging phase joins those tuples that flushed onto disk during the hashing phase. After the first two phases, disk resident relation has been partitioned and is joined with remaining tape resident relation in the probing phase. Experimental results show that NDT can produce join results much earlier than the-state-of-art CDT-GH and the performance of NDT is about the same with that of CDT-GH

Published in:

The Fifth International Conference on Computer and Information Technology (CIT'05)

Date of Conference:

21-23 Sept. 2005