By Topic

CoMRI: a compressed multiresolution index structure for sequence similarity queries

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Sun, H. ; Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA ; Ozturk, O. ; Ferhatosmanoglu, H.

In this paper, we present CoMRI, compressed multiresolution index, our system for fast sequence similarity search in DNA sequence databases. We employ virtual bounding rectangle (VBR) concept to build a compressed, grid style index structure. An advantage of grid format over trees is subsequence location information is given by the order of corresponding VBR in the VBR list. Taking advantage of VBRs, our index structure fits into a reasonable size of memory easily. Together with a new optimized multiresolution search algorithm, the query speed is improved significantly. Extensive performance evaluations on human chromosome sequence data show that VBRs save 80%-93% index storage size compared to MBRs (minimum bounding rectangles) and new search algorithm prunes almost all unnecessary VBRs which guarantees efficient disk I/O and CPU cost. According to the results of our experiments, the performance of CoMRI is at least 100 times faster than MRS which is another grid index structure introduced very recently.

Published in:

Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE

Date of Conference:

11-14 Aug. 2003