We introduce the SCP - the sorted common prefix, and study some of its properties. Based on the internal representations used by a class of new compression schemes, we show how the SCP table can be constructed using an O(u+| Σ |Kmax) number of comparisons on average, and O(u | Σ |) worst case, where u is the size of the sequence, | Σ | is the number of symbols, and Kmax is the maximum SCP value. We describe one application of the SCP to the problem of anchor points in multiple sequence alignment.
Published in:
Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Date of Conference: 11-14 Aug. 2003