Skip to Main Content
Biological sequential patterns usually exhibit some significant functions in a set of sequences. Mining such patterns offers a key means of insight into transcription regulation mechanisms and becomes a useful primitive task underlying many researches and applications. Recently, various methods have been developed to identify biological patterns. However, traditional approaches to mine sequential pattern will get a huge result set, which make biologists difficult to decide which patterns are interesting and meaningful. In this paper, we study a variant of biological sequential pattern mining aiming at the huge result set, termed top k representative patterns mining based on regularity measurement. As the first attempt to tackle the problem, a new measurement `regularity' is defined to evaluate the interesting of each pattern and an efficient algorithm is proposed with pruning strategy which returns top k representative patterns ranked by the regularity. Experimental results demonstrate that the proposed method is more efficient than the state-of-the-art methods on the real datasets.