Skip to Main Content
An efficient local similarity search engine is developed by exploiting some techniques of data mining. All frequent patterns in the database are retrieved and recorded in a one-time preprocessing process. Then a query sequence is checked to see whether any pattern from the preprocessing stage is matched to the query. Two regions coming from the query and a database sequence that both match a pattern form a possible seed for local similarity. Finally, we extend and score each such seed region pair to see whether there really exists local similarity with a score high enough for reporting. For computational efficiency, a novel clustering approach is proposed and integrated into the proposed system, which is based on the local similarity search engine - the DELPHI system proposed by IBM. Extensive experiments are demonstrated to show the performance of our system.