Skip to Main Content
We describe DELPHI, a new computational tool for identifying sequence similarity between a query sequence and a database of proteins. Use is made of a set of patterns obtained from the underlying database through a one-time computation. The patterns are subsequently matched against every query sequence presented to the system. A pattern matched by a region of the query pinpoints a potential local similarity between that region and all of the database sequences also matching that pattern. In a final step, all such local similarities are examined more closely by aligning and scoring the corresponding query and database regions. By prudently choosing a set of patterns, the method can be used to discover weak but biologically important similarities. We provide a number of examples using both classified and unclassified proteins that corroborate this claim.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.