By Topic

Prosima: Protein similarity algorithm

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Novosad, T. ; Dept. of Comput. Sci., VSB-Tech. Univ. of Ostrava, Ostrava, Czech Republic ; Snasel, V. ; Abraham, A. ; Yang, J.Y.

In this article, we present a novel algorithm for measuring protein similarity based on their three dimensional structure (protein tertiary structure). The PROSIMA algorithm using suffix tress for discovering common parts of main-chains of all proteins appearing in current NCSB protein data bank (PDB). By identifying these common parts we build a vector model and next use classical information retrieval tasks based on the vector model to measure the similarity between proteins - all to all protein similarity. For the calculation of protein similarity we are using tf-idf term weighing schema and cosine similarity measure. The goal of this work to use the whole current PDB database (downloaded on June 2009) of known proteins, not just some kinds of selections of this database, which have been studied in other works. We have chose the SCOP database for verification of precision of our algorithm because it is maintained primarily by humans. The next success of this work is to be able to determine protein SCOP categories of proteins not included in the latest version of the SCOP database (v. 1.75) with nearly 100% precision.

Published in:

Nature & Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on

Date of Conference:

9-11 Dec. 2009