Skip to Main Content
A protein local substructure (descriptor) is a set of several short nonoverlapping fragments of the polypeptide chain. Each substructure describes local environment of a particular residue and includes only those segments of the main chain that are located in the proximity of that residue. Similar descriptors from the representative set of proteins were analyzed to reveal links between the substructures and the sequences of their segments. Using the detected sequence-based fingerprints, specific geometrical conformations are assigned to new sequences. The ability of the approach to recognize correct SCOP folds was tested on 273 sequences from the 49 most popular folds. Good predictions were obtained in 85% of cases. No performance drop was observed with decreasing sequence similarity between target sequences and sequences from the training set of proteins.