Skip to Main Content
With the avalanche of protein sequences generated in the post-genomic age, it is highly desirable to develop an automated method by which crystallographic scientists can rapidly and effectively identify which quaternary attribute a particular protein chain has according to its sequence information. Given most of the previous studies are limited to homo-oligomers, in this paper, we will try to identify the quaternary attribute of hetero-oligomer proteins. For a hetero-oligomer, its type will be identified among the following six categories: (1) heterodimer, (2) heterotrimer, (3) heterotetramer, (4) heteropentamer, (5) heterohexamer, (6) heterooctamer. Using machine learning approach, the Fuzzy Nearest Neighbor Algorithm (FKNN), we developed a prediction system for protein quaternary structural type in which we incorporated functional domain composition (FunD) and pseudo-amino acid composition (PseAA). The overall accuracy achieved by this system is more than 80% in the Jack-knife test. Such a technique should improve the success rate of structural biology projects.