Skip to Main Content
The PDZ domain is one of the largest families of protein domains that are involved in targeting and routing specific proteins in signaling pathways. PDZ domains mediate protein-protein interactions by binding the C-terminal peptides of their target proteins. Using the dipeptide feature encoding, we develop a PDZ domain interaction predictor using a support vector machine that achieves a high accuracy rate of 82.49%. Since most of the dipeptide compositions are redundant and irrelevant, we propose a new hybrid feature selection technique to select only a subset of these compositions that are useful for interaction prediction. Our experimental results show that only approximately 25% of dipeptide features are needed and that our method increases the accuracy by 3%. The selected dipeptide features are analyzed and shown to have important roles on specificity pattern of PDZ domains.