Skip to Main Content
Although there are many studies on computer-aided drug design in recent years, determination of proteins for drug candidates is a remarkable area for research. The first major shortcoming of this kind of problems is the feature selection representing the protein structure best, the former one is the computational complexity. We use three datasets with different sizes such as Cherkasov dataset with 2684 examples including over 160 descriptors, sdf formatted DrugDataBank dataset with 7440 examples including over 300 descriptors and Pharmeks Company's real drug database having over 250.000 samples. A statistical multiple relief algorithm is developed in order to measure the quality of the attributes and to reduce the dimension of the dataset. We applied a new approach working on subspaces of dataset called as incremental decremental kernel learning model. As a result, we found that our new approach has better accuracy and lower computational complexity than the other traditional supervised algorithms.