Skip to Main Content
Can sequence analysis tell us about the function of protein? A basic question in protein science is which kind of proteins extent thermostability. Chaos game representation (CGR) can investigate the patterns hiding in protein sequence, visually revealing previously unknown structure. In this paper, we convert every protein sequence into a 20-dimensional vector by CGR algorithm, and based on these vectors we discriminate thermophiles from mesophiles using support vector machine (SVM). The overall accuracy achieves 100% in resubstitution test, and 87.12% in Jackknife test. Moreover, Matthews correlation coefficients (MCC) is 0.745.