Skip to Main Content
With the advancement of microarray technology, gene expression profiling has shown great potential in outcome prediction for different types of cancers. Microarray cancer data, organized as samples versus genes fashion, are being exploited for the classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer type. Nevertheless, small sample size remains a bottleneck to design suitable classifiers. Traditional supervised classifiers can only work with labeled data. On the other hand, a large number of microarray data that do not have adequate follow-up information are disregarded. A novel approach to combine feature (gene) selection and transductive support vector machine (TSVM) is proposed. We demonstrated that 1) potential gene markers could be identified and 2) TSVMs improved prediction accuracy as compared to the standard inductive SVMs (ISVMs). A forward greedy search algorithm based on consistency and a statistic called signal-to-noise ratio were employed to obtain the potential gene markers. The selected genes of the microarray data were then exploited to design the TSVM. Experimental results confirm the effectiveness of the proposed technique compared to the ISVM and low-density separation method in the area of semisupervised cancer classification as well as gene-marker identification.