Skip to Main Content
Cancer classification using gene expression data has the great importance in bioinformatics and is known to contain the keys for addressing the fundamental problems relating to cancer diagnosis and drug discovery. Error correcting output coding (ECOC) is a method to design multiple classifier systems (MCS), which reduces a multi-class problem into some binary sub-problem. A key issue in design of any ECOC ensemble is defining optimal code matrix with maximum discrimination power and minimum number of columns. This paper introduces a heuristic method for application dependent design of optimal ECOC matrix base on the thinning algorithm used in the ensemble design. The key idea of proposed method which called Thinned ECOC is to remove some redundant and unnecessary columns of any initial code matrix successively based on a metric defined for each column. Experimental results on two real datasets show the robustness of Thinned ECOC in comparison with the other existing code generation methods.