Skip to Main Content
This paper presents a system that employs random forests to formulate a method for subcellular localisation of proteins. A random forest is an ensemble learner that grows classification trees. Each tree produces a classification decision, and an integrated output is calculated. The system classifies the protein-localisation patterns within fluorescent microscope images. 2D images of HeLa cells that include all major classes of subcellular structures, and the associated feature set are used. The performance of the developed system is compared against that of the support vector machine and decision tree approaches. Three experiments are performed to study the influence of the training and test set size on the performance of the examined methods. The calculated classification errors and execution times are presented and discussed. The lowest classification error (2.9%) has been produced by the developed system.