Statistical mixture-of-experts models are often used for data analysis tasks such as clustering, regression and classification. We consider two mixture-of-experts models, the shared mixture classifier and the hierarchical mixture-of-experts classifier. We discuss the initialisation and optimisation of the structure and parameters of each classifier. In particular, we initialise the hierarchical mixture of experts classifier with the public domain OC1 decision tree software. We compare the performance of the two classifiers on four datasets, two artificial and two real, finding that the hierarchical mixture-of-experts classifier achieves superior classification performance on the testing data.