Skip to Main Content
Researchers have shown that fusion of categorical labels from multiple experts - humans or machine classifiers - improves the accuracy and generalizability of the overall classification system. Simple plurality is a popular technique for performing this fusion, but it gives equal importance to labels from all experts, who may not be equally reliable or consistent across the dataset. Estimation of expert reliability without knowing the reference labels is, however, a challenging problem. Most previous works deal with these challenges by modeling expert reliability as constant over the entire data (feature) space. This paper presents a model based on the consideration that in dealing with real-world data, expert reliability is variable over the complete feature space but constant over local clusters of homogeneous instances. This model jointly learns a classifier and expert reliability parameters without assuming knowledge of the reference labels using the Expectation-Maximization (EM) algorithm. Classification experiments on simulated data, data from the UCI Machine Learning Repository, and two emotional speech classification datasets show the benefits of the proposed model. Using a metric based on the Jensen-Shannon divergence, we empirically show that the proposed model gives greater benefit for datasets where expert reliability is highly variable over the feature space.