Skip to Main Content
Building a model using machine learning that can classify the sentiment of natural language text often requires an extensive set of labeled training data from the same domain as the target text. Gathering and labeling new datasets whenever a model is needed for a new domain is time-consuming and difficult, especially if a dataset with numeric ratings is not available. In this paper we consider the problem of building models that have a high sentiment classification accuracy without the aid of a labeled dataset from the target domain. We show that ensembles of existing domain models can be used to achieve a classification accuracy that approaches that of models trained on data from the target domain.