Skip to Main Content
In chemometrics, data from infrared or near-infrared (NIR) spectroscopy are often used to identify a compound or to analyze the composition of a material. This involves the calibration of models that predict the concentration of material constituents from the measured NIR spectrum. An interesting aspect of multivariate calibration is to achieve a particular accuracy level with a minimum number of training samples, as this reduces the number of laboratory tests and thus the cost of model building. In these chemometric models, the input refers to a proper representation of the spectra and the output to the concentrations of the sample constituents. The search for a most informative new calibration sample thus has to be performed in the output space of the model, rather than in the input space as in conventional modeling problems. In this paper, we propose to solve the corresponding inversion problem by utilizing the disagreements of an ensemble of neural networks to represent the prediction error in the unexplored component space. The next calibration sample is then chosen at a composition where the individual models of the ensemble disagree most. The results obtained for a realistic chemometric calibration example show that the proposed active learning can achieve a given calibration accuracy with less training samples than random sampling.