Objective measures to automatically predict the perceptual quality of images or videos can reduce the time and cost requirements of end-to-end quality monitoring. For reliable quality predictions, these objective quality measures need to respond consistently with the behavior of the human visual system (HVS). In practice, many important HVS mechanisms are too complex to be modeled directly. Instead, they can be mimicked by machine learning systems, trained on subjective quality assessment databases, and applied on predefined objective quality measures for specific content or distortion classes. On the downside, machine learning systems are often difficult to interpret and may even contradict the input objective quality measures, leading to unreliable quality predictions. To address this problem, we developed an interpretable machine learning system for objective quality assessment, namely the locally adaptive fusion (LAF). This paper describes the LAF system and compares its performance with traditional machine learning. As it turns out, the LAF system is more consistent with the input measures and can better handle heteroscedastic training data.