Print quality (PQ) is a composite attribute defined by human perception. As such, the ultimate way to determine and quantify PQ is by human survey. However, repeated surveys are time consuming and often represent a burden on processes that involve repeated evaluations. A desired alternative would be an automatic quality rating tool. Once such quality evaluation measure is proposed, it should be qualified. That is, it should be shown to reflect human assessment. If two of the human opinions conflict, the tool cannot possibly agree with both. Conflicts between human opinions are common, which complicates the evaluation of tool's success in reflecting human judgment. There are many optional ways for measuring the agreement between human assessment and tool evaluation, but different methods may have conflicting results. It is, therefore, important to pre-establish the appropriate method for the evaluation of quality-evaluation-tools, a method that takes the disagreement among the survey participants into account. In this paper, we model human quality preference and derive the most appropriate method to qualify quality evaluation tools. We demonstrate the resulting qualification method in a real life scenario-the qualification of the mechanical band meter.