Skip to Main Content
Segmentation-based scores play an important role in the evaluation of computational tools in medical image analysis. These scores evaluate the quality of various tasks, such as image registration and segmentation, by measuring the similarity between two binary label maps. Commonly these measurements blend two aspects of the similarity: pose misalignments and shape discrepancies. Not being able to distinguish between these two aspects, these scores often yield similar results to a widely varying range of different segmentation pairs. Consequently, the comparisons and analysis achieved by interpreting these scores become questionable. In this paper, we address this problem by exploring a new segmentation-based score, called normalized Weighted Spectral Distance (nWSD), that measures only shape discrepancies using the spectrum of the Laplace operator. Through experiments on synthetic and real data we demonstrate that nWSD provides additional information for evaluating differences between segmentations, which is not captured by other commonly used scores. Our results demonstrate that when jointly used with other scores, such as Dice's similarity coefficient, the additional information provided by nWSD allows richer, more discriminative evaluations. We show for the task of registration that through this addition we can distinguish different types of registration errors. This allows us to identify the source of errors and discriminate registration results which so far had to be treated as being of similar quality in previous evaluation studies.