Skip to Main Content
Characterizing the performance of automated contouring algorithm has been a persistent challenge. The accuracy of automated contouring algorithm of medical images has been difficult to quantify because of the absence of a ground truth segmentation of clinical data. A common strategy is to compare the automated contour with the segmentation produced by an expert or by a group of experts. The known problem is that inter-observer and intra-observer variability is very high especially when images to be segmented have low contrast. We present a Monte Carlo method that accounts for this variability and uses it to generate a set of random contours. Then we compare automatic contour with each sampled contours obtaining an estimate of the probability that automatic method is accurate. Our method seems to be able to provide more information than the classical evaluation of similarity between automatic and expert contours does, therefore avoiding the problem of constructing a ground truth.