Image segmentation is the partition of an image into a set of nonoverlapping regions whose union is the entire image. The image is decomposed into meaningful parts which are uniform with respect to certain characteristics, such as gray level or texture. In this paper, we propose a methodology for evaluating medical image segmentation algorithms wherein the only information available is boundaries outlined by multiple expert observers. In this case, the results of the segmentation algorithm can be evaluated against the multiple observers' outlines. We have derived statistics to enable us to find whether the computer-generated boundaries agree with the observers' hand-outlined boundaries as much as the different observers agree with each other. We illustrate the use of this methodology by evaluating image segmentation algorithms on two different applications in ultrasound imaging. In the first application, we attempt to find the epicardial and endocardial boundaries from cardiac ultrasound images, and in the second application, our goal is to find the fetal skull and abdomen boundaries from prenatal ultrasound images.