Skip to Main Content
A prototype, content-based image retrieval system has been built employing a client/server architecture to access supercomputing power from the physician's desktop. The system retrieves images and their associated annotations from a networked microscopic pathology image database based on content similarity to user supplied query images. Similarity is evaluated based on four image feature types: color histogram, image texture, Fourier coefficients, and wavelet coefficients, using the vector dot product as a distance metric. Current retrieval accuracy varies across pathological categories depending on the number of available training samples and the effectiveness of the feature set. The distance measure of the search algorithm was validated by agglomerative cluster analysis in light of the medical domain knowledge. Results show a correlation between pathological significance and the image document distance value generated by the computer algorithm. This correlation agrees with observed visual similarity. This validation method has an advantage over traditional statistical evaluation methods when sample size is small and where domain knowledge is important. A multi-dimensional scaling analysis shows a low dimensionality nature of the embedded space for the current test set.