Skip to Main Content
We present a method for sampling feature vectors in large (e.g., 2000 × 5000 × 16 bit) images that finds subsets of pixel locations which represent c "regions" in the image. Samples are accepted by the chi-square (χ2) or divergence hypothesis test. A framework that captures the idea of efficient extension of image processing algorithms from the samples to the rest of the population is given. Computationally expensive (in time and/or space) image operators (e.g., neural networks (NNs) or clustering models) are trained on the sample, and then extended noniteratively to the rest of the population. We illustrate the general method using fuzzy c-means (FCM) clustering to segment Indian satellite images. On average, the new method can achieve about 99% accuracy (relative to running the literal algorithm) using roughly 24% of the image for training. This amounts to an average savings of 76% in CPU time. We also compare our method to its closest relative in the group of schemes used to accelerate FCM: our method averages a speedup of about 4.2, whereas the multistage random sampling approach achieves an average acceleration of 1.63.