Cluster analysis is a valuable tool in exploratory pattern analysis, especially when very little prior information about the data is available. In unsupervised pattern recognition and image segmentation applications, clustering techniques play an important role. The squared-error clustering technique is the most popular one among different clustering techniques. Due to the iterative nature of the squared-error clustering, it demands substantial CPU time, even for modest numbers of patterns. Recent advances in VLSI microelectronic technology triggered the idea of implementing the squared-error clustering directly in hardware. A two-level pipelined systolic pattern clustering array is proposed in this paper. The memory storage and access schemes are designed to enable a rhythmic data flow between processing units. Each processing unit is pipelined to further enhance the system performance. The total processing time for each pass of pattern labeling and cluster center updating is essentially dominated by the time required to fetch the pattern matrix once. Detailed architectural configuration, system performance evaluation, and simulation experiments are presented. The modularity and the regularity of the system architecture make it suited for VLSI implementations.