This paper presents the dynamic programming approach to the design of optimal pattern recognition systems when the costs of feature measurements describing the pattern samples are of considerable importance. A multistage or sequential pattern classifier which requires, on the average, a substantially smaller number of feature measurements than that required by an equally reliable nonsequential classifier is defined and constructed through the method of recursive optimization. Two methods of reducing the dimensionality in computation are presented for the cases where the observed feature measurements are 1) statistically independent, and 2) Markov dependent. Both models, in general, provide a ready solution to the optimal sequential classification problem. A generalization in the design of optimal classifiers capable of selecting a best sequence of feature measurements is also discussed. Computer simulated experiments in character recognition are shown to illustrate the feasibility of this approach.