In various pattern-recognition problems such as classification of photographic data, preprocessing operations result in a two-dimensional array of binary random variables. An optimal recipe for classifying such patterns is described. It combines the use of an orthonormal expansion for the logarithm of probability functions, with the generation of a joint probability distribution from lower-order marginals. The unwieldy nature of the optimal recipe leads to the consideration of dependence represented by a Markov chain and a two-dimensional analog called a Markov mesh. The Markov chain has a "reflecting" property in that therth order nonstationary chain implies that a point depends on itsrnearest neighbors on each side. By scanning along a grid-filling curve so that the2rneighbors of a point are geometrically close to it, certain spatial dependencies are obtained. The grid-filling curves are sequences of functions which have Hilbert "space-filling" curves as their limit. This simple approach does not provide for many of the spatial dependencies that should be considered. A novel extension of Markov chain methods into two dimensions leads to the Markov mesh which economically takes care of a much larger class of spatial dependencies. Likelihood ratio classification based on Markov chain and Markov mesh assumptions requires the estimation of a much smaller number of parameters than the general case. The development presented in this paper is new and need not be restricted to binary variables.