A significant recent advance in hyperspectral image (HSI) classification relies on the observation that the spectral signature of a pixel can be represented by a sparse linear combination of training spectra from an overcomplete dictionary. A spatiospectral notion of sparsity is further captured by developing a joint sparsity model, wherein spectral signatures of pixels in a local spatial neighborhood (of the pixel of interest) are constrained to be represented by a common collection of training spectra, albeit with different weights. A challenging open problem is to effectively capture the class conditional correlations between these multiple sparse representations corresponding to different pixels in the spatial neighborhood. We propose a probabilistic graphical model framework to explicitly mine the conditional dependences between these distinct sparse features. Our graphical models are synthesized using simple tree structures which can be discriminatively learnt (even with limited training samples) for classification. Experiments on benchmark HSI data sets reveal significant improvements over existing approaches in classification rates as well as robustness to choice of training.