A hierarchical learning based approach for human action recognition is proposed in this paper. It consists of hierarchical nonlinear dimensionality reduction based feature extraction and cascade discriminative model based action modeling. Human actions are inferred from human body joint motions and human bodies are decomposed into several physiological body parts according to inherent hierarchy (e.g. right arm, left arm and head all belong to upper body). We explore the underlying hierarchical structures of high-dimensional human pose space using hierarchical Gaussian process latent variable model (HGPLVM) and learn a representative motion pattern set for each body part. In the hierarchical manifold space, the bottom-up cascade conditional random fields (CRFs) are used to predict the corresponding motion pattern in each manifold subspace, and then the final action label is estimated for each observation by a discriminative classifier on the current motion pattern set.