Existing learning-based face super-resolution (hallucination) techniques generate high-resolution images of a single facial modality (i.e., at a fixed expression, pose and illumination) given one or set of low-resolution face images as probe. Here, we present a generalized approach based on a hierarchical tensor (multilinear) space representation for hallucinating high-resolution face images across multiple modalities, achieving generalization to variations in expression and pose. In particular, we formulate a unified tensor which can be reduced to two parts: a global image-based tensor for modeling the mappings among different facial modalities, and a local patch-based multiresolution tensor for incorporating high-resolution image details. For realistic hallucination of unregistered low-resolution faces contained in raw images, we develop an automatic face alignment algorithm capable of pixel-wise alignment by iteratively warping the probing face to its projection in the space of training face images. Our experiments show not only performance superiority over existing benchmark face super-resolution techniques on single modal face hallucination, but also novelty of our approach in coping with multimodal hallucination and its robustness in automatic alignment under practical imaging conditions.