An efficient approach for face compression is introduced. Restricting a family of images to frontal facial mug shots enables us to first geometrically deform a given face into a canonical form in which the same facial features are mapped to the same spatial locations. Next, we break the image into tiles and model each image tile in a compact manner. Modeling the tile content relies on clustering the same tile location at many training images. A tree of vector-quantization dictionaries is constructed per location, and lossy compression is achieved using bit-allocation according to the significance of a tile. Repeating this modeling/coding scheme over several scales, the resulting multiscale algorithm is demonstrated to compress facial images at very low bit rates while keeping high visual qualities, outperforming JPEG-2000 performance significantly.