Transmitting compressed data can reduce inter-processor communication traffic and create new opportunities for DVS (dynamic voltage scaling) in distributed embedded systems. However, data compression alone may not be effective unless coordinated with functional partitioning. This paper presents a dynamic programming technique that combines compression and functional partitioning to minimize energy on multiple voltage-scalable processors running pipelined data-regular applications under performance constraints. Our algorithm computes the optimal functional partitioning, CPU speed for each node, and their respective compression ratios. We validate the algorithm's effectiveness on a real distributed embedded system running an image processing algorithm.