In this paper, we study how to reduce energy consumption in large-scale sensor networks, which systematically sample a spatio-temporal field. We begin by formulating a distributed compression problem subject to aggregation (energy) costs to a single sink. We show that the optimal solution is greedy and based on ordering sensors according to their aggregation costs-typically related to proximity-and, perhaps surprisingly, it is independent of the distribution of data sources. Next, we consider a simplified hierarchical model for a sensor network including multiple sinks, compressors/aggregation nodes, and sensors. Using a reasonable metric for energy cost, we show that the optimal organization of devices is associated with a Johnson-Mehl tessellation induced by their locations. Drawing on techniques from stochastic geometry, we analyze the energy savings that optimal hierarchies provide relative to previously proposed organizations based on proximity, i.e., associated Voronoi tessellations. Our analysis and simulations show that an optimal organization of aggregation/compression can yield 8%-28% energy savings depending on the compression ratio.