This paper presents a technique for exploiting input statistics for energy and performance optimization of embedded software. The proposed technique is based on the fact that the computational complexities of programs or subprograms are often highly dependent on the values assumed by input and intermediate program variables during execution. This observation is exploited in the proposed software synthesis technique by augmenting the program with optimized versions of one or more subprograms that are specialized to, and executed under, specific input subspaces. We propose a methodology for input space-adaptive software synthesis that consists of the following steps: 1) control and value profiling of the input program; 2) application of compiler transformations in a preprocessing step; 3) identification of subprograms and corresponding input subspaces that hold the highest potential for optimization; and 4) an iterative application of known compiler transformations to realize performance and energy savings. We propose novel metrics based on the entropies of program variables to characterize subprograms and input subspaces that hold significant potential for optimization. The chosen subprograms are optimized by translating the input subspaces into value constraints on their variables, and iteratively applying known compiler transformations (that were not applicable in the context of the original program). We have evaluated input space-adaptive software synthesis by compiling the resulting optimized programs to two commercial embedded systems: an embedded system based on the Fujitsu SPARClite processor, and the Compaq iPAQ personal digital assistant (PDA) [64 MB memory, 206 MHz Intel StrongARM central processing unit (CPU)]. The energy and execution-time savings were calculated using energy-aware instruction-level simulators, as well as through direct-current measurement on the iPAQ. Our results demonstrate that the proposed technique can reduce energy by up to 54.5% (average of 30.6% and 25.6% for the SPARClite-based system and the iPAQ, respectively) while simultaneously improving performance by up to 59.6% (average of 31.3% and 31.5% for the SPARClite-based system and the iPAQ, respectively). In effect, improvements in the energy-delay - product of up to 81.1% (average of 51.0% and 47.7% for the SPARClite-based system and the iPAQ, respectively) were observed. The energy savings resulting from our technique are fairly processor independent, and complementary to conventional compiler optimizations.