Code density is of increasing concern in embedded system design since it reduces the need for the scarce resource memory and also implicitly improves further important design parameters like power consumption and performance. In this paper we introduce a novel, hardware-supported approach. Besides the code, also the lookuptables (LUTs) are compressed, that can become significant in size if the application is large and/or high compression is desired. Our scheme optimizes the number and size of generated LUTs to improve the compression ratio. To show the efficiency of our approach, we apply it to two compression schemes: ldquodictionary-basedrdquo and ldquostatisticalrdquo. We achieve an average compression ratio of 48% (already including the overhead of the LUTs). Thereby, our scheme is orthogonal to approaches that take particularities of a certain instruction set architecture into account. We have conducted evaluations using a representative set of applications and have applied it to three major embedded processor architectures, namely ARM, MIPS, and PowerPC.