In this paper, we present a novel and fast constructive technique that relocates the instruction code in such a manner into the main memory that the cache is utilized more efficiently. The technique is applied as a preprocessing step, i.e., before the code is executed. Our technique is applicable in embedded systems where the number and characteristics of tasks running on the system is known a priori. The technique does not impose any computational overhead to the system. As a result of applying our technique to a variety of real-world applications we observed through simulation a significant drop of cache misses. Furthermore, the energy consumption of the whole system (CPU, caches, buses, main memory) is reduced by up to 65%. These benefits could be achieved by a slightly increased main memory size of about 13% on average.