1. Introduction
With the fast growth of consumer electronic products, the demand on digital signal processings related to the multimedia, such as image and video processing, is also increased significantly. To meet the performance challenges, high-end embedded processor and DSP processors are moving towards exploiting intensively instruction level parallelism (ILP). In addition, the tight resources in embedded systems also have DSP processors to adopt architecture features such as distributed register files to reduce the amount of read and write ports in registers to reduce power consumptions [1], special data path to exploit the characteristics of DSP applications, variable length encoding schemes for instructions to reduce code size, the sub-word instructions for video applications, etc. The appearance of these new features create challenges to deploy software toolkits for new embedded DSP processors.