Skip to Main Content
By scaling down the feature size and emersion of multi-cores that are usually multi-thread processors, the performance requirements almost guaranteed. Despite the ubiquity of multi-cores, it is as important as ever to deliver high single-thread performance. Multithreaded processors, by simultaneously using both the thread-level parallelism and the instruction-level parallelism of applications, achieve larger instruction per cycle rate than single-thread processors. In the recent multi-core multi-thread systems, the performance and power consumption is severely related to the average memory access time and its power consumption. This makes the cache as a major and important part in designing multi-thread multi-core embedded processor architectures. In this paper we perform a comprehensive design space exploration to find cache sizes that create the best tradeoffs between performance, power, and area of the processor. Finally we run multiple threads on the proposed optimum architecture to find out the maximum thread level parallelism based on performance per power and area efficient uni-thread architecture.