Skip to Main Content
In this paper, we will study the on-chip network and memory hierarchy design of the Godson-T - a homogeneous many-core processor. Godson-T has 64 cores (with private L1 cache), and 16 global L2 cache banks. All these on-chip units are connected by a 2D 8 Ã 8 mesh network. Our study reveals that:(a) Global on-chip L2 cache can effectively alleviate the memory pressure caused by the data-thirsty on-chip computing engines. However, its potential is still limited by both the off-chip and the in-chip bandwidth, especially when increasing the number of active threads.(b) On-chip traffic congestion is largely caused by the intensive memory access requests issued from the on-chipcores. Therefore, the design of the on-chip network must consider the available performance of the datapath that connects the processor to the main memory. (c) In theory, different applications have different communication patterns (Berkeley's view). However, the application's runtime communication pattern is only determined by the design of the underlying memory hierarchy and on-chip interconnection. These conclusions are generally applicable to a wide variety of many-core processors with similar design.