Skip to Main Content
Multidestination communications are a highly necessary capability for many coherence protocols in order to minimize on-chip hit latency. Although CMPs share this necessity, up to now few suitable proposals have been developed. The combination of resource scarcity and the common idea that multicast support requires a substantial amount of extra resources is responsible for this situation. In this work, we propose a new approach suitable for on-chip networks capable of managing multidestination traffic via hardware in an efficient way with negligible complexity. We introduce a novel multicast routing mechanism, able to circumvent many of the limitations of conventional multicast schemes. Adaptive-tree multicasting is able to maintain correctness for multiflit multicast messages without routing restrictions, while also coupling correctness and performance in a natural way. Replication restrictions not only guarantee the presence of enough resources to avoid deadlock, but also dynamically adapt tree shape to network conditions, routing multicast messages through noncongested paths. The performance results, using a state-of-the-art full system simulation framework, show that it improves the average full system performance of a CMP by 20 percent and network ED2P by 15 percent, when compared to a state-of-the-art router with conventional multicast support and similar implementation cost.