By Topic

Enabling High-Performance Crossbars through a Floorplan-Aware Design

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Roca, A. ; Grupo de Arquitecturas Paralelas (GAP), Univ. Politec. de Valencia, Valencia, Spain ; Hernandez, C. ; Flich, J. ; Silla, F.
more authors

Networks-on-Chip (NoC) with low-radix switches forming a simple and planar topology is typically accepted as the right interconnection infrastructure for current Chip Multi Processor and high-end Multi Processor System-on-Chip. This is mainly due to its simplicity in the physical mapping on the chip. However, as the network diameter increases, latency and power consumption are increased due to the rapidly growing queuing delay in each switch do not scale with system size. In this context, topologies with high-radix switches have been recently proposed in the NoC scenario to keep message latency low when interconnecting a large number of devices. However, the use of high-radix switches present several well-known drawbacks, being the most important the scalability in area and frequency when implemented. In addition, average and maximum wire length is increased. In this paper we present a distributed crossbar NoC architecture that reduces network latency, increases network throughput significantly. For the distributed crossbar implementation trees of 2-to-1 multiplexers with arbitration and buffer capabilities are spread over the chip, avoiding the negative impact of a high radix switch degree on NoC operating frequency, and minimizing the impact of long wires. Results show that in a 64-node NoC our most aggressive distributed crossbar configuration reduces flit latency by 42% and increases throughput by 544% with respect to the low latency flattened butterfly architecture, meanwhile area is increased a 110%. A more conservative distributed crossbar configuration obtains an increment in throughput of 276.1%, latency is decreased a 29.7%, but area is also decreased by 7%.

Published in:

Parallel Processing (ICPP), 2012 41st International Conference on

Date of Conference:

10-13 Sept. 2012