By Topic

Design and Analysis of On-Chip Networks for Large-Scale Cache Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Yuho Jin ; Univ. of Southern California, Los Angeles, CA, USA ; Eun Jung Kim ; Ki Hwan Yum

Switched networks have been adopted in on-chip communication for their scalability and efficient resource sharing. However, using a general network for a specific domain may result in unnecessary high cost and low performance when the interconnects are not optimized for the domain. Designing an optimal network for the specific domain is challenging because in-depth knowledge of interconnects and the application domain is required. Recently proposed Nonuniform Cache Architectures (NUCAs) use wormhole-routed 2D mesh networks in L2 caches. We observe that in NUCAs, network resources are underutilized with the considerable area cost (41 percent of cache) and the network delay is significantly large (63 percent of cache access time). Motivated by our observations, we investigate both router architecture and network topology for communication behaviors in large-scale cache systems. We present Fast-LRU replacement, where cache replacement overlaps with data request delivery. Next, we propose a deadlock-free XYX routing algorithm in a mesh network and present a new halo network topology to reduce the required links. Finally, we introduce a single-cycle multicast router that needs small modification of the unicast router design. Simulation results show that our design improves the average IPC by 38 percent over the mesh design with Multicast Promotion replacement and uses 12 percent of the interconnection area of the mesh network.

Published in:

Computers, IEEE Transactions on  (Volume:59 ,  Issue: 3 )