Skip to Main Content
In this paper we investigate several common bus architectures and measure effective bandwidth between High Performance Computing cores and off-chip memory. Contributions of this paper include (i) characterizing the behavior of four common organizations using off-the-shelf IP cores, (ii) an investigation of the effect of multiple computational cores sharing the bus structures, and (iii) the development of a testing methodology which simulates different access patterns and accurately measures bandwidth. The results show that while some bus architectures arc clearly belter than others, none approach the theoretical bandwidth of the memory interface. Furthermore, negotiating the bus protocol is a significant source of overhead. So much so that it effectively hides any performance one might gain from trying to access the off-chip DRAMs using an "intelligent" access pattern.