Offloading Collective Operations to Programmable Logic on a Zynq Cluster | IEEE Conference Publication | IEEE Xplore