Performance optimization of a data transfer controller for parallel matrix multiplication in FPGAS | IEEE Conference Publication | IEEE Xplore