Versatile Direct and Transpose Matrix Multiplication with Chained Operations: An Optimized Architecture Using Circulant Matrices | IEEE Journals & Magazine | IEEE Xplore