Skip to Main Content
In array processors, data reordering is often needed to perform the computations in correct order. Matrix transpose is such a reordering operation used, e.g., in block-based video coding implementations. In this paper, a parameterized decomposition of the permutation matrix performing 2k × 2k matrix transpose is derived. A systematic approach to design register-based reordering units based on the decomposition is proposed where the number of ports, 2q, can be varied, q ≤ k.