Skip to Main Content
Single or multibit subword permutations are useful in many multimedia and cryptographic applications. Several specialized instructions have been proposed to handle the required data rearrangements. In this paper, we examine the hardware implementation of the powerful permutation instruction group (GRP). The design of the proposed permutation unit is based on the functionality of sorting networks. Two variants of the sorter-based GRP unit are introduced and analyzed and their energy-delay behavior is investigated using static CMOS implementations in a 130-nm CMOS technology.