Lightweight Cryptographic Hash Functions: Design Trends, Comparative Study, and Future Directions

The emergence of the Internet of Things (IoT) has enabled billions of devices that collect large amounts of data to be connected. Therefore, IoT security has fundamental requirements. One critical aspect of IoT security is data integrity. Cryptographic hash functions are cryptographic primitives that provide data integrity services. However, due to the limitations of IoT devices, existing cryptographic hash functions are not suitable for all IoT environments. As a result, researchers have proposed various lightweight cryptographic hash function algorithms. In this paper, we discuss advanced lightweight cryptographic hash functions for highly constrained devices, categorize design trends, analyze cryptographic aspects and cryptanalytic attacks, and present a comparative analysis of different hardware and software implementations. In the final section of this paper, we highlight present research challenges and suggest future research topics related to the design of lightweight cryptographic hash functions.


I. INTRODUCTION
The Internet of Things (IoT) is an essential component of computer science and information technology research. An enormous amount of research on the IoT has been conducted due to the IoT applications in various fields, including automotive systems, sensor networks, healthcare, distributed control systems, cyber-physical systems, smart grids, agriculture, smart cities, smart homes, transport and logistics, and smart factories. Moreover, IoT Analytics [1] has predicted the connectivity between IoT devices to reach 30.90 billion by 2025. The increase in the number of IoT devices has led to more connections than the use of non-IoT devices.
The associate editor coordinating the review of this manuscript and approving it for publication was Xiangxue Li. These connected devices pose the same dilemma as connectivity between people: convenience and security. Among these connected devices are devices with the same or similar resources as standard computers; however, many devices have limitations. Devices with similar resources to standard computers can use standard cryptography primitives; however, other devices require unique designs due to various limitations. Researchers in [2]- [4] defined four design limitations associated with IoT cryptography primitives, especially in hardware implementations: memory consumption, implementation size, speed or throughput, and power or energy (Fig. 1).
One of the most widely used IoT cryptographic primitives is the cryptographic hash function [5]- [8]. The cryptographic hash function is a cryptographic primitive that plays an essential role in various cyber and information security applications. The cryptographic hash function maps an arbitrary length input to a fixed-length output. The hash function outputs the hash value, message digest, digest, or fingerprint. Cryptographic hash functions have been implemented in different cryptographic mechanisms, including data integrity [7]- [10], entity authentication [7], [8], digital signatures [5], [6], [11], [12], pseudorandom number generators [7], cryptographic key derivation [7], [12], key generation [12], password security, and blockchains [13]- [17]. The use of the hash function is crucial in digital signature applications. The hash value of the message is signed using the sender's private key. The security of a digital signature is highly dependent on the security of the cryptographic hash function. If an attacker finds two messages with the same hash value and convinces the other party to sign one of the messages, the attacker can obtain a valid digital signature for the other message. Similar to password security applications, if an attacker constructs a password based on the hash value, the security of the system protected by the password may be at risk. Therefore, government, industry, and academia have attempted to design and analyze cryptographic hash functions. When designing a hash function, the designer must consider both security and performance factors. Some previous works have described the characteristics of a good cryptographic hash function by considering these two factors [18]- [21].
In 2012, the Keccak hash function [22] was selected as the secure hash standard (SHA-3) and was published in the Federal Information Processing Standard (FIPS) 202 [23] and NIST SP 800-185 [24]. However, the hash functions designed in the SHA-3 competition are intended for devices with standard specifications. The primitives are not designed for small computing devices with limited resources, such as embedded devices, RFID devices, and sensor networks. Lightweight cryptographic algorithms for devices with limited resources have been widely discussed in the literature. Some lightweight cryptographic algorithms include the lightweight block cipher, lightweight stream cipher, lightweight public key cryptosystem, lightweight cryptographic hash function (LWCHF), and lightweight message authentication code (MAC). This article focuses on lightweight cryptographic hash functions because of the vital role these algorithms play in devices with limited resources.
There has been considerable research on the design of the LWCHF algorithm since it was first developed in 2008. In addition, many attacks on the LWCHF algorithm have been carried out. We aim to present a state-of-the-art LWCHF algorithm, including the design trends, cryptographic properties, and hardware and software implementation performance. Here, design trends refer to constructs that have been proposed in the literature, cryptographic properties refer to cryptanalytic attacks that have been carried out on each LWCHF algorithm, and the implementation performance summarizes data related to implementing hash function algorithms on hardware and software, along with the accompanying metrics. We obtain the implementation performance data from algorithm designers or implementations by other researchers.
The main contributions of this study can be summarized as follows: • We surveyed state-of-the-art lightweight cryptographic hash functions up to early 2022. To the best of our knowledge, there have been no surveys on lightweight cryptographic hash functions developed until the final round of the NIST Lightweight Cryptography Project.
• We classify the design trends for lightweight cryptographic hash functions.
• We analyze and compare lightweight hash functions based on cryptographic properties (Table 4) and implementation aspects (Table 5).
• We analyze the challenges associated with designing and developing a lightweight cryptographic hash function.
• We identify potential gaps in future research, highlighting essential and practical considerations for developing lightweight cryptographic hash functions that require more attention. VOLUME 10, 2022 This review should support academic and industry researchers in designing, analyzing, and implementing lightweight cryptographic hash functions. We hope our study's results can inspire researchers in future work aimed at designing and implementing LWCHFs.
The remainder of this paper is organized as follows. In Section II, we describe the methodology we used in this review. Section III highlights several surveys related to lightweight cryptographic hash functions. Section IV discusses the theoretical basis of cryptographic hash functions and their relation to lightweight cryptographic hash functions (LWCHFs). The lightweight cryptography performance metrics associated with hardware and software implementations are detailed in Section V. Section VI discusses the design trends of lightweight cryptographic hash functions. A comprehensive study of a state-of-the-art LWCHF is presented in Section VII. The results, discussion, research challenges and future directions are presented in Section VIII. Finally, we conclude our research in Section IX.

II. SURVEY METHODOLOGY
The approach we used to collect manuscripts for this survey is shown in Fig. 2. The scientific databases we searched for articles include IEEE Xplore, ACM Digital Library, Springer, ScienceDirect, and Google Scholar. The search focused on papers published between 2008 and the present day (2022). The search terms used to collect the manuscripts included several variations of ''lightweight cryptographic hash function''. Based on the search terms, we initially identified more than 500 papers. These papers were then filtered to fit the topic coverage based on their title, abstract, content, and conclusion.
This survey followed a semisystematic methodology [25] to narrow the literature into several stages. In Stage 1, an extensive search was used to analyze the literature on all proposed lightweight cryptographic hash functions to the best of our knowledge. In Stage 2 of our study, we conducted an in-depth examination to select literature based on the LWCHF design. The Stage 2 results were formalized as design trends and hardware and software performance comparisons in Stage 3. In Stage 4, we concluded our evaluation of the literature and discussed some challenges of this study and potential future work.

III. RELATED WORKS
Many surveys on research progress in IoT security have been published in recent years [18], [26]- [28]. Researchers have mainly focused on IoT security solutions. Security issues are presented as components of each survey and are treated as general concepts, and security and privacy are often considered together as one concept. Unfortunately, no previous survey has detailed deep-seated IoT security issues related to lightweight cryptographic hash functions (see Table 1).
Biryukov and Perrin [18] investigated lightweight cryptographic algorithms that had been developed prior to 2017, including block ciphers, stream ciphers, and hash functions, which were designed for use in academia, government, and industry. The authors discussed in detail the design of each algorithm. However, the authors do not provide a detailed explanation of the most recent lightweight cryptographic hash function.
Shah and Engineer [26] did not discuss the hash function algorithm, although it was mentioned in the introduction of their work. In addition, the author does not discuss state-of-the-art algorithms. Dhanda et al. [27] discussed 54 lightweight cryptography (LWC) algorithms, including 21 lightweight block ciphers, 19 lightweight stream ciphers, 9 lightweight hash functions, and 5 elliptic curve cryptography (ECC) ciphers that had been developed prior to 2019. When discussing the Keccak algorithm, the author mistakenly identified the algorithm's designers as [30], while the algorithm was actually designed by [31]. The author did not specify the state-of-the-art hash function algorithm identified in the previous survey.
Thakor et al. [29] classified the critical characteristics of LWC algorithms and compared 41 LWC encryption algorithms using seven performance metrics. The seven metrics are the block/key size, memory, gate area, latency, throughput, power & energy, and hardware & software efficiency. In a recent study, Rana et al. [28] discussed state-of-the-art lightweight cryptographic protocols for IoT networks and provided a comparative analysis of popular ciphers. The authors discussed three lightweight cryptography primitives: the block cipher, stream cipher, and elliptic curve cipher.

IV. OVERVIEW OF CRYPTOGRAPHIC HASH FUNCTIONS
Cryptographic hash functions are workhorses in cryptography, and these primitives are used in almost all cryptographic applications [32]. A cryptographic hash function is defined as follows (Definition 1): Definition 1 [19]: Suppose x is the message input, and n is a positive integer. The hash function H is a function with at least the following properties: 1) Compression: H maps any input x of finite length to an output H(x) with length n as H : (0, 1) * → (0, 1) n ..

2) Easy computation: when the hash function H and input
x are known, the hash value H(x) is easy to calculate.
Cryptographic hash functions can generally be classified into two categories [19]:

1) Modification detection codes (MDCs)
This category is also known as message integrity codes (MICs). MDCs calculate the hash value of an input message and determine its integrity by comparing the hash values of the received messages. The MDC is an unkeyed hash function with the properties specified in Definition 2. There are two subclasses of MDCs: • One-way hash functions (OWHFs): it is computationally difficult to identify the message input according to the given hash value.  • Collision-resistant hash functions (CRHFs): it is difficult to identify any two inputs with the same hash value. In this study, we focus on unkeyed hash functions.

2) Message authentication codes (MACs)
This category is also known as keyed hash functions. The MAC is a hash function with an additional parameter: a cryptographic key. The MAC algorithm aims to assure the integrity of the source and message without using other mechanisms. The secret key parameter allows this assurance. Definition 2 [19]: An unkeyed hash function H with message inputs x, x and hash values y, y also has the following properties: 1) Preimage resistance (one-way): given the hash value y, it is computationally difficult to determine the input x such that H(x) = y. 2) Second-preimage resistance: given the input x, it is computationally difficult to determine another input x = x; thus, H(x ) = H(x). This property is also known as weak collision resistance.
3) Collision resistance: it is computationally difficult to find any two inputs x = x such that H(x ) = H(x). Another name for this property is strong collision resistance.
We denote the preimage, second preimage, and collision resistance as Pre, 2nd Pre and Coll. Illustrations of these three properties are shown in Fig. 3.

V. LIGHTWEIGHT CRYPTOGRAPHY PERFORMANCE METRICS
Researchers in several studies have defined performance metrics for software and hardware implementations. The designer must specify which metrics are suitable for a particular application. The choice of metric is crucial because it determines the design of the lightweight cryptographic algorithm. Fig. 1 depicts the IoT device implementation metrics used in the comparison in Subsection VIII-B.

A. SOFTWARE IMPLEMENTATION
The software implementation metrics are defined as follows:  1) Read-only memory (ROM) or code size [33], [34]: this metric relates to the fixed amount of data required to evaluate a function independently of its input. According to [34], this metric is the size of the cryptographic primitive/algorithm/mechanism code in bytes. 2) Random access memory (RAM) consumption [33], [34]: this metric corresponds to the amount of data written to memory during each function evaluation. 3) Energy [27], [35]- [38]: this metric corresponds to the power consumption during a certain period [34] and is measured in microjoules µJ . Lower values are better for this metric. The mathematical equation for energy consumption is formulated as follows: where E per bit is the energy per bit, Lat is the latency, P is the power used by the hardware or software, and B is the block size. 4) Throughput: this metric measures the average amount of data processed during each clock cycle. 5) Latency: this metric corresponds to the number of clock cycles needed to calculate a plaintext/ciphertext block.

B. HARDWARE IMPLEMENTATION
The following metrics are used to evaluate the hardware implementation efficiency: 1) Gate equivalent (GE) [2], [4], [18], [27], [34], [35]: this metric measures the memory consumption and implementation size. The GE is defined as the area occupied by the semiconductor [34]. Lower values are better for this metric. This metric measures how much physical area is required for a circuit that implements a primitive. Gong [4] noted that the physical area allocation in an LWC implementation should be less than 2000 GE. The metric can be defined with the following equation: where P area is the physical area allocation, Larea is the application layout area and A n is the area of the NAND2 gate. 2) Latency: this metric corresponds to the time a circuit outputs after the input is given [2], [18], [27], [34], [39]. The latency is measured in cycles/block or cycles/byte. Lower values are better for this metric. The latency can be defined as : where Lat is the latency, k is the number of clock cycles used to compute the output and t cycle is the time of one cycle. 3) Throughput [2], [27], [36], [40]: this metric is measured in bits or bytes per second and corresponds to the number of plaintexts processed per unit of time. Higher values are better for this metric. The throughput can be defined as: where T is the throughput, B is the block size, F is the frequency and N is the number of cycles per block. 4) Energy consumption: this metric is the same as the corresponding software metrics. 5) Power consumption [18], [27], [28], [35]: this metric is measured in Watts (W ) or µW and quantifies the amount of power required to use the circuit. Lower values are preferred for this metric. The power can be calculated as: where P is the power, B is the block size, Lat is the latency, P is the power used by the hardware or software, and E per bit is the energy per bit.

VI. TRENDS IN LIGHTWEIGHT CRYPTOGRAPHIC HASH FUNCTION DESIGN
This section discusses LWCHF design trends for three popular constructions: Merkle-Damgård construction, sponge construction, and block cipher-based construction. Some algorithms [41]- [43] use a particular construction, such as Merkle-Damgård or sponge, as the main construction and other constructions (e.g., block cipher-based) as building blocks to develop compression functions or permutations. In addition, we identify the round functions used in the LWCHF scheme: the substitution permutation network (SPN), Feistel network, and addition-rotation-exclusive Or (XOR) (ARX) structure. Table 2 lists the LWCHF design trends.

A. MERKLE-DAMGÅRD CONSTRUCTION
As mentioned in the introduction, research on the cryptographic hash function began with two crucial papers that underlie the development of this theory: Ralph Merkle's paper [83] and Ivan Bjerre Damgård's paper [84]. Merkle and Damgård proposed a cryptographic hash function that utilized a compression function, which is assumed to be a collision resistance function. This type of compression function can be extended to a hash function that is also VOLUME 10, 2022

In this case, h i is the intermediate hash value and H(m) is the hash value.
MD construction is vulnerable to length extension attacks [85], [86]. To prevent these attacks, the message input length is added at the end of the message input with the required padding so that the last block is a multiple of k. This construction is known as Merkle-Damgård strengthening [19].

B. LWCHFs BASED ON BLOCK CIPHERS
The use of block ciphers as building blocks in hash function design [87] is almost as old as the Data Encryption Standards (DES) algorithm [88]. Suppose that E is a block cipher with an r bit key k that maps n bit plaintext to n bit ciphertext. To the best of our knowledge, most researchers have used the Davies-Meyer (DM) construction [89] to design LWCHFs based on block ciphers. Fig. 5 depicts three well-known hash function constructions based on block ciphers: Davies-Meyer, Matyas-Meyer-Oseas, and Miyaguchi-Preenel [19], [45].
The steps of the Davies-Meyer algorithm are as follows: Input: bit string x. Output: n-bit hash-code. 1) Input x is divided into k-bit blocks, where k is the key length and padded, if necessary, to complete the last block. Denote the padded message with t k-bit blocks as The sponge construction method has 2 (two) stages: the absorbing and squeezing phases. Fig. 6 illustrates the sponge construction method. In this construction, the designer changes the function f by adding a new permutation or combining existing permutations.
Bertoni et al. [90] proposed the sponge construction method. This construction was further developed in 2011 [91]. Sponge construction is a method for constructing a hash function from a permutation without a publicly known key, which is referred to as P-sponge construction, or a random function, which is referred to as T-sponge construction [90]. In general, the steps in the sponge construction process can be described as follows: Pad the message M if necessary. Then, divide the padded message into blocks of length r bits. Initialize the internal state with b = (r + c) bits with bit 0, where r is the (bit) rate and c is the capacity. Obtain the hash value by absorbing the padded message and squeezing the internal state.
The absorbing phase includes the following steps: 1) Replace the first r bits of the internal state by XORing the previous r-bit values with the r-bit padded message. 2) Replace the internal state with the output of the f function. The above steps are repeated until the entire message block is processed. The squeezing phase includes Z /r steps, where Z is the hash value with length . The steps are as follows: 1) Store the initial r bits of the internal state.
2) Replace the internal state with the output of the f function. The hash value Z is generated by concatenating the r-bit blocks.
The padding algorithm is relatively simple to use. For example, the Keccak, or SHA-3, algorithm [23] uses multirate padding. In the last message block, add bit 1, then bits  0 are added as necessary to ensure that the block length is a multiple of r.
Bertoni et al. [92] proved the security claim of sponge construction, which is known as the flat sponge claim. This claim proves that an attacker can ''distinguish'' the sponge construction output from a random oracle with a probability of N 2 c/2 , where c is the capacity and N is the number of times the f function is called. A sponge structure with capacity c, rate r, and hash value of n bits can absorb messages of length m < 2 c/2 . The resistance of sponge constructions to attacks defined as in Definition 2 is summarized in Table 3.

VII. LIGHTWEIGHT CRYPTOGRAPHIC HASH FUNCTIONS IN THE WILD
We identify 34 LWCHFs that have been used in academia and industry. As discussed in Section VI, the design focuses on the cost, performance, and security tradeoffs. Badel et al. [44] proposed ARMADILLO and ARMADILLO2 as general-purpose cryptographic function designs. ARMADILLO and ARMADILLO2 can be used with fixed-input length MACs for challenge-response protocols, hashing & digital signatures, and PRNG & PRF. The proposed hash function includes five variants according to the length of the hash value: 80 bits, 128 bits, 160 bits, 192 bits, and 256 bits.
Al-Odat et al. [52] proposed a family of lightweight cryptographic hash functions based on the Merkle-Damgård construction. The algorithm has five hash value variants: 160, 224, 256, 384, and 512 bits. Unfortunately, this algorithm uses a substitution box, which is not explained in the article. Moreover, the author does not provide data on all LWCHF performance metrics; data on the power consumption, number of clock cycles, speed, and memory consumption were provided, while other performance metrics were ignored. In addition, the designer does not provide cryptanalytic results such as differential and linear cryptanalysis.
El Hanouti et al. recently proposed a lightweight hash function based on the Merkle-Damgård construction with a Feistel-like structure and a chaotic one-dimensional map known as the skew-tent map [73]. To the best of our knowledge, this proposal is the first chaotic map-based LWCHF algorithm. Other chaotic map-based hash functions [93]- [97] are not recommended for highly constrained devices. The author claims that the proposed hash function exhibits excellent performance (rapid implementation) and sufficient security properties. However, similar to the proposals of Al-Odat et al., not all performance metrics were considered in their study. Furthermore, the author does not provide supporting results concerning the cryptographic properties. The designers claim that Lesamnta-LW [41] is a secure, lightweight hash function with a hash length of 256 bits. The main design goal is to achieve small hardware/software VOLUME 10, 2022 implementations. The designers chose the MD construction and an AES-based design for the building blocks. A 4-branch generalized Feistel network (GFN) and AES components (SubBytes and MixColumn) are utilized in the hash function. The MixColumn operation uses the AES maximum distance separable (MDS) matrix multiplication defined over GF (2 8 ). TWISH [57] was designed based on the TWINE-128 [99] block cipher algorithm and uses the DM construction. TWISH is a single-block length hash function that accepts a 128-bit message input and returns a 64-bit hash value. The message input in the DM scheme acts as a key. The designer tested the security of the TWISH function by using the cryptographic randomness test proposed by [100].

C. SPONGE CONSTRUCTION
The first lightweight sponge construction-based hash function was QUARK [47]. This hash function was first proposed at CHES 2010. The version discussed in this section was updated in 2012. QUARK has been proposed as a lightweight hash function. The algorithm was inspired by the stream cipher Grain [101] and the block cipher KATAN [102]. Two nonlinear feedback shift registers (NFSRs) and a linear feedback shift register (LFSR) are used for the permutation.
PHOTON was proposed by [50] and uses both sponge and AES-like constructions. PHOTON is a compact hash function that uses 1120 gate equivalents (GE) to achieve 64-bit security. When compared with similar algorithms, the speed of this algorithm is claimed to be competitive.
SPONGENT [53], [54] is a family of hash functions designed by Bogdanov et al. and presented at CHES 2011. SPONGENT was designed as a family of hash functions with an 88-bits hash value to ensure resistance to preimages, 128 bits, 160 bits, 224 bits, and 256 bits. The authors claim that the algorithm is resistant to attacks aimed at the hash function.
Another algorithm is SPN-Hash [51]. This algorithm uses another type of sponge construction: the JH construction [105]. The hash function was designed by Choy et al. The main purpose of the design is to provide provable security against differential collision attacks. The S-Box used in the algorithm is the Advanced Encryption Standard (AES) [106].
The SipHash [49] algorithm has an ARX (addition, rotation & XOR) structure. This algorithm is intended for use in network traffic authentication applications and protected hash table lookups. SipHash was inspired by the BLAKE [107] and Skein [108] hash functions, which were both finalists in the SHA3 competition.
LHash [74], [75] is an LWCHF that was proposed by Wu et al. and supports three different message digest sizes: 80, 96, and 128 bits. The LWCHF provides preimage security, second preimage security between 64 and 120 bits, and collision security between 40 and 60 bits. LHash requires approximately 817 and 1028 GEs with serial implementations and 989 and 1200 GEs with 54 and 72 cycles per block in a faster implementation based on the T function. In addition, its energy consumption evaluated according to the energy per bit is extraordinary. The LHash design uses the Feistel-PG structure in the internal permutation, which take advantages of the permutation layer on the nibbles to increase the diffusion speed. The low-area implementation arises due to the hardware-friendly S-box and a linear diffusion layer. The designer evaluated LHash's resistance to known attacks and confirmed that this LWCHF provides a good security margin.
Neeva-hash [76] is a sponge construction-based LWCHF with a message digest length of 224 bits. This algorithm uses 32 rounds to generate a hash value. The only nonlinear function in the Neeva-hash LWCHF utilizes a 4 × 4-bit PRESENT S-Box. State b has 256 bits, the rate is 32 bits, and the capacity is 224 bits. The round function uses the ARX structure.
Mukundan et al. proposed Hash-One [78], aiming at both simplicity and security. Hash-One uses a sponge construction and two 80-and 81-bit nonlinear feedback shift registers (NFSRs) and supports message digests with sizes of 160 bits. The level of security expected by the designer is 160 bits for preimage resistance and 80 bits for collision resistance.
Gimli-Hash [56] is a derivative of the Gimli permutation function that was proposed by Bernstein et al. [55]. The authors claim that this permutation function can be used in various platforms, such as 64-bit Intel/AMD server CPUs, 64-bit and 32-bit ARM smartphone CPUs, 32-bit ARM microcontrollers, 8-bit AVR microcontrollers, FPGAs, ASICs with side-channel protection, and ASICs without sidechannel protection. sLiSCP-hash [43] was designed by AlTawy et al. from the University of Waterloo, Canada, in 2017. Simeck-based permutations for lightweight sponge cryptographic primitives (sLiSCP) are designed for integrated duplex sponge construction and provide minimal overhead for cryptographic functions in single-hardware designs. The sLiSCP design follows the four-subblock Type-2 Generalized Feistel-like Structure (GFS). The algorithm uses the unkeyed Simeck algorithm [109], [110] with round reduction as the round function. The algorithm can be used for two applications: hashing and authenticated encryption.
In the publication [42], AlTawy et al. reviewed the sLiSCP design and developed an sLiSCP-light permutation. This permutation is the building block of sLiSCP-light-hash. The GFS design was changed to a partial substitution-permutation network (P-SPN) construction, and the resulting sLiSCP permutation hardware area was approximately 16% smaller than the previous hardware area. This change also improved the permutation function's bit diffusion and algebraic properties.
This improvement reduced the number of steps and achieved better throughput in the hashing and authentication modes.
Zhang et al. presented LNHash [80], a lightweight hash function that uses linear and nonlinear cellular automata as internal permutations. The goal of this hash function is to achieve high diffusion and confusion. Six types of hash functions with different levels and capacities have been proposed.
The ACE-H-256 [60] is a hash function developed based on the ACE permutation that has 320 input and output bits. This hash function uses the 5-block generalized version of sLiSCP-light [42]. The ACE permutation uses the SIMECKbox (SB-64) as a nonlinear layer.
ASCON-HASH [61] is a member of the ASCON family of cryptographic algorithms proposed in the NIST Lightweight Cryptography competition. Previously, ASCON was the winner of the Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR) [111], which was organized by the NIST to standardize the Authenticated Encryption (AE) algorithm.
KNOT-Hash [62], [63] belongs to the hash function family proposed in the second round of the LWC NIST competition. The hash function defines three operations used in each round: AddRoundConstant b , SubColumn b , and ShiftRow b . These operations are performed in different states and are defined according to the width b parameter in the sponge construction, i.e., 256 bits, 384 bits, and 512 bits. The KNOT permutation is similar to the 64-bit RECTANGLE block cipher [112], [113].
DryGascon-Hash [64] is a family of hash functions designed based on the DrySponge construction and the ASCON [114] algorithm. The DrySponge construction was developed based on the duplex sponge construction [115]. The designers of DryGascon claim that the safety of Gascon permutations is similar to that of ASCON permutations [64].
The ORANGISH algorithm is a member of the ORANGE cryptographic primitive family proposed by Mridul Nandi and Bishwajit Chakraborty [65]. The permutation used in this algorithm is PHOTON 256 [50]. The designer used this permutation mainly because it is the lightest 256-bit permutation in the literature. The hash function is similar to that of JH [105]. JH was one of the five finalists in the SHA3 competition organized by NIST [116].
The hash function ESCH [68] has two variants: ESCH256 and ESCH384. ESCH256 and ESCH384 accept inputs with arbitrary bit lengths and return hash values of 256 bits and 384 bits, respectively. The designer chose ESCH256 as the main proposal for the hash function. This algorithm was developed based on the SPARKLE permutation [67] family, with a rate of r and a capacity of c.
Subterranean [118] is a cryptographic primitive that was originally proposed in 1992 and has been used in hash functions and stream cipher functions. A modification of the Subterranean rotation function was used in the Subterranean2.0-XOF (extendable output function) algorithm [69], [70]. This algorithm uses the Subter-ranean2.0 loop function with an input of arbitrary bit length and an output of 256 bits. The designers claim that Subterranean2.0-XOF has 224-bit security.
XOODYAK [71] is a cryptographic primitive intended for use in hash functions, pseudorandom bit generators (PRBGs), authentication, encryption, and authenticated encryption (AE). The permutation is the building block of XOODYAK-HASH MODE. XOODYAK uses a 384-bit permutation XOODOO [119], [120]. XOODOO is a family of permutations inspired by KECCAK-p [23], [91]. Similar to KECCAK-p, the loop function XOODOO operates on a state with 3 horizontal planes known as a plane. Each plane consists of four 32-bit lane pieces.
HVH [72] is an LWCHF designed by Huang et al. that was presented at the Security, Privacy, and Anonymity in Computation, Communication, and Storage (SpaCCS) 2020 International Workshops in Nanjing, China, 18-20 December 2020. HVH uses a sponge construction based on the lightweight block cipher VH [121]. VH is a lightweight block cipher that was proposed by Dai et al. in 2015. VH has a block size of 64 bits and a key length of 80 bits. The HVH designers defined five different output message lengths, 88-bit, 128-bit, 160-bit, 224-bit, and 256-bit, for use in different application scenarios. HVH follows the structure of the substitution permutation network (SPN). The designer claims that the HVH hash function family strikes a delicate balance between hardware and software implementations and satisfies hardware usage requirements in extreme, resourcelimited environments.
LNMNT Hash is a sponge-based hash function that was proposed by Nabeel et al. at the 2021 8th International Conference on Computer and Communication Engineering (ICCCE) [82]. LNMNT Hash is based on the new Mersenne number transform (NMNT). The designer provided a security analysis in [81]. The designer analyzed the randomness, obfuscation, diffusion, hash value distribution, and differential attacks. There are four classes of LNMNT hash functions: LNMNTHash80, LNMNTHash128, LNMNTHash160, and LNMNTHash224.
D. CELLULAR AUTOMATA L-CAHASH [77] is an LWCHF-based cellular automaton with two variants: 128-bit and 256-bit. Designers claim that linear cellular automata have good chaotic properties and match the security analyses, statistical analyses, and software performance metrics of the hash function. Security analyses include the complexity, preimage and collision resistances, and avalanche criterion. For the statistical analysis, the author used the Diehard test [122]. The software performance analysis compares L-CAHASH with GLUON, U-QUARK, D-QUARK, S-QUARK, and PHOTON.     LCAHASH1.1 [79] is an extension of L-CAHASH [77] that uses a hybrid cellular automaton with a rule set of 30, 90. Based on the cycle per byte (CPB) metric, the software performance of LCHASH1.1 is better than that of L-CAHASH.

A. LWCHF CRYPTOGRAPHIC PROPERTIES
When a designer proposes an LWCHF, in addition to the implementation performance, the security properties are critical. The most commonly used term is cryptanalysis, or, in some literature, cryptanalytic attacks [33], [123]- [127]. Cryptanalytic attacks are attacks that determine the weak points of cryptographic primitives. Attacks on cryptographic hash functions are similar to attacks on other cryptographic primitives. In particular, if the LWCHF building blocks use existing cryptographic primitives, such as block ciphers or stream ciphers, automatic generic attacks on cryptographic primitives may also apply to LWCHFs.
Two important components of cryptanalysis are mathematical cryptanalysis and implementation attacks. Mathematical cryptanalysis involves attacks on the mathematical structure of cryptographic primitives. Implementation attacks exploit side-channel information, such as the execution time, RAM/ROM, power, or energy consumption, to analyze cryptographic primitives. These types of attacks are also called side-channel attacks. One example of a side-channel attack is differential fault analysis (DFA), which is commonly used with cryptographic hash functions [128]- [131]. The principle of the attack is to push errors or faults with unforeseen environmental conditions into the cryptographic implementation to reveal its internal state. We identified several mathematical cryptanalysis techniques, including differential cryptanalysis, linear cryptanalysis, integral cryptanalysis, algebraic cryptanalysis, rebound attacks, zero-sum distinguishers, slide attacks, rotational distinguishers, cube attacks, meets/misses in the middle distinguisher, invariant subspace distinguishers, boomerang attacks, yoyo games, truncated differentials, and impossible differentials.
Differential cryptanalysis and its derivatives are the most commonly considered types of cryptanalysis. Biham and Shamir [125] first proposed this attack in Crypto 1990 to attack the Data Encryption Standard (DES). This technique was also described in detail in a book published by the same researcher [123]. Differential cryptanalysis is a common technique for analyzing symmetric cryptographic primitives, particularly block ciphers and hash functions.
At EUROCRYPT 1993, Matsui [132] introduced theoretical attacks using linear cryptanalysis approaches to attack DES algorithms. Matsui performed practical attacks on the same algorithm [133]. The basic idea of this attack is to approximate the algorithm's operation with a linear expression. Integral cryptanalysis involves multiset attacks. Multiset attacks are a generic attack class that includes several attacks that appear in the literature under three different names: square attacks [134], saturation attacks [135], and integral cryptanalysis [136]. This type of attack was first discovered by Daemen, Knudsen, and Rijmen while analyzing the square block cipher [134]. A similar attack known as a saturation attack was used by Lucks [135] against the block cipher Rijndael. Biryukov and Shamir showed attacks of the same type on three arbitrary SPN rounds. Knudsen-Wagner's integral attack [136] on five rounds of MISTY [137] is in the same category. Since many hash function constructs use SPNs, this attack deserves careful consideration.
The main idea of algebraic cryptanalysis is to express a cryptographic hash function with a nonlinear equation involving the message input and hash value output. The nonlinear equations are in the form of polynomial equations. One advantage of algebraic cryptanalysis is its widespread application, as a set of polynomial equations can be used to describe any cryptographic primitive. Table 4 provides a detailed comparison of the performance of the LWCHF algorithm from the perspective of various cryptographic properties. We define the rate as the size of the message block processed during each round and denote the preimage, second preimage, and collision resistance as Pre, 2nd Pre, and Coll. Table 4 shows that almost all the identified LWCHF algorithms were evaluated by cryptanalysis. Table 4 summarizes a third-party cryptanalysis. Although this cryptanalysis cannot be used as a benchmark, the algorithm that was affected most by the attacks can be classified as weak and need special attention when implemented. Furthermore, it is necessary to determine whether the attacks occur in full or reduced rounds.
A lightweight hash function for a particular application must consider the cryptographic properties. For example, NIST [39] requires that the hash value length of the current usage be 256 bits, and a cryptanalytic attack requires at least 2 112 computations. Therefore, the user should not use hash functions with hash values of less than 256 bits for applications requiring high security levels. Such hash functions include ARMADILLO and ARMADILLO2 (80, 128, 160, and 192   C-RESENT, TWISH, Quark (136,176), SPN-Hash, and Hash-One. Thus, PHOTON, SPONGENT, and Lesamnta-LW were selected as lightweight hash function standards in ISO-IEC 29192-5 [138]. PHOTON and SPONGENT represent algorithms optimized for hardware, while Lesamnta-LW represents an algorithm optimized for software.

B. PERFORMANCE COMPARISON
In addition to studies carried out by the designers, several studies have attempted to compare the performance of LWCHF algorithms [139]- [142]. On the one hand, these efforts have provided essential information about the performance of LWCHFs, and their shortcomings are significant to note. However, the results may not provide a complete picture of the algorithm's potential for a given metric. In addition, the implementation assumptions or goals of various LWCHFs differ, and some proposals have more varied implementations than other proposals. Thus, the results do not indicate a ranking; rather, they serve as a general recommendation. Due to the differences in the metrics that VOLUME 10, 2022   designers use for various hardware and software implementations, as well as differences in the devices themselves, fair comparisons are almost impossible. Table 5 Table 5 shows that the performance metrics for many software implementations are not available. This condition occurs because there are differences in the designer's metrics.

1) HARDWARE
Figs. 8,9,10, and 11 illustrate the hardware implementation performance according to each metric. We summarize the hardware implementations for each type of technology in terms of the hardware area (GE), throughput, power, and latency. The performance of 40 nm technology is marked VOLUME 10, 2022    is 79 GE, which is less than the number of GEs in the LHash-80 implementation in serialized and long message modes.
The lowest latency was generated by Neeva-Hash, ARMADILLO2-A-80, GLUON-160, and GLUON-224, while the highest latency was generated by SPN-Hash-256, SPN-Hash-128, ARMADILLO2-E-256, and ARMADILLO2-D-160. Fig. 12 summarizes the best hardware performance based on the technology used. Two hash function algorithms occupy the first and second positions, namely, the sLiSCP-lighthash-160 and sLiSCP-hash-160 algorithms. Both algorithms obtain good ratings for all metrics and are included in the charts showing the best metrics. Serial implementations of several algorithms, such as Photon, LHash, Hash-One, and SPONGENT, were proven to use small hardware areas between 800 GE and 1200 GE. In addition to serial implementations, designers used the bit-slice technique to reduce the hardware area and design complexity. Some algorithms that use this technique are ACE-H-256, ASCON-HASH, KNOT, PHOTON-Beetle-Hash, SipHash, and sLiSCP-light-hash.

2) SOFTWARE
Because the software implementations of the LWCHF algorithm are more varied than the hardware implementations, many algorithms have empty metric values. This condition shows that algorithm designers use different hardware, software, and metrics.

C. RESEARCH CHALLENGES AND FUTURE DIRECTIONS
Designing lightweight cryptography primitives is a challenging task. The designer must balance the security, performance, and cost when implementing the algorithms in either hardware or software. We identified several issues and challenges that should be considered in future research.

1) LWCHF DESIGN AND IMPLEMENTATION
The cryptographic implementation investigated in this study demonstrates the overall performance of various LWCHF designs. However, the results of this study are distorted due to the dependence on tools and technology, resulting in significant deviations between studies. Therefore, it is crucial to develop another solution, such as proposing a novel hash function to compare with the existing hash function. This new paradigm may increase the quality and quantity of research on lightweight cryptographic hash functions. In particular, lightweight permutation designs with reasonable diffusion rates and resistance to differential, linear cryptanalysis, or other attacks were researched. This research opportunity was possible due to the various permutations designed for multiple cryptography primitives. These permutations can be used for various cryptography primitives, such as AEAD, hash functions, PRNG, and KDF. Some permutations include ACE [60], sLiSCP [43], sLiSCP-light [42], XOODOO [119], Sparkle [67], [68], Alzette [143], and Subterranean2.0 [69], [70].

2) SUBSTITUTION BOX DESIGN
An alternative s-box with a smaller hardware implementation area and similar cryptographic properties to the proposed s-box, namely, the Simeck s-box used in permutations of the sLiSCP, sLiSCP-light, ACE-H-256, sLiSCP-hash, and sLiSCP-light-hash algorithms, should be developed.

3) OPTIMAL ROUND FUNCTION DESIGN
An optimal round function based on a permutation substitution network (SPN), Feistel network, addition, rotation, and XOR (ARX) structure, or another approach should be designed.

4) SECURITY METRICS STANDARDIZATION
The metrics for evaluating the security performance and hardware and software implementations vary widely. As mentioned in the previous discussion, because of this condition, fair comparisons of different algorithm implementations are almost impossible. Therefore, standard hardware and software security and performance metrics should be developed to analyze LWCHF security and implementations on devices with limited resources. Several attempts to develop such metrics have been made, including by NIST (USA), Cryptrec (Japan), and ECRYPT (Europe).

5) NOVEL CRYPTANALYTIC ATTACKS
New cryptanalytic approaches for analyzing the proposed permutations or hash function algorithms, particularly differential cryptanalysis and linear cryptanalysis and attacks on secure hash function properties, including the preimage, second preimage, and collision resistance, should be researched.

IX. CONCLUSION
The lightweight cryptographic hash function has played a crucial role in the development of the IoT. This paper presents recent developments and state-of-the-art implementations of lightweight cryptographic hash functions. The hardware and software implementations of LWCHFs were examined based on nine metrics. In addition, the security, cost, and performance properties of different proposals were considered. Furthermore, a comparative analysis was presented, with the information presented in corresponding tables. A large number of studies have been conducted as the field has developed, with brand new algorithms and cryptanalytic attacks proposed in published works. We hope that the review presented in this study provides a clear picture of LWCHF so that other researchers can use it as a starting point and consideration in designing a robust and secure LWCHF. SURYADI SURYADI received the B.S. degree in mathematics from the Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Indonesia, in 1990, the master's degree in informatics engineering from the Institute Technology Bandung, Indonesia, in 1998, and the Ph.D. degree from the Department of Electrical and Computer Engineering, Universitas Indonesia, in 2013. He has been a Lecturer (an Associate Professor) with the Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, and the Department of Electrical Engineering, Universitas Indonesia. He is the author and the coauthor, has published over 30 papers in leading international journals and conferences, and has written two books and contributed to one book chapter. His research interests include information security, cryptography, and computational mathematics.
KALAMULLAH RAMLI (Member, IEEE) received the master's degree in telecommunication engineering from the University of Wollongong, Wollongong, NSW, Australia, in 1997, and the Ph.D. degree in computer networks from the Universitaet Duisburg-Essen (UDE), North Rhine-Westphalia, Germany, in 2003. He has been a Lecturer at Universitas Indonesia (UI), since 1994, and a Professor of computer engineering, since 2009. He currently teaches advanced communication networks, embedded systems, object-oriented programming, and engineering and entrepreneurship. He is a prolific author, with more than 125 journals/conference papers and eight books/book chapters published. His research interests include embedded systems, information and data security, computers and communication, and biomedical engineering.
BERNARDI PRANGGONO (Senior Member, IEEE) received the B.Eng. degree in electronics and telecommunication engineering from Waseda University, Japan, the M.DigComms. degree in digital communications from Monash University, Australia, and the Ph.D. degree in electronics and electrical engineering from the University of Leeds, U.K. He has previously held academic and research positions at Glasgow Caledonian University, Queen's University Belfast, and the University of Leeds. He has held industrial positions at Oracle, Pricewa-terhouseCoopers, Accenture, and Telstra. He is currently a Senior Lecturer with the Department of Engineering and Mathematics, Sheffield Hallam University. His current research interests include cybersecurity, the Internet of Things, cloud computing, and green ICT. He is a fellow of the Higher Education Academy (HEA). He is an Associate Editor of