Secure and Compact: A New Variant of McEliece Cryptosystem

This paper introduces a variant of the McEliece cryptosystem and employs the <inline-formula> <tex-math notation="LaTeX">$(C_{1}, C_{1} + C_{2})$ </tex-math></inline-formula>-construction to generate a new code from two arbitrary linear codes. We propose an efficient hard-decision decoding algorithm for linear codes derived from the <inline-formula> <tex-math notation="LaTeX">$(C_{1}, C_{1} + C_{2})$ </tex-math></inline-formula>-construction and integrate them into the McEliece framework. The security of the cryptosystem varies based on the specific codes used in the <inline-formula> <tex-math notation="LaTeX">$(C_{1}, C_{1} + C_{2})$ </tex-math></inline-formula>-construction. Our proposed variant achieves a good level of security with approximately the same key size compared to one of the classic McEliece candidates of the National Institute of Standards and Technology (NIST) standardization process. Specifically, we demonstrate a 25% key size reduction for our proposed parameters compared to one of the 256-bit secured classic McEliece parameters.


I. INTRODUCTION
Code-based cryptography is expected to become a highly efficient method of encryption once quantum computers become more prevalent.In 1976, Diffie and Hellman introduced public-key cryptography [2], and code-based cryptography is one of its forms.The Rivest, Shamir, and Adelman's (RSA), Elliptic Curve (ECC), and Hyperelliptic Curve Cryptosystems (HCC) are the most commonly used public-key encryption systems for secure data transmission.RSA is based on the integer factorization problem, while ECC and HCC are based on the discrete logarithm problem.
Quantum breakthroughs in integer factoring and discrete logarithmic problem-solving were pioneered by Shor in 1994 [3].This seminal discovery raised a red flag for the security of numerous cryptosystems, such as RSA [4], ECC [5], and HCC [6], as the ascent of quantum computers looms: it will break all schemes based on them like [7] and [8].Consequently, there is a heightened focus on crafting efficient code-based cryptosystems that can withstand the looming threat of quantum attacks.Recognizing the urgency of the situation, NIST is actively engaged in the standardization of quantum-resistant public-key cryptographic algorithms, with code-based cryptosystems emerging as one of the most promising alternatives.
The associate editor coordinating the review of this manuscript and approving it for publication was Mahdi Zareei .
NIST launched the PQC initiative in 2016 and has been working to evaluate and standardize candidate algorithms through a public competition.The goal is to provide a set of standardized post-quantum cryptographic algorithms to secure sensitive information in the post-quantum era.This initiative has generated significant interest from researchers and practitioners in the field of cryptography, as it represents a significant effort to prepare for future security challenges.Out of 69 proposals, 19 are based on the coding theory.The fourth phase of NIST's standardization procedure [9] comprises three code-based cryptographic systems: the classic McEliece [1], which relies on binary Goppa code; BIKE [10], which is grounded in quasi-cyclic MDPC code; and HQC [11], which utilizes the quasi-cyclic approach and doesn't necessitate the utilization of error-correcting codes algebraic structure.
Several code-based cryptographic submissions have been assessed in the context of NIST's post-quantum cryptography evaluations.Among these, BIG QUAKE [12] and Edon-K [13] rely on Goppa codes but exhibit relatively large key sizes and faced security issues.Moreover, DAGS [14], initially considered secure with dyadic generalized Srivastava codes, has since been compromised.LAKE [15] used a combination of techniques like Ideal codes (IC), Double circulant (DC), and Low-Rank Parity check codes (LRPC) but has been merged into the ROLLO [16] project for further development.LEADkem [17] and LEDApkc [18], both using QC LDPC codes, have also been merged under the LedAcrypt [18].Lepton [19], using BCH codes, faced cryptanalysis.LOCKER [20], akin to LAKE in its techniques, was also merged into the ROLLO project.McNie [21] employed QC LRPC codes but was found to be insecure.Ouroboros-R [22], employing QC and LRPC codes, has also been merged into the ROLLO project.QC-MDPC KEM [23] used QC MDPC codes but is not considered secure.Ramstake [24] utilized Reed-Solomon codes but has been broken, and RLCE-KEM [25], using GRS codes, also faced security issues.These submissions collectively highlight the ongoing efforts and challenges in developing secure post-quantum cryptographic algorithms.While some have faced vulnerabilities and security breaches, they contribute to our understanding of the strengths and weaknesses of code-based cryptography, driving progress toward improved security solutions.McEliece presented the first code-based public key cryptosystem [26] in 1978, which is computationally infeasible to break from all possible attacks from the last four decades.The McEliece cryptosystem relies on two key elements: the decoding of random linear codes, and the crafting of an indistinguishable public key matrix G ′ resembling a random k × n matrix, ensuring the security of the public key generation process.However, McEliece's recommendation of using binary Goppa codes for encryption has led to exploring different families of linear codes to reduce key size.Moreover, Niederreiter [27] introduced the dual McEliece cryptosystem over the Generalized Reed-Solomon codes.Linear codes such as Reed-Solomon [27], Reed-Muller [28], Algebraic Geometry codes [29], LDPCcodes [30], [31], Polar codes [32], QC-LDPC [33], and MDPC [34] codes are used as variants of binary Goppa code in the McEliece cryptosystem to minimize the key size and increase the security.Most of them have been broken [35], [36], [37], [38], [39], [40], [41], [42] or partially broken [43], except the original McEliece cryptosystem.

A. CRYPTOSYSTEMS BASED ON THE
There have been several cryptosystems that are based on the (C 1 , C 1 + C 2 ) construction of codes.Márquez-Corbella and Tillich [44] proposed a cryptosystem that used Reed-Solomon codes for C 1 and C 2 .They have provided a soft decision decoding algorithm for the (C 1 , C 1 + C 2 ) code.The decoding algorithm first employs a soft decision decoding algorithm for (0, C 2 ), followed by a soft decision decoding algorithm for the (C 1 , C 1 )-structured code.The authors demonstrated that this cryptosystem resists the Sidelnikov and Shestakov attack [35].
In 2017-18 two signature schemes were proposed that utilized the (C 1 , C 1 + C 2 )-type of code.One scheme, called SURF [45], is based on the binary field, while the other, known as WAVE [46], is based on the ternary field.The SURF signature scheme employs non-error-correcting source distortion codes and utilizes a Prange decoding algorithm for document signing, posing a challenge to code-based cryptography.However, this scheme only works when the dimension of the first code C 1 is greater than or equal to the dimension of the second code C 2 , rendering it vulnerable to forgery attacks.The authors of the SURF signature scheme demonstrated that the dimension of the hull of a random linear code is distinct from the dimension of the hull of a (C 1 , C 1 + C 2 ) code when k 1 ≥ k 2 .They have also shown that this distinction is ineffective when k 1 < k 2 , but their scheme also fails in this regime because it works for k 1 ≥ k 2 .The SURF signature scheme exhibits a distinction in hull dimensions between a random linear code and a (C 1 , C 1 +C 2 ) code when k 1 ≥ k 2 but becomes ineffective when k 1 < k 2 .The scheme is specifically designed for k 1 ≥ k 2 , and its efficacy is compromised in scenarios where k 1 < k 2 .The authors discussed the NP-completeness of the distinguishing problem, revealing that determining the permutation matrix from the permuted (C 1 , C 1 + C 2 )-type of linear code is an NP-complete problem.On the other hand, the WAVE scheme is secure as it employs a ternary field and eliminates all the conditions that render SURF insecure.
Another cryptosystem based on the (C 1 , C 1 + C 2 ) construction operates on the interleaved low-rank paritycheck (ILRPC) codes and λ-Gabidulin codes [47].This cryptosystem shares the same concept as Corbella and Tillich's works, with their security relying entirely on the ILRPC rank metric codes.Our literature review also reveals several proposed schemes that have utilized the (C 1 , C 1 + C 2 ) code construction, differing primarily in the decoding algorithm and security analysis.Moreover, our cryptosystem also adapts the McEliece framework on the (C 1 , C 1 + C 2 ) code.Consequently, in this study, we discuss the hard-decision decoding algorithm for the (C 1 , C 1 + C 2 ) code and all possible attacks on the proposed work.[48] and adapt it for one case when the minimum Hamming distance of the code satisfies 2d 1 ≤ d 2 .Additionally, we extend this decoding algorithm to handle the scenario where d 2 < 2d 1 .As a result, the error-correcting capability of the (C 1 , C 1 + C 2 ) code is greater than or equal to the codes C 1 and C 2 .When 2d 1 ≤ d 2 , the error-correcting capability of the (C 1 , C 1 + C 2 ) code is greater than the code C 1 .In the second case, the error-correcting capability of C matches the error-correcting capability of the code C 2 .Since we used the complexity of the ISD algorithm to discover the error vector, the parameter t is crucial in determining the security level of the McEliece cryptosystem.

B. OUR CONTRIBUTION
The encryption algorithm remains the same as in McEliece's cryptosystem.However, our proposed cryptosystem introduces a change in the decryption algorithm as required.During decryption, we post-multiply the ciphertext by one of the secret keys, specifically a permutation matrix, and use the proposed decoding algorithm of Section III.
In our construction, we utilized two specific examples: a.)For the first example, we employed both C 1 and C 2 with the GRS code.b.)In the second example, we used C 1 as a binary Goppa code and C 2 as a GRS code.We introduced a decoding algorithm for the (C 1 , C 1 + C 2 ) code, and we have applied this algorithm to decipher the plaintext.Specific factors influenced our choice of examples: a.) We opted for the binary Goppa code due to its enhanced security profile compared to other linear codes in code-based cryptography.b.)The GRS code, being a Maximum Distance Separable (MDS) code, offers excellent error-correcting capabilities.However, it's acknowledged that the GRS code is susceptible to attacks in the McEliece variant.Therefore, we strategically incorporated it in the (C 1 , C 1 +C 2 ) construction to obscure its structure and mitigate potential vulnerabilities, as discussed in the security section.
This characteristic helps in enhancing the security level of the cryptosystem, which is computed against the Stern [49] and generalized Stern ISD algorithms [50].Our proposed variant of the McEliece cryptosystem offers significant advantages over a parameter of the classic McEliece cryptosystem described in the submission of the fourth round of NIST.A key feature of our cryptosystem is a 25% reduction in key size compared to classic McEliece [1], ensuring a more cost-effective alternative for implementation while maintaining the same level of security.
We formally prove that adding a randomly generated bit-string to the plaintext ensures semantic security against chosen plaintext attacks (IND-CPA) in our proposed (C 1 , C 1 +C 2 )-based McEliece cryptosystem framework.This proof relies on standard assumptions about the difficulty of general decoding and the indistinguishability problem associated with public keys.The primary objective of code-based cryptography is to ensure security even in the presence of reliable quantum computers.Our proposed system addresses this concern by relying on a coding theory problem, specifically the decoding of random linear codes, which remain unbroken on quantum computers.Because the quantum search algorithm called Grover's ISD algorithm [51] is just the square root of the complexity of ISD algorithms, which work on classical computers, that's why the proposed system is computationally infeasible on quantum computers as well.
Paper organization.Section II provides an overview of the preliminary concepts.In Section III, we propose a decoding algorithm based on the (C 1 , C 1 + C 2 )-construction.We provide a variant of the McEliece cryptosystem in Section IV.Section V addresses the attacks (structural and non-structural), key size, and examples.Computational complexity is discussed in Section refsecVI.IND-CPA security is discussed in Section VII.Finally, we discuss the future prospects of the paper in Section VIII.

II. PRELIMINARIES
Consider a finite field F q , where q represents the size of the field and is a prime power.Let n be a positive integer, and F n q is an n-tuple vector space of dimension n over F q .A vector in F n q is represented as a row vector.The row vector is represented in bold character x and scalar in x.
Definition 1: A subspace of a vector space F n q over a finite field F q is called a linear code C. The term [n, k] represents the length n and dimension k of the linear code C.
Definition 2 (Inner product): For any u and v in F n q the inner product on the vector space F n q is ⟨u, v⟩ = n i=1 u i v i .Definition 3: The dual of a linear code C is defined as , respectively, defined over the same finite field where G 1 and G 2 are generator matrices of C 1 and C 2 , respectively, [0] represent zero matrix with k 1 × n order.Essential to the security of code-based cryptosystems are several NP-complete problems in coding theory.ÂThere are two mainly discussed, and highly studied problems are: Definition 5 (General decoding problem): For a given size of k × n generator matrix G over F q , t ∈ N and a vector r ∈ F n q , the problem is to determine vector m ∈ F k q and e ∈ F n q with wt(e) = t such that e = r − mG.Definition 6 (Syndrome decoding problem): Given an element s ∈ F n−k q , where t is a positive integer, and a parity check matrix H of size (n−k)×n, the problem is to determine a vector e ∈ F n q with wt(e) = t, such that it satisfies the equation H e T = s T .
This cryptosystem leverages coding theory problems to establish one-way functions within the encryption algorithm, concealing the plaintext as a random vector.For decryption, the receiver must possess knowledge of the code's structure.
The general framework of this cryptosystem is as follows: Choose positive integers q, n, and k with k ≤ n.Then, construct a linear code C over the finite field F q , with parameters [n, k, d], selected from a family of linear codes.

Key generation:
The sender creates a key using the linear code's generator matrix G and uses this key to produce the public key.To do this, select a k × k size invertible matrix S, and a n × n size permutation matrix P over F q , and compute is the public key, and (S, G, P) is the private key, where t is the error-correcting capacity of the code.Encryption: The receiver uses the sender's public key to encrypt the message m ∈ F k q as a ciphertext c = mG ′ + e, where e is an error vector chosen uniformly at random from F n q with a weight of t.Decryption: The sender receives the ciphertext c from the receiver and decrypts it as follows: 1) cP −1  = mSG + eP −1 , noting that eP −1 has a weight of t due to P being a permutation matrix.
2) Utilize an efficient decoding algorithm of the linear code (depending on the specific code) to find mS.3) Retrieve the original message as m = mSS −1 .

IND-CPA
The IND-CPA (indistinguishable chosen plaintext attack) security game involves interactions between a challenger and an adversary aiming to distinguish between encrypted plaintexts.In this scenario, the adversary's objective is to correctly identify the plaintext corresponding to a given ciphertext sent by the challenger.If the adversary successfully achieves this, they win, indicating a breach in the security of our public key encryption; otherwise, the encryption is considered to be IND-CPA secure.The game proceeds as follows: 1.The challenger generates a public key and a private key by using a designated key generation algorithm and specified parameters.The public key and parameters are then transmitted to the adversary.2. The adversary selects two plaintexts, denoted as m 1 and m 2 , and transmits them to the challenger for encryption.The public key encryption scheme is deemed IND-CPA secure if there is no polynomial-time adversary capable of reliably discerning the actual plaintext corresponding to a given ciphertext.

III. DECODING ALGORITHM FOR THE (C
The primary objective is to construct a decoding method for the code C, achieved by combining the decoding procedures of the individual codes C 1 and C 2 , which are linear codes of length n over F q .Let D C i denote the efficiently decodable algorithm of code C i , capable of correcting up to 2 ⌋ be the error-correcting capability and d be the minimum distance of C. Consider the transmission of a codeword c through a communication channel, which introduces a randomly generated error of weight t.The received vector can be expressed as r = c + e = (c 1 , c 1 + c 2 ) + (e 1 , e 2 ), where r 1 = c 1 + e 1 and r 2 = c 1 + c 2 + e 2 .It is important to note that the Hamming distance of the code C can be either 2d 1 or d 2 , corresponding to the respective minimum distances of C 1 and C 2 .A decoding method for the first case, d = 2d 1 , has been discussed in [48]; our study works into investigating the second case, d = d 2 .Therefore, we discuss both cases.
Case 1: .Now, we make the claim that either wt(e 1 ) ≤ t 1 or wt(e 2 ) ≤ t 1 : On the contrary, we assume that wt(e i ) > t 1 for i = 1, 2. Therefore, by using equations ( 2) and (3), we obtain wt(e) = wt(e 1 , e 2 ) = wt(e 1 ) + wt(e 2 ) This leads to a contradiction; thus, there must exist at least one i such that wt(e i ) < t 1 .Consequently, we can decode the codeword c 1 using D C 1 of C 1 .However, we do not know which index i is to decode.Nevertheless, we can detect it by checking that we have not corrected more than and d(r, c Use the conditions and get which contradicts the eq.( 5).Hence, we can find a codeword c 1 using the decoding algorithm D C 1 of the code C 1 .Therefore, there exist atleast one i such that wt(e i ) < t 1 , and decode the codeword c 1 using D C 1 .However, we do not know which index i is to decode.Nevertheless, we can detect it by checking that we have not corrected more than

Algorithm 1 Structure of the decoding algorithm
Input: ) ∈ F n q with Hamming weight t, respectively.Output: The proposed decoding algorithm's complexity is related to the decoding complexity of the linear codes under consideration.The algorithm performs decoding in two steps.First, decoding is performed on r 2 − r 1 with D C 2 .In the second step, D C 1 works on r 2 − c 2 and r 1 .Thus, the given algorithm's complexity is twice the decoding complexity of C 1 and once for C 2 .Specifically, it is given by 2Dec(C 1 ) + Dec(C 2 ), where Dec(C 1 ) (respectively Dec(C 2 )) represents the decoding complexity of C 1 (respectively C 2 ).Thus, the decoding complexity of the (C 1 , C 1 + C 2 ) code remains polynomial when C 1 and C 2 exhibit polynomial time decoding complexities.= SGP and t, the error-correcting capability of a code C. We have two cases: Now the public key is (G ′ , t) and private key is (S, G, P, C 1 , C 2 ).Encryption: Let u ∈ F k q be the plaintext, we encode it as c = uG ′ + e where e ∈ F n q is an error vector with weight t chosen uniformly at random.Decryption: Post-multiply the permutation matrix P −1 with ciphertext c, cP −1  = uSG Since P is a permutation matrix, it does not affect the weight of the error e. Write S as [S 1 |S 2 ] where S 1 is the first k 1columns of S and S 2 is remaining k 2 columns.Equation ( 6) derived as Since eP −1 has length n, split it into n/2 bits and takes e ′ 1 and e ′ 2 as the first and last n/2 coordinates of eP −1 .The first n/2 coordinates are denoted as and last n/2-cordinates are Subtract eq. ( 7) from ( 8), we find .

VOLUME 12, 2024
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Employing the algorithm for decoding as detailed in Section (III), decode the text uS 2 with wt(e ′ 2 − e ′ 1 ) ≤ t 2 .Now, decode x 2 − x 1 and x 1 with the same decoding algorithm and gets uS 1 .By applying S −1 on [uS 1 |uS 2 ], get the plaintext u.

V. SECURITY
A cryptosystem uses cryptographic methods to ensure the security of information.The strength of a cryptosystem depends on the information available to the public and the mathematical challenges involved in its construction.Adversaries can employ different attacks, such as distinguisher-based and side-channel attacks, to steal data.The McEliece cryptosystem is one of the post-quantum secured systems that relies on the effectiveness of one-way functions associated with the public key and ciphertext.An ideal cryptosystem should have a small key size and high attack complexity to enhance security.The primary threats to the McEliece cryptosystem come in the form of structural and non-structural attacks.As a result, we examine all possible attacks that can be launched against the variant of the McEliece cryptosystem that we have proposed.

A. NON-STRUCTURAL ATTACK EXHAUSTIVE SEARCH ATTACK
This attack involves a random search for the private key.The permutation matrix P can be found among n! possibilities.To find the non-singular matrix S of size k × k, the attacker needs to search through k−1 i=0 (q k − q i ) possibilities.The chances of successfully determining the accurate values of S and P are 1 k−1 i=0 (q k −q i ) and 1  n! , respectively.However, due to the exponential complexity, this attack is not practical for typical system parameters.

INFORMATION-SET DECODING ATTACK
ISD algorithms excel as the preferred choice for decoding general linear codes.Prange gave the first ISD algorithm [52] in 1962 based on a binary field.After that, some modifications to the ISD algorithm were proposed, such as fast-speed and low-complexity algorithms.Lee and Brickell's [53] and Stern's algorithms [54] represent enhancements over Prange's approach, achieving a reduction in the required number of iterations while raising the per-iteration cost.These algorithms were initially put forth for binary fields and later extended to finite fields F q by Peters in [50].Regarded as one of the speediest ISD algorithms on classical computers, the Stern algorithm is frequently utilized.It is an enhancement over Lee-Brickell's and Prange's algorithms, reducing the average number of required iterations and increasing the cost per iteration while decreasing the overall cost.
In our proposed cryptosystem, we employ the Stern algorithm as an ISD attack for conducting cryptanalysis.The algorithm operates as follows: first, we identify an information set and organize the parity-check matrix (obtained through GH ⊥ = 0) based on this information set.We then partition the information set into two subsets, denoted as I 1 and I 2 , where I 1 contains vectors from the first partition and I 2 contains vectors from the second partition.By comparing two fixed vectors, we can determine if they yield the desired error vector.Kindly refer to [55, Chapter 4] for details.Complexity estimations for binary and non-binary cases can be found in [55, Chapter 4], and we have employed these estimations to determine the system's security level.

B. STRUCTURAL ATTACK
A structural attack in cryptography involves uncovering the private key structure using information from the public key.The adversary must find the private key to execute such an attack successfully.
One category of structural attacks is the distinguisherbased attack.This type of attack aims to distinguish a specific code from a randomly generated one.Within the realm of the McEliece cryptosystem, several variants have fallen prey to distinguisher-based attacks.For example, the McEliece cryptosystem, which relies on the sum and expanded GRS codes with added random elements, is vulnerable to such attacks.
The SURF signature scheme [45] uses a generalized (C 1 , C 1 + C 2 ) code structure and the Niederreiter cryptosystem framework to generate hash and sign algorithms, as discussed in the introduction.The security of the signature scheme is based on two problems: 1) the syndrome decoding problem, and 2) determining if a linear code is a permuted (C 1 , C 1 + C 2 ) code.Our proposed cryptosystem focuses on the second problem, which involves revealing the (C 1 , C 1 + C 2 ) code structure.We use the same idea as the SURF scheme to demonstrate that an attack is not feasible.This section discusses the difficulty of discovering the permutation of the (C 1 , C 1 + C 2 ) code, the problem of finding the structure of subspaces (C 1 , C 1 ) and (0, C 2 ), and presents the optimal strategy to ensure the proposed cryptosystem's security.Problem of Finding the Structure of a (C 1 , C 1 + C 2 ) Code: In the SURF scheme, authors addressed the challenge of determining if a linear code conforms to the structure of a permuted (C 1 , C 1 + C 2 ) code.To achieve the target, they proved some results: determine the weight distribution of a codeword derived from a random linear code and a (C 1 , C 1 + C 2 )code in which both C 1 and C 2 are random linear codes.The difference between the weight distribution of a random linear code and the weight distribution of (C 1 , C 1 + C 2 )-code is tiny for the same length and dimension.As the weight distribution remains invariant under permutations, we can identify the low-weight codewords within either the permuted (c 1 , c 1 ) codeword (where c 1 belongs to C 1 ) or the permuted (0, c 2 ) codeword (where c 2 is an element of C 2 ).After finding such a codeword, they apply some existing algorithm [56,Subs. 4.4], and find the valid signature.This signature attack is different from our encryption scheme, but as our construction is based on (C 1 , C 1 + C 2 ) code, there must exist particular subspaces (C 1 , C 1 ) and (0, C 2 ).This special subspace may reveal the lowest weight codewords of the code.An algorithm is proposed in the SURF scheme to find out the basis elements of the permuted version of subspaces (C 1 , C 1 ) and (0, C 2 ).We use this algorithm in our scheme and discuss it in the algorithm (2).Now, let us define the support of a code.The support of code C is defined as the union of the supports of its codewords, where the support of a codeword is defined as the set of the positions of the nonzero coordinates of the codeword c.Our approach's fundamental security mainly depends on determining whether a linear code corresponds to a permuted The General Idea of Computing C 1 and C 2 up to some Permutation: This idea is formulated by the algorithm that is mentioned in the distinguisher attack on (C 1 , C 1 + C 2 ) codes.The given public key G ′ generates a linear code C pub such that Our aim is to find the basis elements of the C ′ 1 and C ′ 2 , respectively, up to some permuting positions.The following are the three functions used in the algorithm: • Codewords(Punc I (C pub ), p) finds all the codewords of weight p in the punctured public code Punc I (C pub ).
• PERFECT(x, I , C pub ) identifies all codewords c within C pub that satisfy the condition: the elements of c outside the set I are equal to x.
2) Collect all codewords of C pub in U that have been punctured at position I and should have weight p. 3) Now search for every element say u in U , a codeword z of C pub whose restriction outside I is equal to u. 4) Check that z belongs to C ′ 2 (or, respectively C ′ 1 ).If it does, keep it on a list.5) If the list is empty, start it again from step 1; otherwise, find the remaining elements by applying steps 3 and 4. To determine the optimal number of iterations required to recover at least one element, denoted as N. We focus on the success probability of obtaining a codeword in a single iteration for C ′ 1 and C ′ 2 .By analyzing this, we can choose N such that it is at least ( 1P succ ) to ensure a non-zero list.
Proposition 1 ( [45], proposition 9): The success probability of obtaining an element using the algorithm DetermineC ′ 1 in the list C during one iteration executed by the for loop is given by: .
here g is defined as g(y) = max{y(1 − y/2), 1 − 1/y}.Proposition 2 ( [45], proposition 8): The success probability of obtaining an element using the algorithm DetermineC ′ 2 in the list C during one iteration executed by the for loop is given by In this equation, the function g remains the same as defined in the preceding proposition.Complexity of Finding the Permuted Version of C 1 and C 2 .The complexity of getting C 1 and C 2 up to permutation has computed in [45], which is Moreover, g is defined as above.The dual of a code (C 1 , [57, proposition 2.5].Therefore, we apply the same approach to the dual code.By utilizing this algorithm in our scheme, we discovered that its complexity 35592 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. is exponential, corresponding to our proposed parameters for binary cases.This observation is presented in Table (2) in subsection V-C.
Further, the attack that breaks the SURF scheme depends on distinguishing between the dimension of the hull of a random linear code and a (C 1 , C 1 + C 2 ) code, where C 1 and C 2 are random linear codes.One can obtain partial information on the permutation matrix by finding the hull of the permuted (C 1 , C 1 + C 2 ) code when k 1 ≥ k 2 .However, this attack only works when k 1 ≥ k 2 , not when k 1 < k 2 .
The McEliece cryptosystem security relies on the idea that the publicly known code appears indistinguishable from a randomly generated code.In order to enhance the resilience of our proposed cryptosystem against SURF attacks, we have imposed a restriction such that k 1 < k 2 .

DISTINGUISHER ATTACK(SQUARE-CODE ATTACK)
An alternative method exists for distinguishing a particular linear code structure from randomly generated ones.However, it is necessary to delve into specific definitions related to this attack.
Definition 7 (Schur Product): For a given u and v ∈ F n q .The schur product of u and v is defined as u * v = (u 1 v 1 , u 2 v 2 , . . ., u n v n ), where * represent the componentwise product.
Definition 8 (Schur Product codes): The Schur Product between the two codes C 1 and C 2 is defined as The Schur Product code becomes a square code when C 1 = C 2 , and its dimension, as discussed in [58] and G F 2 be the generator matrices of the codes E 1 , E 2 , F 1 , and F 2 respectively.Therefore, the generator matrix of the code E and F is given by and To obtain a generator matrix for E * F, one can compute the componentwise products between each row of matrix G E and each row of matrix G F and subsequently eliminate any rows that are linearly dependent.The matrix G E * G F contains all the componentwise products of rows in G E with rows in G F .
After this process, the rows that are linearly dependent are eliminated to obtain the required matrix.Since represent the collection of rows contained in G E * G F generates E * F. By eliminating rows that are linearly dependent, we obtain a generating matrix of the type where G ′ 1 and G ′ 2 represents the matrix for E 1 * F 1 and □ The generator matrix of the square code for by using (11).
Based on this, the square code E * E dimension is as follows:
Since G E 1 * G E 2 is generated by {g i * g j } where g i and g j are rows of generator matrix of G E 1 and G E 2 , respectively, for 1 ≤ i ≤ k 1 and 1 ≤ j ≤ k 2 .Additionally, it is noted that the space generated by {g i * g j ; . We only discuss the general structure of the square code of (C 1 , C 1 + C 2 ) and provide only an upper bound of the dimension of the square code.The square code attack, as discussed in [59], [60], and [61], only works when the number of randomly added columns in the public key is less than n − k in the case of generalized Reed-Solomon codes.Therefore, the square code attack does not apply to our proposed parameters for generalized Reed-Solomon codes, which makes the proposed work resistant to the square code attack.The determination of the tightness and lower bounds for our proposed cases is open for anyone to explore.

C. KEY SIZE
The robustness and effectiveness of the McEliece cryptosystem hinge on the careful consideration of its key size, making it a pivotal factor in ensuring its security against potential threats.The key size in our proposed cryptosystem is kn log 2 q-bits, where n is the length and k is the dimension of the proposed code.TABLE 2. Key size analysis of the proposed cryptosystem using C 1 as a Goppa and C 2 as a GRS code at different security levels against the Stern ISD algorithm [49].Table 1 presents the key sizes for achieving a good security level of 201 and 245 against the Generalized Stern ISD algorithm [50].The cryptosystem employs generalized Reed-Solomon codes C 1 and C 2 with dimensions k 1 and k 2 , respectively, to construct C. In Table 2, we delve into the complexity of distinguishing the permuted codewords of (C 1 , C 1 ) and (0, C 2 )-codes from equations 9 and 10.Our analysis reveals that these codes offer a commendable level of security.In Table 2, we explore the security aspects of the proposed cryptosystem, particularly focusing on the utilization of binary Goppa and GRS codes.The first two rows present codes generated from binary Goppa and GRS codes, achieving security levels of 133 and 192 bits, respectively, against the Stern ISD algorithm [49].Moving to the third row, we employ a binary Goppa code C 1 with parameters [1980,539,263]

VI. COMPLEXITY ANALYSIS
Let's consider the cost of one addition in F q as log 2 (q) binary operations (denoted as δ), and one multiplication as log 2 (q) 2 binary operations.The complexity of the encryption algorithm is then expressed as nk(δ + δ 2 ) binary operations.The decryption complexity relies on the choice of linear codes for C 1 and C 2 .The overall decoding complexity involves: 1. n operations to perform cP −1 , 2. Decoding complexity: 2Dec(C 1 ) + Dec(C 2 ), and 3. uSS −1 requiring k 2 (δ + δ 2 ) binary operations.Hence, the total complexity is n + 2Dec(C 1 ) + Dec(C 2 ) + k 2 (δ + δ 2 ).The computational complexity of decoding a binary Goppa code is O(n log 2 (n)) [62].For a GRS code, the efficient algorithm is the Berlekamp-Massey Decoding algorithm [63], with a computational complexity of O(n 2 ).Therefore, based on our example in table 1, the computational complexity of the decryption algorithm is n + 3O(n 2 ) + k 2 (δ + δ 2 ) ≈ O(n 2 ).Similarly, in the example of table 2, the computational complexity of the decryption algorithm is n + O(n log 2 (n) + 2O(n 2 ) + k 2 ≈ O(n 2 ).

VII. IND-CPA SECURE VERSION
The

VIII. CONCLUSION
In conclusion, this study focused on enhancing the security of the McEliece framework through the novel (C 1 , C 1 + C 2 )construction of codes.Our cryptographic system, utilizing binary Goppa and GRS codes within this construction, demonstrated significant improvements in key size while maintaining robust security against known attacks.Future research directions may explore alternative algebraic codes or parameter adjustments to further enhance our proposed variant of the McEliece cryptosystem.Notably, our work is positioned as a resilient candidate against post-quantum attacks, surpassing the classic McEliece cryptosystem in certain aspects.
This study opens avenues for refining cryptographic protocols and advancing secure communication across diverse applications, including the military, government, IoT, key exchange, and digital signatures.All calculations and assessments were rigorously conducted using Sagemath software, ensuring the reliability of our findings.Despite achieving a reduction in key size, we acknowledge the substantial nature of the overall key size, prompting future research endeavors to address this limitation.In summary, our work contributes to the evolving landscape of post-quantum cryptography, laying the groundwork for more secure and efficient cryptographic systems in the future.
CONFLICTS OF INTEREST: Not applicable.
in which the rows form a basis of C is termed a generator matrix G, and a matrix in which the rows form a basis of the dual code of C is said to be a parity check matrix (denoted by H ) of C. The matrices G and H are not unique, as a code can have multiple generator and parity check matrices.Nonetheless, G and H satisfy the relation GH ⊥ = 0, where H ⊥ denotes the orthogonal complement of H .The distance d of a code C is defined as the minimum number of coordinate positions at which any two distinct codewords of C differ, and the minimum distance among all codewords is called the Hamming distance d of a code C. The Hamming weight wt(u) of a codeword u is the number of non-zero coordinates in u.For a linear code, the Hamming distance equals the minimum Hamming weight.Definition 4 ((C 1 , C 1 + C 2 ) construction): Let C 1 and C 2 be two linear codes with parameters

3 .
The challenger, randomly choosing b from {1, 2}, encrypts the selected plaintext m b , resulting in a ciphertext c b .This ciphertext is sent back to the adversary.4. The adversary's challenge is to determine which plaintext, m 1 or m 2 , was originally encrypted with the given ciphertext, c b .

the efficiently decoding algorithm of C 1 , 2 = c 1 + c 2 + e 2 , where c 1 ∈ C 1 and c 2 ∈
C 2 with error-correcting capacity t 1 and t 2 , • a received word r = (r 1 , r 2 ) with r 1 = c 1 + e 1 , r C 2 and e = (e 1 , e 2 IV. PROPOSED CRYPTOSYSTEMThis section introduces a variant of the McEliece cryptosystem, which capitalizes on the NP-hard problem associated with decoding random linear codes and distinguishing them from efficiently decodable codes.The methodology entails the selection of publicly available parameters, denoted as [n, k], which are then partitioned into two sets of linear code parameters-C 1 and C 2 .These codes operate over the finite field F q , with k = k 1 + k 2 , and are specified by parameters [n/2, k 1 , d 1 ] and [n/2, k 2 , d 2 ], respectively.They also have efficient decoding algorithms.The construction of a new code, denoted as C, is achieved through the (C 1 , C 1 + C 2 ) construction, resulting in parameters [n, k, min{2d 1 , d 2 }].Key Generation: Let G be a generator matrix of the code C, choose a non-singular matrix S of size k × k, and P be a permutation matrix of size n × n over F q .Computes G ′

(C 1 , 2 Input: 1 .
C 1 + C 2 )-code.NP-Completeness Problem of Distinguishing a (C 1 , C 1 + C 2 )-Code.Problem.((C 1 , C 1 + C 2 )-distinguishing) [45, Theorem 3] Given a binary linear code C and an integer k 1 , the problem is to determine whether there exists a permutation P of length n such that P(C) forms a (C 1 , C 1 + C 2 ) code, with dim(C 1 ) = k 1 and |Supp(C 2 )| = |n/2|.Algorithm 2 DETERMINEC ′ 2 : Calculates a Set of Independent Elements in C ′ C pub the public code 2. N represents a specific quantity of iterations.3. p, l : small integer Output: a collection of independent elements of C ′ 2 .function(DETERMINEC 2 (C pub , N )) for i=1 to N do C ← φ Randomly select a set I from the set {1, 2, . . ., n} with a cardinality of n − k − l in a uniform manner U ← CODEWORDS(Punc I (C pub , p))

Security parameters :
Same as proposed cryptosystem.;Key Generation: Same as proposed cryptosystem.;Encryption: Choose uniformly at random a string s of k ′ -bits (k ′ < k) and encrypt m|s using the encryption algorithm of the proposed variant of the McEliece cryptosystem and the ciphertext will be c = (m|s)G ′ + e.; Decryption: Apply the decryption algorithm of the (C 1 , C 1 + C 2 )-based McEliece cryptosystem to get m|s.And output only the first k ′ -bits m.; lack of IND-CPA security is evident and outlined in Section V.As a result, in this study, we convert the proposed cryptosystem into an IND-CPA-secured system.It is apparent that the original McEliece cryptosystem lacks IND-CPA security.If an adversary acquires a ciphertext c and knows that c corresponds to either m 0 or m 1 , they can easily determine the corresponding plaintext by calculating the weight of m 0 G ⊕ c and checking whether it equals t.In this section, we introduce an IND-CPA secure variant of the McEliece cryptosystem based on the (C 1 , C 1 + C 2 ) construction.In the proposed cryptosystem IV, the parameter selection and key generation for the proposed and randomized algorithms are the same.During randomized encryption, the algorithm encrypts m|s instead of m, and during decryption, it outputs only the first k − k ′ bits of the message.To prove the security of the randomized (C 1 , C 1 + C 2 )-based McEliece cryptosystem, we assume that the matrix G ′ is indistinguishable from random matrices of the same size for any probabilistic-time algorithm.Theorem 2: The randomized (C 1 , C 1 +C 2 )-based McEliece cryptosystem is IND-CPA secure if the general decoding problem is hard and G ′′ is indistinguishable.The proof of the theorem follows the approach outlined in [64, section 4].
This study presents a variant of the McEliece cryptosystem, employing the (C 1 , C 1 + C 2 ) construction of codes.Suppose we have two linear codes, denoted as C 1 and C 2 , with parameters [n/2, k 1 , d 1 ] and [n/2, k 2 , d 2 ], respectively, where k 1 < k 2 .We utilize these codes within the (C 1 , C 1 + C 2 ) construction, resulting in a novel linear code characterized by the parameters [n, k 1 + k 2 , min{2d 1 , d 2 }].We present a general approach for the McEliece cryptosystem based on the (C 1 , C 1 + C 2 ) code and propose a decoding algorithm for the (C 1 , C 1 + C 2 ) code.We utilize the decoding algorithm proposed in , C 1 + C 2 ) construction, where E is constructed by E 1 , E 2 linear codes and F be constructed by F 1 , F 2 linear codes such as E

TABLE 1 .
[50]parameters of the proposed cryptosystem having C 1 and C 2 as GRS codes, providing security against the generalized Stern ISD algorithm[50].
, and a GRS code C 2 with parameters[1980, 1500, 481].This configuration results in a (C 1 , C 1 + C 2 ) code characterized by parameters [3960, 2039, 481].The key size of classic McEliece is k(n − k)/8-bytes, and our proposed cryptosystem has 2kn/8-bytes because log 2 q becomes one in the binary case.So, the key size of McEliece has 6528(8192 − 6528)/8 = 1357824 bytes for [8192, 6528, 128] parameter code, and our has 2039 * 3960 = 1009305-bytes, which makes our keys 25% smaller than classic McEliece key [1].Remarkably, the key size associated with this setting is 1009305 bytes, showcasing a significant 25% reduction compared to one of the classic McEliece parameters, specifically [8192, 6528, 128], which carries a key size of 1357824 bytes.Importantly, this reduction in key size is achieved while maintaining a robust 256-bit security level.