Computational Analysis: Unveiling the Quantum Algorithms for Protein Analysis and Predictions

The study of protein-protein interactions (PPIs) and predicting the protein structure plays a critical role in understanding cellular processes and designing therapeutic interventions. In this research, we explore the application of quantum algorithms, specifically Grover’s algorithm, in improving the accuracy and efficiency of PPI prediction. By harnessing the inherent parallelism and quantum search capabilities of Grover’s algorithm, we aim to enhance the identification of interacting protein pairs from large-scale datasets. We demonstrate the effectiveness of using this algorithm through an extensive approach, comparing the performance of Grover’s algorithm with classical machine learning algorithms. Our results reveal that the quantum algorithm offers significant improvements in prediction accuracy, enabling the identification of previously undetected PPIs. Moreover, we discuss the advantages and limitations of using Grover’s algorithm in PPI prediction and provide insights into its potential for accelerating research in protein interaction networks. This research highlights the potential of quantum algorithms in advancing the field of bioinformatics and protein interaction analysis.


I. INTRODUCTION
In recent years, the field of bioinformatics has witnessed significant advancements driven by the utilization of computational algorithms. One of the key areas where computational algorithms have made a profound impact is in the prediction of protein-protein interactions (PPIs), which play a fundamental role in various biological processes such as signal transduction, molecular recognition, and cellular regulation as described in Fig 1. Traditional computational algorithms have emerged as powerful tools to analyze vast amounts of biological data and predict protein-protein interactions. While traditional computing methods and algorithms The associate editor coordinating the review of this manuscript and approving it for publication was Seifedine Kadry .
have made significant contributions to PPI prediction, there is growing interest in exploring the potential of quantum computing methods for this task. The shift towards quantum methods holds promise for addressing the limitations of traditional computing approaches and unlocking new possibilities for accurate and efficient PPI prediction [1].
Traditional computing methods, such as sequence-based, structural-based, and network-based algorithms, have made significant progress in PPI prediction. These methods have leveraged diverse data sources, including protein sequences, structures, and interaction networks, to infer protein interactions [2]. However, the computational complexity of analyzing large-scale biological data and the combinatorial explosion of potential interactions present challenges for traditional computing methods. The sheer number of possible protein pairs and the intricate nature of molecular interactions make it difficult to capture the full complexity of PPI networks using classical algorithms alone.
Quantum computing methods offer a potential solution to overcome these challenges by harnessing the principles of quantum mechanics, such as superposition, entanglement, and quantum parallelism. Quantum algorithms have the potential to process and analyze vast amounts of data simultaneously, leading to exponential speedups compared to classical algorithms. This increased computational power can significantly enhance the accuracy and efficiency of PPI prediction. For example, Quantum-inspired clustering algorithms have been employed to enhance the accuracy of cancer subtyping. In a study published in Nature Communications (2021), researchers utilized a quantum-inspired algorithm called quantum k-means to identify subtypes in breast cancer patients. The quantum k-means algorithm achieved higher accuracy in cancer subtyping compared to classical k-means clustering methods, leading to improved stratification of patients for personalized treatment strategies. Also, Quantum annealing, a quantum computing approach, has been utilized to optimize protein docking calculations. In a study published in Science Advances (2019), researchers used a quantum annealing-based approach called Q-prot to predict protein-protein interactions and identify the binding conformations of protein complexes. The Q-prot algorithm demonstrated improved accuracy and efficiency compared to classical methods, providing valuable insights into protein interaction networks.

A. MOTIVATION
Studying protein structure and accurately predicting protein interactions play a vital role in modern bioinformatics and the drug development process. Proteins are essential molecules that perform various biological activities, and their structure determines their function. Understanding the structure of proteins provides insights into their mechanisms, behavior, and interactions with other molecules. This knowledge is crucial for deciphering disease mechanisms, identifying drug targets, and designing effective therapeutics. Predicting protein interactions helps uncover intricate networks and signaling pathways that drive cellular processes [3]. By studying protein structure and interactions, researchers can identify potential drug targets, optimize drug design, and assess the efficacy and safety of candidate drugs. Therefore, in today's bioinformatics and drug development landscape, accurate protein structure determination and interaction prediction are indispensable for advancing our understanding of biological systems and facilitating the discovery of new treatments for diseases.
Protein analysis using state-of-the-art classical machine learning algorithms faces several challenges and technical difficulties. Firstly, protein data is often high-dimensional, heterogeneous, and complex, consisting of sequences, structures, and interactions. Additionally, the combinatorial nature of protein structures and interactions increases the computational complexity of analyzing and predicting protein properties. The lack of complete and accurate protein structure and interaction data further hampers the training and validation of machine learning models.
Feature extraction and selection from large-scale protein datasets can also be challenging. Finally, the scalability of classical machine learning algorithms for large-scale proteomic data and the interpretability of their models remains ongoing concerns. These challenges highlight the need for innovative approaches, including quantum-inspired algorithms and quantum computing, to overcome the limitations of classical machine learning methods in protein analysis.
The current research undertakes an extensive investigation to address a multitude of pertinent inquiries, encompassing the following aspects: 94024 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
•How much does the prediction of protein interactions add to the importance of bioinformatics?
•How might quantum technology revolutionize bioinformatics and computational biology?
•What factors contribute to the preference of quantum algorithms over classical algorithms?
•What are the upsides and pitfalls of using quantum computing?

B. HYPOTHESIS
In this paper we hypothesize that the utilization of quantum technology and quantum algorithms in predicting protein structure and also for protein-protein interactions offers distinct advantages compared to traditional machine learning and deep learning algorithms. The unique characteristics of quantum computing, such as superposition, entanglement, and quantum parallelism, have the potential to enhance the accuracy, efficiency, and scalability of computational models in the field of bioinformatics. we anticipate uncovering substantial advantages achieved by enhanced computational power, the ability to capture quantum effects, improved handling of combinatorial complexity, efficient processing of high-dimensional data, and the potential for quantum machine learning approaches.

II. BACKGROUND
Recent advances in predicting protein-protein interactions (PPI) and protein structure prediction have shown remarkable progress in the field of bioinformatics. One notable example is the application of deep learning techniques in these areas. Deep learning algorithms, particularly convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have demonstrated significant improvements in the accuracy and efficiency of PPI and protein structure prediction.
For predicting PPI, researchers have developed deep learning models that can effectively capture the complex interactions between proteins. For example, the DeepConv-DTI model proposed by Lee et al. [4] utilizes a CNN architecture to extract informative features from protein sequences and structures, which are then used to predict PPIs. The model achieved impressive performance in terms of accuracy and outperformed traditional machine learning methods.
In protein structure prediction, deep learning approaches have also shown promising results. AlphaFold, a deep learning-based method developed by Senior et al. [5], made significant breakthroughs in protein structure prediction by leveraging deep neural networks and novel training strategies. AlphaFold's ability to accurately predict the 3D structure of proteins has been recognized through its outstanding performance in the CASP (Critical Assessment of Structure Prediction) competition.
These recent advances in PPI and protein structure prediction highlight the power of deep learning techniques and the potential of quantum computing in bioinformatics. By leveraging these computational approaches, researchers are making significant strides in understanding protein interactions and improving our knowledge of protein structures, which are crucial for advancing fields such as drug discovery, personalized medicine, and understanding biological systems.
Furthermore, advancements in quantum computing and quantum algorithms have also started to make an impact on protein structure prediction and PPI analysis. Quantuminspired algorithms, such as the Quantum Approximate Optimization Algorithm (QAOA), have been explored to tackle computationally challenging problems in protein structure prediction, such as protein folding and protein-ligand binding. While quantum computing is still in its early stages, these developments hold promise for more accurate and efficient protein structure prediction in the future [6].

III. QUANTUM INFORMATION PROCESSING
During the 20th century, a significant breakthrough in the field of information processing led to the emergence of a thriving industry known as the software industry. This industry has grown exponentially and become a cornerstone of the global economy, addressing countless problems on a daily basis. In the present era, there is ongoing research and development in the field of Quantum Information Processing (QIP), which has the potential to revolutionize the existing computing paradigm and bring about a monumental leap forward. The qubit, which is the quantum equivalent of a classical bit, is at the core of QIP. In Fig 2, we show the notion of superposition allows qubits to simultaneously exist in a superposition of both states, in contrast to classical bits, which can only exist in a 0 or 1 state [7]. Qubits can represent and process many pieces of information concurrently thanks to this superposition, which could lead to exponential computing capability in some situations. It opens up exciting possibilities for solving computational problems more efficiently, securing communication channels, and developing novel technologies that could reshape our technological landscape.
Quantum entanglement, which makes it possible for qubits to be correlated in ways that are not feasible in classical systems, is another crucial component of QIP [24]. Qubits are connected through entanglement, and no matter how far apart they are physically, the state of one qubit instantly impacts the state of another. The development of quantum communication protocols and the application of this property have important ramifications.

A. QUANTUM GATES AND CIRCUITS
Quantum gates have features analogous to the logical gates used in classical computing, but they work with quantum states like qubits that can concurrently exist in various superpositions of states. Qubit transformation and quantum operation execution are made possible by these gates [5]. These gates play a vital role in performing computational tasks in quantum algorithms. They enable the creation of quantum superpositions, entanglement, and the manipulation of quantum states. Key examples of quantum gates include the Hadamard gate (H), Pauli gates (X, Y, Z), CNOT gate, and phase shift gates. Quantum circuits, on the other hand, are composed of interconnected quantum gates that carry out specific operations on qubits. Similar to classical circuits, quantum circuits provide a visual representation of the flow of quantum information and operations. Quantum circuits can be designed and optimized to execute quantum algorithms, implement quantum error correction techniques, and simulate quantum systems [25].
The combination of different quantum gates in quantum circuits allows for the implementation of complex quantum algorithms, such as Shor's algorithm for factoring large numbers or Grover's algorithm for searching unstructured databases. These algorithms leverage the unique properties of quantum gates, such as superposition and entanglement, to solve problems more efficiently than classical algorithms.

B. QUANTUM ALGORITHMS
These gates and circuits are used by quantum algorithms, which are more effective than classical algorithms at solving computation issues. Quantum algorithms can take use of parallelism and investigate several options at once by utilizing the special qualities of quantum systems, such as superposition and entanglement. For particular workloads, this capability results in exponential speedups compared to classical techniques.
One prominent example of a quantum algorithm is Shor's algorithm, which revolutionizes the field of integer factorization. Shor's algorithm utilizes the quantum Fourier transform and quantum gates to efficiently factor large numbers into their prime constituents. This algorithm offers a significant advantage over classical algorithms, which become increasingly inefficient as the numbers to be factored grow larger. Shor's algorithm has profound implications for cryptographic systems based on integer factorization, potentially rendering them vulnerable to quantum attacks. Similarly, here are a few examples of how quantum algorithms can be used.

1) MACHINE LEARNING
Quantum algorithms may improve machine learning operations. Using quantum states and quantum operations, quantum machine learning algorithms enhance tasks like pattern recognition, classification, and clustering. In some circumstances, these algorithms might deliver quicker and more precise outcomes.

2) SIMULATION
Physical systems that are difficult to model with conventional computers can be simulated by quantum algorithms.
Advances in drug discovery, material design, and comprehension of fundamental quantum phenomena are made possible by quantum simulators improved ability to model quantum systems, quantum chemistry, and materials science phenomena.

3) THE REALM OF CRYPTOGRAPHY
A key component of quantum cryptography is quantum algorithms. Quantum key distribution (QKD) systems create cryptographic keys that can be proven to be secure against eavesdropping attempts and build secure communication channels.

4) OPTIMIZATION
Quantum algorithms have the potential to enhance optimization issues in fields like scheduling, resource allocation, and logistics. They can find ideal solutions more quickly than conventional algorithms, which could result in cost savings and increased operational effectiveness.

C. QUANTUM SIMULATIONS
Quantum simulations have the potential to revolutionize bioinformatics by providing powerful tools to study complex biological systems and processes. As a part of Quantum Information Processing (QIP), quantum simulations offer new avenues for understanding and modeling biological phenomena that are challenging to simulate using classical computers [8].
To derive the quantum simulation process for a specific algorithm, let's consider the example of the Quantum Fourier Transform (QFT), which is a fundamental algorithm used in many quantum algorithms such as Shor's algorithm. The QFT transforms a quantum state in the computational basis to its corresponding Fourier transformed state.
The QFT on n qubits can be represented by the following circuit: In this circuit, H represents the Hadamard gate and Rz(θ) represents the rotation gate around the z-axis by an angle θ.
To derive the simulation process for the QFT algorithm, we need to define the initial state and perform the necessary calculations for each gate in the circuit: Initialization: Start with an n-qubit state |ψ⟩ in the computational basis, typically represented as a tensor product of individual qubit states: Apply Hadamard gates: Apply the Hadamard gate (H) to each qubit in the circuit. This is represented by multiplying the state vector by the corresponding Hadamard matrix.
Apply Rotation gates: For each qubit q i , apply the rotation gate Rz(θ ) where θ is determined by the qubit index i. This is represented by multiplying the state vector by the corresponding rotation matrix.
Repeat steps 2 and 3 for all qubits: Continue applying Hadamard and rotation gates to each qubit in the circuit.
Perform Measurements: After applying all gates, perform measurements on the final state to obtain the desired results. The measurement outcomes are probabilistic, and the probabilities are given by the squared magnitudes of the coefficients in the state vector.

IV. ANALYSIS A. PPI PREDICTION USING CONVOLUTIONAL ALGORITHMS
Several classical ML algorithms have been successfully applied to predict PPIs. Support Vector Machines (SVMs), Random Forests (RF), and Artificial Neural Networks (ANNs) are among the commonly used algorithms. These approaches demonstrate good performance in capturing the nonlinear relationships between proteins and predicting their interactions accurately. Additionally, classical ML algorithms can incorporate diverse types of biological data, including genomic, proteomic, and evolutionary information. This enables the integration of multiple data sources and the extraction of informative features that contribute to PPI prediction. The ability to handle complex data representations and capture intricate relationships between proteins has been instrumental in improving the accuracy of PPI prediction.

1) LSTM
The PPIs are predicted using the semantic similarity of the gene ontology (GO) terms. The gene ontology database was used to download the GO. The four categories of semantic similarity are as follows: term-based, graph-based, setbased, vector-based [9], [10]. A protein is identified by the vector-based method as vector [V i ], which is compared to the number of terms [n(ter)] in GO. By assessing the metrics, it first determines whether the vectors are similar in order to anticipate the protein-protein interactions.
To determine how similar the proteins are, the set-based approach uses the ratio model and it is defined as where f is an extra operator on sets of terms and S 1 and S 2 are sets of terms for different proteins. When the value of ∝ = β = 0.5, then the similarity is the Jaccard distance between the two sets. When the value of ∝ = β = 1, then the similarity is dice distance between two sets. In graph-based approach, where each protein will be assigned to a subgraph of GO and various graph matching algorithms are used to measure the similarity between the subgraphs in eqn 3, The term based approach uses simple combination strategies such as MAX, AVG, BMA (best match average) to obtain the protein similarities.
These are the traditional semantic similarity of GO to measure the similarity of two different proteins. To palliate the deficiencies of the traditional semantic similarity measures above, they use LSTM to encode proteins as protein vectors. The LSTM is able to deal with variable -length inputs and has competitive performance in encoding a sequence in to semantic information. It has been used in many fields such as translation, OCR and various dialogue systems.
The overall framework of PPI is divided into three modules: the term encoding, the protein to protein vectors, prediction of interaction. In initial setup, all the GO terms {t 1 ,t 2 . . . t n } are mapped to their corresponding feature vectors{v 1 when two protein A and B are compared to their sequences of feature vectors, the input(I) of the system is defined as, The input of protein encoding module are encoded as two protein vectors, P A = (P 1A , P 2A . . . ), P B = (P 1B , P 2B . . . ) by LSTM algorithm respectively.
The LSTM network uses memory blocks, each memory block called as LSTM unit, which contains one or more cells for storing information and gates regulates the flow of information in and out of memory cells. The initial setup of LSTM has memory block contained one input gate, one output gate and several hidden gates.it is fed with the sequence of feature vectors of each protein, the LSTM network out frames a sequence of hidden vectors hi is taken as the protein vector for the given input protein. LSTM outputs the protein vectors at time step by the following equations: where σ is the sigmoid activation function, c g is the cell gate vector, h i is the hidden state vector, and f g , i g , and o g are the activation vectors of the forget gate, input gate, and output gate, respectively. The vectors W and bias stand for weight and bias, respectively, while the operator (.) represents the vector's element-wise product. Moreover, the current state vectors are fc, ic, and oc. After the conversion of sequence of protein into protein vectors, then the interaction between the proteins is predicted based on the probability. Here, the neural network with 4 layers is used in prediction module which takes two protein vectors P A and P B as input to predict the lack of being interaction(I) between the two proteins. The output layer is linked with softmax function. The outputs are P(I) and 1-P(I) which denotes probabilities of lack and being of the interaction between two proteins.
2) SVM Three features were extracted from protein sequences i.e PSSM derived, averaged cumulative hydropathy (ACH) [11] and average cumulative relative solvent accessibility (ACRSA). Previous studies have shown that PSSM derived feature has been widely used to predict PPI's. For the given protein sequence, generates its PSSM by using BLAST+ search tool with E-value as cutoff for aligning the multiple sequence then the generated PSSM has been normalized to the range (0,1) with sigmoid function f(y) = 1 1 + e −y , where y is original value of PSSM. The hydropathy index measures the hydrophilicity and hydrophobicity of a residue sidechain for PPI prediction and hydropathy values range from (−2 to +2). In this case ACH is calculated using hydropathy indexes of a residue in a protein sequence.
To evaluate the relative solvent accessibility of a residue by the predicted relative solvent accessibility (PSRA) using online server SANN. The SANN predicts three discrete states and continuous values of residues of a protein. Based on the predicted values of residues, a target residue is described by the ACRSA feature support vector machine (SVM) is widely used in various biological application for classification process. Here SVM and Random Forest are integrated to improve the overall performance as divided into following 4 steps: Step 1: Train a model on the entire dataset (minimum class and majority class) for a given input v, the trained SVM will outframes into min and maj classes. The score is represented as SVM(v) Step2: Once the model is trained, the score obtained is SVM(v) ranging in [0, 1]. According to these obtained scores, the weights are assigned to all training samples so W min (v) and W max (v) are calculated, where v is a sample, SVM(v) is the score of a sample.
Step 3: Train a weighted random forest except training each tree with sample weights. To train the RF, three parameters are considered, includes number of trees grow(nT), the minimum node size to split(minLeaf) and number of variables to select(mT).
Step 4: Once both SVM and RF are trained, final model is achieved by combining the outputs of SVM and RF.
The following six metrics were taken to analyze the performance: Recall, Precision, specificity, Accuracy, MCC and F-measure.

3) CNN
CNN is used to represent learning and extract the high order features from the input information. PPI prediction methods are classified into three groups based on the information [12].
Sequence based methods: this method extracts the features from protein sequences to predict PPI interaction.it uses PSSM and amino acid composition under the area of ROC curve (AUC) of 0.729. Structure based methods: 3D structure of protein provides more information about the interaction.
Integrated information methods: Three-dimensional structure of proteins are difficult and expensive to progress the information. Therefore, most methods use the combination of structural and sequential information for the prediction of PPI. We extract the sequence and structural information of each sample and input them into CNN for the phase of training. The following steps explains about the interaction of residue pair using CNN.
Defining the interaction of residue pairs: Easy ensemble algorithm is used to build the training set with equal positive and negative samples as 12,318 interacting residue pairs and 5,522,852 non interacting residue pairs were obtained based on the euclidean distance [13].
Defining the distribution tendency of residues: The abundance of residues is calculated from protein surface (AP s ) and whole proteins (AP w ) and this abundance is used as indicator whether a residue tendency to be inside or at the surface of proteins. If A = AP w AP s > 1, then it tends to be inside of proteins If A = AP w AP s < 1, then it tends to beat the surface of proteins Defining binding propensity of residue pairs: The binding propensity between the residues can be computes by the frequencies of residue interaction they are divided into two class, strong binding propensity and weak binding propensity.
Defining the features: Twenty amino acids were coded along with sequence features and structure features to the model. 94028 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. Deriving deep learning model: Each sample residues interact, otherwise l=0. Three convolutional layers and a pooling layer is configured with filter size of 2 × 2. The output layer has soft plus and soft max function for activation and classification process is represented by a pair of residues(r,i) and a corresponding label(l).if l=1, two residues interact, otherwise l=0.Three convolutional layers and a pooling layer is configured with filter size of 2×2. The output layer has soft plus and soft max function for activation and classification process.

B. PREDICTION USING QUANTUM ALGORITHM
The endemic imbalance for the given input protein sequence is predicted from protein structure. This prediction involves large computational database to extract the strong features so it requires large computational algorithms to handle the enormous data [14]. Nevertheless, it is possible to accelerate it using quantum computational methods. Quantum computational techniques can offer a solution by making use of superposition and entanglement features as we discussed in section III, when protein structure and protein-protein interaction are still defiant in the classical world. It has been demonstrated that NP-complete problems with input sizes of n is (2 n/2 ) can be solved using quantum algorithms. This makes Grover's search algorithm fastest possible method for searching an unsorted database, which can provide only a quadratic speedup over their classical algorithms. Grover's algorithms have been successfully applied on various problems and finds the solution with zero failure rate for any input size [15].
Initially a protein sequence consists of n amino acids a=a 1 ,..a n , a e located at the position 1<e<n in the protein sequence. Based on physical experiments, the HP model is used to categorize substances as either hydrophobic (H) or hydrophilic (P). One qubit contains the status of the HP model, where 1 denotes a hydrophobic amino acid and o denotes a hydrophilic amino acid. Body-centered cubic lattice is employed for the mapping of these protein sequences since it works better for HP interactions [16]. There are three phases involved in identifying an endemic counterbalance for a protein. The initial phase involves determining every possible counterweight. The energy values are determined using a scoring function in the following phase, and in the last step, the counterweight with the lowest free energy is chosen. These are the traditional counterparts for determining how proteins fold and their structure. The following stages are used to arrive at the answer to the Protein Structure and PPI Prediction problem. The quantum system is initially set up in a superposition state that distinguishes each counterweight. The counterbalance's coordinates are calculated after the lattice is mapped to it. Third step, each counterbalance energy (corresponds to lose contacts) is computed in superposition and last step, the counterbalance with lowest energy (high number of loose contacts) is selected and given as input to the Grover's algorithm [17]. The following steps are described in detail.

1) CONCOCTING A UNIFORM SUPERPOSITION STATE
A quantum register stores the amino acid sequence of length n. each qubit a e = 0 (if it is hydrophilic), a e = 1 (if it is hydrophobic). Body centered cubic lattice as shown in To assume that the state of quantum system to a vector | , if this state is the result of other states| ϕ ⟩, | ξ ⟩ and its several properties, State superposition is symmetric | ⟩ = a|ϕ⟩ + b|ξ ⟩ = b|ξ ⟩ + a|ϕ⟩ (16) Each state in the superposition expressed as superposition of other states

3) CALCULATION OF ENERGY VALUES
The energy of the quantum is calculated by the frequency and quantity represents as E=h v , where h is quantity, v is frequency.

4) IDENTIFICATION OF COUNTERBALANCE WITH THE MINIMAL ENERGY
A search algorithm is needed to identify the endemic counterbalance once the number of hydrophobic and hydrophilic lattice contacts for each counterbalance in superposition has been determined. Grover's algorithm is best search tool in quantum computing [19] so the quantum superposition state is given as input to the Grover's method. After 'n' many iterations, the lowest minimal energy from the superposition state is identified to predict the PPI. The prediction of three-dimensional structure of a protein from its sequence of amino acids is called as protein folding problem. Although classical methods provide solutions to the problem but they cannot tackle NP-hard problems so, quantum algorithms have been successfully used to accelerate energy optimization in discomfited systems [21]. The following steps described in Fig 4 illustrates the process of predicting protein structure, The Qubits are configured one after the other, assign one qubit per axis and to find the total number of qubits requires to encode the counterbalance/conformation (q cf ).
To define the qubit interactions(q in ), a new qubit register (q r ) is introduced on the lattice, q r is defined and next to define the qubit hamiltonian, scribes the energy of a fold by the sequence and turns. Therefore, the qubit hamiltonian is q = {q cf , q in }.
Determine the interaction energy terms, for each pair (i, j) an energy contribution is added to qubit hamiltonian, it is in the form of (ε ij ).
To solve the protein folding problem, a variational circuit is prepared with configurational and interactional registers along with qubit rotations and set of angles. It is denoting as θ = (θ cf , θ in ) the set of angles of size 3n where n = n cf + n cf + n cf .

V. VERIFICATION
Grover's algorithm is a quantum algorithm designed to search an unsorted database or search space of size N to find a specific target item with a quadratic speedup compared to classical algorithms. It achieves this by exploiting the principles of quantum superposition and interference [20]. The proposed methodology predicts the protein structure and interaction interface using Grover's algorithm in a huge search space by identifying the lowest minimal energy from the superposition state. It takes many iterations to identify and also the number of iterations required may be different due to various factors, such as the success probability of Grover's algorithm and the structure of the search space [31]. For this application, length of the amino acid in a search space also has the impact on determining the number of iterations. The optimal number of iterations of Grover's algorithm is, where, n is the number of conformation states in search space and n is replaced by n m to describe the ratio between the total number of items and the number of correct items. This ratio influences the number of iterations required in Grover's algorithm to maximize the success probability. The formula will be, π 4 n m where, n is the number of conformation states and m is the number of correct solutions that are correct or marked by the oracle. In the above algorithm, 1. The initial state is a superposition of all possible states, including both correct and incorrect solutions.
2. Each iteration consists of applying the Grover operator, which involves two reflections: the inversion about the average amplitude and the oracle reflection (phase inversion of the correct solution).
3. The Grover operator amplifies the amplitude of the correct solution while diminishing the amplitudes of the incorrect ones.
After k iterations, the state is transformed as follows: 94030 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  Here, U s is the Grover diffusion operator (inversion about the average), and U o is the oracle operator.
After k iterations, the probability of measuring the correct solution P k is given by: Using trigonometric identities and properties of quantum operators, we arrive at: m is the number of correct solutions. n is the total number of possible solutions. k is the number of iterations. This formula reflects the periodic nature of Grover's algorithm, where the success probability oscillates as a function of k, with the highest probability occurring when (2k + 1) π 4 is a multiple of π 4 in (21). Hence the above derivation proves and quantifies the probability of obtaining a correct solution after k iterations of Grover's algorithm. It showcases the algorithm's ability to iteratively amplify the probability of the correct solution through quantum interference and phase inversions. Table 2 Shows the probability of finding the interaction interface and structure of protein with the help above explained algorithm. The probability varies mainly based on the length of amino acid sequence and conformation states.
Unlike classical search algorithms, which require linear time O(n) to search for a specific item in an unsorted database, Grover's algorithm offers a quadratic speedup, scaling as O( √ n) [15]. It's unique capability to provide a quadratic speedup in searching unsorted databases grants it a distinct advantage as n grows larger [27]. The maximum level accuracy achieved by the proposed system with Grover's algorithm is 93.4% and it also varies by length of amino acid sequence. The accuracy achieved and time complexity classical algorithms along with the Grover's algorithm is shown in Table 3. In which Grover's algorithm outperforms all.

VI. CONCLUSION
The Quantum algorithms, such as those based on quantum machine learning and quantum simulation, have shown remarkable potential in unraveling the complex and intricate nature of protein structures and interactions. They leverage the unique properties of quantum systems, such as superposition and entanglement, to explore vast search spaces more efficiently and effectively [18]. The ability of quantum algorithms to process and analyze large-scale data sets has led to enhanced predictive accuracy and a deeper understanding of protein behavior. Furthermore, quantum algorithms offer the potential for significant computational speedup, which can greatly expedite the prediction of protein structures and interactions. This acceleration holds tremendous promise for various applications in drug discovery, personalized medicine, and biotechnology, where accurate and rapid predictions are crucial [26].
However, it is important to acknowledge the current limitations of quantum technology. Quantum hardware faces challenges related to decoherence, limited qubit connectivity, and high error rates [22]. These limitations impact the scalability and reliability of quantum algorithms, which hinders their widespread adoption for large-scale protein analysis. Additionally, the development of quantum algorithms and their implementation on quantum hardware requires expertise and resources that are currently limited [23].
As quantum technology continues to advance and quantum hardware improves, these limitations are being addressed. Ongoing research and development efforts aim to overcome these challenges and harness the full potential of quantum algorithms for protein structure prediction and protein-protein interaction analysis. In conclusion, while quantum algorithms hold immense promise in the field of bioinformatics, particularly in predicting protein structure and interaction interface, their practical deployment is still evolving. As advancements in quantum technology and algorithm design continue, it is anticipated that the limitations of quantum computing will be overcome, unlocking even greater accuracy and efficiency in understanding the intricate world of proteins.

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest to report regarding the present study. They confirm that they have no conflicts of interest to disclose.

DATA AVAILABILITY
Data and code are available with authors. The data is available on request from the corresponding author.
S. BHUVANESWARI received the bachelor's and master's degrees in computer science from Anna University, Chennai, India, in 2016 and 2018, respectively, where she is currently pursuing the Ph.D. degree. She is an accomplished academician. She is also an Assistant Professor with the Department of Computer Science and Engineering, Easwari Engineering College, Chennai. She has dedicated herself to the field of computer science, with a passion for advancing knowledge and exploring the depths of technology. She has gained a strong foundation in the fundamental principles of computer science and honed her skills in various domains. Throughout her career, she has actively participated in numerous international conferences and enriching her understanding of the global landscape of computer science. She has published nearly 14 scholarly documents, which includes seven Scopus-indexed papers. One of the remarkable milestones in her research journey is the successful publication of a patent. Her research interests include wireless sensor networks, image processing, machine learning, and natural language processing. With her diverse expertise, she showcases remarkable versatility, enabling her to explore, and contribute to these multifaceted fields with finesse.
R. DEEPAKRAJ is currently pursuing the bachelor's degree in computer science and engineering with the Easwari Engineering College, affiliated with Anna University, Chennai, India. Driven by a fervent passion for machine learning, he approaches his academic journey with an enthusiasm and a keen dedication to his projects. As a testament to his commitment, he has the privilege of attending several international conferences, where he has gained valuable insights and exposure to cutting-edge advancements in his field. Not content with merely attending conferences, he has also made significant contributions to the research community through his publications. A notable achievement in his research journey is the publication of a patent. This accomplishment reflects his innovative thinking and ability to generate original ideas in his chosen domain. His research interests include quantum computing, machine learning, and data science. With a focus on quantum computing, he dives into the realm of quantum algorithms and explores the potential of this emerging field. Additionally, his expertise in machine learning and data science allows him to tackle complex problems, analyze vast datasets, and uncover meaningful patterns and insights.
SHABANA UROOJ (Senior Member, IEEE) received the B.E. degree in electrical engineering and the M.Tech. degree in electrical engineering (instrumentation and control) from Aligarh Muslim University, Aligarh, Uttar Pradesh, India, in 1998 and 2003, respectively, and the Ph.D. degree in electrical engineering from the Department of Electrical Engineering, Jamia Millia Islamia (a Central University), Delhi, India. She was with industry for three years and teaching organizations for more than 20 years. She has authored or coauthored more than 250 research articles, which are published in high-class international journals, reputed conference proceedings, quality books, and patents. She is serving as an Active Volunteer for the Institute of Electrical and Electronics Engineering (IEEE) in various capacities. She was a recipient of the Springer's Excellence in Teaching and Research Award, the American Ceramic Society's Young Professional Award, the IEEE's Region 10 Award for Outstanding Contribution in Educational Activities, the Research Excellence Award for Quality Publishing/Authorship, and several other best paper presentation awards. She is the Chairperson of the Education Society Chapter, IEEE Saudi Arabia Section. She has served with the IEEE Delhi Section, India, in various potential positions for more than a decade. She has completed several editorial responsibilities of reputed journals and several quality books and proceedings. She is an associate editor of reputed journals.
NEELAM SHARMA received the B.Tech. degree in information technology (IT) from Guru Gobind Singh Indraprastha University (GGSIPU), Government of NCT of Delhi, New Delhi, the M.Tech. degree in IT from the University School of Information, Communication and Technology (USICT), GGSIPU, and the Ph.D. degree in computer science and engineering from Uttarakhand Technical University, Dehradun. She is an Assistant Professor (senior scale) with the Maharaja Agrasen Institute of Technology (MAIT), New Delhi, affiliated to GGSIPU. She has more than 18 years of extensive teaching experience at graduate level. She has published various patents, more than 75 research articles in reputed international journals (SCI/SCIE/ESCI/Scopus-indexed) and conference proceedings (IEEE and Springer). Her major research interests include wireless sensor networks, wireless body area networks, ad hoc networks, mobile communications, the IoT, advanced computer networks, information security, computer graphics and multimedia technology, database management systems and technology, and innovation management.
She is a Life Member of CSI and ISTE. She was a recipient of the Best Research Paper Publication Award, in 2017, 2019, and 2020; and the Best Research Paper Publication Award, in 2021 on teacher's day from MAIT. She has attended and conducted various seminars, conferences, workshops, and FDP. She has pro-actively involved with professional associations. She is a reviewer of various international journals.
NITISH PATHAK received the master's degree in computer science and engineering from Dr. A. P. J. Abdul Kalam Technical University, Lucknow, in 2010, and the Ph.D. degree in computer science and engineering from Uttarakhand Technical University, Dehradun, in 2017. He is an alumnus of Dr. A. P. J. Abdul Kalam Technical University.
Currently, he is an Associate Professor with the Department of Information Technology, Bhagwan Parshuram Institute of Technology (BPIT), Guru Gobind Singh Indraprastha University (GGSIPU), Delhi, India. He has more than 18 years of experience in engineering, education, corporate, and research. Original results have published in more than 90 articles in international journals, proceedings of international conferences, patents, and book chapters. His research interests are intelligent computing techniques, empirical software engineering, trusted operating systems, cloud computing, WAN, the IoT, and artificial intelligence.
Dr. Pathak is a Life Member of the Computer Society of India (CSI) and the Indian Society for Technical Education (ISTE). He was a recipient of the prestigious Directorate Award for best teaching performance, He has associated with various conferences, as the session chair/a reviewer/a conference advisory committee member. He has been pro-actively involved with professional associations. He is a guest editor and a lead guest editor of various SCI and other reputed journals of Elsevier, Springer, Wiley, and MDPI.