Many-Body Effects-Based Invertible Logic With a Simple Energy Landscape and High Accuracy

Inspired by many-body effects, we propose a novel design for Boltzmann machine (BM)-based invertible logic (IL) using probabilistic bits (p-bits). A CMOS-based XNOR gate is derived to serve as the hardware implementation of many-body interactions, and an IL family is built based on this design. Compared to the conventional two-body-based design framework, the many-body-based design enables compact configuration and provides the simplest binarized energy landscape for fundamental IL gates; furthermore, we demonstrate the composability of the many-body-based IL circuit by merging modular building blocks into large-scale integer factorizers (IFs). To optimize the energy landscape of large-scale combinatorial IL circuits, we introduce degeneracy in energy levels, which enlarges the probabilities for the lowest states. Circuit simulations of our IFs reveal a significant boost in factorization accuracy. An example of a 2- $\times2$ -bit IF demonstrated an increment of factorization accuracy from 64.99% to 91.44% with a reduction in the number of energy levels from 32 to 9. Similarly, our 6- $\times6$ -bit IF increases the accuracy from 4.430% to 83.65% with the many-body design. Overall, the many-body-based design scheme provides promising results for future IL circuit designs.


I. INTRODUCTION
In recent years, there has been a noticeable upswing in the exploration of invertible logic (IL) [1], [2], [3], [4] as an efficient computational model that is capable of operating in bidirectional modes remarkably.A diverse range of hard computational problems, including integer factorization [5], [6], [7], that serves as the cornerstone of modern encryption algorithms [8], Boolean Satisfiability [3], [9], and training and learning of neural networks [10], [11], [12] is nature-friendly to be solved using the reverse operation mode of a well-designed IL; moreover, a single IL circuit can integrate multiple logic operations.For example, the invertible multiplier/adder circuit can separately function as a multiplier/adder, a divider/subtractor, and an integer factorizer (IF)/sum factorizer by operating in forward, partially forward, and reverse modes, respectively.This feature has the potential to greatly reduce the hardware cost when performing certain arithmetic tasks.
IL is an energy-based computational model that facilitates bidirectional computing by embedding all possible solutions that match the truth table into the system's ground state.The configuration of IL is designed on the bidirectional connectivity of the Boltzmann machine (BM) model [13], with each node implemented by a probabilistic-bit (p-bit) device [1].Currently, most of the proposed designs for IL rely on two-body interactions [1], [2], [14], which correspond to pairwise interactions between nodes of the BM.Even though the two-body-based design is intuitive and has the potential to provide a simple structure for BM-based IL, it still presents some critical issues that need to be addressed.It is widely recognized that using only pairwise interactions to describe a system results in an incomplete characterization of the energy function of the system.For example, fundamental IL gates, such as the invertible XOR gate (IXOR), cannot be achieved with only two-body interactions.Even though a well-designed model based on two-body effects, including invertible AND gates (IANDs), invertible half adders (IHAs), and invertible full adders (IFAs), can map correct solutions to the ground state, it fails to address other wrong solutions, resulting in multiple discrete energy levels other than the ground state [15], [16].This leads to other unreasonable ''half wrong'' or ''more wrong'' states in addition to ''right'' and ''wrong'' solutions; furthermore, combinatorial IL circuits composed of various fundamental IL gates, such as IFs, have a much more complicated energy landscape, compromising the performance of the factorization problem.These extra energy levels further narrow the energy differences between the ground state and the first excited energy level in the IFs, severely degrading the factorization accuracy.Modern designs use various annealing techniques, such as simulated annealing [17], [18] and parallel annealing [9], to facilitate the system toward the ground state, but these approaches come with additional algorithmic costs.On the other hand, finding the optimal annealing schedule and selecting appropriate parameters for these algorithms can be challenging and time-consuming.
In this work, we expand the dimension of interactions from pairwise to multibody to address the above problems.We present a novel design for IL circuits based on the many-body interactions with p-bit implementation.Our theoretical calculations of typical fundamental IL gates demonstrate the superiority of many-body effects in expressing the energy function of the IL system; furthermore, the manybody-based system allows for the binarization of energy levels to a highly degenerate energy landscape.The logic synthesis method is used to merge fundamental IL gates to create larger-size combinatorial IL circuits, such as IFs.The proposed many-body-based design has great potential in 1) simplification of the system's energy landscape by introducing energy level degeneracy [19] and 2) enhancement of the factorization accuracy by enlarging the energy difference between the ground state and other local energy minima points.
The remainder of this article is as follows.Section II briefly reviews related work on the use of many-body effects in the field of logic design and hardware implementations of many-body interactions.Section III introduces the fundamentals of BM-based IL, including two-body-based and many-body-based designs.A comprehensive configuration library of many-body-based fundamental IL gates and small-size combinatorial IL circuits is also developed in this section.Section IV presents the hardware implementation of the many-body-based IL, including the p-bit device and the derivation of the electronic elements to implement many-body interactions.Section V presents the circuit simulation results, ranging from the simplest IAND to larger-size logically synthesized IFs.The underlying reasons for the improvement in factorization accuracy using the many-bodybased design are analyzed in this section.Finally, Section VI concludes the article.

II. RELATED WORK
To address the limitations of pairwise interactions, manybody interactions have been proposed as a promising solution.
Ground spin logic models [19] have shown that many-body interactions can induce energy degeneracy for both valid and invalid states through energy function designs, providing a theoretical foundation to implement the scheme in IL circuits.In hardware, only two inductive couplers and N ancilla qubits have been demonstrated to implement effective N -body interactions for quantum systems [20], alleviating the challenge of encoding optimization problems using physical quantum annealing devices.A probabilistic computer [5] first leverages the many-body effects among p-bits, in which the interactions are carried out by peripheral microcontrollers; however, this IF circuit is customized for a specific-size factorization problem and cannot be logically synthesized from fundamental IL gates.The other CMOS-based probabilistic IL circuits [21] have explored the effect of the three-body interactions in a simplified energy landscape, which accelerates the convergence rate of invertible adders.Nevertheless, the hardware overhead of this implementation is high due to the requirement for linear feedback shift registers [22] or xorshift random number generators [2] to generate the stochastic bitstreams.

III. MANY-BODY-BASED IL A. BM-BASED MODEL
The physical mechanism underlying IL is rooted in the Boltzmann Law, where the configuration of the BM determines the corresponding IL system.An example configuration of an N -node IL based on many-body interactions is shown in Fig. 1(a), characterized by four interaction terms: 1) local bias term h to each node; 2) pairwise interactions J between pairs of nodes; 3) three-body interactions K among nodes s i , s j, and s k ; and 4) four-body-interactions L among nodes s i , s j , s k , and s l .The energy of the general many-body-based system is defined as where s denotes the bipolar values, i.e., +1, −1.
On the other hand, the two-body-based system depicted in Fig. 1(b) is limited by the dimension of interactions, and only the first two terms are used to define the system energy Once the system's configuration, i.e., the interconnection relationship is established, the energy of the system solely relies on the state of the nodes {s} = [s 1 , . . ., s i , s j , s k , . . ., s N ], and the steady probability for each state configuration can be described by the Boltzmann Law where T represents a pseudo-temperature parameter indicating the stochasticity of the system under the context of BM-based IL.The embedding of the solution into the ground state is a necessary step in the process of solving the integer factorization, as states with the lowest energy are emphasized during temporal evolution.As a result, an appropriate design of the interaction relationships among p-bits is crucial.

B. FUNDAMENTAL IL GATES
Fundamental IL gates, such as IANDs, invertible OR gates (IORs), IXORs, IHAs, and IFAs, are used as the building blocks for larger combinatorial IL circuits.Linear programming (LP) is employed to determine the interaction configurations of these gates as it can provide a compact architecture design [14], [23].The LP method can eliminate the need for auxiliary nodes and minimize the number of nodes, reducing the hardware overhead.For instance, the three-node IOR gate's eight energy states under two different designs can be represented by separate sets of configuration parameters {h A , h B , h C , J AB , J AC , J BC , K ABC } and {h A , h B , h C , J AB , J AC , J BC }, respectively.For either design, the energies of the four correct states {E 000 , E 011 , E 101 , E 111 } should be mapped to the ground state E min , whereas the energies of the other four undesirable states {E 001 , E 010 , E 100 , E 110 } should all be greater than E min .The second lowest energy of the system is defined as E ′ min , and the energy gap between E min and E ′ min is denoted as E g .By maximizing E g with LP tools, such as the PuLP toolkit for Python [24], the MATLAB LP solver [25] used in our work, or other commercial LP solvers [26], the configuration parameters of the IOR gate under the many-body interactions and two-body interactions can be solved.Fig. 2(a) depicts the graphic model and the energy landscape of the many-body-based IOR.The energy levels of states are binarized, with correct states mapping to E min at −2 and undesirable states mapping to the other high energy level at +2.The number of energy levels (N EL ) can reflect the complexity of the energy landscape.Here, N EL is 2 for the many-body-based IOR with the simplest binarized landscape, whereas, for the two-body-based design shown in Fig. 2(b), there are three discrete energy levels at +9, +1, and −3 with N EL = 3.Other many-body-based three-body gates, like IAND and IXOR, illustrated in Fig. 2(c) and (d), can be likewise solved using LP, but a two-body-based design cannot configure the IXOR without an auxiliary node [2].This conveys the versatility and inclusivity of many-body-based design in describing the energy function of the system.IHA and IFA are representative examples of four-node and five-node fundamental IL gates, respectively.A possible design for IHA involving two-, three-, and four-body interactions is shown in Fig. 3(a).The correct and undesirable states are mapped to −6 and +2 in this binarized energy landscape, respectively.An alternative IHA design only incorporating two-and three-body interactions in Fig. 3(b) has one additional energy level compared to the first design.This disadvantage in the complexity of its energy landscape could be compensated by its simpler circuit implementation, as it only has one branch of three-body interaction.An appropriate design should be picked on demand while creating a combinatorial IL circuit.The formation of IFA based on two-and four-body effects is further illustrated in Fig. 3(c), which exhibits the energy binarization phenomenon with two distinct energy levels −6 and +2.To objectively compare the performance of two-body and many-body-based designs, we set the minimum absolute interaction strength for all fundamental IL gates to +1.Key energy metrics summarized in Table 1 indicate that manybody-based designs can enlarge energy gaps (as demonstrated by IHA and IFA) and introduce degenerated energy levels.

C. COMBINATORIAL IL
IL circuits share the same composability feature with VLSI, enabling large-scale circuit design through the logic synthesis of fundamental IL gates [6], [27].Fig. 4(a) shows the logic schematic of a many-body-based combinatorial IL merged from fundamental IAND and IOR gates with a total consumption of 5 p-bits, in which these two gates share with a common node m 3 shown in Fig. 4(b).Fig. 4(c) further uses matrix representation to illustrate the merging process.Two [3 × 1] h matrices and two [3 × 3] J matrices are merged into a [5 × 1] h matrix and a [5 × 5] J matrix, respectively.For higher-order interactions, there are two branches of three-body interactions among nodes {m 1 , m 2 , m 3 } with strength +1 and {m 4 , m 5 , m 6 } with strength −1.Fig. 4(d) shows alternative designs of IHA and IFA created through logic synthesis.
In order to validate the functionality of larger logically synthesized IL circuits, we develop a 2-× 2-bit invertible multiplier/IF with its logical diagram shown in Fig. 4(e).The circuit uses 4 two-and three-body-based IANDs together with 2 two-, three-, and four-body-based IHAs, consuming 12 nodes, as illustrated in Fig. 4(f).With the IF's modest size, we enumerate the energies corresponding to all 2 12 states of its 12 nodes.Key energy metrics are summarized in Table 2. Compared to the two-body-based design, the many-bodybased increases E g from 2 to 4. In addition, the system's energy landscape is greatly simplified, with N EL reducing from 32 to 9, as the distribution of the energy levels degenerates.Note that the application of many-body effects can lead to increased complexity in connectivity as the circuit scales.The sparsity technique [3] is a potential solution that can help to handle high-order interactions efficiently while maintaining appropriate connection complexity, but this is out of the scope of this article.Here, the objective of logic synthesis is to use small-scale many-body-based gates to ensure that connection complexity remains within acceptable bounds.
In this work, we adopt a FeFET-based design due to its low hardware cost and compatibility with CMOS technology [35], where the stochasticity of the p-bit device arises from the thermal noise.The FeFET p-bit comprises a FeFET, a transistor, and two serially connected inverters, as shown in Fig. 5(a).During the operation, the resistor first converts the analog drain current signal to an analog voltage signal; thenceforth, it is digitized by the inverters to produce a binary voltage signal, namely 0 and 1, represented by the low voltage level 0 and high voltage level V DD , respectively.As shown in Fig. 5(b), more positive or negative voltage gives a higher possibility of getting 1 or 0, respectively, and the probability of getting 1 can be modulated in a sigmoidal function manner by the gate voltage.Fig. 5(c) presents a flowchart for designing and simulating a combinational IL circuit based on many-body interactions.The stochasticity of the p-bit is first extracted using a fit sigmoidal curve or a lookup table.These behavioral characteristics of the p-bit are modeled and packaged into p-bit cells using Verilog-A in Cadence Virtuoso.Finally, electronic elements are used to implement all the target combinational ILs and then are simulated at the circuit level.

B. MANY-BODY INTERACTIONS
In circuit implementation, two-body interactions between p-bit pairs can be achieved with a passive resistor network [1], but for IL circuits based on many-body interactions, appropriate electronic components for implementing the many-body interconnections are crucial.In this article, the IFs are synthesized with building blocks, namely, the IAND incorporating two-and three-body interactions, the IHA incorporating two-, three-and four-body interactions, and the IFA incorporating two-and four-body interactions.We will take this manybody-interacting system involving interactions with different dimensions to demonstrate the derivation for the electronic component that realizes many-body interactions.
As the system operates, the nodes are updated sequentially, and the update rule for node s i is as follows: Only considering the three-body and four-body interactions, their respective contributions are as follows: In the circuit, the bipolar state of the p-bit s is represented by its digitized voltage output as 0 and 1.The conversion relationship between binary and bipolar formats is s = 2v − 1.
Fig. 6(a) shows all the possible values of nodes s j , s k , and s j • s k in the theoretical bipolar format in (6a) after threebody interactions.The function f (v j , v k ) gives the correct output results that the circuit implementation should meet after the conversion from bipolar to binary format, which perfectly matches the function of the XNOR gate.Detailed connectivity using conventional XNOR-implemented threebody interactions is depicted in Fig. 6(b), in which the output terminals of p-bits v j and v k are connected to the input terminal of the XNOR gate.The output voltage of XNOR is the final signal after the three-body interaction.Fig. 6(c) similarly shows all the value relations of nodes s j , s k , s l , and s j •s k •s l in (6b) for the four-body interactions.In this scenario, the function of f (v j , v k , v l ) can be implemented by two cascading conventional XNOR gates.As shown in Fig. 6(d), the output signals of p-bits v j and v k are first processed by the first XNOR gate and then fed to the second XNOR gate together with the output signal of p-bit v l .The output of the second XNOR gate is the final signal following the four-body interaction.

A. FUNDAMENTAL MANY-BODY-BASED AND GATE
The proposed circuit diagram for an example three-node IAND is illustrated in Fig. 7(a), where graphical information is translated into electronic components.Specifically, nodes are implemented with FeFET-based p-bits, local biases are substituted with voltage sources, and interactions among bits, including two-body and three-body interactions, are translated into resistor networks in conjunction with XNOR On the other hand, since we adopt voltage-controlled p-bits in the circuit implementation, an ideal current-controlled voltage source (CCVS) with a gain of 10 6  is added at each p-bit end.The ideal CCVSs are modeled by Verilog-A with the purpose of converting the current signal into a voltage signal, and the input voltage of the CCVS is pinned at V P = V DD /2.Detailed parameters used in the circuit simulations are summarized in Fig. 7(b).With this configuration, the node current equations for nodes A, B, and C are as follows: where I A , I B , and I C are the feedback current signals and are fed into the input terminal of CCVSs for p-bit A, B, and C, respectively.V A , V B , and V C are the output voltages of these p-bits, which can only take the binary values 0 or 1 (V DD ), R AC = R BC = R ABC = R 0 = 1 M , and XNOR represents the Boolean operation of XNOR.The real-time output of (ABC) in the free mode and reverse mode updated according to (7) is shown in Fig. 8(a).In the free mode, all p-bits are floating, and the states matching with the truth table are visited with high probability, whereas in the reverse mode with p-bit C clamped to 0, states (ABC) = (000), ( 010) and (100) are emphasized with time evolution.The time-averaged probability distributions of all states are shown in Fig. 8(b), in which the phenomenon of probability binarization reflects that there are only two distinct energy levels under the many-body-based design.The average probabilities of the four correct solutions and the four undesirable solutions obtained from circuit simulation are 24.12% and 0.88%, respectively, which is in excellent consistency with the theoretical values of 24.12% and 0.88% calculated from (3); furthermore, when C is clamped to 0, the IAND operates in the reverse mode with (ABC) = (000) = (010) = (100) ≈ 33% shown in Fig. 8(c).Similarly, the free operation mode of IHA and IFA shown in Fig. 8(d) exhibits a phenomenon of probability binarization.Although the number of wrong solutions increases to 12 and 24, respectively, the many-body-based design still provides the simplest binarized energy landscape for these fundamental IL gates comprising more p-bits.

B. 2-× 2-BIT MULTIPLIER/IF
In Section V-A, a theoretical comparison of the 2-× 2-bit IF based on two-body interactions and many-body interactions is performed.The key energy metrics summarized in Table 2 show that the many-body-based design can offer a larger E g at 4 and a smaller N EL at 9, which simplifies the energy landscape and boosts the factorization accuracy.To demonstrate the superiority of the many-body-based design in solving the factorization problem, we first implement the 2-× 2-bit IFs under the two design schemes.As shown in Fig. 9(a), the many-body-based system offers a simpler and more regular energy landscape than the two-body-based one, characterized by a squeezed fluctuation range from −20 ∼ +44 to −20 ∼ +12, an enlarged E g from 2 to 4, and a reduced N EL from 32 to 9. Fig. 9(b) shows that the system spends most of the time at the correct solutions, i.e., (A, B) = (2, 3) and (3, 2) when its output is clamped to 6.
The time-averaged statistics of all candidate solutions are depicted in Fig. 9(c).Owing to the simplicity of the energy landscape and enlarged E g brought by the many-body-based design, a significant improvement in factorization accuracy from 64.99% to 91.44% is obtained, as shown in Fig. 9(d).Simulation results align well with the theoretical values calculated based on energy levels in Table 2 using (2) and (3).Note that the parameter 1/T in (3) measures the stochasticity of the system and must be appropriately set based on the degree of stochasticity when calculating the analytical probability distributions.

C. 6-× 6-BIT AND LARGER-SIZE MULTIPLIER/IF
To investigate the effects of many-body interactions of p-bits in large-scale combinatorial IL, we evaluate the performance of a 6-× 6-bit IF when it is clamped to 3233.This IF is logically synthesized with 36 IANDs, 6 Has, and 24 FAs using a total of 108 p-bits.As shown in Fig. 10(a) and (b), although both two-body and many-body-based designs produce the correct factors (A, B) = (53, 61) or (61, 53), a boost of accuracy in Fig. 10(c) from 4.430% to 88.85%, is achieved when the design is optimized to many-body.This significant improvement is that the many-body-based combinatorial IL can have degenerated energy levels, as evidenced by a significantly reduced N EL in the 2-× 2-bit IF case.
A larger E g can be induced due to energy degeneration, which could enhance the performance of factorization; furthermore, the performance of IFs of various sizes has been studied.Fig. 10(d) shows that there is a downward trend in IFs' factorization accuracy as the size of the problem gradually increases from 2-× 2-bit to 10-× 10-bit IFs.The decrease in accuracy is because the expansion of the problem scale leads to an increasing number of wrong solutions, which, in turn, unavoidably dilutes the probability of correct solutions.Even so, for the 10-× 10-bit IF, with our proposed many-bodybased design, the factorization accuracy for an integer up to more than 1 million is still acceptable, which is approximately 28.39% for the solutions (A, B) = (1019, 1021) and (1021, 1019).

VI. CONCLUSION
In this work, we introduce a novel design scheme for BM-based IL with p-bit implementation.The proposed design scheme expands the dimension of interactions of the BM configuration from two-body to many-body, providing more degrees of freedom for describing the system's energy function.By using Boltzmann Law and circuit simulation, we demonstrate that the many-body-based design provides the simplest binarized energy landscape with minimal consumption of p-bits for fundamental IL gates.A comprehensive development of the IL family based on the many-body interactions is provided in this article; moreover, large-scale combinatorial IL circuits, such as IFs, logically synthesized from modules of this IL family, have degenerated energy levels and enlarged E g , which has been shown to improve the IFs' performance.For instance, the factorization accuracy of the 2-× 2-bit and 6-× 6-bit IFs can be significantly enhanced from 64.99% to 91.44% and 4.430% to 88.85%, respectively, as compared to the conventional two-body-based design.In future prospects, our design would be useful as a more efficient computational model for IL-related probabilistic applications.

FIGURE 1 .
FIGURE 1.(a) General graphic model of the many-body-based design.(b) General graphic model of the two-body-based design.

FIGURE 2 .
FIGURE 2. Three-node IL gates.(a) Graphic model and energy landscape of the many-body-based IOR.(b) Graphic model and energy landscape of the two-body-based IOR.(c) IAND.(d) IXOR.

3 .
Four-node and five-node IL gates based on up to four-body interaction.(a) (b) Alternate design of IHA.(c) IFA.

FIGURE 4 .
FIGURE 4. (a) Logic diagram of a serially connected AND gate and OR gate.(b) Many-body-based IAND and IOR are merged along their common node.(c) Mathematical representations of the merging process under the many-body-based design.(d) Alternative designs of many-body-based IHA and IFA created by logical synthesis.(e) Logic schematic of a 2-× 2-bit multiplier/IF.(f) Graphic model of the logically synthesized IF based on many-body interactions.

FIGURE 5 .
FIGURE 5. (a) FeFET-based p-bit used in this work, in which the switching is assisted by thermal noise.(b) Sigmoidal response of the adopted p-bit with respect to the applied gate voltage.(c) Design and simulation flowchart of the many-body-based IL obtained by logic synthesis.

FIGURE 6 .
FIGURE 6.(a) Difference between mathematical and circuit representations of the three-body effect.(b) One XNOR gate is used to implement the three-body interaction.Difference between mathematical and circuit representations of the four-body effect.(d) Two serially connected XNOR gates are used to implement the four-body interaction.

FIGURE 7 .
FIGURE 7. (a) Schematic of the proposed many-body-based IAND.(b) Parameters used in circuit simulation.

FIGURE 8 .
FIGURE 8. (a) Real-time waveform clip of the IAND operating in the free and reverse modes.(b) Statistical probabilities of the IAND operating in the free mode.The normalized time is calculated as the circuit operation time divided by the product of p-bit fluctuation time (1 ns) and the number of p-bits.(c) Statistical probabilities of the IAND in the reverse mode with C clamped to 0. (d) Five-node IFA operating in the free mode.The statistical probability distributions for both modes of IAND and the free mode of IFA are obtained by averaging 10 6 sampling points in the time domain.Additionally, the reverse operation is enabled by setting a strongly positive or negative bias voltage to p-bit C.

FIGURE 9 .
FIGURE 9. (a) Energy landscapes of the 2-× 2-bit IF under the two-body-based design and the many-body-based design.(b) Real-time waveform clips of factors A and B when the 2-× 2-bit IF is clamped to 6.The clamping process of integer 6 is realized by pinning the states of nodes (Y 0 , Y 1 , Y 2 , Y 3 ) to (0, 1, 1, 0).(c) Three-dimensional histogram of the statistical probability distributions for all possible solutions.The data are collected using 10 7 sampling sets of the 12 p-bits.(d) Accuracy evaluations of two-body-based and many-body-based designs.