Methodology for Automated Design of Quantum-Dot Cellular Automata Circuits

Quantum-dot Cellular Automata (QCA) provide very high scale integration potential, very high switching frequency, and have extremely low power demands, which make the QCA technology quite attractive for the design and implementation of large-scale, high-performance nanoelectronic circuits. However, state-of-the-art QCA circuit designs were not derived by following a set of universal design rules, as is the case of CMOS circuits, and, as a result, it is either impossible or very difficult to combine QCA circuit blocks in effective large-scale circuits. In this paper, we introduce a novel automated design methodology, which builds upon a QCA specific universal design rules set. The proposed methodology assumes the availability of a generic QCA crossbar architecture and provides the means to customize it in order to implement any given logic function. The programming principles and the flow of the proposed automated design tool for crossbar QCA circuits are described analytically and we apply the proposed automated design method for the design of both combinatorial and sequential circuits. The obtained designs demonstrate that the proposed method is functional, easy to use, and provides the desired QCA circuit design unification.


I. INTRODUCTION
Quantum-dot Cellular Automata (QCA) have been proposed in 1993 by Lent et al. [1] and QCA technology potentially provides an avenue beyond Moore's Law and von Neumann architecture electronics.In QCA technology, the logic states are not represented by voltage levels like in the VLSI/CMOS technology, but defined by the Quantum dots that are occupied by the individual electrons within a cell.Due to their great potential many QCA circuits have been proposed [2], [3], [4], [5], [6] and novel fabrication techniques developed [7], [8], [9], [10], [11], which, even though they need to be further developed, are providing a realistic roadmap for future QCA based nanoelectronics.Moreover, a programmable QCA crossbar architecture was introduced in [12], which provides designers the means to obtain robust and efficient QCA circuit designs.In this architecture, programmable logic gates are formed at crossbar cross-points, which function can be determined via the programming lines located at the crossbar top and bottom.Given that, at every and every cross-point one of the universal set of Boolean gates {OR, AND or NOT} can be instantiated, the architecture provides support for the implementation of any digital circuit.We, note that the crossbar architecture is considered as one of the most promising solutions for nanoelectronic circuits [13], because of its fabrication simplicity and the inherent redundancy, which supports defect tolerance [14], [15], [16], [17], [18].However, state-ofthe-art QCA circuit designs were not derived by following a set of universal design rules, as is the case of CMOS circuits, and, as a result, it is either impossible or very difficult to combine QCA circuit blocks in effective large scale circuits.
In this paper, we address this problem and propose an automated methodology for the design of combinatorial and sequential circuits by making use of a programmable QCA crossbar architecture [12].The proposed methodology aims to tackle one major QCA circuit design issues and provide a design automation that enables compatibility between different QCA circuits.We note that even if the combination of state-ofthe-art QCA circuits can be feasible in some specific cases, the interconnection circuit overhead is usually overwhelming, because it can be even larger than the circuits themselves.These compatibility issues result from the lack of universal design rules.The proposed methodology is utilizing the fundamental design rules of the programmable QCA crossbar architecture.In addition, it introduces the universal QCA structural blocks that can be used to design any combinatorial logic circuit.Moreover, the presented methodology is successfully handling the clock zone partitioning to resolve any corresponding signal timing and robustness issues.
In order to design sequential logic QCA circuits, the proposed methodology enhances the aforementioned set of QCA structural blocks with a memory element block.As a result, the design of a memory cell on the programmable QCA crossbar architecture is considered as a prerequisite for the further development of the introduced design methodology [19].This memory cell provides the means for creating 2 n -bit memories and, at the same time, provides effective programmability.This means that the same QCA circuit in programmable crossbar architecture can be either used as a memory cell or as a processing unit.Such a design perspective is possible by exploiting the features of the programmable QCA crossbar architecture, to be analyzed in the next sections.Thus, in the proposed methodology, the memory element block, along with the combinatorial logic QCA blocks are employed to design any sequential logic circuit, while both memory element blocks and combinatorial logic blocks are implemented into the same crossbar.
The proposed methodology as clearly already stated focus on automated QCA design and not on the proposed clocking schemes.As a result, it shouldn't be compared with several clocking schemes that have already been introduced earlier in the literature [20], [21], [22], having also in mind that the usage of a fixed distribution clocking scheme has several advantages for the design and fabrication of a QCA circuits.As a general comment, every clocking scheme has its drawbacks and, as a result, the majority of the proposed circuits in the literature do not follow any specific scheme.Furthermore, even though these schemes are trying to tackle the clock signal distribution problem, they can't be generally used for all design problems.Just to name some of the open issues that merit further investigation are the quantum-dot cells manual placement, the random location of I/O quantum-dot cells in the circuits and the overhead of the combination.These issues can be overwhelming.Nevertheless, and for sake of clarity, it should be mentioned that in the proposed design methodology, the clock signal distribution is not random, as it is explicitly stated in Section II, while the clock zones for each block are defined properly, and the clock zones sequence is cascadable.
Apart of the methodology, we also present an automated QCA circuit design software tool that automatically generates the QCA circuit layout corresponding to a given user specified logic function.Up to our best knowledge, no similar tool with the similar abilities exists.To further demonstrate the capabilities of the proposed methodology and the corresponding tool, the design of various combinatorial and sequential QCA circuits is delivered together with the corresponding simulation results obtained by QCADesigner [23] based simulations for a default cell size of 18 × 18.
The structure of the paper is as follows: Section II introduces the proposed methodology for automated combinatorial QCA circuit design and Section III presents its utilization for the design of two QCA circuit examples.Section IV describes the software tool for automated QCA circuits design and Section V, extends the proposed methodology for the automated design of QCA sequential circuits.In Section VI, two sequential circuits are presented and paper conclusions are drawn in Section VII.

II. AUTOMATED QCA COMBINATORIAL CIRCUIT DESIGN
In this section, we introduce a novel methodology for automated combinatorial QCA circuits design.As mentioned in the introduction, the lack of a universal QCA circuit design methodology results in compatibility issues between reported QCA circuits and this is the very problem our methodology is aiming to overcome.Namely, we propose a universal design methodology that, given a combinatorial function F and the generic programmable crossbar of quantum-dot cells proposed in [12], can create a QCA circuit instance able to evaluate F .The programmable QCA crossbar architecture stability has been verified in [12].This verification has been made theoretically and with the most widespread and reliable simulation tools found in the literature.In this section, only the basic design rules of the architecture will be presented.The programmable QCA crossbar architecture consists of an array of quantum-dot cells (see Fig. 1) and a set of rules that can be employed to map any digital circuit onto the crossbar.These rules define how to handle circuit inputs and outputs, how to form logic gates at the crossbar cross points, and how to (re)configure the logic gate operation even during circuit operation.In particular, cross-shape majority gate [24] is one of the very first and most used logic gates in digital design in QCA technology.Using the majority gate, the OR and AND logic gates can be implemented.More specifically, if one of the three inputs is fixed polarized at −1 (i.e.logic 0 ) the majority gate is operating as an AND gate, and if one of the three inputs is fixed polarized at +1 (i.e.logic 1 ) the majority gate is operating as an OR gate.This fixed polarization input is the programming cell.Namely, by polarizing this input either at +1 or −1, the same cells topology operates either as OR or AND gate, respectively.In addition, in [12] a cross-shape inverter has been proposed.These cross-shape logic gates that can be formed at the cross points of the programmable crossbar are shown in Fig. 1.As Fig. 1 depicts, the programming cells that are used to define the operation of the majority gates are located to the top and to the bottom of the circuit, the inputs are located to the left and the outputs are located to the right.
The straightforward information flow, the well-defined I/O interface, the fixed position of the quantum-dot cells, and the programmability feature make the programmable QCA crossbar architecture the best candidate architecture for an automated QCA design methodology, enabling both scalability and productivity in QCA circuits design.
Even though the programmable QCA crossbar architecture defines a universal design rules set, it does not provide a generic design methodology that can automatically generate the QCA circuit for any design case.Namely, there are still many design problems that need be solved manually, e.g., circuit clocking, logic gate positioning, programming lines distribution.Our design methodology is aiming to provide an efficient and generic solution to these problems.
The first step towards the development of a universal QCA circuit design methodology is the definition of the circuit information flow.In the proposed design methodology, the information propagates from the left side towards the right side of the circuit, which is achievable by the appropriate handling of the clock zone partitioning.Note that clock zone partitioning is one of the most important QCA circuit design phase, and adiabatic switching [25], [26] is currently considered to be the best clocking technique able to provide stability and information flow control within QCA circuits.In adiabatic switching, the electrons of every cell are pushed to either neutral state or one of the two possible logic states.In the latter case, the prevailing logic state depends on the polarization of the neighboring cells.This adjustment of electron motion is achieved through applied electric fields controlled by four-phase clock signals, Switch, Hold, Release, and Relax, which have a relative phase difference of 90 • .The quantum-dot cells in a clock zone are all controlled by the same clock and, as such, the information propagates from one clock zone cells to their neighboring cells of the next clock zone.
To evaluate a given combinatorial function F by means of the QCA technology we rely on fundamental QCA blocks that are able to perform basic Boolean algebra operations, i.e., NOT, AND, OR.Thus, the first design step consists of rewriting F in terms of Boolean algebra operations such that its QCA implementation can be done by means of fundamental QCA   blocks only.Subsequently, the selected QCA blocks are to be instantiated within the crossbar space.
Firstly, for blocks placement, we have to take into consideration the logic operations hierarchy.For example, F = (A • B) + (C • D) is implemented in two logic levels, as indicated in Fig. 2, with two AND gates in level 1 and one OR gate in level 2 with level 1 outputs being level 2 inputs.
The programming line within the crossbar architecture [12] are located at array top and bottom, such that the upper (lower) block makes use of the top (bottom) programming lines.Consequently, each circuit level can accommodate at most two blocks because extra blocks don't have any available programming lines to utilize.In order to overcome this problem, intermediate levels (sub-levels) need to be added.For example, in order to implement AND-gates are placed on the two level 1 sub-levels and the OR-gate in level 2, as depicted in Fig. 3.The number of sub-levels of level i (NSL i ) is defined as where NB i is the number of blocks at level i.In this particular example, NB 1 = 3 and NB 2 = 1.
The proposed methodology can be also utilized to design multi output QCA circuits, case in which each and every function F i has to be implemented separately, as suggested in Fig. 4. Though, since one function is placed bellow the other, we can utilize both top and bottom programming lines in every function.Namely, in the example of Fig. 4 F 1 is utilizing the top programming lines and F 2 is utilizing the bottom.
QCA blocks can be classified in two classes: (i) blocks that implement basic Boolean algebra operations and (ii) interconnect facilitators.Type (i) blocks can be further divided into subcategories depending on their input cardinality.Figs.5-7 present examples of the proposed type (i) blocks of the proposed methodology, i.e., Fig. 5 depicts a 2-input block, while a 3-input block and a 4-input block is presented in Figs. 6  and 7, respectively.We note that all circuits are designed with QCADesigner design tool [23].
The second category contains blocks that can be utilized for signal crossing and branching.The four crossing cases and the two branching cases are presented in Figs. 8 and 9, respectively.
We note that blocks belonging to the same subcategory are designed such that they exhibit the exact same delay, namely, all 4-input blocks introduce a 7 clock zones delay, all branching blocks a 3 clock zones delay, and so on.This uniform delay block design policy eases the handling of synchronization constraints at the circuit level.However, even though the earlier described blocks exhibit fixed and known delays, in QCA circuits the signal propagation between adjacent blocks is performed by binary wires, which induce a wire length dependent delay overhead.Taking this into consideration, QCA blocks placement and binary wires routing are crucial parts of any design technique following QCA operation principles, which seeks the realization of stable and functional QCA circuits.The fact that blocks placed in the upper/lower half of the circuit are utilizing the top/bottom programming lines allows for the realization of different information flows into the upper and the lower parts of the circuit, which are both converging to the right center of the circuit, where the circuit output is located.Moreover, circuit partition into levels and sub-levels enables wire length minimization such that wire induce delay becomes manageable.
The systematic block delays policy combined with interconnection wire length minimization allow the methodology to properly address robustness issues also.Namely, since interconnection wires are as small as possible, it is easier to avoid kinks, i.e., occasions where a quantum-dot cell has a different polarization than the expected one.The maximum length of a QCA binary wire [27] in a specific clock zone is given by where E k is the kink energy, k b Boltzmann's constant, and T the temperature.The kink energy between two Quantum dots is calculated as Thus, to achieve stability any QCA circuit has to be divided into as many clock zones as required while fulfilling (2).On the other hand one compromise should be thought as the more clock zones are utilized the larger the circuit delay.The proposed methodology is handling all the above issues and the obtained circuits have the smallest possible delay.To summarize, the proposed methodology encompasses the following steps: r Transform the to be implemented logic function expres- sion such that it can be implemented by the predefined basic QCA blocks.
r Partition the circuit into levels and sub-levels.r Place the blocks onto the crossbar grid, while utilizing the top and the bottom programming lines, as earlier described.
r Connect the placed blocks, while taking into considera- tion clocking and synchronization constraints.

III. DESIGN METHODOLOGY APPLICATION
In this section, we present two example designs, a 2 : 1 and a 4 : 1 multiplexer, derived by means of the methodology introduced in the previous section.
The logic function that described the output of a 2 : 1 multiplexer is A • S + B • S. Fig. 10 depicts the QCA circuit obtained by following the proposed QCA circuit design methodology.The basic QCA blocks are located inside the red boxes and as one can observe in the Figure the circuit is divided into three levels.Initially the S signal is branched in order to be utilized as input for both 2-input AND gates in the second circuit level.The supplementary input for the upper gate is A, while B is the second input for the lower gate.These two QCA blocks are horizontally mirrored because the  programming lines of the first one are located at the top, and the programming lines of the second one at the bottom.The outputs of these two AND gates are inputs of the 2-input OR gate located in the third and final circuit level.The implementation makes use of 136 Quantum-dot cells that occupy 0.16μm 2 and has a 7 clock zones delay.In Fig. 11, the simulation results that prove the circuit functionality are presented.For the simulation of the 2 : 1 multiplexer as well for all the other circuits that are presented in the following sections,  we made use of the QCADesigner [23].All the simulations were preformed with QCADesigner coherence vector simulation engine default parameters and default cell size, namely 18 × 18.
Likewise, Figs. 12 and 13 present the 4 : 1 multiplexer design and simulation results, respectively.As the 4 : 1 multiplexer output behaviour is described by   block.The implementation requires 2 levels, while the first level that includes the 4 AND gates, has been implemented in 2 sub-levels.The resulting circuit consists of 1,080 Quantumdot cells that occupy 1.06 μm 2 and exhibits a 19 clock zones delay.

IV. AUTOMATIC QCA LAYOUT GENERATION
Based on the circuit design methodology introduced in Section II we developed in C++ a QCA circuit design automation tool.The user provides as input the logic function that she/he wants to implement and the tool automatically generates the layout of its QCA implementation in a QCADesigner compatible format in file with .qcaextension.The circuit synthesis operation comprises the following five steps: 1) QCA blocks selection: The logic function F is analyzed and the necessary QCA blocks for its circuit level implementation are chosen.2) QCA blocks position definition: The QCA circuit is divided into levels and each level is further divided into sub-levels, if needed, based on the methodology discussed in Section II.Then, every block is placed at the corresponding level following the feedforward logical structure of F . 3) Wire placing and routing: Appropriate binary wires are instantiated in order to connect the outputs of prior blocks to the inputs of following blocks, from F 's primary inputs towards its primary output.the tool user friendly.As mentioned before, the only required user action is to specify the to be implemented logic function.

V. SEQUENTIAL QCA CIRCUIT DESIGN METHODOLOGY
The methodology of Section II can be utilized to derive the QCA layout of the circuit that implements any given combinatorial logic function.In this section, we introduce the required modifications that enable its utilization for the QCA implementation of sequential logic circuits.
Even though many memory designs have been proposed [28], [29], [30], [31], [32], [33], [34] the majority of them are not compatible with our targeted crossbar architecture.Thus, for the extension of the circuit design methodology to sequential logic we make use of the QCA memory cell presented in [19], which combines the basic advantages of the QCA technology with the capabilities of the programmable crossbar architecture.The memory cell circuit that it utilized by the proposed methodology is presented in Fig. 16.
In this implementation the memory cell operations, i.e., read and write, are controlled by the programming lines located at the top and the bottom of the QCA crossbar.More specifically, in the read operation the polarization of the programming Quantum-dot cells value should alternate, i.e., if the polarization of the first top programming cell is −1 the polarization of the second should be +1 and vice-versa.The same pattern applies to the fourth and fifth top and to the first two bottom programming cells.On the other hand, in write mode all the top/bottom programming cells have identical polarization, but the top cell polarization value is different than the polarization of the bottom cells.
The obvious advantage of this implementation is programmability, as the same circuit layout can be utilized in many applications by just changing the polarization of the  programming Quantum-dot cells.This makes this RAM implementation promising, since the same circuit can be utilized for different applications with different storage and performance requirements.This RAM structure flexibility and adaptability make it quite attractive for QCA implementations.Last but not least, this approach provides the possibility to implement a given size memory on a prefabricated QCA crossbar, while other RAM QCA technology implementations require from scratch fabrication, which is a great advantage in view of the challenging nature of the QCA circuits fabrication process.
To extend the methodology from combinational to sequential circuit we extend the basic block with the QCA memory block presented in Fig. 16.The memory cell is always in writing mode and the select signal determines the to be stored data value.Fig. 17 presents the generic structure of a QCA circuits designing with memory, which combinational part can be generated by the approach introduced in Section II and the storage part by means of the previously discussed RAM block.

VI. AUTOMATED DESIGN METHODOLOGY FOR SEQUENTIAL LOGIC APPLICATIONS
To further clarify the implementation to digital circuits with memory elements on the generic QCA cell crossbar let we assume that the logic function that is computed and stored in Fig. 17  We also considered the design of a classic sequential electronic device, i.e., a 4-bit shift register, which has a serial data input and the second one that triggers data shifting.The QCA 4-bit right shift register circuit created by the proposed methodology is presented in Fig. 20, where the four memory blocks that have been used in the design are located inside the red frames.Fig. 21 presents simulation result and by comparing the four output waveforms produced by QCADesigner with Table 1 data, one can conclude that the QCA circuit behaves as expected.

VII. CONCLUSION
This paper addressed one of the QCA technology major issues, the lack of universal design methodologies and architectures, and by implication the unavailability of software design tools that can facilitate the design of large (with thousands quantum-dot cells) QCA circuits.We introduced an automated design methodology that makes use of a generic programmable QCA cell crossbar architecture to derive the implementation of any Boolean logic function.We utilized our proposal for the design of several Boolean logic circuits, which correct behavior was verified by means of the QCADesigner design and simulation tool.Moreover, we extended the methodology for the design of sequential circuits and utilized it for the QCA design of a 4-bit shift register.Furthermore, a software designing tool based on the proposed automated methodology was presented, which automatically generates the QCA circuit layout corresponding to a given user specified logic function.
Even though the proposed methodology constitutes the best solution to deal with the well-known QCA design challenges, future research issues could be considered aiming to continuous improvement of the proposed methodology towards even more fabrication friendly solutions.In particular, the proposed methodology should be considered as the first step towards the resolution of an important drawback of QCA technology once the fundamentals of automated design are established.The combination of the methodology with other clocking schemes could be the second step to that direction since this would make the clock signal distribution even more straightforward.

FIGURE 4 .
FIGURE 4. Diagram of the implementation of more than one function.

FIGURE 9 .
FIGURE 9.The blocks that implement the two different branching cases.
implementation requires four 3-input AND gate blocks and one 4-input OR gate

FIGURE 16 .
FIGURE 16.Crossbar mapped QCA RAM cell in writing mode.

4 )
QCA circuit clocking: The circuit is divided into clock zones from the left to the right, taking into consideration stability and the other previously discussed constrains.5) Quantum-dot cells placing: The Quantum-dot cells are placed at their appropriate crossbar positions and clock zones, according to the positions determined during the previous steps.Even though the tool relies on a Command Prompt User Interface (UI), the simplicity of the requested actions makes
Fig. 14 depicts the QCA circuit layout for the evaluation of a • b • (c + d ) logic function, produced by the tool, without human interference.The correctness of the circuit is verified by means of QCADesigner simulations, which results are presented in Fig. 15.
is a + b + (c + d ).Fig.18presents the QCA circuit obtained after the application of the proposed methodology.In the Figure, the red boxes delimitate the memory block and the two logic blocks and the blue boxes the two crossing blocks.Fig.19depicts the corresponding simulation result that demonstrate proper circuit functionality.