Introduction
In distributed systems, different events may happen at the same time, but participants may perceive them in different orders. In contrast, distributed ledger technologies (DLTs) such as Bitcoin [1] typically use a totally ordered data structure, a blockchain, to record the transactions that define the state of the ledger. This design creates a bottleneck, e.g. a miner or validator, through which each transaction must pass. The creation of blocks can also happen concurrently at different parts of the network, leading to bifurcations of the chain that must be resolved. This is typically done by the longest–chain rule [1] or some variant of the heaviest sub-tree [2]. To guarantee the security of the system, the throughput of the system is artificially suppressed so that each block propagates fully before the next block is created, and very few “orphan blocks” spontaneously split the chain. Another effect that limits scalability is that the transactions are handled in batches. The miners create these batches or blocks of transactions and the blockchain can be seen as a three-step process. In the first step, a client sends a transaction to the block producers, then some block producer proposes the block containing a batch of transactions, and in the last step, validators validate the block.
A more novel approach that addresses the asynchronous setting of the distributed system has been taken by IOTA [3]. This approach eliminates the need for clustered transactions and uses a directed acyclic graph (DAG) (as the underlying data structure) to express simultaneous events. In this model, individual transactions are added to the ledger, and each transaction refers to at least two previous transactions. This property reduces the update of the ledger to two steps: One node proposes a transaction to the ledger and waits for the other nodes to validate it. The removal of the intermediary of miners or validators promises to solve (or at least mitigate) several problems associated with them, e.g. mining races [4], centralisation [5], miner extractable value [6], and negative externalities [7] and allows for a fee-less architecture. However, the parallelism involved in adding new transactions to the ledger means that consensus must be found on a “wider” subgraph than just the longest chain or the heaviest sub-tree.
A. Results
Two main problems of Nakamoto’s “longest-chain rule” are the severely limited scalability and the lack of parallelisability. The lack of parallelisability results in the underlying communication network requiring strong assumptions about synchronicity. We propose a consensus protocol that works efficiently and fast in an asynchronous model and allows a high degree of parallelisation. This is achieved by replacing the “longest-chain rule” with the “heaviest-DAG rule”. As the resulting consensus is not based on a total ordering of the transactions, it enables the transactions to be stream processed. An optimization that becomes more and more relevant in the validation of smart contract updates and optional sharding solutions.
Another disadvantage in blockchains, which is perhaps not so well known, is the need for intermediaries in the form of miners or validators. By enabling leaderless writing access to the ledger we remove this dependency and reduce the system to a dichotomy of fund owners and nodes, where nodes take additional roles akin to validators. Nodes propose new blocks, which contain transactions from fund owners, and append them to the Tangle. Nodes utilise the append process to validate and vote on previous blocks in a highly efficient implicit voting scheme.
We propose a generalisation of the voting power of nodes in form of a generalised weight function. This generalisation allows for a high level of configurability of our protocol, making it adaptable to the needs and security requirements of the system in which it should be implemented, such as permissionless or permissioned.
We introduce an asynchronous leaderless protocol that employs a weight-based voting scheme on the Tangle. In this scheme, the supporters of transactions, which are the nodes, are tracked through implicit votes. The confirmation status of transactions can be determined using threshold criteria. We provide the algorithms for the various core components. More specifically, we describe how the supporter lists are updated through the implicit voting scheme and how nodes should attach their blocks to the Tangle. We provide theorems for the convergence, as well as the liveness and safety of the system. First, given a random, unpredictable influx of blocks, Theorem 1 gives guarantees that the system will converge eventually on a consensus state if an adversary has less than 50% of the weight, however, no safety guarantees are given in this case. Second, we give safety and liveness guarantees by extending the protocol and incorporating the capability to synchronise the nodes at certain intervals with the help of a common coin. The security guarantees for this extended protocol are given in Theorem 2. Finally, we provide an overview of simulation results that display the performance of the protocol.
B. Structure of the Paper
The document is structured as follows. In Section I-C we give an overview of essential aspects relevant to the design of a DLT solution. In Section I-D we provide an overview of other recent DAG-based protocols and highlight the differences to our proposal. Section I-E provides an overview of used symbols, acronyms and glossary. Section II gives an overview of some of the graph-theoretical preliminaries used in this paper. In Section III we provide a basic network setting within which the proposed Sybil protection mechanism operates. Section IV describes the functionality of the Tangle data structure and how it is utilised to confirm blocks. Section V introduces an overview of the Reality-based UTXO Ledger, which forms a central component in our approach that helps with tracking the opinions of honest nodes about conflicting transactions. In Section VI, we describe the voting protocol and confirmation of transactions. In Section VII we define the communication and adversary models and address the liveness and security of the system in Sections VIII and IX. In particular, we show that certain attacks that attempt to create a “metastable” situation, could become problematic under specific circumstances and strong assumptions about the adversary. In Section X we provide a solution to this by introducing a synchronization of nodes at larger time intervals. In Section XI, to showcase the performance of the protocol, we provide results from simulation studies. Finally, we conclude the paper with Section XII, where we describe future research directions.
C. Background
Consensus protocols in general and even DLTs, in particular, are such a large research area that we have to refer to some review articles for a more detailed introduction, e.g. [8], [9], [10]. Although a consensus protocol depends on many different aspects, we focus, in the remaining part of the introduction, on those that are most relevant for the design choices of our proposed protocol.
1) Ledger Model
Distributed ledgers (DLs) generally arrive in two flavours of balance keeping: an account-based model, where funds are directly associated with the account of a user, such as is the case with Ethereum [11]; and an unspent transaction output (UTXO) model, where tokens are linked to a so-called output, and users own the keys to the output, as is the case with Bitcoin [1] and many of its derivatives, as well as Cardano [12], Avalanche [13], and IOTA [14]. As an important observation in the latter case, the UTXOs form a DAG themselves. A total ordering of the transactions is unnecessary for many use cases and situations, as most of them are parallelisable. However, the append-only nature of the UTXO ledger hinders this advantage of parallelisation in the presence of conflicting transactions. In [15] we propose an augmented UTXO ledger model that optimistically updates the ledger and tracks the dependencies of the possible conflicts. We construct a consensus protocol that utilises this ledger model to enable fast and parallelisable conflict resolution.
2) The Tangle and Partial Order
The Tangle is the DAG that stores all transactions of the distributed ledger (DL). Every DAG induces a partial order on the set of vertices, the collection of transactions in our setting. This property contrasts with a blockchain where a total order of transactions is established. As in systems with crash failures, atomic broadcast and consensus are equivalent problems, see [16], the partial order of the DAG induces additional “difficulties” in the consensus protocol. More precisely, there have been serious limitations concerning the security of a DAG-based DLT. In the original proposal of the Tangle, [3], the longest chain rule was replaced by the “heaviest sub-graph”, i.e. the sub-DAG containing the most transactions. However, it turned out that this design is vulnerable to various types of attacks and would rely too much on the Proof-of-Work necessary to issue a transaction, e.g. [17]. Another critical element of the design that is common to many other DAG-based proposals is that it suffers a liveness problem. Honest transactions that refer to transactions that turn out to be malicious in the future can not be added to the ledger state. The protocol we propose in this paper solves the security problems by relying on a weight function for nodes and by using the Reality-based Ledger. It also treats the problems of liveness by separating transactions from their containers, which are blocks,1 and by applying a new block referencing scheme. In particular, this batch-less architecture enables a stream process-oriented design of the DLT.
3) Sybil Protection
Sybil protection plays a crucial role in a “permissionless environment” where everyone can participate. By leveraging Proof-of-Work (PoW), Bitcoin’s Nakamoto consensus was the first to achieve consensus in such an open environment. As PoW leads to enormous energy waste and many negative externalities, a lot of effort has been put into proposing more sustainable alternatives. The most prominent of them is called Proof-of-Stake (PoS), where the validators’ voting power is proportional to their stake (i.e. in terms of the underlying cryptocurrency) in the system.
The Sybil protection used in this paper is based on node identities. We describe it generically as a function of a scarce resource or an abstract reputation function. This function, called weight assigns every node identity a positive number. For example, this weight can correspond to an amount of staked tokens, delegated tokens, or the “mana” described in [14]. We want to note that the weight does not have to be connected to the underlying token but can be replaced by any other “weight” serving as a good Sybil protection. In particular, our framework can also be used in a permissioned setting, where only the pre-defined validators would have a positive weight and can apply to the situation with dynamic committee selections.
4) Nakamoto Consensus
Distributed consensus allows participants to agree on a constantly growing log of transactions. It has been an important research topic in recent decades, and its importance in computer science has never been disputed. There are many ways to categorize consensus protocols. For instance, there are the classical landmark results on PAXOS and BFTs, and the newer Nakamoto type consensus mechanisms.
We understand as Nakamoto consensus the rule to select the longest sub-chain, e.g. see [10], and as a variant also the heaviest weighted sub-chain. We extend this concept to the heaviest sub-DAG. More precisely we consider, a Nakamoto blockchain consensus to follow the propose-vote paradigm and that it can be described as follows. The time is divided into epochs, and for each epoch, there is an “elected” leader. This leader batches transactions into a new block and proposes this block. Then the other participants vote on the proposed block, e.g. by extending the chain to which the proposed block is attached. Once the number of votes reaches a certain threshold, the proposed block is considered part of the ledger. The specific definition of the various elements mentioned above may vary and lead to different variants of the Nakamoto Consensus. To some extent, the above paradigm reduces to the necessity to agree on a unique leader in each epoch. Once the participants have a consensus on the leader, the linearity of the blockchain implies consensus on the ledger state. However, the fact that only a leader can advance the ledger state creates an obvious bottleneck with well-known performance limitations. In our proposal, we remove the role of the “leader” entirely and allow the participants to propose their blocks and the contained transactions concurrently. Once a block is proposed, all participants can vote and participate in the consensus finding. The weight of the vote is proportional to the weight of the node, introduced above, such that the protocol adapts to different weight distributions. The protocol is also classified as a non-binary consensus protocol since it can decide on several transactions simultaneously and is an ever-ongoing voting procedure forming a progressively-growing history.2 It also relates to a probabilistic consensus in the sense that the more supporting nodes a transaction accumulated the more likely it is that this transaction is eventually confirmed and added to the ledger.
5) Voting
In our non-linear architecture, each new block references at least two existing blocks. This results in a DAG structure as mentioned above. As with a blockchain, a new block not only votes on its direct references but also on its past cone. Although this is an efficient voting scheme, there is the problem of orphanage or liveness. If a block contains an invalid block in its past cone, it can no longer be voted for and, thus, the contained transaction cannot be included in the ledger. We solve this problem by introducing two different references. The first reference is to the Tangle structure and the second is to the DAG structure originating from the UTXO ledger. The last reference allows voting for transactions that were originally orphaned and also to change previously issued votes. Eventually, both types of votes accumulate in a voting weight, which we call the Approval Weight (AW). The higher this AW the higher the probability that the transaction is eventually included in the ledger. We refer to Figure 1 for an example of the voting mechanism.
Tangle is utilised as a voting layer for nodes to reach a consensus about the outcome of a conflict. Nodes agree on the winner between conflicting transactions
Generally, the voting mechanism can be applied to any DAG-based data structure with an append process that allows for referencing previous blocks. It requires three main ingredients: the first essential ingredient is a reference scheme that efficiently casts and propagates votes. The second necessary ingredient is the construction of a generalised invariant data structure that allows conflicts to coexist (see Section V). This feature allows to treat transactions “optimistically”; every new incoming transaction is considered “honest” unless it conflicts with another transaction. Consequently, nodes may start to build on top of every new transaction, even though this transaction may turn out to be conflicting. The third ingredient is a voting mechanism, dubbed On Tangle Voting (OTV), that efficiently votes on a possible unbounded number of transactions simultaneously. The efficiency is achieved by maintaining a low block overhead since votes of other nodes can be piggy-backed through the implicit voting mechanism. Also in contrast to classical Byzantine fault tolerance, nodes don’t have to be monitored for activity since the issuance of transactions (casting of votes) is a clear sign of being functional.
6) Security
Since the beginning of research on consensus protocols, the concept of security has been at the centre of attention. Any consensus protocol aims to reach consensus on a data. Some of the participants may be faulty or even active in preventing a consensus, and one is interested in the conditions under which consensus can be achieved.
The security of a propose-vote consensus protocol is usually divided into two points; liveness and safety. Liveness means that any correct transaction is finally accepted by all honest participants, and safety means that all participants finally agree on the same set of transactions. The question of whether a given consensus protocol fulfils these properties depends largely on the model assumptions. Roughly, these can be divided into the communication model and the attacker model.
In the most restrictive communication model, the synchronous model, many different solutions are known since the landmark result [18]. However, this is not the case under the most general communication model, the asynchronous model, which does not assume any bounds on the transmission delay of block; commonly denoted by
Besides the communication model, the adversary model plays an important role, especially in the security analysis of Nakamoto protocols. The protocol’s security is commonly expressed in the amount of scarce resources, e.g. energy or computing power, that is necessary to attack the protocol and revert already confirmed transactions. Nakamoto [1] analyzed this property by considering a specific attack, the so-called private-double spend attack. Note that here the classic communication model is the partial synchronous one. Over the last decade, a pertinent research question was the search for worst-case attacker strategies and the identification of the security threshold in terms of the percentage of the scarce resource controlled by an adversary. Tight consistency bounds were recently given in [22] and [23] for several classes of longest-chain type protocols. While these security thresholds do hold in the partial synchronous situation, they fail in the asynchronous setting, e.g. [24].
There is also a line of research that studies how an attacker can compensate its lower weight with more influence on the communication level. The most prominent of such an attack is the balance attack, [25], which consists of delaying network communications between multiple subgroups of nodes with balanced mining power.
This discussion is of particular interest to us because we propose a framework for modelling the communication level and adversarial level jointly. Unsurprisingly, we obtain impossibility results in the asynchronous communication model. Still, under further synchronicity assumptions, we prove that the protocol guarantees liveness and safety (with a very high probability) if the adversarial weight does not exceed certain thresholds. The obtained security bounds are established for any possible attack strategy and are configurable by the protocol.
The situations that lead to the impossibility results in the asynchronous model are frequently considered irrelevant for practicable purposes, e.g. [26], [27]. The argument for this is that in real-world applications, the randomness in the block delays is so great that the particular situation cannot occur. While we partly agree with this reasoning concerning our OTV, we added a second synchronicity level to our core voting protocol to obtain a rigorous security threshold. For this reason, we see our consensus protocol as a two-layer solution. The first layer works in an asynchronous setting and allows fast and secure confirmation under normal network conditions. The second layer is based on an optional synchronization of the nodes that allows consensus finding under worst-case scenarios. The synchronized level relies on a decentralised random beacon or common coin that makes the protocol robust against attacks similar to the balanced attack described above. Randomization of consensus protocols to circumvent the impossibility results are known since [28], which introduces local randomness. A common coin was introduced [29] and is used in several approaches to increase the security in the asynchronous setting.
7) Performance
Defining a measure for the efficiency of a consensus protocol is not an easy task since it relies on many different aspects. Natural choices are the number of blocks sent between the participants and, in synchronous models, the number of communication steps. In DLTs, common measures are the number of transactions per second and the time to confirmation. As our protocol uses implicit voting and no direct blocks are exchanged between the nodes, it is optimal in block complexity (if votes are cast through blocks that would have been sent anyway). We present estimates for the time to confirmation and show their dependence on the distribution of the weights. We do not evaluate quantitative performance measures such as throughput and energy consumption in this work. This type of study will be addressed in follow up research.
A common misunderstanding is that asynchronous consensus protocols are not appropriate for time-critical applications [26]. The fallacy is that synchronous protocols assume strong synchronicity assumptions; however, the security is harmed once these assumptions are not satisfied. We argue that it is even the converse and that asynchronous protocols might be better suited for time-critical applications. Under a good communication situation, transactions are approved much faster than in synchronous models based on network delay estimations with an essential security margin.
One main drawback of the leader-based architecture of blockchains is its lack of scalability capability. To make this more precise, let \begin{equation*} q < \frac {1-q}{1+(1-q)\lambda \Delta }.\tag{1}\end{equation*}
In our case there is no theoretical upper limit for the throughput of the protocol in this paper; however, the limits of scalability of our protocol still need to be investigated in future work.
D. Related Work on Dag-Based Protocols
We already mentioned various related works in the general introduction. This section focuses on the general architecture and mention previous proposals that use DAGs in the underlying data structures. Blockchain-based protocols rely on a chain or “linearisation” of blocks that create a total order of transactions. These blocks have three purposes: leader election, data transmission and voting for ancestor blocks through the chain structure, see [34]. Each of these aspects can be, individually or combined, addressed through a DAG-based component. The various proposals differ in how and which of these components are replaced by a DAG.
The most common approach is to use a DAG structure for the data transmission. This is the most natural approach since if blocks are created at a high rate compared to their propagation time, many competing or even conflicting blocks are created, leading to frequent bifurcation points of the chain. As this results in a performance loss, a natural proposal is to include not only the “main chain” but also bifurcations using additional references, e.g. [2], [35], [36], [37], [38].
Protocols can also achieve a higher degree of parallelisation of the data transmission or writing access if all participants can write and propose blocks. This concurrent writing access removes considerably performance limitations of traditional blockchains. In blockchains where only a tiny proportion of participants can write to the ledger, and these participants are randomly chosen, e.g. by PoW or PoS, participants need to communicate the set of pending transactions to all their peers. This memory pool is a considerable performance limitation as nodes must broadcast transactions twice. Several interesting proposals allow participants to add concurrent blocks to the ledger and to construct a distributed memory pool in the form of a DAG. In the following, we give two approaches that differ in how consensus is achieved and in the underlying Sybil protection. More specifically the first utilises a permissioned setting, while the second employs a permissionless setting.
In the permissioned setting there is the following interesting line of research. The aim is to construct an atomic broadcast protocol based on a combined encoding of the data transmission history and voting on “leader blocks”. Such protocols allow the network participants to reach a consensus on a total ordering of the received transactions, and this linearised output forms the ledger. The most robust protocols achieve Byzantine fault tolerance in asynchronous settings and reach optimal communication complexity, see Honeybadger [39] and [40]. Improvements are proposed, for example, in Hashgraph [41] and Aleph [42] and more recently in Narwhal [43] based on the encoding of the “communication history” in the form of a DAG. These protocols remove the bottleneck of data dissemination of the classical Nakamoto consensus by decoupling the data dissemination from the consensus finding. Promising improvements for the consensus finding on top of the DAG-based memory pool were recently made in DAG Rider [44] and Bullshark [45]. We also want to mention [46] that analyses and discusses this kind of protocol from a more abstract and general point of view.
There is a common point with our approach to mention here. A DAG structure serves as a “testimony” of the communication among the nodes, and new blocks are used for (implicit) voting on previous blocks. In other words, the DAG is used for the two purposes of data transmission and voting. However, voting is done only over so-called “anchor blocks”, leading to an a posteriori leader election and total ordering of the transactions. Furthermore, and as mentioned above, these DAG-based broadcast protocols are designed for permissioned networks, which leads to similar safety-liveness properties to standard BFT protocols. A difference is, thus, that our protocol is designed for an asynchronous network environment and is not round-based as these proposals above.
In the permissionless setting, another route is taken by Prism [34]. This approach explicitly decomposes the three purposes of blocks into three types: proposer blocks, transaction blocks and voter blocks. Having separate transaction blocks allows participants to issue transactions and removes the need for a memory pool. The three types of blocks form a structured DAG that allows a very efficient way to vote on “leader blocks” that eventually give consensus via total ordering. Our approach is orthogonal in that we do not distinguish between different kinds of blocks but that the underlying DAG delivers consensus without an additional tool. In an implementation [47] of Prism, another DAG was used to increase the performance of the execution of the transaction. More precisely, [47] used a scoreboarding technique to execute the (totally) ordered UTXO transactions in parallel. In our approach, we actively construct a DAG, called the Ledger DAG, that encodes the dependencies of the transactions. This DAG is created before reaching consensus and allows tracking dependencies between pending or conflicting transactions. It was demonstrated in [48] that Prism can also support smart contract platforms and that in their implementation, the bottleneck is no longer the consensus but the execution of the smart contracts.
The main difference of our proposal to all the aforementioned protocols is that consensus is found on the heaviest DAG without the need for a “linearisation” using any leader selection. This reduces the purposes of blocks to data transmission and voting.
We want to mention another class of DAG-based and leaderless consensus protocols. However, it is conceptually different from the proposals above and our proposal. In this kind of protocol, e.g. [13], [49], the voting is performed via direct queries between the peers and hence necessities an additional communication layer. A DAG structure is used in Avalanche [13] to “transitively” vote on several blocks at once. We note, however, that the authors of [13] fail to analyze their proposed protocol properly, and the question of whether it has the desired properties remains unclear, e.g. [50, Sec. 2.3].
Finally, let us note that the above is only a selection of previous work on DAG-based DLTs and refer the reader to [10] for a more detailed summary.
E. List of Acronyms and Symbols
For the reader’s convenience, in this section, we summarize important notations and acronyms that are used throughout the paper. Furthermore, in Appendix C we provide a glossary of the terms in use in this paper.
AbbreviationExpansionAcronyms: | |
AW | Approval Weight |
dRNG | Distributed Random Number Generator |
DAG | Directed Acyclic Graph |
DLT | Distributed Ledger Technology |
OTV | On Tangle Voting |
P2P | Peer-to-Peer |
PoW | Proof-of-Work |
PoVP | Proof-of-Voting-Power |
TSA | Tip Selection Algorithm |
TTC | Time to Confirmation |
UTXO | Unspent Transaction Output |
WW | Witness Weight |
Symbols: | |
Set Symbols | |
set of branches | |
set of conflicts | |
set of nodes in network | |
ledger or set of transactions | |
set of blocks | |
DAGs | |
Ledger DAG | |
Tangle DAG | |
Voting DAG | |
set of children of vertex | |
future cone of vertex | |
past cone of vertex | |
directed acyclic graph (DAG) with vertex set | |
genesis or vertex with out-degree zero | |
set of maximal elements in set | |
set of minimal elements in set | |
set of neighbours of a vertex | |
partial order on set | |
set of parent of vertex | |
supporters of | |
Time Symbols | |
time to confirmation defined on | |
confluence time defined on | |
solidification time defined on | |
Weight Functions | |
weight function defined on | |
Approval Weight defined on | |
Witness Weight defined on |
Graph Structures:
We employ several graph structures as a base for the consensus protocol. Table 1 gives an overview of the utilised graphs.
Graph Theoretical Preliminaries
In this section, we summarize basic graph theoretical notations that are used in the remaining part of the paper.
The set of integers between 1 and
Definition 1 (DAG):
A directed acyclic graph (DAG) is a directed graph with no directed cycles, i.e. by following the directions of edges, we never form a closed loop.
A vertex
Definition 2 (Neighbours in a Graph):
Let
Definition 3 (Parents, Children and Leaves in a DAG):
Let
Definition 4 (Partial Order Induced by a DAG):
Let
Note that there could be different DAGs producing the same partial order. The DAG with the fewest number of edges that gives the partial order
Definition 5 (Minimal subDAG Induced by a Set of Vertices):
Let
Definition 6 (Maximal and Minimal Elements):
Let
Definition 7 (Future and Past Cones):
Let
Definition 8 (Past-Closed Sets):
Let
Nodes and Participation
At a high level, DLTs can be divided into permissioned and permissionless networks. In a permissioned setting, only selected parties can participate, while in the permissionless setting, anyone can join the network at any time. In a permissioned network, participants have either reading access or writing (validation) rights. A “fully” permissioned (or private) DLT selects the participants in advance and restricts any activity in the network to these only. This is in contrast to a permissionless network where anybody can participate in the network and validate the ledger. Our protocol can work in both settings using a generic weight function on the participating nodes. In the permissionless setting, this weight function serves as a Sybil protection, and in the permissioned setting, this weight function regulates the participant’s influence.
In Section III-A, we introduce the network participants called nodes. In Section III-B we describe a Sybil protection mechanism based on assigning specific weights to nodes. Finally, in Section III-C we discuss how the writing ability of nodes is controlled by their weight.
A. Network
The network participants in the DLT are called nodes, and we denote the set of all nodes by
In contrast to other DLTs, where nodes can be divided into separate functional classes, we assume all nodes behave in the same way. Specifically, all nodes have two main roles. First, they propagate specific blocks through the network by receiving and sending these from and to their neighbours. Second, by creating new blocks and appending them to the data structure, nodes implicitly vote on the state of the previous blocks and their contained transactions; this procedure is called On Tangle Voting (OTV), see Section VI. For the voting part, we assume a scarce resource, see Section III-B. This resource endows every node with a certain weight that is used for the implicit voting procedure.
B. Sybil Protection
A common problem in permissionless distributed systems is that it is easy to spawn a significant number of nodes, also known as the Sybil attack. Thus, any critical component must ensure that the action of nodes is limited, otherwise, it would be trivial for an attacker to gain a disproportionately large influence and corrupt the protocol.
To limit or prevent Sybil attacks, we assume that each node can be associated with a particular reputation or weight attributing them an equivalent proportion of voting power in the applied voting mechanism.
Definition 9 (Weight):
For a given node \begin{equation*} \sum _{i\in \mathcal {N} } \mathbf {w}(i)=1.\end{equation*}
The above weight function plays a crucial role in the validation process, see Sections IV-D–VI-D.
Remark 1:
We make use of the same weights as a control for the writing access in Section III-C. Note, however, that the weight for writing and validation could be different.
A common way to implement such a weight is the so-called resource testing, where each identity has to prove the ownership of specific difficult-to-obtain resources. Since in the cryptocurrency world, users own a certain amount of a scarce resource, i.e. tokens, a practical Sybil protection mechanism can be based on proving the ownership of tokens and, thus, a certain amount of collateral.
Another way of implementing the weights is through delegation methods. The owners of source tokens, from which the weights are derived, can then delegate these weights to any node of their choosing. This brings several key advantages. For example, fund owners can delegate weight to nodes that provide good service or revoke it when the node does not behave as expected, thus enabling the implementation of a “reputation” system. In the extreme case, this even allows decoupling the weights from the token distribution and incorporate real-world trust models.
Generally, the weight distribution in our system may change over time due to changes in the weights or inevitable churns (nodes join and leave). Due to the asynchronous nature of the protocol, the perception of the weights may then differ from node to node. The protocol design considers this effect and allows a certain divergence in the weight vector. This tolerance to different perceptions provides for some additional features of the protocol. However, a more detailed discussion of a divergence in the nodes’ view on the weight vector is out of the scope of this paper. Thus, for simplicity, we make the following assumption.
Assumption 1 (Agreement on Stability of Weights):
All nodes in the network perceive the weight of node
C. Writing Access
The distributed nature of the protocol and the Byzantine environment within which it operates puts several constraints on the writing access. These constraints are even more critical for our protocol since it is not leader-based and does not rely on the intermediary of miners and block creators. Similar to [51] we require the following conditions:
Consistency: if a block that is issued by an honest node is written to the (distributed) database by one honest node, it should eventually be written by all honest nodes.
Fairness: given a weight function and a maximum bandwidth, nodes can issue blocks at a rate proportional to their weight.
Security: the above constraints are guaranteed in a Byzantine environment.
Consequently, the protocol should ensure that in congested scenarios only a limited amount of blocks are propagated, i.e. the block rate is capped by a certain throughput. Furthermore, this should happen fairly. These requirements prevent nodes from becoming overloaded and from inconsistencies in the ledger being created. In principle, this could be enabled through fees and PoW, or more novel alternatives as the access control algorithm presented in [51].
For the safe operation of the consensus mechanism, we assume the availability of such a mechanism. The required tool should provide guarantees on the constraints mentioned above. We make the following assumption.
Assumption 2 (Writing Access):
The writing access is controlled such that consistency, security, and fairness in writing access are guaranteed for a given weight function
Block Structure and Witness Weight
In this section, we introduce our protocol’s data structure concepts. To replicate a certain content over the distributed network, a node must wrap this content in a block.4 However, when the content is simply transactions, we require a block to contain only one transaction in its payload. This assumption is made for sake of a better presentation and can be relaxed, such that blocks contain more than one transaction. Moreover, each block has to refer to at least two blocks issued in the past. The latter requirement is motivated by the leaderless architecture of our protocol, in which each node can issue blocks independently of others. In addition, we discuss a particular metric on blocks, called the Witness Weight, that allows nodes to reliably understand when a significant fraction of the network has seen a given block.
In Section IV-A, we formally define a block. Section IV-B discusses the Tangle, a DAG formed by blocks and their references. The local version of the Tangle seen by a specific node is introduced in Section IV-C. Using the weight function for nodes introduced in Section III-B, we formally define the Witness Weight of a given block in the local Tangle in Section IV-D and show how to use this metric as a confirmation rule for blocks in Section IV-E. The analysis of the growth of the Witness Weight is provided in Section IV-F.
A. Blocks
The protocol’s goal is to replicate certain content between the nodes in the network reliably. For example, this content could be the atomic updates of balances of fund owners.
This content is wrapped into an object that we call block. A node that would like to initiate the addition of certain content to the Tangle across the network assembles such a block, which includes the content,
Simplified block layout with a transaction as content. The fund owner provides the node with the transaction. The node wraps the transaction into a block and signs the block.
Definition 10 (Block):
A reference \begin{equation*} x=(\{ \mathrm {ref}_{1}(x),\ldots, \mathrm {ref}_{k}(x)\}, \hat {x}, \mathrm {nodeID(x)}),\end{equation*}
Remark 2:
A collision-resistant hash function is used to map data of arbitrary size to a fixed-size binary sequence, i.e.
Remark 3:
The label
The issuing node obtains the content through a service-client relationship with the issuer of the content, which can be facilitated through an application programming interface (API) call. Alternatively, the node itself may also be the issuer of the content. An essential application for the content is the transfer of funds, i.e. the consumption and creation of outputs. We call this type of content a transaction. In this paper, for the sake of presentation, we will assume that each block contains exactly one transaction in its payload. However, in general, blocks are not limited to this use case.
As blocks will also be used to propagate votes, keeping track of the issuing nodes is crucial.
Definition 11 (Issuer of a Block):
For a block
B. The Tangle
The Tangle is a data structure built in accordance with the following rule as stated in the original paper [3] of the Tangle: “In order to issue a [block],5 a node chooses two other [blocks] to approve”.
More generally, we modify this by allowing a block to reference up to
Let us define this data structure more formally. We denote the set of blocks by
Definition 12 (The Tangle):
The Tangle
Using the notation from Section II, we write
Example 1:
We refer to Figure 3 for an illustration of the Tangle and the Tangle future and past cones of block
C. Local Tangles
Due to the distributed nature of the network, nodes can receive blocks at differing times or even out of order. The time at which a node first receives a block is called arrival time.
Blocks can also be lost during their broadcast. While, generally, this could be problematic, the Tangle DAG allows for an elegant solution to remedy the loss by a process called solidification. If a node receives a block for which the parents are unknown, it requests the missing block from its peers. Upon receipt of the missing parent block, the past cone is now complete (unless their parents are missing - in which case the node has to repeat this procedure recursively). Once a block’s past cone is completed, the node flags the block as solid. The time of solidification of a block
As a consequence of the above, we can argue that there is no such thing as one Tangle in the network, as every node may have a different perception of it. Hence, at time
D. Witness Weight and Weighted Local Tangles
In the original Tangle whitepaper [3] the cumulative weight of a block plays a crucial role in the consensus finding. This cumulative weight is the number of blocks referencing a given block. In case of a conflict, nodes follow the part of the Tangle that contains the largest cumulative weight.
We adopt this fundamental idea to the setting where each node carries some weight. In this way, the nodes’ weight replaces the PoW in the block creation as a Sybil protection mechanism. The nodes’ signature in each block links the issuing node to the block (see Section IV-A). Thus, a node can be associated with the set of blocks on the Tangle issued by that node, and the node’s weight can be mapped to the blocks.
Definition 13 (Block Supporter and Witness Weight):
Let \begin{equation*} \mathrm {sprt}_{ \mathcal {T}_{i,t}}(x)=\left \{{j\in \mathcal {N}: \exists y \in \mathrm {cone}_{ \mathcal {T}_{i,t}}^{(f)}\left ({x}\right), j= \mathbf {issue}(y)}\right \}.\end{equation*}
\begin{equation*} \mathbf {WW}_{i,t} (x):= \sum _{j \in \mathrm {sprt} _{ \mathcal {T}_{i,t}}(x)} \mathbf {w}(j).\tag{2}\end{equation*}
Example 2:
In Figure 4, we give an example of the set of nodes approving given blocks
Tangle DAG, where the issuing node of a block can be identified with a unique colour shown in the bottom of the block. The colors of the supporters of blocks
We proceed with two trivial statements saying that the WWs of blocks are monotonically increasing toward the genesis and the WW of a block can only grow over time.
Lemma 1 (Monotonicity of the WW):
For any two blocks
Lemma 2 (Growth of the WW):
For any block
A more delicate analysis of the growth of the WW under certain assumptions is provided in Section IV-F.
E. Confirmation Rule for Blocks
The block stream is controlled by the writing access control, see Section III-C. A priori, this control alone may not be sufficient to guarantee that all nodes see all blocks in the network. However, to guarantee the safety of the system, nodes must have consensus on which blocks should permanently be accepted in the data set
Tools that provide information about the confirmation status of blocks, with specific safety and liveness considerations, are generally referred to as confirmation rule. We design such a tool based on the concept of WWs of the blocks. The WW allows the nodes and users to create their subjective confirmation criterion. The larger the WW of a block, the higher the probability that the block will be in the ledger forever. This idea is similar to the “depth” of a transaction in a blockchain. Therefore, the actual confirmation criterion may depend on the protocol environment and the underlying use case.
Definition 14 (Confirmed Block):
Let
Once a block is confirmed for a node, it remains confirmed forever. This irreversibility of the confirmation status places some strong requirements on the convergence of this status. More specifically, once a single node reaches the threshold for a given block, all nodes should reach this threshold eventually with a very high probability.
In an honest scenario, this assumption can be easily satisfied since a high WW also represents that a large proportion of nodes have “seen” a given block and issued a block approving it. If the default tip selection algorithm is suitably chosen and followed by sufficiently many nodes all nodes will attach blocks eventually to the future cone of that block with a very high probability (for more details, see Section IV-F). In Section VIII we discuss the liveness and safety of the protocol in detail.
F. Growth of Witness Weight
In this section, we model the block issuance and discuss the growth of the WW and its dependencies on the protocol environment.
We consider the following assumption.
Assumption 3 (Issuing Rate):
Each node \begin{equation*} \lambda =\sum _{i\in \mathcal {N} }\lambda _{i}.\end{equation*}
Remark 4:
Under Assumption 3 the times between two successive blocks from a node
To develop a heuristic for the WW we use the following approach. We assume that there is an “omniscient observer”, that is instantly aware of all blocks issued by all nodes. The observer’s perception of the state may differ from the perception of a given node, however, these differences have no substantial influence on the heuristic result. We refer to [52], [53] where this method has already been proven to lead to good heuristics. This view is reflected in the notation by omitting the index
Let
For \begin{equation*} \mathbf {WW}_{t}(x) = \sum _{i=1}^{N} \mathbf {w}(i) \mathbf {1}\{E_{i}(\delta,x)\}.\tag{3}\end{equation*}
\begin{equation*} \mathbb {P}(E_{i}(\delta,x)) \leq 1- \exp (- \delta \lambda \mathbf {w}(i)).\tag{4}\end{equation*}
\begin{equation*} \mathbb {E}[\mathbf {WW}_{t}(x)] \leq \sum _{i=1}^{N} \mathbf {w}(i) \left ({1- \exp (- \delta \lambda \mathbf {w}(i))}\right).\tag{5}\end{equation*}
The formula given in (3) holds in the very general setting. For the analysis of the protocol, it is, however, important to consider a specific weight distribution. Probably the most appropriate modelings of weight distributions rely on universality phenomena. The most famous example of this universality phenomenon is the central limit theorem. While the central limit theorem is suited to describe statistics where values are of the same order of magnitude, it is not appropriate to model more heterogeneous situations where the values might differ in several orders of magnitude. These heterogeneous situations are frequently described by a Zipf law and appear in many fields; e.g. city populations, internet traffic data, the formation of P2P communities, company sizes, and science citations. We refer to [54] for a brief introduction and more references, and to [55], [56], and [57] for the appearance of Zipf’s law on the internet, computer networks, and DLTs.
We consider a situation with \begin{equation*} \mathbf {w}(r) = \frac {r^{-s}}{ \sum _{j = 1}^{N} j^{-s}},\tag{6}\end{equation*}
Example 3:
We refer to Figure 5. The growth of the WW depends on several factors, notably the issuing rate
G. Estimates on Time to Confirmation
As discussed in Section IV-E, a confirmation rule is essential for many use cases, and time to confirmation (TTC) is undoubtedly a vital performance measure of every consensus protocol. As a thorough analysis of the TTC is out of the scope of this paper, we give a first “heuristic” upper bound in this section.
Definition 15 (Time to Confirmation):
We define the time to confirmation of a block \begin{equation*} \tau _{f,i}=\tau _{f,i}(x):= \inf \{ t>0: \mathbf {WW}_{i,t}(x) \ge \theta \} - \tau _{s,i}(x),\tag{7}\end{equation*}
In the remainder of this section, we omit index \begin{equation*} \tau _{f} \leq \tau _{c} + \tau _{iss}.\tag{8}\end{equation*}
Example 4:
We demonstrate the confluence time and the issuance time with the help of Figure 6. Blocks with a solid frame are in the future cone of block
Illustration of the Tangle to display the confluence time and issuance time. The colours in the bottom of the blocks represents the issuing nodes with significant weight. We demonstrate the colours of the “heavy” supporters of block
With some additional assumptions, we can obtain estimates for the confluence time
Assumption 4 (Constant Network Delay):
We assume that the time between the block creation and until any other node receives this block equals some constant
Definition 16 (Number of Tips):
Let
As mentioned in Section IV-C, there is no “objective Tangle,” and every node has its own perception. Nevertheless, previous work [52] showed that the approximation made in this section leads to reasonable approximations for some quantitative properties of the Tangle, such as the number of tips and confluence times. For this reason, we omit the subscript “
Assumption 5 (Constant Tangle Width):
We assume that the number of tips
Using Assumptions 3, 4, and 5 we follow the heuristics described in [3, Sec. 3]. A first observation is that at any given time \begin{equation*} L_{0} = L_{0}^{(k)} = \frac { k \lambda h}{k-1}.\tag{9}\end{equation*}
A first consequence of (9) is that, if \begin{equation*} h+L_{0}/(k\lambda)=h+ \frac {h}{(k-1)}.\tag{10}\end{equation*}
Remark 5:
For any given
We can proceed similar to [3] to obtain that \begin{equation*} \tau _{c} \approx \frac {h}{W\left ({\frac {(k-1)^{2}}{k} }\right)} \left ({\log L_{0} + \log \varepsilon }\right),\tag{11}\end{equation*}
\begin{equation*} W\left ({\frac {(k-1)^{2}}{k} }\right)\approx 2 \log (k-1) - \log k \approx \log k\end{equation*}
\begin{equation*} \tau _{c} \approx \frac {h}{\log k} \log (L_{0}) \approx \frac {1}{\log k} h \log (\lambda h).\tag{12}\end{equation*}
Example 5:
The behaviour of the issuing time \begin{equation*} X_{(i)} \sim \frac {1}{\gamma } \sum _{j=1}^{i} \frac {Z_{j}}{N-j+1},\tag{13}\end{equation*}
\begin{equation*} \mathbb {E}[X_{(i)}] = \frac {1}{\gamma } \sum _{j=1}^{i} \frac {1}{N-j+1},\end{equation*}
\begin{equation*} \mathbb {E}[X_{(i)}] \approx \frac {N}{\lambda } \left ({\log (N) - \log (N-i) }\right).\end{equation*}
\begin{equation*} \tau _{iss} \approx \mathbb {E} [X_{(i)}] \approx \frac {N}{\lambda } \left ({- \log (1-\theta) }\right).\end{equation*}
\begin{equation*} \tau _{f} \lesssim \frac {1}{\log k} h \log (\lambda h) + \frac {N}{\lambda } \left ({- \log (1-\theta) }\right).\end{equation*}
The Ledger
This section introduces several novel concepts to represent transactions and their interrelationships. Recall that in the standard UTXO conflict-free model, transactions specify the outputs of previous transactions as inputs and create new outputs by spending (or consuming) the inputs. No two transactions are consuming the same input. Such a conflict-free data structure can be implemented in a network where a consensus mechanism filters transactions. The latter is typically done by choosing a “leader” among the participants, and the leader adds a block of transactions to the conflict-free ledger. To bypass this “centralised” bottleneck, we propose the concept of the Reality-based UTXO Ledger, an augmented version of the standard conflict-free UTXO Ledger that allows more than one output spend. We refer the reader to the parallel work [15], where we discuss all concepts in detail.
In Section V-A, we recall the definition of a transaction in the UTXO model and the ledger, which is a set of all transactions. In Section V-B, we introduce definitions of conflicting transactions, conflicts and branches, which represent proper subsets of “non-conflicting conflicts”. A reality is a maximal possible branch, and restricting a ledger to a reality results in the conflict-free UTXO Ledger. Finally, in Section V-C we discuss how nodes could choose a reality given an abstract weight function defined on the set of conflicts. The selected reality allows a node to express its opinion when issuing new blocks and validating transactions.
A. UTXO Model and Transactions
In the Unspent Transaction Output (UTXO) model transactions specify the outputs of previous transactions as inputs and spend them by creating new outputs.
Thus, a transaction consists of a list of inputs and a list of outputs, see Figure 2. Note that outputs must be unique. The uniqueness is typically achieved by creating the output ID with the involvement of a hash function. For example, the output ID could be the concatenation of the index of an output and the hash of a transaction’s content. Every output represents a specific amount of the underlying cryptocurrency. The value of all inputs, i.e. spent outputs, must equal the value of all outputs of a transaction. With each output comes a declaration by whom and under which conditions it can be spent. Under unlock conditions, e.g. a signature proving ownership of a given input’s address, the transaction issuer is allowed to spend the inputs. We refer to Figure 2 for a general transaction layout.
As said in Section IV-A, blocks contain transactions in their payload. Hereafter, we write
Let us define the transactions and ledger model more formally. We follow the approach of [59].
Definition 17 (Output and Input):
An output is a pair of a value
Definition 18 (Transaction):
A transaction
is a list of inputs, i.e. references to unconsumed outputs. We say that those outputs are spent or consumed by transaction$\mathrm {in}(\hat {x})=(i_{1},\ldots, i_{n})$ ;$x$ is a list of new outputs produced by transaction$\mathrm {out}(\hat {x})=(o_{1},\ldots, o_{m})$ ;$\hat {x}$ is a proof which performs verification of the unlock conditions of each input$\mathrm {unlock}(i)$ of transaction$i$ . This is usually done by cryptographic proof of authorization that ensures that the issuer of the transaction satisfies the condition cond of the consumed outputs.$\hat {x}$
Definition 19 (Ledger):
The ledger is a set of transactions and denoted as
The UTXO ledger starts at the so-called genesis which contains outputs and no inputs. We emphasize that we use the same term for the ultimate predecessor of all blocks and all transactions. Recall that the genesis-block is written as
Typically every output can be consumed by at most one transaction and, hence, the value of all unspent outputs is conserved overall. Specifically, in the standard conflict-free UTXO model, the ledger can not contain a so-called double spend, i.e. two transactions that consume the same output of a transaction.
In the following section, we alleviate this conflict-free restriction and allow the Ledger to contain conflicting transactions.
B. Reality-Based Ledger
In this section, we propose an augmented version of the standard conflict-free UTXO ledger model that allows containing double spends. We suggest different structures that can be used for tracking conflicting transactions without the need for consensus.
First, we explain how the transactions and their in- and outputs result in a DAG structure. The information contained in the Ledger DAG is split into the Conflict Graph, which keeps track of the conflicting transactions only. Then we introduce the concept of branches. A branch forms a possible non-conflicting state of the ledger. We will then derive a concept, called a reality, which allows us to reduce
Definition 20 (Ledger DAG):
We define the Ledger DAG
We refer to Appendix B, where we demonstrate this graph together with many other core concepts. Using the notation from Section II, we write
Typically, the addition of transactions to this type of data structure is such that only transactions, which create no conflict with any previously recorded transactions are allowed to be added, i.e. the Ledger DAG is conflict-free. However, this requires a consensus mechanism that pre-selects transactions.
Now we introduce a new design for a ledger, where this constraint is replaced by a relaxed one – namely, a new transaction
Definition 21 (Conflicts):
Two distinct transactions
Definition 22 (Conflicting Transactions):
Two distinct transactions
The interrelations between conflicts can be encoded with the help of the Conflict DAG and the Conflict Graph.
Definition 23 (Conflict DAG and Conflict Graph):
The set of all conflicts is denoted by
We can group transactions based on whether they conflict with each other or not.
Definition 24 (Conflict-Free Set and Conflicting Sets):
A subset of transactions
We further specialise conflict-free sets and introduce the notion of branches.
Definition 25 (Branch and Set of Branches):
A set of conflicts
is conflict-free (cf. Definition 24);${B}$ is${B}$ -past-closed (cf. Definition 8).$D_{ \mathcal {C}}$
We now introduce the concept of a reality which can be defined as a maximal possible branch or, equivalently, a maximal independent set in the Conflict Graph. In other words, a reality aggregates the maximal number of conflicts while preserving non-conflicting nature.
Definition 26 (Maximal Branch and Reality):
A branch
Next, we describe the notion of the maximal contained branch of a given transaction which consists of the set of conflicting transactions in the past cone of the given transaction.
Definition 27 (Maximal Contained Branch):
Let
We note that there could not be two maximal branches in the ledger past cone of a transaction. Indeed, the past cone of any transaction is conflict-free and, thus, if there would be two maximal branches, we could consider the union of two branches, which has to be also a branch.
Definition 28 (Ledger of a Reality):
Let
Recall that a maximal contained branch of a transaction from the
Remark 6 (Local Ledger):
As discussed in Section IV-C, there could be subjective versions of the Tangle DAG. Similarly, every node has its own perception of the Ledger. Thereby, we will use subscripts
C. Reality Selection Algorithm
To issue new blocks and validate transactions, each node in the network has to choose a conflict-free part of the ledger that it prefers. For this purpose, it suffices for a node to choose a preferred reality. Once a reality
Definition 29 (Preferred Reality):
Node
There could be different ways to choose the preferred reality. We provide a natural reality selection algorithm that takes as an input the Conflict Graph and an abstract weight function
monotonicity: for any two conflicts
such that$x, y\in \mathcal {C} $ , it holds that$x \le _{ \mathcal {C}} y$ \begin{equation*} \mathbf {w} (x) \le \mathbf {w}(y);\end{equation*} View Source\begin{equation*} \mathbf {w} (x) \le \mathbf {w}(y);\end{equation*}
consistency: let
be pairwise conflicting conflicts.7 Then it holds that$x_{1},\ldots, x_{s}$ \begin{equation*} \sum _{i=1}^{s}\mathbf {w}(x_{i}) \le 1.\end{equation*} View Source\begin{equation*} \sum _{i=1}^{s}\mathbf {w}(x_{i}) \le 1.\end{equation*}
Remark 7:
In Section VI-D, we introduce the Approval Weight function defined on the set of all transactions, i.e.
In Algorithm 1 we describe the proposed procedure. In this algorithm, we initialize
We refer to Appendix B, where we apply the algorithm as part of an illustrated example.
On Tangle Voting
In this section, we present a voting mechanism based on the Tangle and the Ledger DAG. This mechanism allows for selecting realities in the Reality-based Ledger.
In Section VI-B we give an overview of two suitable DAG structures, which can be utilised to enable voting on the realities. Section VI-C combines these two structures into a Voting DAG and introduces basic concepts that follow from it. We also address how voting on two DAGs increases the liveness of the protocol. Section VI-D defines a metric called Approval Weight which is utilised in Section VI-E to identify a preferred reality and vote for it using a suitable tip selection algorithm.
A. Extension of Witness Weight and Liveness Problems
In Section IV we introduced the Witness Weight, which is a metric used for the confirmation of blocks. In this section, we seek a similar tool for the confirmation of transactions.
The Witness Weight has the property that it is monotonically increasing since it expresses the percentage of the weight that has witnessed a block’s existence. The situation is different for transactions where we want to leverage the node’s weight to decide between conflicting transactions. To ensure liveness, nodes must have the possibility to change their votes and withdraw their weights from the approval weight of a given transaction.8 However, changing the opinions might imply that blocks that reference (and vote for) blocks with rejected transactions might never be confirmed.
This situation creates a negative incentive to reference new tips. More precisely, nodes may be incentivized to either reference only blocks from trusted entities, tips of a certain age, or in the worst case, ancient and already confirmed blocks. The last behaviour may eventually lead to no new blocks being confirmed anymore.
The problems above were until now a significant concern of DAG-based consensus protocols, e.g. [3]. We propose to solve these by using the Reality-based Ledger and extending the reference scheme.
B. Immutable DAGs
Blocks are the primary information carriers of the network, i.e. they contain transactions and express the opinion of the issuing nodes. The references in the blocks, together with the signature of the nodes and the unlock proofs for the inputs, form two immutable data structures, similar to a blockchain.
First, the Tangle
Second, the Ledger DAG
For nodes to objectively agree on a partial order of events, we require the following assumption.
Assumption 6 (Past Cone Completeness):
For a transaction
In other words, we have the natural assumption that the spending of the output should happen in the future cones the blocks “creating” these outputs.
Lemma 3:
Under Assumption 6, the partial order
Proof:
The statement can be shown trivially by induction on the length of the shortest path between
C. Voting and Voting Dag
As a consequence of Lemma 3 both, the Tangle and the Ledger DAG, are suitable for nodes to express their opinions about which transactions they prefer among any conflicting transactions. More specifically by creating and attaching new blocks, nodes have an implicit way of voting for the “preferred” branches and conflicts. Let us define this more precisely.
We utilise the references contained in a block, which constitute the edges of the Tangle, see Section IV, to express a node’s opinion. As by Definition 10 a reference contains two fields:
Definition 30 (Block Reference):
We say a reference
To overcome the liveness issues described in Section VI-A we additionally add a reference that bypasses the block and directly addresses the contained transaction.
Definition 31 (Transaction Reference):
We say a reference
Remark 8:
Naturally, a block references the transaction that is the content of the block. As such, an honest node would not issue a block with a transaction that is not in its preferred reality (see Section VI-E).
Example 6:
Consider Figure 7. Blocks
Inheritance of branches: we consider two potential blocks
Remark 9:
The distinction into the sub-categories (transaction reference and block reference) is only relevant for the purpose of voting; the definition of the Witness Weight, see Section IV-D, remains unaffected.
We define a data structure that combines the two immutable data structures in Section VI-B into one single DAG used for propagating the votes.
Definition 32 (Voting DAG):
The Voting DAG
and$u,v\in \mathcal {T} $ contains a block reference to$u$ ;$v$ and$u\in \mathcal {T}, v\in \mathcal {L} $ contains a transaction reference to transaction$u$ ;$v$ and$u\in \mathcal {T} $ , i.e.$v= \hat {u}\in \mathcal {L} $ is a transaction in block$v$ ;$u$ and transaction$u,v\in \mathcal {L} $ spends the output from transaction$u$ , i.e.$v$ .$v\in \mathrm {par}_{ \mathcal {L}}\left ({u}\right)$
So far we described how references between blocks are given additional meaning to construct the voting DAG. This DAG allows nodes to express their opinions, recursively. Following Definition 7 we define
Definition 33 (Voting):
A node
Example 7:
We illustrate the concept of a Voting DAG in Figure 8. The Voting DAG assembles information from the Tangle and the Ledger DAG. We assume a situation where the node that issues block
Illustration of how the Voting DAG is assembled from the Tangle and the Ledger DAG. By creating a transaction reference to block
We can also describe the voting past cone in terms of a recursive equation.
Proposition 1:
Suppose a given block \begin{equation*} \mathrm {cone}_{ \mathcal {V}}^{(p)}\left ({x}\right)= x \cup \mathrm {cone}_{ \mathcal {L}}^{(p)}\left ({\hat {x}}\right)\cup C_{ \mathcal {L}}(x) \cup C_{ \mathcal {V}}(x),\end{equation*}
\begin{align*} C_{ \mathcal {L}}(x):=&\mathrm {cone}_{ \mathcal {L}}^{(p)}\left ({\hat {z}_{1}}\right)\cup \ldots \cup \mathrm {cone}_{ \mathcal {L}}^{(p)}\left ({\hat {z}_{r}}\right),\\ C_{ \mathcal {V}}(x):=&\mathrm {cone}_{ \mathcal {V}}^{(p)}\left ({y_{1}}\right)\cup \ldots \cup \mathrm {cone}_{ \mathcal {V}}^{(p)}\left ({y_{s}}\right).\\{}\end{align*}
The Reality-based Ledger introduces the concept of branches, see Section V. The consumption of more than one output from different branches creates a new branch, which is the union of the branches of the consumed outputs. Now we extend this concept to blocks, which can combine branches by voting for previous blocks or transactions. More precisely we can relate a given reference in a block with a branch. The branch of the block is then defined as follows.
Definition 34 (Voting Branch):
Given a block \begin{equation*} \mathrm {branch}^{(p)}_{ \mathcal {V}}(x):= \mathrm {cone}_{ \mathcal {V}}^{(p)}\left ({x}\right) \cap \mathcal {C},\end{equation*}
Remark 10:
We highlight that for the correctness of the protocol, a node has to create references for a new block
Recall Definition 27 that introduces the maximal contained branch of a transaction
Proposition 2 (Inheritance of Branches):
Suppose a given block \begin{equation*} \mathrm {branch}^{(p)}_{ \mathcal {V}}(x) = \mathrm {branch}^{(p)}_{ \mathcal {L}}(x) \cup B_{ \mathcal {L}}(x) \cup B_{ \mathcal {V}}(x),\end{equation*}
\begin{align*} B_{ \mathcal {L}}(x):=&\mathrm {branch}^{(p)}_{ \mathcal {L}}(\hat {z}_{1})\cup \ldots \cup \mathrm {branch} ^{(p)}_{ \mathcal {L}}(\hat {z}_{r}),\\ B_{ \mathcal {V}}(x):=&\mathrm {branch}^{(p)}_{ \mathcal {V}}(y_{1})\cup \ldots \cup \mathrm {branch} ^{(p)}_{ \mathcal {V}}(y_{s}),\\{}\end{align*}
Example 8:
We follow the same example as shown in Figure 7. We assume the maximal contained branch of the transaction in block
We can associate a given block
Definition 35 (Change of Vote and Current Vote):
Let
Remark 11:
The notion of “time” and its implications on the meaning of “after” in Definition 35 are crucial. Natural choices are the timestamp inside a transaction or the solidification time of a block that contains a given transaction.
Example 9:
The principle of Definition 35 is demonstrated in Figure 1. Specifically, transactions
D. Approval Weight and Confirmation Rule for Transactions
Nodes must be able to track the progress of the acceptance of a transaction. We extend the concepts of Witness Weight, introduced in Section IV-D, to the Approval Weight (AW) of transactions. The objective is then to define a parameterisable confirmation condition for transactions similar to the one discussed for blocks in Section IV-E.
Definition 36 (Transaction Supporters and Approval Weight):
Let \begin{equation*} \mathbf {AW}(\hat {x}):= \sum _{j \in \mathrm {sprt} _{ \mathcal {L}}(\hat {x})} \mathbf {w}(j)\tag{14}\end{equation*}
Clearly, the AW describes the percentage of the network approving a given transaction.
Remark 12:
The WW of a block
The supporter of transactions can be updated using propagation of the supporter information through the voting DAG. More precisely on arrival of a block
Similar to Definition 14 we define the confirmation of a transaction. We will use subscripts
Definition 37 (Confirmed Transaction):
Let
We also define the AW of a branch, which will form the base for the algorithm in the next section. The supporters of a branch are equal to the intersection of the supporters of the conflicts in the branch. More formally we have the following.
Definition 38 (Branch Supporters and Approval Weight):
Let \begin{equation*} \mathbf {AW}(B) := \sum _{j \in \mathrm {sprt}^{ \mathcal {L}} (B)} \mathbf {w}(j).\tag{15}\end{equation*}
E. Tip Selection Algorithm
The consensus protocol relies substantially on an implicit voting mechanism. Nodes express their opinions and votes by choosing the references in their newly issued blocks. The process that determines the references is called the Tip Selection Algorithm (TSA) and is discussed in this section.
With every block, a node can vote on which parts of the Tangle and the Ledger DAG it prefers by using block or transaction references. The preferred parts of the Tangle and the Ledger DAG are defined by the preferred reality. Following the algorithm described in Section V-C, a node
We now describe a tip selection mechanism that considers both block and transaction votes. Note that due to Lemma 3 the Ledger DAG induces a partial order consistent with the one induced by the Tangle and, thus, voting on the Ledger DAG allows expressing a more selective, albeit less efficient vote than on the Tangle.
Let us define some reality-dependent tip sets on the Tangle DAG and the Ledger DAG.
Denote by
Denote by
Definition 39 (Uniform Random Tip Selection on a Reality):
To issue a new block, node
if the selected block is in the set
, a block reference is created;$\mathbf {T}_{ \mathcal {T}}(R)$ otherwise, if the selected tip contains a transaction that is in the set
, a transaction reference is created;$\mathbf {T}_{ \mathcal {L}}(R)$ if neither of the above apply, the block is discarded instead.
We refer to Appendix B, where we demonstrate this algorithm as part of an illustrated example.
A node may have voted previously for a branch that is no longer its preferred branch. It has therefore to change its vote. With the above tip selection nodes are allowed to vote for branches they previously did not “prefer” (by voting for a conflicting transaction) and vote “against” branches they previously voted for. Every node must therefore keep the supporters for each branch and their AW up to date. An important consequence is that the AW of certain branches may increase in time while for others it may decrease in time.
The addition of the transaction vote demonstrates that solutions for the Tip Selection Algorithm can be found that mitigate or reduce liveness issues and that transactions eventually will be considered for tip selection. Thus, in the following, we work under the following assumption.
Assumption 7 (Block Inclusion):
Let
We also refer to Section XII for a more detailed discussion.
Communication and Adversary Models
Before stating the security requirements of the protocol, we have to make assumptions about the underlying communication model. It is common to describe the uncertainty related to the communication by an attacker that controls the delays of the blocks. The communication model defines the limits the adversary can delay the communication between the nodes. As a model, it is only a simplification, but it allows a systematic study of the most critical components.
For simplicity, we also analyse the voting mechanism without details such as the TSA. We want to emphasise that our modelling can also be applied to other consensus protocols, thus, providing a framework for comparing different DLTs.
A. Communication Model
The participating nodes communicate over a peer-to-peer (P2P) protocol or network. In this P2P protocol, nodes send their signed blocks to their neighbouring peers. Neighbours forward blocks from other nodes in the overlay network only if they have verified its validity; if a transaction is invalid, the propagation stops. The transmission of a block between two nodes is done by sending a package containing the block.
There are three basic (or classic) models for the P2P communication between the nodes: the synchronous model, the asynchronous model, and the partial synchronous model, e.g. see [16] and [20].
In the synchronous model, there exists some known finite time bound
A partially synchronous system can be seen as initially asynchronous that becomes eventually synchronous. The time at which the system becomes synchronous is called the Global Stabilisation Time (GST).
We also consider a probabilistic synchronous model, see [49]. In this model we assume that for every
The specific implementations for a consensus mechanism depend heavily on the underlying synchronicity assumption. It also seems appropriate to distinguish between consensus protocols that find consensus on one data set and consensus protocols that find consensus on a growing number of decisions. The latter allows to “strengthen the synchronicity” between the nodes if the data are related by references.
B. The Tangle, Solidification, and Synchronicity
The references that form the Tangle are essential for the consistency of information every node has. Consider that a package propagates to only part of the network, e.g. lost during some of the propagation processes on the communication layer. However, nodes that have received the block start building on it and gossip their blocks to the network. These new blocks contain references to the partially missing block. Since nodes must know the past cone of any block to have a complete Tangle history from that blocks’ point of view, we use a mechanism called the solidification process. In this mechanism, nodes that receive a given block only process it if its past cone is complete or, otherwise, ask their peers for the missing referenced block (for more details, see Section IV-C). In other words, the solidification process is a mechanism to recover lost blocks and, hence, strengthens the “synchronicity” of the communication model. We think that this, to some extent, supports the assumption that all blocks are delivered within a bound time
C. Adversary Model
We distinguish between three types of nodes: honest, faulty, and malicious. Honest nodes follow the protocol, faulty nodes are not working properly (e.g. not sending any transactions), and malicious nodes are trying to disturb the protocol by not following the rules actively. In most scenarios, we assume that the malicious nodes are controlled by an abstract entity that we call the attacker. We assume that the attackers are computationally limited and cannot break the signature schemes or the cryptographic hash functions involved. However, we assume that the attacker is omniscient and “knows immediately” about all state changes of the honest nodes.
In classic consensus protocols, the communication model already covers the adversary behaviours, as delaying blocks is essentially the only way an attacker can influence the system. This is no longer true for our consensus protocol. Here, adversarial strategies can be divided into two main categories: attacks on the protocol level and attacks on the voting layer.
D. Configuration Graph and Schedule
How events in distributed systems are triggered depends on some external causes that are often referred to as the environment. We follow [60] and model this environment using the abstraction of a scheduler.
To this end, we consider a communication network on which all communications between the nodes are carried out. These networks are often referred to as P2P networks. We model them using a directed graph whose vertex set is the set of participating nodes. There is a directed edge from
We assume this graph to be connected. Along the directed edges of this graph packages are exchanged by the nodes. In our case these packages contain blocks.
Definition 40 (Packages and Communication Graph):
For each block \begin{equation*} e(x,i,j):=(x, i,j, t(x), \delta _{i,j}(x)).\tag{16}\end{equation*}
Essentially a node
A node may also create blocks. Once it creates a block
Example 10:
We illustrate the concept of networks and packages in Figure 9. In this figure, the network consists of six nodes. Directed edges exist between some of them and show the communication channels. We point out that these communication does not necessarily have to be symmetric. Packages containing blocks can be sent along the edges.
Illustration of a network of 6 nodes. Packages
Every node keeps a local version of the Tangle
Remark 13:
The simplified version described above allows a more accessible analysis of the voting on conflicting transactions. This comes with the cost of not describing the confirmation of non-conflicting transactions. We give more details on the “liveness” of these transactions in Section VIII-A.
We interpret the packages a node \begin{equation*} I(e, \omega _{i}): \mathcal {M}\times \mathcal {Q}\mapsto \mathcal {Q},\end{equation*}
\begin{equation*} O(\omega _{i}): \mathcal {Q}\mapsto \mathcal {M}^{| \mathcal {N}_{i}|-1},\end{equation*}
The creation of blocks uses randomness (by design) through the TSA. Moreover, issuing times of blocks may depend on the interactions of the node with the environment of our system. For this reason, we model the time between two successive blocks of one given node by random variables. oreover, the latency between packages of two given nodes is described by random variables. This randomness turns our protocol into a random protocol, and the randomness is described by the probability measure
Definition 41 (Configuration Graph):
Let \begin{equation*} \mathbb {P}(I(e, \omega _{i}) = \omega '_{i})>0\end{equation*}
Definition 42 (Valid Packages):
A package (or edge) \begin{equation*} \mathbb {P}(O(\omega _{i}) \ni e)>0.\end{equation*}
In the following, we assume that honest nodes only issue valid packages.
Definition 43 (Communication of Configurations):
We say that the (global) configuration
The relation
Definition 44 (Communication Classes):
The equivalence classes of the equivalent relation
The closed communication classes play a vital role as they describe the outcome of the protocol. Let
Definition 45 (Consensus State):
A state \begin{equation*} \mathrm {sprt}_{ \mathcal {L}_{i}}(R)= \mathcal {N}, \quad \forall i\in \mathcal {N},\tag{17}\end{equation*}
Remark 14:
Let us stress that the definition of “consensus state” is only about agreeing on the preferred reality. It does not take into account the meaning of confirmation; see Definition 15. Liveness and safety with respect to confirmation are discussed in the following sections.
We make a crucial assumption about the communication layer.
Assumption 8 (Random Block Issuance and Package Delay):
Block issuances and package delays are random and satisfy:
Nodes issue new blocks independently and distributed according to some probability distribution
.$\mu _{\mathrm {iss}}$ The delays of packages between two nodes are independent and distributed according to some probability distribution
.$\mu _{\mathrm {pack}}$ Block issuances and package delays are independent.
With a positive probability packages are delivered faster than new blocks are issued. More precisely, if
and$X\sim \mu _{\mathrm {iss}}$ , then$Y\sim \mu _{\mathrm {pack}}$ .$\mathbb {P}(Y< X)>0$
Lemma 4:
Under Assumption 8, for every given configuration
Proof:
Let
There are two immediate consequences of Lemma 4.
Corollary 1:
Under Assumption 8, a communication class is closed if and only if it consists of one consensus state.
Corollary 2:
Under Assumption 8 (and in absence of an adversary), the protocol converges (
Definition 46 (Schedule):
A schedule on the communication graph
The above definitions can naturally extend to models that distinguish between honest and adversary nodes. We assume that adversary nodes do not have to follow the algorithm
Theorem 1 (Eventual Consistency - Random Blocks):
Assume Assumption 8 to hold for the blocks and packages of honest and malicious nodes and let
Proof:
Since \begin{equation*} \mathrm {sprt}_{ \mathcal {L}_{i}}(R) \supset \mathcal {N} _{h}, \quad \forall i\in \mathcal {N} _{h},\tag{18}\end{equation*}
Liveness and Safety
In the previous section, we were interested in the eventual convergence and proved an optimal result in Theorem 1 under the assumption of random blocks issuance and random package delay. This section adds the confirmation status of transactions into our considerations. We divide security into liveness and safety to allow a more detailed and quantitative analysis.
From a general point of view, liveness means that eventually, good things will happen, and safety means that nothing wrong will ever happen. In our situation, this translates into the following. The safety condition is that any two honest nodes should always reach an agreement and that this decision satisfies the specified validity conditions. Furthermore, no two nodes should ever confirm conflicting transactions. The liveness property is that each honest node should eventually make a decision on the confirmation status of a transaction, i.e. in our case all nodes reach the confirmation threshold
Remark 15:
In general, one requires in addition that the consensus protocol satisfies integrity. Integrity requires that the eventual outcome of the consensus protocol was initially proposed by at least one node. Since in OTV honest nodes always pick a maximal branch, the integrity property is satisfied once the protocol terminates.
A. Non-Conflicting Transactions
Liveness of a non-conflicting transaction is the property that it will eventually be included in the ledger state. In the strongest form, it means that every non-conflicting transaction will be confirmed, see Definitions 14 and 37. Therefore, the security threshold for liveness is at most a proportion
Liveness is inherently linked with the TSA and the orphanage problem. We assume the following Assumption on the TSA11 and we refer to Section VI-A for a discussion.
Proposition 3 (Liveness and Safety of Non-Conflicting Transactions):
We assume in the asynchronous model that the tip pool size is stationary, and that Assumption 7 is satisfied. The weight of the malicious nodes is
Proof:
Let
Some discussion on the validity of the stationary tip pool size assumption is appropriate. This kind of assumption was also made throughout Section IV-G as Assumption 5. Let us review this assumption in the light of the communication and the adversary model.
An attacker can delay blocks with honest transactions such that the network delay
On the Tangle layer, the “worst-case scenario” seems to be the following. The adversary issues blocks, referencing already referenced blocks, not removing any tips from the tip pool. Under the assumption that nodes can issue blocks proportionally to their weight, we obtain that
B. Conflicting Transactions
Theoretical results on the liveness and safety of conflicting transactions rely heavily on the assumptions of the underlying communication and adversary model. Moreover, the analysis of the OTV protocol is complex: it requires modeling of the networking part, modeling of the weight distribution, and various (even an infinite number of) adversarial strategies. The following section shows that an adversary can hinder consensus finding in specific situations or edge cases. However, we want to emphasize that this interference only influences the liveness of conflicting transactions and that an appropriate TSA guarantees liveness of non-conflicting transactions; see Proposition 3. In Section X, we add a feature to the protocol that allows us to obtain theoretical results on the liveness of conflicting transactions.
Impossibility Results and Metastability
Impossibility results play an essential role in the theory of consensus protocols, as they emphasize the limitations and critical edge cases. The most famous impossibility result is the FLP-result, [19], which states that achieving consensus in the asynchronous communication model is in general impossible for deterministic protocols. From a general point of view, this impossibility is due to the possible delay of packages in the P2P communication and the resulting “symmetric” situation that hinders consensus finding.
We will consider the situation of two or more directly conflicting transactions. It is the role of the consensus mechanism to reach an agreement on which transaction should eventually be accepted. One may consider that keeping conflicting transactions in an undecided state, i.e. violating the liveness, is acceptable. However, this is problematic for several reasons. For example, if nodes keep transactions indefinitely undecided, this could drastically inflate the communication required on the voting layer and prevent the pruning capability of the ledger. Transactions that are undecided for a long time can also harm safety. There is always a chance that some node confirms an “undecided” transaction. While the probability of this event might be small, it is still positive, and hence this unlikely event will happen at some point in time. We also note that simply rejecting malicious transactions does not provide a solution since this would allow delayed cancellation of transactions, thus, violating the system’s safety.
In this section, we give examples where the liveness and safety of conflicting transactions are not satisfied; more complicated examples can be constructed following the same principles. They constitute an impossibility result in the sense that the proposed protocol does not guarantee liveness or safety under the asynchronous communication model. These situations rely on strong assumptions about the attackers. We distinguish between attacks on the communication level and those on the voting level. By requiring both levels we give a theoretical result when safety cannot be guaranteed, Lemma 5.
A. Communication Level
We start with an example where an attacker does not take part directly in the voting but only controls the schedule of the honest nodes’ blocks. Let us point out that the attacker does not need to control any weight in this scenario.
The first adversary attack is dubbed a metastability attack since it tries to keep the honest nodes in an undecided situation. We refer to [50] for more details and analysis of these kinds of attacks. On a conceptual level, these kinds of attacks exploit a situation where the system is kept in a roughly symmetric condition between two incompatible options. Once the symmetric scenario is broken, nodes likely converge quickly on one of the options.
Example 11 (Metastability Attack I):
We consider \begin{align*}&(x_{i}, i,3, t_{0}, \delta), (x_{1}, i,4, t_{0}, \delta), \quad i\in \{1,2\}, \\&(x_{j}, j,1, t_{0}, \delta), (x_{j}, j,2, t_{0}, \delta), \quad j\in \{3,4\},\end{align*}
\begin{align*} & (x_{1}, 1,2, t_{0}, \gamma), \quad (x_{2}, 2,1, t_{0}, \gamma), \\ &(x_{3}, 3,4, t_{0}, \gamma), \quad (x_{4}, 4,3, t_{0}, \gamma).\end{align*}
Illustration of Example 11. Nodes are voting for transaction
Remark 16:
The situation described above is undoubtedly a special case and mainly of theoretical interest. However, it raises the question under which conditions such schedules exist and how realistic they appear in real applications.
B. Voting Level
In this section, we describe situations, where an attacker can successfully interfere in the consensus finding by using the voting layer. We do not need conditions to control communication between honest nodes but relatively strong assumptions about the adversary’s ability to issue new blocks and reliably forward them to the honest nodes.
Example 12 (Metastability Attack II):
We again consider the situation of one double spend, i.e. a set of conflicts
We consider an even number
Illustration of Example 12. Nodes are voting for transaction
Remark 17:
We want to note that in Example 12 the attacker heavily relies on the capability of an adversary to immediately adapt its opinion before more than 2 honest nodes changed their vote to the majority.
The next example, the Bait-and-Switch Attack, depends less on the adversaries issuance rate but requires a higher amount of weight.
Example 13 (Bait-and-Switch Attack):
We consider a situation where the adversary possesses the node with the highest weight. The strategy is to switch frequently the opinions such that the honest nodes are constantly “ chasing the ever-changing heaviest branch”. For example, consider \begin{equation*} n_{cr} \cdot \frac {w_{h}}{N_{h}} < w_{a}.\end{equation*}
C. Communication and Voting Level
In the previous sections, we presented examples of how an adversary can harm the liveness of conflicting transactions. The attacker strategies required either substantial control of the communication layer or a high issuance rate combined with considerable weight. In this section, we prove an impossibility result for safety that involves an attack strategy that uses both levels.
Definition 47 (Broken Safety):
We say that safety is broken if and only if there exist two honest nodes \begin{equation*} \mathbf {AW}_{i,t}(\hat {x}) > \theta { ~\text {and }} \mathbf {AW}_{j,s}(\hat {y}) > \theta.\end{equation*}
We have the following “negative” result.
Lemma 5:
Let
Proof:
Let us choose a number of honest nodes \begin{equation*} \frac {\theta -q}{1-q} < \frac {N_{h}^{\ast}}{N_{h}} < \frac {0.5}{1-q}.\end{equation*}
The attacker sends to
After this, the attacker sends blocks to
Next, the attacker lets
The above proof indicates that the attacker needs very strong control over the communication layer to conduct such an attack. Nevertheless, it gives a reasonable theoretical security threshold for the protocol’s safety. All the more since we can prove safety under the assumption
D. Realistic Conditions
The above examples illustrate that the two dimensions, namely the communication and voting level, may interact either in favor of the attacker or in favor of the robustness of the protocol. In all cases, it seems that the attacker needs excellent control of the communication layer of the protocol. Randomness or uncertainty on the communication layer may interfere with the adversary strategy and finally lead to convergence of the honest nodes’ opinions.
We conjecture that these strong assumptions are not met in most reasonable real-world scenarios and that the attacks that rely solely on the communication level are hard to perform in practice.
With a completely random schedule of packages, the system will eventually converge to a consensus state in situations where an attacker controls not more than half of the total weight, see Theorem 1. However, this convergence time can be impracticably long for real-world applications and it is possible that safety (for the confirmation) can be broken as shown by Lemma 5. The theoretical treatment of the inherent randomness of real-world implementation systems is at best in an early state, and a quantification or even its control seems currently out of reach. We refer to [60] for a theoretical approach to describe the entropy related to the scheduling of the transactions.
The following section proposes a more sophisticated variation that allows a more straightforward theoretical treatment and provides the “optimal” safety thresholds.
Synchronized Random Reality Selection
In the previous section, we demonstrated that under several conditions, the protocol presented so far might lead to situations where nodes cannot come to an agreement between several valid options. This section offers a mechanism to overcome this scenario by utilising external randomness. As shown in [62], [63], and [50] common randomness can successfully navigate a system away from such an undesired situation.
Pre-consensus classes are those classes from which the network reaches a consensus eventually. The aim of the design of the consensus protocol is, therefore, to construct the protocol so that its global state reaches such a pre-consensus state fast and that from there, the actual consensus state is inevitable.
The OTV is an asynchronous protocol and comes with advantages and disadvantages. One disadvantage is the lack of synchronization possibilities between nodes that could be used against adversarial attacks on the communication level. The arguments and examples in the previous section showed that it is theoretically possible for an attacker to keep the honest nodes in an undecided situation for a long time. To exclude these cases and obtain theoretical results, we use a distributed random number generation (dRNG) process to synchronize the nodes and interfere with a possible adversary.
We choose a parameter
We consider a system of
We start with stating our model assumptions.
Assumption 9:
We make the following assumptions:
Every block from an honest node is received by another honest node during time
with probability of at least$\mathbf {d}= \mathbf {d}(\varepsilon)$ . The constant$1-\varepsilon $ can be chosen arbitrarily small. The events for each block are independent of each other.$\varepsilon >0$ The adversary controls a proportion
of the weight. The adversary might have an influence on the schedule of the blocks to the extent of 9.1.$q$ The set of conflicts
is fixed and does not vary in time. All nodes perceive the same$\mathcal {C}$ .$\mathcal {C}$ There exists a dRNG that publishes a random variable every
unit of times. The random variable is uniformly distributed on the interval$\mathcal {D}$ , where$[0.5, \theta]$ is the confirmation threshold; see Section IV-E. This value is received (independently) by every given node before time$\theta $ (in every epoch) with a probability of at least$\mathbf {d}$ .$1-\varepsilon $ Honest nodes of cumulative weight of at least
issue blocks expressing support for their preferred reality12 at least every$\theta $ time units with a probability of at least$\mathcal {D}/2$ .$1-\varepsilon $
Let us comment on the validity of the above assumptions. Assumption 9.1 is essentially a probabilistic synchronicity assumption. The fact that the probability
In the beginning, before time
After the arrival of the first dRNG randomness
In Algorithm 4, we describe an iterative procedure, inspired by [70], for choosing a preferred reality by a node. First, it initialises set
Proposition 4:
The resulting set
Denote by \begin{equation*} \mathbf {AW}_{i,t}^{(h)} (\hat {x}):= \sum _{j \in \mathrm {sprt} _{ \mathcal {L}_{i,t}}^{(h)}(\hat {x})} \mathbf {w}(j)\end{equation*}
Due to Assumption 9.5 and since the honest nodes change their vote at most once, every other honest node sees this vote with a very high probability. In other words, every honest node has the same perception of the votes of all other honest nodes (with high probability). In this case, we can speak of the honest AW seen by the honest nodes of a transaction \begin{equation*} \mathbf {AW}_{t}^{(h)}(\hat {x}):= \mathbf {AW}_{1,t}^{(h)}(\hat {x})\tag{19}\end{equation*}
Adversarial nodes may change their opinions. In particular, they can do this close to the threshold time \begin{equation*} I_{t}(c) = [\mathbf {AW}_{t}^{(h)}(c), \mathbf {AW}_{t}^{(h)}(c)+ q];\tag{20}\end{equation*}
Region of adversarial control. (a) control on large thresholds, (b1) control on small thresholds, (b2) no control on thresholds.
We summarize the above considerations in the following statement.
Lemma 6:
Assume that the honest nodes have the same perceptions on the honest AWs. Then, for all \begin{equation*} \mathbf {AW}_{i,t}(c) \in I_{t}(c).\tag{21}\end{equation*}
The above holds for every adversary strategy that satisfies Assumption 9.2. The idea is now to choose the support of the dRNG in such a way that independent of the honest AWs and the adversarial strategy all honest nodes will decide on the same reality with a positive probability. Every
Definition 48 (Convergence to a Consensus State):
We say that the protocol converges to a consensus state if and only if there exist some reality \begin{equation*} \mathbf {AW}_{i,t} (R) > \theta,\quad \forall i\in \{1,\ldots, N_{h}\}, \forall t>T.\tag{22}\end{equation*}
Remark 18:
Definition 48 is similar to the definition of a consensus state; see Definition 45. While it describes the asymptotic behaviour of the protocol, it delivers not a practicable criterion for confirmation.15 A “confirmation rule”, as in Definition 15, however, is always susceptible to possible “re-orgs”16 of the ledger state; see also Lemma 5. Quantifying the probabilities that such re-orgs happen depends on the precise communication and adversarial models and is out of this paper’s scope.
This discussion can be turned into a formal protocol description written in Algorithm 5 and we obtain the following theorem.
Theorem 2 (Liveness and Safety - Synchronisation):
Let \begin{equation*} q< \min \left \{{1-\theta, \theta - \tfrac {1}2}\right \}\end{equation*}
Proof:
We start the protocol at time \begin{align*} \mathbb {P}(A_{1})=&\mathbb {P}(C_{1} | B_{1}) \mathbb {P}(B_{1}) \\ \geq & (1-\varepsilon)^{| \mathcal {C}| N_{h}} (1-\varepsilon)^{N_{h}} \\ =&(1-\varepsilon)^{N_{h}(| \mathcal {C}|+1)}.\end{align*}
\begin{equation*} p(\varepsilon):= (1-\varepsilon)^{N_{h}(| \mathcal {C}|+1)}\end{equation*}
We start a recursive argument on the Conflict Graph by initialising
Case A:
Case B:
We now remove the conflicts
Altogether, with a positive probability of at least
Remark 19:
The above proof offers a possibility to estimate the “consensus time”
Remark 20:
The assumption that the set of conflicts is fixed reduces to the assumption that the set of conflicts is bounded during the run-time of the protocol. The results, therefore, also apply to sets of conflicts that may evolve over time. However, the quantitative bounds in the proof get worse for larger sets of conflicts.
Performance Studies
We summarize some of the performance analysis obtained in [71] via agent-based simulations to validate the performance of the presented concepts. The used simulator [72] is written in Go and is open source. In this simulator, the necessary components of the consensus protocol are implemented, however, some of them are simplified. In the following we give a short description but refer to [71] for more details and further simulation results.
The simulated environment reflects a situation in which network participants are connected in a peer-to-peer network, where each node has the same number of neighbors. Nodes can gossip, receive blocks, request for missing blocks, and state their opinions whenever conflicts occur. The underlying network topology is modeled by a Watts-Strogatz network. In order to mimic a real world behaviour the simulator allows to specify the network delay and packet loss for each node’s connection.
Nodes are modeled as different independent agents that concurrently issue new blocks. This means that different nodes can have different perceptions of the Tangle and Approval Weights, at any given moment of time. The number of nodes does not change during the simulation period, and all the honest actors are actively participating in the consensus mechanism. While the simulator allows to model different weight distributions, we focus here on the case of a Zipf distribution with
Here, we focus on the robustness of the consensus protocol against the Bait-and-Switch attack, 13, and illustrate the influence of the Synchronized Random Reality Selection (SRRS) introduced in Section V-C.
We present simulation studies with the following specific setup. We consider
The access to all Tangles of all nodes in the simulator allows to “objectively” measure the confirmation time as proposed in [71] for each node. These can be combined to extract the consensus time, which is defined as the time between the creation of a conflict and the time when all honest nodes confirm the same spending or branch. As such, for any given conflict, it is strictly larger than the confirmation time at any node. By measuring the consensus time, the safety and liveness of the protocol can be analyzed.
Figure 14 shows the consensus time for the Bait-and-Switch strategy as a function of the adversarial weight if SRRS is disabled. It is interesting to note that there this some “inherent randomness” in the protocol as blocks are issued randomly. This seems sufficient to guarantee the security against an attacker with at most 20% of total weight. In i Figure 15 we see the effectiveness of the SRRS, that makes the protocol robust against the Bait-and-Switch attack up to the theoretical limit of
Consensus time distributions under Bait-and-Switch attack, without SRRS (
Consensus time distributions under Bait-and-Switch attack, with SRRS (
We conclude this section with a brief analysis of the performance with the degree of decentralization and the size of the network. This also allows to support the values for the growth of the Witness Weight in Figure 5. Figure 16 shows the confirmation time distributions for several Zipf parameters
Confirmation time distributions of blocks with the Zipf parameter
Outlook - Future Research
The proposed consensus mechanism in combination with the Reality-based Ledger supports the parallelisation of many processes, such as processing, booking and voting. This can lead to a significant performance boost since it can enable multi-threaded concurrency. The potential for multi-threadedness of our solution, the capability to work in an asynchronous setting and the leaderless approach can offer a highly performant consensus and ledger solution. Detailed and sound performance analysis will be necessary to validate theoretically predicted properties.
Since the ledger can be progressed without having global knowledge of new transaction additions to the ledger, it is possible that nodes can reach consensus with our mechanism even without learning about all blocks. As a consequence, the approach may enable certain sharding solutions directly on the Tangle layer, in which nodes only observe a proportion of the total ledger. However, this approach may lower performance and potentially lower security and/or liveness. To address the viability of our solution for a sharded scenario key questions such as necessary assumptions and a full security analysis are vital.
The weight system from which the Approval Weight is derived can be constructed from multiple sources and in various settings. For example, the weight may be derived from the token value and the system can be operated permissioned or permissionless. A different approach is to obtain the weights through reputation systems, which has so far received little attention.
By introducing the transaction reference in addition to the block reference in Section VI, the orphanage of transactions can be reduced through Algorithm 3. However, it does not solve the problem entirely. For instance, an honest transaction can be referenced (directly) only by eventually rejected transactions and may never reach sufficient AW to be considered confirmed. This can be improved in several ways. First, nodes may keep their “own” transactions as tips until they are confirmed. This resembles an automated way of reattaching blocks. Second, nodes may also retain transactions that are in their preferred reality but for which they have not yet voted for in the tip pool. The transactions may then be supported via a transaction reference. Third, one could allow block and transaction references to be conflicting for a given block. The transaction can then be prioritised over block references in a transaction. This enables an efficient way to remove parts of branches from the referenced aggregated branch. Another possible solution for a more accurate voting is to introduce more reference types which would eventually allow nodes to remove more explicitly certain branches from the supported branches of referenced blocks. The above examples demonstrate that solutions for the Tip Selection Algorithm can be found that mitigate or reduce orphanage, however, they require thorough analysis to cover edge cases.
Conclusion
We have introduced a new leaderless consensus protocol that can be seen as a generalisation of the Nakamoto consensus. Our protocol is based on the Tangle, which not only forms a partially ordered communication record between participants in a peer-to-peer network, but also serves as an efficient way to implicitly vote on the history of the underlying ledger. These nodes are associated with reputation-based weights which are used to reach consensus on the acceptance of transactions to the ledger. The leaderless nature of the protocol allows asynchronous and concurrent writing access to the ledger. It also eliminates the need for shared “memory pools” for pending transactions and the special roles of miners or validators.
We provide formal definitions and proofs for the functionalities of the protocol, as well as pseudo-code for the various core algorithms. Furthermore, liveness and security of the protocol are analysed and several attack scenarios discussed in detail. We proved an impossibility result for safety in the asynchronous communication model. However, by introducing a synchronisation mechanism that utilises a common random coin, we proved theoretical results on the safety of the protocol. Finally, we presented initial simulation studies that confirm the performance of the protocol with confirmation times in the order of second, and robustness up to a theoretical upper bound of the adversary weight of 1/3.
ACKNOWLEDGMENT
The authors would like to thank the developer team of the GoShimmer software, for supporting this study with the prototype implementation of the IOTA 2.0 protocol. They also thank precious staff members of the IOTA Foundation and members of the IOTA community for their feedback and criticism.
Appendix AEstimates on Confluence Time
Estimates on Confluence Time
This section gives an upper bound on the confluence time
In the case where the network is in a low load regime, we can assume that the tip pool size is small. Then after several approvals, all new transactions will indirectly reference this transaction. In the high load regime, the tip pool size \begin{equation*} 1 - \left ({1 - \frac {K(t-h)}{L_{0}}}\right)^{k}.\tag{23}\end{equation*}
\begin{equation*} \frac {\lambda h}{L_{0}} = \frac {k-1}k.\tag{24}\end{equation*}
\begin{equation*} p_{A} = \frac {K(t-h)}{k L_{0}} \text {, resp. }p_{B}=\frac { (k-1) K(t-h)}{k L_{0}}\tag{25}\end{equation*}
The probability of the first event can be described by a binomial distribution. In fact, \begin{equation*} p_{1} = \sum _{i=1}^{k} {\binom{ k }{ i}} p_{B}^{i} (1-p_{A}-p_{B})^{k-i}.\tag{26}\end{equation*}
\begin{equation*} p_{1} \approx k p_{B} + \frac {1}2 k(k-1) p_{B}^{2}.\tag{27}\end{equation*}
The random variable \begin{equation*} \mathbb {P}[Y_{1} \ge 2] = \sum _{i=2}^{k} {\binom{k }{ i }} p_{A}^{i} (1-p_{A})^{k-i}.\tag{28}\end{equation*}
\begin{equation*} p_{2}= \frac {1}2 k (k-1) p_{A}^{2},\tag{29}\end{equation*}
\begin{equation*} \frac {d K(t)}{dt} = (p_{1} -p_{2}) \lambda \approx \lambda \frac { (k-1) K(t-h)}{ L_{0}}\tag{30}\end{equation*}
\begin{equation*} \frac {d K(t)}{dt} \approx \frac { (k-1)^{2} K(t-h)}{k h },\tag{31}\end{equation*}
\begin{equation*} K(t) = \exp \left ({W\left ({\frac {(k-1)^{2}}{k} }\right) \frac {t}{h} }\right),\tag{32}\end{equation*}
\begin{equation*} \tau _{c} \approx \frac {h}{W\left ({\frac {(k-1)^{2}}{k} }\right)} \left ({\log L_{0} + \log \varepsilon }\right).\tag{33}\end{equation*}
\begin{equation*} \tau _{c} \approx \frac {h}{\log k} \log (L_{0}) \approx \frac {1}{\log k} h \log (\lambda h).\tag{34}\end{equation*}
Appendix BIllustrative Example
Illustrative Example
In this section, we demonstrate in Figure 18 the most important concepts introduced in the paper using a toy example. In this example, blocks have two references which are identical in some cases.
Tangle, the Ledger DAG, the Conflict DAG and the Conflict Graph are shown. The Tangle starts with the genesis
The Tangle starts with the genesis
To demonstrate the steps of our protocol we discuss the actions from the point of view of the “green” node for issuing block
We observe that the Approval Weight of transactions is often equal to the Witness Weight of the corresponding blocks. However, this is not always the case. For instance, the Approval Weight of transaction
To find the preferred reality, a node must follow Algorithm 1. Specifically, the reality
We also highlight that if at the next moment the “brown” node, which is supposed to be honest, decides to issue a new block and attach it to block
Appendix CGlossary
Glossary
Approval Weight A function that computes the “relative” part of the network that approves a given transaction
Conflict A transaction that consumes the same output as a distinct transaction
Conflicting transactions Two transactions that contain two transactions in their past cones which consume the same output of some transaction
Cone A set of vertices in a DAG that are reachable from a given vertex by following the directions (past cone) and the opposite directions (future cone) of edges in the DAG.
Branch A set of conflicts which does not contain conflicting transactions and is past-closed
Branch DAG A DAG that represents the relations between branches
Ledger DAG A data structure that stores all transactions in the form of a DAG
Tangle DAG A data structure that stores all blocks in the form of a DAG
Voting DAG An augmented DAG that represents a combination of the Tangle DAG and the Ledger DAG and is used for determining voting cones
Genesis The transaction that is the ultimate predecessor of any transaction of the UTXO ledger.
Block An element of the Tangle DAG, constituted of identified data that refer to at least two blocks
Node A machine that is a part of the network. Its role is to issue new blocks and validate pre-existing ones
Reality A maximal branch
Solidification The process of retrieving missing blocks in the past cone of a given block which can be requested by a node
Witness Weight A function that computes the “relative” part of the network that approves a given block