Introducing the consensus problem
The distributed consensus problem has been studied extensively in distributed systems research since the late 1970s. Distributed systems are classified into two main categories, namely message passing and shared memory. In the context of blockchain, we are concerned with the message passing type of distributed systems, where participants on the network communicate with each other via passing messages to each other.
Blockchain is a distributed system that relies upon a consensus mechanism, which ensures the safety and liveness of the blockchain network.
In the past decade, the rapid evolution of blockchain technology has been observed. Also, with this tremendous growth, research regarding distributed consensus has grown significantly. Researchers from industry and academia are especially interested in researching novel methods of consensus. A common research area is to convert traditional (classical) distributed consensus mechanisms into their blockchain variants that are suitable for blockchain networks. Another area of interest is to analyze existing and new consensus protocols.
As we saw in Chapter 1, Blockchain 101, there are different types of blockchain networks. In particular, two types, permissioned and public (non-permissioned) were discussed. The consensus is also classified based on these two paradigms. For example, Bitcoin is a public blockchain. It runs PoW, sometimes called Nakamoto consensus.
In contrast, many permissioned blockchains tend to run variants of traditional or classical distributed consensus. A prime example is IBFT, which is a blockchained version of PBFT. Other examples include Tendermint, Casper FFG, and many variants of PBFT. We will discuss more on that later in this chapter.
First, we will look at traditional consensus mechanisms, which are also known as fault-tolerant distributed consensus, classical distributed consensus, or pre-Bitcoin distributed consensus. Distributed consensus is a highly researched problem and many fundamental ideas to describe and elaborate on the problem have been developed. One of them, and arguably the most famous, is the Byzantine generals problem, which we'll describe next. In addition to the Byzantine generals problem, we will look at relevant and important impossibility results, which will help to build a general understanding of the consensus and relevant limitations. An understanding of these concepts is vital to understand what problem exactly is being solved and how.
The Byzantine generals problem
The problem of reaching agreement in the presence of faults or Byzantine consensus was first formulated by M. Pease, R. Shostak, and L. Lamport. In distributed systems, a common goal is to achieve consensus (agreement) among nodes on the network even in the presence of faults. In order to explain the problem, Lamport came up with an allegorical representation of the problem and named it the Byzantine generals problem.
The Byzantine generals problem metaphorically depicts a situation where a Byzantine army, divided into different units, is spread around a city. A general commands each unit, and they can only communicate with each other using a messenger. To be successful, the generals must coordinate their plan and decide whether to attack or retreat. The problem, however, is that any generals could potentially be disloyal and act maliciously to obstruct agreement upon a united plan. The requirement now becomes that every honest general must somehow agree on the same decision even in the presence of treacherous generals.
In order to address this issue, honest (loyal) generals must reach a majority agreement on their plan.
Leslie Lamport introduced numerous fundamental ideas and techniques for distributed consensus.
The famous Byzantine generals problem was formulated by Lamport et al. in their paper: Lamport, L., Shostak, R. and Pease, M., 1982. The Byzantine Generals Problem. ACM Transactions on Programming Languages and Systems, 4(3), pp.382-401.
The paper is available here: https://www.microsoft.com/en-us/research/publication/byzantine-generals-problem/.
In the digital world, generals are represented by computers (nodes) and communication links are messengers carrying messages. Disloyal generals are faulty nodes. Later in this chapter, we'll see how agreement can be achieved using consensus algorithms even in the presence of faults.
Fault tolerance
A fundamental requirement in a consensus mechanism is that it must be fault-tolerant. In other words, it must be able to tolerate a number of failures in a network and should continue to work even in the presence of faults. This naturally means that there has to be some limit to the number of faults a network can handle, since no network can operate correctly if a large majority of its nodes are failing. Based on the requirement of fault tolerance, consensus algorithms are also called fault-tolerant algorithms, and there are two types of fault-tolerant algorithms.
Types of fault-tolerant consensus
Fault-tolerant algorithms can be divided into two types of fault-tolerance. The first is Crash fault-tolerance (CFT) and the other is Byzantine fault-tolerance (BFT). CFT covers only crash faults or, in other words, benign faults. In contrast, BFT deals with the type of faults that are arbitrary and can even be malicious.
Replication is a standard approach to make a system fault-tolerant. Replication results in a synchronized copy of data across all nodes in a network. This technique improves the fault tolerance and availability of the network. This means that even if some of the nodes become faulty, the overall system/network remains available due to the data being available on multiple nodes.
There are two main types of replication techniques:
- Active replication, which is a type where each replica becomes a copy of the original state machine replica.
- Passive replication, which is a type where there is only a single copy of the state machine in the system kept by the primary node, and the rest of the nodes/replicas only maintain the state.
In this section, we've briefly looked at what replication is and its types. In the context of fault-tolerant consensus mechanisms, replication plays a vital role by introducing resiliency into the system. We'll now introduce another relevant concept, known as state machine replication, which is a standard technique used to achieve fault tolerance in distributed systems.
State machine replication
State machine replication (SMR) is a de facto technique that is used to provide deterministic replication services in order to achieve fault tolerance in a distributed system. State machine replication was first proposed by Lamport in 1978 in his paper:
Lamport, L., 1978. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7), pp.558-565.
The paper is available here:
Later, in 1990, Schneider formalized the state machine replication approach and published the results in a paper titled:
Schneider, F.B., 1990. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys (CSUR), 22(4), pp.299-319.
The paper is available here:
Now, just before we discuss SMR further, let's understand what a state machine is. At an abstract level, it is a mathematical model that is used to describe a machine that can be in different states. It is important to understand that a state machine can only have one state at a time. A state machine stores a state of the system and transitions it to the next state as a result of input received. As a result of state transition, an output is produced along with an updated state.
The fundamental idea behind SMR can be summarized as follows:
- All servers always start with the same initial state.
- All servers receive requests in a totally ordered fashion (sequenced as generated from clients).
- All servers produce the same deterministic output for the same input.
State machine replication is implemented under a primary/backup paradigm, where a primary node is responsible for receiving and broadcasting client requests. This broadcast mechanism is called total order broadcast or atomic broadcast, which ensures that backup or replica nodes receive and execute the same requests in the same sequence as the primary.
Consequently, this means that all replicas will eventually have the same state as the primary, thus resulting in achieving consensus. In other words, this means that total order broadcast and distributed consensus are equivalent problems; if you solve one, the other is solved too.
Now that we understand the basics of replication and fault tolerance, it is important to understand that fault tolerance works up to a certain threshold. For example, if a network has a vast majority of constantly failing nodes and communication links, it is not hard to understand that this type of network may not be as fault-tolerant as we might like it to be. In other words, even in the presence of fault-tolerant measures, if there is a lack of resources on a network, the network may still not be able to provide the required level of fault tolerance. In some scenarios, it might be impossible to provide the required services due to a lack of resources in a system. In distributed computing, such impossible scenarios are researched and reported as impossibility results.
In distributed computing, impossibility results provide an understanding of whether a problem is solvable and the minimum resources required to do so. If the problem is unsolvable, then these results give a clear understanding that a specific task cannot be accomplished and no further research is necessary. From another angle, we can say that impossibility results (sometimes called unsolvability results) show that certain problems are not computable under insufficient resources. Impossibility results unfold deep aspects of distributed computing and enable us to understand why certain problems are difficult to solve and under what conditions a previously unsolved problem might be solved.
The requirement of minimum available resources is known as lower bound results. The problems that are not solvable under any conditions are known as unsolvability results. For example, it has been proven that asynchronous deterministic consensus is impossible. This result is known as the FLP impossibility result, which we'll introduce next.
FLP impossibility
FLP impossibility is a fundamental unsolvability result in distributed computing theory that states that in an asynchronous environment, the deterministic consensus is impossible, even if only one process is faulty.
FLP is named after the authors' names, Fischer, Lynch, and Patterson, who in 1985 introduced this result. This result was presented in their paper:
Fischer, M.J., Lynch, N.A. and Paterson, M.S., 1982. Impossibility of distributed consensus with one faulty process (No. MIT/LCS/TR-282). Massachusetts Inst of Tech Cambridge lab for Computer Science.
The paper is available at:
To circumvent FLP impossibility, several techniques have been introduced in the literature. These techniques include:
- Failure detectors, which can be seen as oracles associated with processors to detect failures.
- Randomized algorithms have been introduced to provide a probabilistic termination guarantee. The core idea behind the randomized protocols is that the processors in such protocols can make a random choice of decision value if the processor does not receive the required quorum of trusted messages.
- Synchrony assumptions, where additional synchrony and timing assumptions are made to ensure that the consensus algorithm terminates and makes progress.
Now that we understand a fundamental impossibility result, let's look at another relevant result that highlights the unsolvability of consensus due to a lack of resources: that is, a lower bound result. We can think of lower bound as a minimum amount of resources, for example, the number of processors or communication links required to solve a problem. In other words, if a minimum required number of resources is not available in a system, then the problem cannot be solved. In the context of a consensus problem, a fundamental proven result is lower bounds on the number of processors, which we describe next.
Lower bounds on the number of processors to solve consensus
As we described previously, there are proven results in distributed computing that state several lower bounds, for example, the minimum number of processors required for consensus or the minimum number of rounds required to achieve consensus. The most common and fundamental of these results is the minimum number of processors required for consensus. These results are listed below:
- In the case of CFT, at least 2F + 1 number of nodes is required to achieve consensus.
- In the case of BFT, at least 3F + 1 number of nodes is required to achieve consensus.
F represents the number of failures.
These lower bounds are discussed in several papers with relevant proofs. The most fundamental one is by Lamport et. al:
Pease, M., Shostak, R. and Lamport, L., 1980. Reaching agreement in the presence of faults. Journal of the ACM (JACM), 27(2), pp.228-234.
The paper is available here:
https://lamport.azurewebsites.net/pubs/reaching.pdf
Another paper that provides a number of impossibility proofs is:
Fischer, M.J., Lynch, N.A. and Merritt, M., 1986. Easy impossibility proofs for distributed consensus problems. Distributed Computing, 1(1), pp.26-39.
The paper is available here:
We have now covered the fundamentals of distributed consensus theory. Now, we'll delve a little bit deeper into the analysis and design of consensus algorithms.