CAP theorem and blockchain
The CAP theorem, also known as Brewer's theorem, was introduced by Eric Brewer in 1998 as a conjecture. In 2002, it was proven as a theorem by Seth Gilbert and Nancy Lynch. The theorem states that any distributed system cannot have consistency, availability, and partition tolerance simultaneously:
- Consistency is a property that ensures that all nodes in a distributed system have a single, current, and identical copy of the data.
Consistency is achieved using consensus algorithms in order to ensure that all nodes have the same copy of the data. This is also called state machine replication. The blockchain is a means for achieving state machine replication.
- Availability means that the nodes in the system are up, accessible for use, and are accepting incoming requests and responding with data without any failures as and when required. In other words, data is available at each node and the nodes are responding to requests.
- Partition tolerance ensures that if a group of nodes is unable to communicate with other nodes due to network failures, the distributed system continues to operate correctly. This can occur due to network and node failures.
A Venn diagram is commonly used to visualize the CAP theorem:
Figure 1.11: CAP theorem
The preceding diagram shows that only two properties at a time can be achieved. Either AP, CA, or CP.
In summary:
- If we opt for CP (consistency and partition tolerance), we sacrifice availability.
- If we opt for AP (availability and partition tolerance), we sacrifice consistency.
- If we opt for AC (availability and consistency), we sacrifice partition tolerance.
Usually, a network partition cannot be ignored; therefore, the choice mostly becomes either consistency or availability in the case of a network partition.
As shown previously, a distributed system cannot have consistency, availability, and partition tolerance simultaneously. This can be explained with the following example.
Let's imagine that there is a distributed system with two nodes. Now, let's apply the three theorem properties on this smallest of possible distributed systems only with two nodes:
- Consistency is achieved if both nodes have the same shared state; that is, they have the same up-to-date copy of the data.
- Availability is achieved if both nodes are up and running and responding with the latest copy of data.
- Partition tolerance is achieved if, despite communication failure or delay between nodes, the network (distributed system) continues to operate.
Now think of a scenario where a partition occurs, and nodes can no longer communicate with each other. If new updated data comes in now, it can only be updated on one node only. In that case, if the node accepts the update, then only that one node in the network is updated and therefore consistency is lost. Now, if the update is rejected by the node, that would result in loss of availability. In that case, due to partition tolerance, both availability and consistency are unachievable.
This is strange because somehow blockchain manages to achieve all of these properties—or does it?
It seems that the CAP theorem is violated by blockchain, especially in its most successful implementation, Bitcoin. However, this is not the case. In blockchains, consistency is sacrificed in favor of availability and partition tolerance. In this scenario, Consistency (C) on the blockchain is not achieved simultaneously with Partition tolerance (P) and Availability (A), but it is achieved over time. This is called eventual consistency, where consistency is achieved as a result of validation from multiple nodes over time. It means that there can be a temporary disagreement between nodes on the final state, but it is eventually agreed upon. For example, in Bitcoin, multiple transaction confirmations are required to achieve a good level of confidence that transactions may not be rolled back in the future and eventually a consistent view of transaction history is available to all nodes. Multiple confirmations of a transaction over time provide eventual consistency in Bitcoin. For this purpose, the process of mining was introduced in Bitcoin. Mining is a process that facilitates the achievement of consensus by using the PoW algorithm. At a higher level, mining can be defined as a process that is used to add more blocks to the blockchain. We will cover more on this later in Chapter 6, Introducing Bitcoin.