Rewrite synchronization chapter (#2156)

* Rewrite synchronization chapter
* Add synchronization terminology
This commit is contained in:
Greg Fitzgerald
2018-12-14 11:06:53 -07:00
committed by GitHub
parent f6e3464ab9
commit 483f6702a6
2 changed files with 95 additions and 85 deletions

View File

@ -1,84 +1,56 @@
# Synchronization # Synchronization
It's possible for a centralized database to process 710,000 transactions per Fast, reliable synchronization is the biggest reason Solana is able to achieve
second on a standard gigabit network if the transactions are, on average, no such high throughput. Traditional blockchains synchronize on large chunks of
more than 176 bytes. A centralized database can also replicate itself and transactions called blocks. By synchronizing on blocks, a transaction cannot be
maintain high availability without significantly compromising that transaction processed until a duration called "block time" has passed. In Proof of Work
rate using the distributed system technique known as Optimistic Concurrency consensus, these block times need to be very large (~10 minutes) to minimize
Control [\[H.T.Kung, J.T.Robinson the odds of multiple fullnodes producing a new valid block at the same time.
(1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At There's no such constraint in Proof of Stake consensus, but without reliable
Solana, we're demonstrating that these same theoretical limits apply just as timestamps, a fullnode cannot determine the order of incoming blocks. The
well to blockchain on an adversarial network. The key ingredient? Finding a way popular workaround is to tag each block with a [wallclock
to share time when nodes can't trust one-another. Once nodes can trust time, timestamp](https://en.bitcoin.it/wiki/Block_timestamp). Because of clock drift
suddenly ~40 years of distributed systems research becomes applicable to and variance in network latencies, the timestamp is only accurate within an
blockchain! hour or two. To workaround the workaround, these systems lengthen block times
to provide reasonable certainty that the median timestamp on each block is
always increasing.
> Perhaps the most striking difference between algorithms obtained by our Solana takes a very different approach, which it calls *Proof of History* or
> method and ones based upon timeout is that using timeout produces a *PoH*. Leader nodes "timestamp" blocks with cryptographic proofs that some
> traditional distributed algorithm in which the processes operate duration of time has passed since the last proof. All data hashed into the
> asynchronously, while our method produces a globally synchronous one in which proof most certainly have occurred before the proof was generated. The node
> every process does the same thing at (approximately) the same time. Our then shares the new block with validator nodes, which are able to verify those
> method seems to contradict the whole purpose of distributed processing, which proofs. The blocks can arrive at validators in any order or even could be
> is to permit different processes to operate independently and perform replayed years later. With such reliable synchronization guarantees, Solana is
> different functions. However, if a distributed system is really a single able to break blocks into smaller batches of transactions called *entries*.
> system, then the processes must be synchronized in some way. Conceptually, Entries are streamed to validators in realtime, before any notion of block
> the easiest way to synchronize processes is to get them all to do the same consensus.
> thing at the same time. Therefore, our method is used to implement a kernel
> that performs the necessary synchronization--for example, making sure that
> two different processes do not try to modify a file at the same time.
> Processes might spend only a small fraction of their time executing the
> synchronizing kernel; the rest of the time, they can operate
> independently--e.g., accessing different files. This is an approach we have
> advocated even when fault-tolerance is not required. The method's basic
> simplicity makes it easier to understand the precise properties of a system,
> which is crucial if one is to know just how fault-tolerant the system is.
> [\[L.Lamport
> (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
## Verifiable Delay Functions Solana technically never sends a *block*, but uses the term to describe the
sequence of entries that fullnodes vote on to achieve *confirmation*. In that
way, Solana's confirmation times can be compared apples to apples to
block-based systems. The current implementation sets block time to 800ms.
A Verifiable Delay Function is conceptually a water clock where its water marks What's happening under the hood is that entries are streamed to validators as
can be recorded and later verified that the water most certainly passed quickly as a leader node can batch a set of valid transactions into an entry.
through. Anatoly describes the water clock analogy in detail here: Validators process those entries long before it is time to vote on their
validity. By processing the transactions optimistically, there is effectively
[water clock no delay between the time the last entry is received and the time when the node
analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8) can vote. In the event consensus is **not** achieved, a node simply rolls back
its state. This optimisic processing technique was introduced in 1981 and
The same technique has been used in Bitcoin since day one. The Bitcoin feature called [Optimistic Concurrency
is called nLocktime and it can be used to postdate transactions using block Control](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). It
height instead of a timestamp. As a Bitcoin client, you'd use block height can be applied to blockchain architecture where a cluster votes on a hash that
instead of a timestamp if you don't trust the network. Block height turns out represents the full ledger up to some *block height*. In Solana, it is
to be an instance of what's being called a Verifiable Delay Function in implemented trivially using the last entry's PoH hash.
cryptography circles. It's a cryptographically secure way to say time has
passed. In Solana, we use a far more granular verifiable delay function, a SHA
256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we
implement Optimistic Concurrency Control and are now well en route towards that
theoretical limit of 710,000 transactions per second.
## Proof of History
[Proof of History
overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)
### Relationship to Consensus Mechanisms
Most confusingly, a Proof of History (PoH) is more similar to a Verifiable
Delay Function (VDF) than a Proof of Work or Proof of Stake consensus
mechanism. The name unfortunately requires some historical context to
understand. Proof of History was developed by Anatoly Yakovenko in November of
2017, roughly 6 months before we saw a [paper using the term
VDF](https://eprint.iacr.org/2018/601.pdf). At that time, it was commonplace to
publish new proofs of some desirable property used to build most any blockchain
component. Some time shortly after, the crypto community began charting out all
the different consensus mechanisms and because most of them started with "Proof
of", the prefix became synonymous with a "consensus" suffix. Proof of History
is not a consensus mechanism, but it is used to improve the performance of
Solana's Proof of Stake consensus. It is also used to improve the performance
of the replication and storage protocols. To minimize confusion, Solana may
rebrand PoH to some flavor of the term VDF.
### Relationship to VDFs ### Relationship to VDFs
The Proof of History technique was first described for use in blockchain by
Solana in November of 2017. In June of the following year, a similar technique
was described at Stanford and called a [verifiable delay
function](https://eprint.iacr.org/2018/601.pdf) or *VDF*.
A desirable property of a VDF is that verification time is very fast. Solana's A desirable property of a VDF is that verification time is very fast. Solana's
approach to verifying its delay function is proportional to the time it took to approach to verifying its delay function is proportional to the time it took to
create it. Split over a 4000 core GPU, it is sufficiently fast for Solana's create it. Split over a 4000 core GPU, it is sufficiently fast for Solana's
@ -90,13 +62,26 @@ just the subset with certain performance characteristics. Until that's
resolved, Solana will likely continue using the term PoH for its resolved, Solana will likely continue using the term PoH for its
application-specific VDF. application-specific VDF.
Another difference between PoH and VDFs used only for tracking duration, is Another difference between PoH and VDFs is that a VDF is used only for tracking
that PoH's hash chain includes hashes of any data the application observed. duration. PoH's hash chain, on the other hand, includes hashes of any data the
That data is a double-edged sword. On one side, the data "proves history" - application observed. That data is a double-edged sword. On one side, the data
that the data most certainly existed before hashes after it. On the side, it "proves history" - that the data most certainly existed before hashes after it.
means the application can manipulate the hash chain by changing *when* the data On the side, it means the application can manipulate the hash chain by changing
is hashed. The PoH chain therefore does not serve as a good source of *when* the data is hashed. The PoH chain therefore does not serve as a good
randomness whereas a VDF without that data could. Solana's [leader rotation source of randomness whereas a VDF without that data could. Solana's [leader
algorithm](#leader-rotation), for example, is derived only from the VDF rotation algorithm](#leader-rotation), for example, is derived only from the
*height* and not its hash at that height. VDF *height* and not its hash at that height.
### Relationship to Consensus Mechanisms
Proof of History is not a consensus mechanism, but it is used to improve the
performance of Solana's Proof of Stake consensus. It is also used to improve
the performance of the data plane and replication protocols.
### More on Proof of History
* [water clock
analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
* [Proof of History
overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)

View File

@ -18,9 +18,14 @@ A fraction of a [block](#block); the smallest unit sent between
#### block #### block
A contiguous set of [entries](#entry) on the ledger covered by a [vote](#ledger-vote). A contiguous set of [entries](#entry) on the ledger covered by a
The duration of a block is some number of [ticks](#tick), configured via the [vote](#ledger-vote). The duration of a block is some cluster-configured
[control plane](#control-plane). Also called [voting period](#voting-period). number of [ticks](#tick). Also called [voting period](#voting-period).
#### block height
The number of [blocks](#block) beneath the current block plus one. The [genesis
block](#genesis-block), for example, has block height 1.
#### bootstrap leader #### bootstrap leader
@ -153,6 +158,10 @@ A computer particpating in a [cluster](#cluster).
The number of [fullnodes](#fullnode) participating in a [cluster](#cluster). The number of [fullnodes](#fullnode) participating in a [cluster](#cluster).
#### PoH
See [Proof of History](#proof-of-history).
#### program #### program
The code that interprets [instructions](#instruction). The code that interprets [instructions](#instruction).
@ -161,6 +170,13 @@ The code that interprets [instructions](#instruction).
The public key of the [account](#account) containing a [program](#program). The public key of the [account](#account) containing a [program](#program).
#### Proof of History
A stack of proofs, each which proves that some data existed before the proof
was created and that a precise duration of time passed before the previous
proof. Like a [VDF](#verifiable-delay-function), a Proof of History can be
verified in less time than it took to produce.
#### public key #### public key
The public key of a [keypair](#keypair). The public key of a [keypair](#keypair).
@ -224,6 +240,15 @@ A set of [transactions](#transaction) that may be executed in parallel.
The role of a [fullnode](#fullnode) when it is validating the The role of a [fullnode](#fullnode) when it is validating the
[leader's](#leader) latest [entries](#entry). [leader's](#leader) latest [entries](#entry).
#### VDF
See [verifiable delay function](#verifiable-delay-function).
#### verifiable delay function
A function that takes a fixed amount of time to execute that produces a proof
that it ran, which can then be verified in less time than it took to produce.
#### vote #### vote
See [ledger vote](#ledger-vote). See [ledger vote](#ledger-vote).