From 483f6702a696202c4b865c5e0710e6100d4c4d97 Mon Sep 17 00:00:00 2001 From: Greg Fitzgerald Date: Fri, 14 Dec 2018 11:06:53 -0700 Subject: [PATCH] Rewrite synchronization chapter (#2156) * Rewrite synchronization chapter * Add synchronization terminology --- book/src/synchronization.md | 149 ++++++++++++++++-------------------- book/src/terminology.md | 31 +++++++- 2 files changed, 95 insertions(+), 85 deletions(-) diff --git a/book/src/synchronization.md b/book/src/synchronization.md index 4018d58498..08e045809c 100644 --- a/book/src/synchronization.md +++ b/book/src/synchronization.md @@ -1,84 +1,56 @@ # Synchronization -It's possible for a centralized database to process 710,000 transactions per -second on a standard gigabit network if the transactions are, on average, no -more than 176 bytes. A centralized database can also replicate itself and -maintain high availability without significantly compromising that transaction -rate using the distributed system technique known as Optimistic Concurrency -Control [\[H.T.Kung, J.T.Robinson -(1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At -Solana, we're demonstrating that these same theoretical limits apply just as -well to blockchain on an adversarial network. The key ingredient? Finding a way -to share time when nodes can't trust one-another. Once nodes can trust time, -suddenly ~40 years of distributed systems research becomes applicable to -blockchain! +Fast, reliable synchronization is the biggest reason Solana is able to achieve +such high throughput. Traditional blockchains synchronize on large chunks of +transactions called blocks. By synchronizing on blocks, a transaction cannot be +processed until a duration called "block time" has passed. In Proof of Work +consensus, these block times need to be very large (~10 minutes) to minimize +the odds of multiple fullnodes producing a new valid block at the same time. +There's no such constraint in Proof of Stake consensus, but without reliable +timestamps, a fullnode cannot determine the order of incoming blocks. The +popular workaround is to tag each block with a [wallclock +timestamp](https://en.bitcoin.it/wiki/Block_timestamp). Because of clock drift +and variance in network latencies, the timestamp is only accurate within an +hour or two. To workaround the workaround, these systems lengthen block times +to provide reasonable certainty that the median timestamp on each block is +always increasing. -> Perhaps the most striking difference between algorithms obtained by our -> method and ones based upon timeout is that using timeout produces a -> traditional distributed algorithm in which the processes operate -> asynchronously, while our method produces a globally synchronous one in which -> every process does the same thing at (approximately) the same time. Our -> method seems to contradict the whole purpose of distributed processing, which -> is to permit different processes to operate independently and perform -> different functions. However, if a distributed system is really a single -> system, then the processes must be synchronized in some way. Conceptually, -> the easiest way to synchronize processes is to get them all to do the same -> thing at the same time. Therefore, our method is used to implement a kernel -> that performs the necessary synchronization--for example, making sure that -> two different processes do not try to modify a file at the same time. -> Processes might spend only a small fraction of their time executing the -> synchronizing kernel; the rest of the time, they can operate -> independently--e.g., accessing different files. This is an approach we have -> advocated even when fault-tolerance is not required. The method's basic -> simplicity makes it easier to understand the precise properties of a system, -> which is crucial if one is to know just how fault-tolerant the system is. -> [\[L.Lamport -> (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078) +Solana takes a very different approach, which it calls *Proof of History* or +*PoH*. Leader nodes "timestamp" blocks with cryptographic proofs that some +duration of time has passed since the last proof. All data hashed into the +proof most certainly have occurred before the proof was generated. The node +then shares the new block with validator nodes, which are able to verify those +proofs. The blocks can arrive at validators in any order or even could be +replayed years later. With such reliable synchronization guarantees, Solana is +able to break blocks into smaller batches of transactions called *entries*. +Entries are streamed to validators in realtime, before any notion of block +consensus. -## Verifiable Delay Functions +Solana technically never sends a *block*, but uses the term to describe the +sequence of entries that fullnodes vote on to achieve *confirmation*. In that +way, Solana's confirmation times can be compared apples to apples to +block-based systems. The current implementation sets block time to 800ms. -A Verifiable Delay Function is conceptually a water clock where its water marks -can be recorded and later verified that the water most certainly passed -through. Anatoly describes the water clock analogy in detail here: - -[water clock -analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8) - -The same technique has been used in Bitcoin since day one. The Bitcoin feature -is called nLocktime and it can be used to postdate transactions using block -height instead of a timestamp. As a Bitcoin client, you'd use block height -instead of a timestamp if you don't trust the network. Block height turns out -to be an instance of what's being called a Verifiable Delay Function in -cryptography circles. It's a cryptographically secure way to say time has -passed. In Solana, we use a far more granular verifiable delay function, a SHA -256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we -implement Optimistic Concurrency Control and are now well en route towards that -theoretical limit of 710,000 transactions per second. - -## Proof of History - -[Proof of History -overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274) - -### Relationship to Consensus Mechanisms - -Most confusingly, a Proof of History (PoH) is more similar to a Verifiable -Delay Function (VDF) than a Proof of Work or Proof of Stake consensus -mechanism. The name unfortunately requires some historical context to -understand. Proof of History was developed by Anatoly Yakovenko in November of -2017, roughly 6 months before we saw a [paper using the term -VDF](https://eprint.iacr.org/2018/601.pdf). At that time, it was commonplace to -publish new proofs of some desirable property used to build most any blockchain -component. Some time shortly after, the crypto community began charting out all -the different consensus mechanisms and because most of them started with "Proof -of", the prefix became synonymous with a "consensus" suffix. Proof of History -is not a consensus mechanism, but it is used to improve the performance of -Solana's Proof of Stake consensus. It is also used to improve the performance -of the replication and storage protocols. To minimize confusion, Solana may -rebrand PoH to some flavor of the term VDF. +What's happening under the hood is that entries are streamed to validators as +quickly as a leader node can batch a set of valid transactions into an entry. +Validators process those entries long before it is time to vote on their +validity. By processing the transactions optimistically, there is effectively +no delay between the time the last entry is received and the time when the node +can vote. In the event consensus is **not** achieved, a node simply rolls back +its state. This optimisic processing technique was introduced in 1981 and +called [Optimistic Concurrency +Control](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). It +can be applied to blockchain architecture where a cluster votes on a hash that +represents the full ledger up to some *block height*. In Solana, it is +implemented trivially using the last entry's PoH hash. ### Relationship to VDFs +The Proof of History technique was first described for use in blockchain by +Solana in November of 2017. In June of the following year, a similar technique +was described at Stanford and called a [verifiable delay +function](https://eprint.iacr.org/2018/601.pdf) or *VDF*. + A desirable property of a VDF is that verification time is very fast. Solana's approach to verifying its delay function is proportional to the time it took to create it. Split over a 4000 core GPU, it is sufficiently fast for Solana's @@ -90,13 +62,26 @@ just the subset with certain performance characteristics. Until that's resolved, Solana will likely continue using the term PoH for its application-specific VDF. -Another difference between PoH and VDFs used only for tracking duration, is -that PoH's hash chain includes hashes of any data the application observed. -That data is a double-edged sword. On one side, the data "proves history" - -that the data most certainly existed before hashes after it. On the side, it -means the application can manipulate the hash chain by changing *when* the data -is hashed. The PoH chain therefore does not serve as a good source of -randomness whereas a VDF without that data could. Solana's [leader rotation -algorithm](#leader-rotation), for example, is derived only from the VDF -*height* and not its hash at that height. +Another difference between PoH and VDFs is that a VDF is used only for tracking +duration. PoH's hash chain, on the other hand, includes hashes of any data the +application observed. That data is a double-edged sword. On one side, the data +"proves history" - that the data most certainly existed before hashes after it. +On the side, it means the application can manipulate the hash chain by changing +*when* the data is hashed. The PoH chain therefore does not serve as a good +source of randomness whereas a VDF without that data could. Solana's [leader +rotation algorithm](#leader-rotation), for example, is derived only from the +VDF *height* and not its hash at that height. +### Relationship to Consensus Mechanisms + +Proof of History is not a consensus mechanism, but it is used to improve the +performance of Solana's Proof of Stake consensus. It is also used to improve +the performance of the data plane and replication protocols. + +### More on Proof of History + +* [water clock + analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8) + +* [Proof of History + overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274) diff --git a/book/src/terminology.md b/book/src/terminology.md index 0506443d34..5ce4e5abb2 100644 --- a/book/src/terminology.md +++ b/book/src/terminology.md @@ -18,9 +18,14 @@ A fraction of a [block](#block); the smallest unit sent between #### block -A contiguous set of [entries](#entry) on the ledger covered by a [vote](#ledger-vote). -The duration of a block is some number of [ticks](#tick), configured via the -[control plane](#control-plane). Also called [voting period](#voting-period). +A contiguous set of [entries](#entry) on the ledger covered by a +[vote](#ledger-vote). The duration of a block is some cluster-configured +number of [ticks](#tick). Also called [voting period](#voting-period). + +#### block height + +The number of [blocks](#block) beneath the current block plus one. The [genesis +block](#genesis-block), for example, has block height 1. #### bootstrap leader @@ -153,6 +158,10 @@ A computer particpating in a [cluster](#cluster). The number of [fullnodes](#fullnode) participating in a [cluster](#cluster). +#### PoH + +See [Proof of History](#proof-of-history). + #### program The code that interprets [instructions](#instruction). @@ -161,6 +170,13 @@ The code that interprets [instructions](#instruction). The public key of the [account](#account) containing a [program](#program). +#### Proof of History + +A stack of proofs, each which proves that some data existed before the proof +was created and that a precise duration of time passed before the previous +proof. Like a [VDF](#verifiable-delay-function), a Proof of History can be +verified in less time than it took to produce. + #### public key The public key of a [keypair](#keypair). @@ -224,6 +240,15 @@ A set of [transactions](#transaction) that may be executed in parallel. The role of a [fullnode](#fullnode) when it is validating the [leader's](#leader) latest [entries](#entry). +#### VDF + +See [verifiable delay function](#verifiable-delay-function). + +#### verifiable delay function + +A function that takes a fixed amount of time to execute that produces a proof +that it ran, which can then be verified in less time than it took to produce. + #### vote See [ledger vote](#ledger-vote).