From 483f6702a696202c4b865c5e0710e6100d4c4d97 Mon Sep 17 00:00:00 2001
From: Greg Fitzgerald <greg@solana.com>
Date: Fri, 14 Dec 2018 11:06:53 -0700
Subject: [PATCH] Rewrite synchronization chapter (#2156)

* Rewrite synchronization chapter
* Add synchronization terminology
---
 book/src/synchronization.md | 149 ++++++++++++++++--------------------
 book/src/terminology.md     |  31 +++++++-
 2 files changed, 95 insertions(+), 85 deletions(-)

diff --git a/book/src/synchronization.md b/book/src/synchronization.md
index 4018d58498..08e045809c 100644
--- a/book/src/synchronization.md
+++ b/book/src/synchronization.md
@@ -1,84 +1,56 @@
 # Synchronization
 
-It's possible for a centralized database to process 710,000 transactions per
-second on a standard gigabit network if the transactions are, on average, no
-more than 176 bytes. A centralized database can also replicate itself and
-maintain high availability without significantly compromising that transaction
-rate using the distributed system technique known as Optimistic Concurrency
-Control [\[H.T.Kung, J.T.Robinson
-(1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At
-Solana, we're demonstrating that these same theoretical limits apply just as
-well to blockchain on an adversarial network. The key ingredient? Finding a way
-to share time when nodes can't trust one-another. Once nodes can trust time,
-suddenly ~40 years of distributed systems research becomes applicable to
-blockchain!
+Fast, reliable synchronization is the biggest reason Solana is able to achieve
+such high throughput. Traditional blockchains synchronize on large chunks of
+transactions called blocks. By synchronizing on blocks, a transaction cannot be
+processed until a duration called "block time" has passed. In Proof of Work
+consensus, these block times need to be very large (~10 minutes) to minimize
+the odds of multiple fullnodes producing a new valid block at the same time.
+There's no such constraint in Proof of Stake consensus, but without reliable
+timestamps, a fullnode cannot determine the order of incoming blocks.  The
+popular workaround is to tag each block with a [wallclock
+timestamp](https://en.bitcoin.it/wiki/Block_timestamp). Because of clock drift
+and variance in network latencies, the timestamp is only accurate within an
+hour or two. To workaround the workaround, these systems lengthen block times
+to provide reasonable certainty that the median timestamp on each block is
+always increasing.
 
-> Perhaps the most striking difference between algorithms obtained by our
-> method and ones based upon timeout is that using timeout produces a
-> traditional distributed algorithm in which the processes operate
-> asynchronously, while our method produces a globally synchronous one in which
-> every process does the same thing at (approximately) the same time. Our
-> method seems to contradict the whole purpose of distributed processing, which
-> is to permit different processes to operate independently and perform
-> different functions. However, if a distributed system is really a single
-> system, then the processes must be synchronized in some way. Conceptually,
-> the easiest way to synchronize processes is to get them all to do the same
-> thing at the same time. Therefore, our method is used to implement a kernel
-> that performs the necessary synchronization--for example, making sure that
-> two different processes do not try to modify a file at the same time.
-> Processes might spend only a small fraction of their time executing the
-> synchronizing kernel; the rest of the time, they can operate
-> independently--e.g., accessing different files. This is an approach we have
-> advocated even when fault-tolerance is not required. The method's basic
-> simplicity makes it easier to understand the precise properties of a system,
-> which is crucial if one is to know just how fault-tolerant the system is.
-> [\[L.Lamport
-> (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
+Solana takes a very different approach, which it calls *Proof of History* or
+*PoH*. Leader nodes "timestamp" blocks with cryptographic proofs that some
+duration of time has passed since the last proof. All data hashed into the
+proof most certainly have occurred before the proof was generated. The node
+then shares the new block with validator nodes, which are able to verify those
+proofs. The blocks can arrive at validators in any order or even could be
+replayed years later. With such reliable synchronization guarantees, Solana is
+able to break blocks into smaller batches of transactions called *entries*.
+Entries are streamed to validators in realtime, before any notion of block
+consensus.
 
-## Verifiable Delay Functions
+Solana technically never sends a *block*, but uses the term to describe the
+sequence of entries that fullnodes vote on to achieve *confirmation*. In that
+way, Solana's confirmation times can be compared apples to apples to
+block-based systems. The current implementation sets block time to 800ms.
 
-A Verifiable Delay Function is conceptually a water clock where its water marks
-can be recorded and later verified that the water most certainly passed
-through.  Anatoly describes the water clock analogy in detail here:
-
-[water clock
-analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
-
-The same technique has been used in Bitcoin since day one. The Bitcoin feature
-is called nLocktime and it can be used to postdate transactions using block
-height instead of a timestamp. As a Bitcoin client, you'd use block height
-instead of a timestamp if you don't trust the network. Block height turns out
-to be an instance of what's being called a Verifiable Delay Function in
-cryptography circles. It's a cryptographically secure way to say time has
-passed. In Solana, we use a far more granular verifiable delay function, a SHA
-256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we
-implement Optimistic Concurrency Control and are now well en route towards that
-theoretical limit of 710,000 transactions per second.
-
-## Proof of History
-
-[Proof of History
-overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)
-
-### Relationship to Consensus Mechanisms
-
-Most confusingly, a Proof of History (PoH) is more similar to a Verifiable
-Delay Function (VDF) than a Proof of Work or Proof of Stake consensus
-mechanism. The name unfortunately requires some historical context to
-understand. Proof of History was developed by Anatoly Yakovenko in November of
-2017, roughly 6 months before we saw a [paper using the term
-VDF](https://eprint.iacr.org/2018/601.pdf). At that time, it was commonplace to
-publish new proofs of some desirable property used to build most any blockchain
-component. Some time shortly after, the crypto community began charting out all
-the different consensus mechanisms and because most of them started with "Proof
-of", the prefix became synonymous with a "consensus" suffix. Proof of History
-is not a consensus mechanism, but it is used to improve the performance of
-Solana's Proof of Stake consensus. It is also used to improve the performance
-of the replication and storage protocols. To minimize confusion, Solana may
-rebrand PoH to some flavor of the term VDF.
+What's happening under the hood is that entries are streamed to validators as
+quickly as a leader node can batch a set of valid transactions into an entry.
+Validators process those entries long before it is time to vote on their
+validity. By processing the transactions optimistically, there is effectively
+no delay between the time the last entry is received and the time when the node
+can vote. In the event consensus is **not** achieved, a node simply rolls back
+its state. This optimisic processing technique was introduced in 1981 and
+called [Optimistic Concurrency
+Control](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735).  It
+can be applied to blockchain architecture where a cluster votes on a hash that
+represents the full ledger up to some *block height*. In Solana, it is
+implemented trivially using the last entry's PoH hash.
 
 ### Relationship to VDFs
 
+The Proof of History technique was first described for use in blockchain by
+Solana in November of 2017. In June of the following year, a similar technique
+was described at Stanford and called a [verifiable delay
+function](https://eprint.iacr.org/2018/601.pdf) or *VDF*.
+
 A desirable property of a VDF is that verification time is very fast. Solana's
 approach to verifying its delay function is proportional to the time it took to
 create it. Split over a 4000 core GPU, it is sufficiently fast for Solana's
@@ -90,13 +62,26 @@ just the subset with certain performance characteristics. Until that's
 resolved, Solana will likely continue using the term PoH for its
 application-specific VDF.
 
-Another difference between PoH and VDFs used only for tracking duration, is
-that PoH's hash chain includes hashes of any data the application observed.
-That data is a double-edged sword. On one side, the data "proves history" -
-that the data most certainly existed before hashes after it. On the side, it
-means the application can manipulate the hash chain by changing *when* the data
-is hashed. The PoH chain therefore does not serve as a good source of
-randomness whereas a VDF without that data could. Solana's [leader rotation
-algorithm](#leader-rotation), for example, is derived only from the VDF
-*height* and not its hash at that height.
+Another difference between PoH and VDFs is that a VDF is used only for tracking
+duration. PoH's hash chain, on the other hand, includes hashes of any data the
+application observed.  That data is a double-edged sword. On one side, the data
+"proves history" - that the data most certainly existed before hashes after it.
+On the side, it means the application can manipulate the hash chain by changing
+*when* the data is hashed. The PoH chain therefore does not serve as a good
+source of randomness whereas a VDF without that data could. Solana's [leader
+rotation algorithm](#leader-rotation), for example, is derived only from the
+VDF *height* and not its hash at that height.
 
+### Relationship to Consensus Mechanisms
+
+Proof of History is not a consensus mechanism, but it is used to improve the
+performance of Solana's Proof of Stake consensus. It is also used to improve
+the performance of the data plane and replication protocols.
+
+### More on Proof of History
+
+* [water clock
+  analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
+
+* [Proof of History
+  overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)
diff --git a/book/src/terminology.md b/book/src/terminology.md
index 0506443d34..5ce4e5abb2 100644
--- a/book/src/terminology.md
+++ b/book/src/terminology.md
@@ -18,9 +18,14 @@ A fraction of a [block](#block); the smallest unit sent between
 
 #### block
 
-A contiguous set of [entries](#entry) on the ledger covered by a [vote](#ledger-vote).
-The duration of a block is some number of [ticks](#tick), configured via the
-[control plane](#control-plane). Also called [voting period](#voting-period).
+A contiguous set of [entries](#entry) on the ledger covered by a
+[vote](#ledger-vote).  The duration of a block is some cluster-configured
+number of [ticks](#tick).  Also called [voting period](#voting-period).
+
+#### block height
+
+The number of [blocks](#block) beneath the current block plus one. The [genesis
+block](#genesis-block), for example, has block height 1.
 
 #### bootstrap leader
 
@@ -153,6 +158,10 @@ A computer particpating in a [cluster](#cluster).
 
 The number of [fullnodes](#fullnode) participating in a [cluster](#cluster).
 
+#### PoH
+
+See [Proof of History](#proof-of-history).
+
 #### program
 
 The code that interprets [instructions](#instruction).
@@ -161,6 +170,13 @@ The code that interprets [instructions](#instruction).
 
 The public key of the [account](#account) containing a [program](#program).
 
+#### Proof of History
+
+A stack of proofs, each which proves that some data existed before the proof
+was created and that a precise duration of time passed before the previous
+proof. Like a [VDF](#verifiable-delay-function), a Proof of History can be
+verified in less time than it took to produce.
+
 #### public key
 
 The public key of a [keypair](#keypair).
@@ -224,6 +240,15 @@ A set of [transactions](#transaction) that may be executed in parallel.
 The role of a [fullnode](#fullnode) when it is validating the
 [leader's](#leader) latest [entries](#entry).
 
+#### VDF
+
+See [verifiable delay function](#verifiable-delay-function).
+
+#### verifiable delay function
+
+A function that takes a fixed amount of time to execute that produces a proof
+that it ran, which can then be verified in less time than it took to produce.
+
 #### vote
 
 See [ledger vote](#ledger-vote).