Reduce Avalanche redundancy and implement traditional fanout (#4174)

* Reduce Avalanche redundancy and implement traditional fanout * Revert tiny fanout * Update diagrams and docs based on review comments
2019-05-07 13:24:58 -07:00
parent 4f3b22d04e
commit 2107e15bd3
8 changed files with 218 additions and 214 deletions
--- a/book/art/data-plane-fanout.bob
+++ b/book/art/data-plane-fanout.bob
@@ -0,0 +1,19 @@
+------------------------------------------------------------------+
+|                                                                  |
+|   +-----------------+    Neighborhood 0    +-----------------+   |
+|   |                 +--------------------->+                 |   |
+|   |   Validator 1   |                      |   Validator 2   |   |
+|   |                 +<---------------------+                 |   |
+|   +--------+-+------+                      +------+-+--------+   |
+|            | |                                    | |            |
+|            | +-----------------------------+      | |            |
+|            |      +------------------------+------+ |            |
+|            |      |                        |        |            |
+------------------------------------------------------------------+
+             |      |                        |        |
+             v      v                        v        v
+   +---------+------+---+                  +-+--------+---------+
+   |                    |                  |                    |
+   |   Neighborhood 1   |                  |   Neighborhood 2   |
+   |                    |                  |                    |
+   +--------------------+                  +--------------------+
--- a/book/art/data-plane-seeding.bob
+++ b/book/art/data-plane-seeding.bob
@@ -0,0 +1,15 @@
+                          +--------------+
+                          |              |
+             +------------+    Leader    +------------+
+             |            |              |            |
+             |            +--------------+            |
+             v                                        v
+------------+----------------------------------------+------------+
+|                                                                  |
+|   +-----------------+    Neighborhood 0    +-----------------+   |
+|   |                 +--------------------->+                 |   |
+|   |   Validator 1   |                      |   Validator 2   |   |
+|   |                 +<---------------------+                 |   |
+|   +-----------------+                      +-----------------+   |
+|                                                                  |
+------------------------------------------------------------------+
--- a/book/art/data-plane.bob
+++ b/book/art/data-plane.bob
@@ -1,28 +1,18 @@
-
-                                          +--------------+
-                                          |              |
-                             +------------+    Leader    +------------+
-                             |            |              |            |
-                             |            +--------------+            |
-                             v                                        v
-                    +--------+--------+                      +--------+--------+
-                    |                 +--------------------->+                 |
-  +-----------------+   Validator 1   |                      |   Validator 2   +-------------+
-  |                 |                 +<---------------------+                 |             |
-  |                 +------+-+-+------+                      +---+-+-+---------+             |
-  |                        | | |                                 | | |                       |
-  |                        | | |                                 | | |                       |
-  |                +---------------------------------------------+ | |                       |
-  |                |       | | |                                   | |                       |
-  |                |       | | |            +----------------------+ |                       |
-  |                |       | | |            |                        |                       |
-  |                |       | | +--------------------------------------------+                |
-  |                |       | |              |                        |      |                |
-  |                |       | +----------------------+                |      |                |
-  |                |       |                |       |                |      |                |
-  v                v       v                v       v                v      v                v
-+--------------------+   +--------------------+   +--------------------+  +--------------------+
-|                    |   |                    |   |                    |  |                    |
-|   Neighborhood 1   |   |   Neighborhood 2   |   |   Neighborhood 3   |  |   Neighborhood 4   |
-|                    |   |                    |   |                    |  |                    |
-+--------------------+   +--------------------+   +--------------------+  +--------------------+
+                                  +--------------------+
+                                  |                    |
+                         +--------+   Neighborhood 0   +----------+
+                         |        |                    |          |
+                         |        +--------------------+          |
+                         v                                        v
+               +---------+----------+                  +----------+---------+
+               |                    |                  |                    |
+               |   Neighborhood 1   |                  |   Neighborhood 2   |
+               |                    |                  |                    |
+               +---+-----+----------+                  +----------+-----+---+
+                   |     |                                        |     |
+                   v     v                                        v     v
+------------------+-+ +-+------------------+  +------------------+-+ +-+------------------+
+|                    | |                    |  |                    | |                    |
+|   Neighborhood 3   | |   Neighborhood 4   |  |   Neighborhood 5   | |   Neighborhood 6   |
+|                    | |                    |  |                    | |                    |
+--------------------+ +--------------------+  +--------------------+ +--------------------+
--- a/book/src/data-plane-fanout.md
+++ b/book/src/data-plane-fanout.md
@@ -5,16 +5,15 @@ broadcast transaction blobs to all nodes in a very quick and efficient manner.
 In order to establish the fanout, the cluster divides itself into small
 collections of nodes, called *neighborhoods*. Each node is responsible for
 sharing any data it receives with the other nodes in its neighborhood, as well
-as propagating the data on to a small set of nodes in other neighborhoods.
+as propagating the data on to a small set of nodes in other neighborhoods. 
+This way each node only has to communicate with a small number of nodes.

 During its slot, the leader node distributes blobs between the validator nodes
-in one neighborhood (layer 1). Each validator shares its data within its
-neighborhood, but also retransmits the blobs to one node in each of multiple
-neighborhoods in the next layer (layer 2). The layer-2 nodes each share their
-data with their neighborhood peers, and retransmit to nodes in the next layer,
-etc, until all nodes in the cluster have received all the blobs.
-
-<img alt="Two layer cluster" src="img/data-plane.svg" class="center"/>
+in the first neighborhood (layer 0). Each validator shares its data within its
+neighborhood, but also retransmits the blobs to one node in some neighborhoods
+in the next layer (layer 1). The layer-1 nodes each share their data with their 
+neighborhood peers, and retransmit to nodes in the next layer, etc, until all
+nodes in the cluster have received all the blobs.

 ## Neighborhood Assignment - Weighted Selection

@@ -23,48 +22,50 @@ cluster is divided into neighborhoods. To achieve this, all the recognized
 validator nodes (the TVU peers) are sorted by stake and stored in a list. This
 list is then indexed in different ways to figure out neighborhood boundaries and
 retransmit peers. For example, the leader will simply select the first nodes to
-make up layer 1. These will automatically be the highest stake holders, allowing
-the heaviest votes to come back to the leader first. Layer-1 and lower-layer
-nodes use the same logic to find their neighbors and lower layer peers.
+make up layer 0. These will automatically be the highest stake holders, allowing
+the heaviest votes to come back to the leader first. Layer-0 and lower-layer
+nodes use the same logic to find their neighbors and next layer peers.

 ## Layer and Neighborhood Structure

 The current leader makes its initial broadcasts to at most `DATA_PLANE_FANOUT`
-nodes. If this layer 1 is smaller than the number of nodes in the cluster, then
+nodes. If this layer 0 is smaller than the number of nodes in the cluster, then
 the data plane fanout mechanism adds layers below. Subsequent layers follow
 these constraints to determine layer-capacity: Each neighborhood contains
-`NEIGHBORHOOD_SIZE` nodes and each layer may have up to `DATA_PLANE_FANOUT/2`
-neighborhoods.
+`DATA_PLANE_FANOUT` nodes. Layer-0 starts with 1 neighborhood with fanout nodes.
+The number of nodes in each additional layer grows by a factor of fanout.

 As mentioned above, each node in a layer only has to broadcast its blobs to its
-neighbors and to exactly 1 node in each next-layer neighborhood, instead of to
-every TVU peer in the cluster. In the default mode, each layer contains
-`DATA_PLANE_FANOUT/2` neighborhoods. The retransmit mechanism also supports a
-second, `grow`, mode of operation that squares the number of neighborhoods
-allowed each layer. This dramatically reduces the number of layers needed to
-support a large cluster, but can also have a negative impact on the network
-pressure on each node in the lower layers. A good way to think of the default
-mode (when `grow` is disabled) is to imagine it as chain of layers, where the
-leader sends blobs to layer-1 and then layer-1 to layer-2 and so on, the `layer
-capacities` remain constant, so all layers past layer-2 will have the same
-number of nodes until the whole cluster is covered. When `grow` is enabled, this
-becomes a traditional fanout where layer-3 will have the square of the number of
-nodes in layer-2 and so on.
+neighbors and to exactly 1 node in some next-layer neighborhoods, 
+instead of to every TVU peer in the cluster. A good way to think about this is, 
+layer-0 starts with 1 neighborhood with fanout nodes, layer-1 adds "fanout" 
+neighborhoods, each with fanout nodes and layer-2 will have 
+`fanout * number of nodes in layer-1` and so on.
+
+This way each node only has to communicate with a maximum of `2 * DATA_PLANE_FANOUT - 1` nodes.
+
+The following diagram shows how the Leader sends blobs with a Fanout of 2 to 
+Neighborhood 0 in Layer 0 and how the nodes in Neighborhood 0 share their data
+with each other.
+
+<img alt="Leader sends blobs to Neighborhood 0 in Layer 0" src="img/data-plane-seeding.svg" class="center"/>
+
+The following diagram shows how Neighborhood 0 fans out to Neighborhoods 1 and 2.
+
+<img alt="Neighborhood 0 Fanout to Neighborhood 1 and 2" src="img/data-plane-fanout.svg" class="center"/>
+
+Finally, the following diagram shows a two layer cluster with a Fanout of 2.
+
+<img alt="Two layer cluster with a Fanout of 2" src="img/data-plane.svg" class="center"/>

 #### Configuration Values

-`DATA_PLANE_FANOUT` - Determines the size of layer 1. Subsequent
-layers have `DATA_PLANE_FANOUT/2` neighborhoods when `grow` is inactive.
-
-`NEIGHBORHOOD_SIZE` - The number of nodes allowed in a neighborhood.
+`DATA_PLANE_FANOUT` - Determines the size of layer 0. Subsequent
+layers grow by a factor of `DATA_PLANE_FANOUT`.
+The number of nodes in a neighborhood is equal to the fanout value.
 Neighborhoods will fill to capacity before new ones are added, i.e if a
 neighborhood isn't full, it _must_ be the last one.

-`GROW_LAYER_CAPACITY` - Whether or not retransmit should be behave like a
-_traditional fanout_, i.e if each additional layer should have growing
-capacities. When this mode is disabled (default), all layers after layer 1 have
-the same capacity, keeping the network pressure on all nodes equal.
-
 Currently, configuration is set when the cluster is launched. In the future,
 these parameters may be hosted on-chain, allowing modification on the fly as the
 cluster sizes change.
@@ -72,13 +73,10 @@ cluster sizes change.
 ## Neighborhoods

 The following diagram shows how two neighborhoods in different layers interact.
-What this diagram doesn't capture is that each neighbor actually receives
-blobs from one validator per neighborhood above it. This means that, to
-cripple a neighborhood, enough nodes (erasure codes +1 per neighborhood) from
-the layer above need to fail.  Since multiple neighborhoods exist in the upper
-layer and a node will receive blobs from a node in each of those neighborhoods,
-we'd need a big network failure in the upper layers to end up with incomplete
-data.
+To cripple a neighborhood, enough nodes (erasure codes +1) from the neighborhood 
+above need to fail. Since each neighborhood receives blobs from multiple nodes 
+in a neighborhood in the upper layer, we'd need a big network failure in the upper 
+layers to end up with incomplete data.

 <img alt="Inner workings of a neighborhood"
 src="img/data-plane-neighborhood.svg" class="center"/>