From f96563c3f204b347a5084d4be9ae4666a2d80df5 Mon Sep 17 00:00:00 2001 From: Greg Fitzgerald Date: Wed, 7 Nov 2018 16:49:19 -0700 Subject: [PATCH] Add documentation for pipelining --- src/fullnode.md | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/src/fullnode.md b/src/fullnode.md index 9e5f104746..b5ef4b3bbb 100644 --- a/src/fullnode.md +++ b/src/fullnode.md @@ -4,4 +4,44 @@ ## Pipelining -## Pipeline Stages +The fullnodes make extensive use of an optimization common in CPU design, +called *pipeling*. Pipelining is the right tool for the job when there's a +stream of input data that needs to be processed by a sequence of steps, and +there's different hardware responsible for each. The quintessential example is +using a washer and dryer to wash/dry/fold several loads of laundry. Washing +must occur before drying and drying before folding, but each of the three +operations is performed by a separate unit. To maximize efficiency, one creates +a pipeline of *stages*. We'll call the washer one stage, the dryer another, and +the folding process a third. To run the pipeline, one adds a second load of +laundry to the washer just after the first load is added to the dryer. +Likewise, the third load is added to the washer after the second is in the +dryer and the first is being folded. In this way, one can make progress on +three loads of laundry simultaneously. Given infinite loads, the pipeline will +consistently complete a load at the rate of the slowest stage in the pipeline. + +## Pipelining in the fullnode + +The fullnode contains two pipelined processes, one used in leader mode called +the Tpu and one used in validator mode called the Tvu. In both cases, the +hardware being pipelined is the same, the network input, the GPU cards, the CPU +cores, and the network output. What it does with that hardware is different. +The Tpu exists to create ledger entries whereas the Tvu exists to validate +them. + +## Pipeline stages in Rust + +To approach to creating a pipeline stage in Rust may be unique to Solana. We +haven't seen the same technique used in other Rust projects and there may be +better ways to do it. The Solana approach defines a stage as an object that +communicates to its previous stage and the next stage using channels. By +convention, each stage accepts a *receiver* for input and creates a second +output channel. The second channel is used to pass data to the next stage, and +so its sender is moved into the stage's thread and the receiver is returned +from its constructor. + +A well-written stage should create a thread and call a short `run()` method. +The method should read input from its input channel, call a function from +another module that processes it, and then send the output to the output +channel. The functionality in the second module will likely not use threads or +channels. +