docs: add initial stream! protocol specification (#1454)

* docs: add initial pull sync spec
This commit is contained in:
acud
2019-06-20 12:21:07 +02:00
committed by GitHub
parent f2a12b38c4
commit 8afb316399
11 changed files with 315 additions and 0 deletions

View File

@@ -0,0 +1,160 @@
stream! protocol
======
| Subject | Description |
|---|---|
| Authors | @zelig, @acud, @nonsense |
| Status | Draft |
| Created | 2019-06-11 |
### definition of stream
a protocol that facilitates data transmission between two swarm nodes, specifically targeting sequential data in the form of a sequence of chunks as defined by swarm. the protocol should cater for the following requirements:
- client should be able to request arbitrary ranges from the server
- client can be assumed to have some of the data already and therefore can opt in to selectivally request chunks based on their hashes
As mentioned, the client is typically expected to have some of the data in the stream. to mitigate duplicate data transmission the stream protocol provides a configurable message roundtrip before batch delivery which allows the downstream peer to selectively request the chunks which it does not store at the time of the request.
When delivery batches are pre-negotiated (i.e. when the client selectively tells the server which chunks it would like to receive), we can conclude that the delivery batches are optimsed for _urgency_ rather than for maximising batch utilisation (since the server sends a certain batch that potentially gets reduced into a smaller one by the client before actually being transmitted).
### the protocol defines the notions of:
- **stream** - data source which is composed of a sequence of hashes, referenced by monotonically increasing integers, with unguaranteed contiguity with respect to one particular stream.
- **client** _(downstream peer)_ - the peer which is requesting data and does not posses it (client)
- **server** _(upstream peer)_ - the peer that has the data and sends it to the downstream peer (server)
- **range** - based on the notion of integer indexes we can define a range that designates an interval on the stream.
- **batch** - a set of chunks constituting an interval in a range are called a batch, with a length not exceeding a ceiling value negotiated when establishing streams
- **batch delivery** - end of batch delivery should be indicated by an explicit message from the server
both offered and wanted go together - note this
- **roundtrip** - a configurable extra message exchange (negotiated on initial message exchange) meant to mitigate and avoid requesting the same data from different peers. when the roundtrip is not used the stream is assumed to be continuous and the order of delivery should be guaranteed, or the stream should be particularly ordered. A roundtrip consists of:
- **offered hashes** - from the server to the client
- **wanted hashes** - at the discretion of the client in response to offered hashes
### responsibilities:
- client is able to request a range but doesnt know how many results the interval will return from the server
- client does not know if interval is continuous or has gaps
- range is defined by client and should be strictly respected and followed by server
- all intervals specified in protocol messages are closed (inclusive)
- when roundtrip is configured - chunk deliveries can be handled concurrently (therefore their order is not guaranteed), but a server end-of-batch with topmost session index must be sent to signal the end of a batch
- when roundtrip is not configured - chunks are expected to be sent in order, one after the other
- when a client requests an unbounded range (i.e. FROM=..., TO=nil):
- if there's no chunks available - server waits until something becomes available then send it to the client
- server's responsibility to give as much as possible, as fast as possible, with a limit of batch size
- one range query should result in ONE rountrip + batch delivery
- when a client requests a bounded range, server should respond to the client range requests with either offered hashes (if roundtrip is required) or chunks (if not) or an end-of-batch message if there are no more to offer. If none of these responses arrive within a timeout interval, client must drop the upstream peer.
- the server should always respond to the client
#### stream termination condition:
- timeout, connection died, we get an error and remove the client, server also gets an error from p2p layer and removes all servers/clients and drops the peer
### considerations:
- server must make sure that chunk got to client in order to account in SWAP (synchronous). if the send does not result in an error - the send should be accounted
- there is always a max batch size so that clients cannot grieve servers with very large ranges
### syncing contracts:
- stream indexes always > 0
- syncing is an implementation of the stream protocol
- client is expected to manage all intervals, and therefore:
- server is designed to be stateless, except for the case of managing a offered/wanted roundtrip and the knowledge of a boundedness of a stream (e.g. the server knows that syncing streams are always unbounded from the localstore perspective - data can always enter the system, however this is not the case for live video stream for example)
- the server does not terminate streams - it is at the discretion of the downstream peer
- the server does not initiate any messages unless instructed to
- the server does not instruct client on which bins to subscribe to it
Wire Protocol Specifications
=======
### The wire protocol defines the following messages:
| Msg Name | From->To | Params | Example |
| -------- | -------- | -------- | ------- |
| StreamInfoReq | Client->Server | Streams`[]string` | `SYNC\|6, SYNC\|5` |
| StreamInfoRes | Server->Client | Streams`[]StreamDescriptor` <br>Stream`string`<br>Cursor`uint64`<br>Bounded`bool` | `SYNC\|6;CUR=1632;bounded, SYNC\|7;CUR=18433;bounded` |
| GetRange | Client->Server| Ruid`uint`<br>Stream `string`<br>From`uint`<br>To`*uint`(nullable)<br>Roundtrip`bool` | `Ruid: 21321, Stream: SYNC\|6, From: 1, To: 100`(bounded), Roundtrip: true<br>`Stream: SYNC\|7, From: 109, Roundtrip: true`(unbounded) |
| OfferedHashes | Server->Client| Ruid`uint`<br>Hashes `[]byte` | `Ruid: 21321, Hashes: [cbcbbaddda, bcbbbdbbdc, ....]` |
| WantedHashes | Client->Server | Ruid`uint`<br>Bitvector`[]byte` | `Ruid: 21321, Bitvector: [0100100100] ` |
| ChunkDelivery | Server->Client | Ruid`uint`<br>[]Chunk `[]byte` | `Ruid: 21321, Chunk: [001000101]` |
| BatchDone | Server->Client| Ruid `uint`<br>Last `uint` | `Ruid: 21321, Last: 113331` |
| StreamState | Client<->Server | Stream`string`<br>Code`uint16`<br>Message`string`| `Stream: SYNC\|6, Code:1, Message:"Stream became bounded"`<br>`Stream: SYNC\|5, Code:2, Message: "No such stream"` |
Notes:
* communicating the last bin index when roundtrip is configured - can be done on top of OfferedHashes message (alongside the hashes), or to reuse the ACK from the no-roundtrip config
* two notions of bounded - on the stream level and on the localstore
* if TO is not specified - we assume unbounded stream, and we just send whatever, until at most, we fill up an entire batch.
### Message struct definitions:
```go
type StreamInfoReq struct {
Streams []string
}
```
```go
type StreamInfoRes struct {
Streams []StreamDescriptor
}
```
```go
type StreamDescriptor struct {
Name string
Cursor uint
Bounded bool
}
```
```go
type GetRange struct {
Ruid uint
Stream string
From uint
To uint `rlp:nil`
BatchSize uint
Roundtrip bool
}
```
```go
type OfferedHashes struct {
Ruid uint
LastIndex uint
Hashes []byte
}
```
```go
type WantedHashes struct {
Ruid uint
BitVector []byte
}
```
```go
type ChunkDelivery struct {
Ruid uint
LastIndex uint
Chunks [][]byte
}
```
```go
type BatchDone struct {
Ruid uint
Last uint
}
```
```go
type StreamState struct {
Stream string
Code uint16
Message string
}
```
Message exchange examples:
======
Initial handshake - client queries server for stream states<br>
![handshake](https://raw.githubusercontent.com/ethersphere/swarm/stream-spec/docs/diagrams/stream-handshake.png)
<br>
GetRange (bounded) - client requests a bounded range within a stream<br>
![bounded-range](https://raw.githubusercontent.com/ethersphere/swarm/stream-spec/docs/diagrams/stream-bounded.png)
<br>
GetRange (unbounded) - client requests an unbounded range (specifies only `From` parameter)<br>
![unbounded-range](https://raw.githubusercontent.com/ethersphere/swarm/stream-spec/docs/diagrams/stream-unbounded.png)
<br>
GetRange (no roundtrip) - client requests an unbounded or bounded range with no roundtrip configured<br>
![unbounded-range](https://raw.githubusercontent.com/ethersphere/swarm/stream-spec/docs/diagrams/stream-no-roundtrip.png)

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

View File

@@ -0,0 +1,105 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/PR-SVG-20010719/DTD/svg10.dtd">
<svg width="24cm" height="27cm" viewBox="257 -21 480 532" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="560" y1="55.4873" x2="559.021" y2="498.679"/>
<polygon style="fill: #1b1b1b" points="559.005,506.179 554.027,496.168 559.021,498.679 564.027,496.19 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="559.005,506.179 554.027,496.168 559.021,498.679 564.027,496.19 "/>
</g>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="372" y="46.5">
<tspan x="372" y="46.5">Client</tspan>
</text>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="545" y="46.5">
<tspan x="545" y="46.5">Server</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="402.1" y1="80.1" x2="540.364" y2="80.1"/>
<polygon style="fill: #1b1b1b" points="547.864,80.1 537.864,85.1 540.364,80.1 537.864,75.1 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="547.864,80.1 537.864,85.1 540.364,80.1 537.864,75.1 "/>
</g>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="445.018" y="65.1">
<tspan x="445.018" y="65.1">GetRange</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="546.782" y1="183.25" x2="417.618" y2="182.181"/>
<polygon style="fill: #1b1b1b" points="410.118,182.119 420.159,177.201 417.618,182.181 420.076,187.201 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="410.118,182.119 420.159,177.201 417.618,182.181 420.076,187.201 "/>
</g>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="429.318" y="171.1">
<tspan x="429.318" y="171.1">OfferedHashes</tspan>
</text>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="257.05" y="82.1">
<tspan x="257.05" y="82.1">1. Client requests</tspan>
<tspan x="257.05" y="98.1">an arbitrary</tspan>
<tspan x="257.05" y="114.1">range of a stream</tspan>
<tspan x="257.05" y="130.1">e.g. From: 1, To: 13</tspan>
</text>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="585.287" y="157.1">
<tspan x="585.287" y="157.1">2. Server replies with</tspan>
<tspan x="585.287" y="173.1">possible offered hashes</tspan>
<tspan x="585.287" y="189.1">in the requested range</tspan>
</text>
<text font-size="12.8" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="413.05" y="-8.51248">
<tspan x="413.05" y="-8.51248">Bounded GetRange</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="407.1" y1="272.25" x2="535.364" y2="272.25"/>
<polygon style="fill: #1b1b1b" points="542.864,272.25 532.864,277.25 535.364,272.25 532.864,267.25 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="542.864,272.25 532.864,277.25 535.364,272.25 532.864,267.25 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="428.893" y="259.25">
<tspan x="428.893" y="259.25">WantedHashes</tspan>
</text>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="273" y="259.25">
<tspan x="273" y="259.25">3. Client replies</tspan>
<tspan x="273" y="275.25">with a (sub)set</tspan>
<tspan x="273" y="291.25">of wanted chunk</tspan>
<tspan x="273" y="307.25">hashes</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="546.782" y1="351.185" x2="417.618" y2="350.115"/>
<polygon style="fill: #1b1b1b" points="410.118,350.053 420.159,345.136 417.618,350.115 420.076,355.136 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="410.118,350.053 420.159,345.136 417.618,350.115 420.076,355.136 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="423.518" y="336">
<tspan x="423.518" y="336">Chunk Deliveries</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="546.782" y1="370.185" x2="417.618" y2="369.115"/>
<polygon style="fill: #1b1b1b" points="410.118,369.053 420.159,364.136 417.618,369.115 420.076,374.136 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="410.118,369.053 420.159,364.136 417.618,369.115 420.076,374.136 "/>
</g>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="546.782" y1="388.185" x2="417.618" y2="387.115"/>
<polygon style="fill: #1b1b1b" points="410.118,387.053 420.159,382.136 417.618,387.115 420.076,392.136 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="410.118,387.053 420.159,382.136 417.618,387.115 420.076,392.136 "/>
</g>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="546.782" y1="407.185" x2="417.618" y2="406.115"/>
<polygon style="fill: #1b1b1b" points="410.118,406.053 420.159,401.136 417.618,406.115 420.076,411.136 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="410.118,406.053 420.159,401.136 417.618,406.115 420.076,411.136 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="585.287" y="373">
<tspan x="585.287" y="373">4. Server delivers</tspan>
<tspan x="585.287" y="389">requested chunks</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="546.782" y1="452.185" x2="417.618" y2="451.115"/>
<polygon style="fill: #1b1b1b" points="410.118,451.053 420.159,446.136 417.618,451.115 420.076,456.136 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="410.118,451.053 420.159,446.136 417.618,451.115 420.076,456.136 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="448" y="440">
<tspan x="448" y="440">BatchDone</tspan>
</text>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="585.287" y="446">
<tspan x="585.287" y="446">5. Server reports to</tspan>
<tspan x="585.287" y="462">client that batch has</tspan>
<tspan x="585.287" y="478">completed with the </tspan>
<tspan x="585.287" y="494">last delivered bin index</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="388.309" y1="55.4873" x2="387.331" y2="498.679"/>
<polygon style="fill: #1b1b1b" points="387.314,506.179 382.336,496.168 387.331,498.679 392.336,496.19 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="387.314,506.179 382.336,496.168 387.331,498.679 392.336,496.19 "/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 7.7 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

View File

@@ -0,0 +1,50 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/PR-SVG-20010719/DTD/svg10.dtd">
<svg width="23cm" height="13cm" viewBox="269 -21 460 256" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="560" y1="56.0723" x2="560.992" y2="223.573"/>
<polygon style="fill: #1b1b1b" points="561.037,231.073 555.978,221.102 560.992,223.573 565.977,221.043 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="561.037,231.073 555.978,221.102 560.992,223.573 565.977,221.043 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="372" y="46.5">
<tspan x="372" y="46.5">Client</tspan>
</text>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="545" y="46.5">
<tspan x="545" y="46.5">Server</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="402.1" y1="80.1" x2="540.364" y2="80.1"/>
<polygon style="fill: #1b1b1b" points="547.864,80.1 537.864,85.1 540.364,80.1 537.864,75.1 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="547.864,80.1 537.864,85.1 540.364,80.1 537.864,75.1 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="455.1" y="65.1">
<tspan x="455.1" y="65.1">Stream</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="553.1" y1="109.1" x2="415.808" y2="177.746"/>
<polygon style="fill: #1b1b1b" points="409.1,181.1 415.808,172.156 415.808,177.746 420.28,181.1 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="409.1,181.1 415.808,172.156 415.808,177.746 420.28,181.1 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="426.1" y="131.1">
<tspan x="426.1" y="131.1">StreamAck</tspan>
</text>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="269.05" y="83.1">
<tspan x="269.05" y="83.1">1. Client requests</tspan>
<tspan x="269.05" y="99.1">info about</tspan>
<tspan x="269.05" y="115.1">streams</tspan>
</text>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="592.05" y="96.1">
<tspan x="592.05" y="96.1">2. Server replies with</tspan>
<tspan x="592.05" y="112.1">stream info: names,</tspan>
<tspan x="592.05" y="128.1">session indexes and</tspan>
<tspan x="592.05" y="144.1">boundedness</tspan>
</text>
<g>
<line style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" x1="387.808" y1="56.0723" x2="388.801" y2="223.573"/>
<polygon style="fill: #1b1b1b" points="388.845,231.073 383.786,221.102 388.801,223.573 393.786,221.043 "/>
<polygon style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #1b1b1b" points="388.845,231.073 383.786,221.102 388.801,223.573 393.786,221.043 "/>
</g>
<text font-size="12.7998" style="fill: #1b1b1b;text-anchor:start;font-family:sans-serif;font-style:normal;font-weight:normal" x="413.05" y="-8.51248">
<tspan x="413.05" y="-8.51248">Establishing a stream</tspan>
</text>
</svg>

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB