Drop the recommendation that --expected-shred-version be set by validators (bp #12242) (#12243)

* Drop the recommendation that `--expected-shred-version` be set by validators

`--expected-shred-version` is another knob for users to get wrong and is
documentation that can get stale due to cluster restarts.  Turns out
it's also generally not required anymore either because:
1. The cluster entrypoint can always be expected to be using the correct
   shred version, and that shred version will be adopted by the new node
   (earlier this was not the case when the `solana-gossip spy` node on
   mainnet-beta.solana.com:8001 ran with shred version 0)
2. On a cluster restart, `--expected-bank-hash` is a much stronger
   assertion that the validator is starting from the correct place (and
   didn't exist when `--expected-shred-version` was first recommended)

(cherry picked from commit 4ada4d43f2)

# Conflicts:
#	docs/src/clusters.md
#	docs/src/running-validator/restart-cluster.md

* Update clusters.md

Co-authored-by: Michael Vines <mvines@gmail.com>
This commit is contained in:
mergify[bot]
2020-09-15 17:43:00 +00:00
committed by GitHub
parent 388a285517
commit 4b649a71df
3 changed files with 81 additions and 7 deletions

View File

@ -44,9 +44,8 @@ $ solana-validator \
--ledger ~/validator-ledger \
--rpc-port 8899 \
--dynamic-port-range 8000-8010 \
--entrypoint devnet.solana.com:8001 \
--expected-genesis-hash HzyuivuNXMHJKjM6q6BE2qBsR3etqW21BSvuJTpJFj9A \
--expected-shred-version 61357 \
--entrypoint entrypoint.devnet.solana.com:8001 \
--expected-genesis-hash Ap36zrBt2jLWpwUjaF48hRULVgmvSE3ViFxiQgjZX2XC \
--limit-ledger-size
```
@ -89,7 +88,6 @@ $ solana-validator \
--dynamic-port-range 8000-8010 \
--entrypoint 35.203.170.30:8001 \
--expected-genesis-hash 4uhcVJyU9pJkvQyS88uRDiswHXSCkY3zQawwpjk2NsNY \
--expected-shred-version 1579 \
--limit-ledger-size
```
@ -135,7 +133,6 @@ $ solana-validator \
--dynamic-port-range 8000-8010 \
--entrypoint mainnet-beta.solana.com:8001 \
--expected-genesis-hash 5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d \
--expected-shred-version 64864 \
--limit-ledger-size
```

View File

@ -21,7 +21,6 @@ solana-validator \
--ledger <LEDGER_PATH> \
--entrypoint <CLUSTER_ENTRYPOINT> \
--expected-genesis-hash <EXPECTED_GENESIS_HASH> \
--expected-shred-version <EXPECTED_SHRED_VERSION> \
--rpc-port 8899 \
--no-voting \
--enable-rpc-transaction-history \
@ -32,7 +31,7 @@ solana-validator \
Customize `--ledger` to your desired ledger storage location, and `--rpc-port` to the port you want to expose.
The `--entrypoint`, `--expected-genesis-hash`, and `--expected-shred-version` parameters are all specific to the cluster you are joining. The shred version will change on any hard forks in the cluster, so including `--expected-shred-version` ensures you are receiving current data from the cluster you expect.
The `--entrypoint` and `--expected-genesis-hash` parameters are all specific to the cluster you are joining.
[Current parameters for Mainnet Beta](../clusters.md#example-solana-validator-command-line-2)
The `--limit-ledger-size` parameter allows you to specify how many ledger [shreds](../terminology.md#shred) your node retains on disk. If you do not include this parameter, the validator will keep the entire ledger until it runs out of disk space. The default value is good for at least a couple days but larger values may be used by adding an argument to `--limit-ledger-size` if desired. Check `solana-validator --help` for the default limit value used by `--limit-ledger-size`

View File

@ -0,0 +1,78 @@
## Restarting a cluster
### Step 1. Identify the slot that the cluster will be restarted at
This will probably be the last root that was made. Call this slot `SLOT_X`
### Step 2. Stop the validator(s)
### Step 3. Install the new solana version
### Step 4. Create a new snapshot for slot `SLOT_X` with a hard fork at slot `SLOT_X`
```bash
$ solana-ledger-tool -l ledger create-snapshot SLOT_X ledger --hard-fork SLOT_X
```
The ledger directory should now contain the new snapshot.
`solana-ledger-tool create-snapshot` will also output the new shred version, and bank hash value,
call this NEW\_SHRED\_VERSION and NEW\_BANK\_HASH respectively.
Adjust your validator's arugments:
```bash
--wait-for-supermajority SLOT_X
--expected-bank-hash NEW_BANK_HASH
```
Then restart the validator.
Confirm with the log that the validator booted and is now in a holding pattern at `SLOT_X`, waiting for a super majority.
### Step 5. Update shred documentation
Edit `https://github.com/solana-labs/solana/blob/master/docs/src/clusters.md`,
replacing the old shred version with NEW\_SHRED\_VERSION. Ensure the edits make it into the release channel
### Step 6. Announce the restart on Discord:
Post something like the following to #announcements (adjusting the text as appropriate):
> Hi @Validators,
>
> We've released v1.1.12 and are ready to get TdS back up again.
>
> Steps:
> 1. Install the v1.1.12 release: https://github.com/solana-labs/solana/releases/tag/v1.1.12
> 2.
> a. Preferred method, start from your local ledger with:
>
> ```bash
> solana-validator
> --wait-for-supermajority 12961040 # <-- NEW! IMPORTANT!
> --expected-bank-hash 6q2oTgs8FiJxra2Zo1N8tzWqo5b6uGbjmFgoWDsXxchY # <-- NEW! IMPORTANT!
> --hard-fork 56096 # <-- NEW! IMPORTANT!
> --entrypoint 35.203.170.30:8001 # <-- Same as before
> --trusted-validator 5D1fNXzvv5NjV1ysLjirC4WY92RNsVH18vjmcszZd8on
> --expected-genesis-hash 4uhcVJyU9pJkvQyS88uRDiswHXSCkY3zQawwpjk2NsNY
> --no-untrusted-rpc
> --limit-ledger-size
> --no-snapshot-fetch
> --no-genesis-fetch
> ... # <-- your other --identity/--vote-account/etc arguments
> ```
> b. If your validator doesn't have ledger up to slot 21042873, have it download a snapshot by removing
> `--no-snapshot-fetch --no-genesis-fetch --hard-fork 21042873` arguments.
> You can check for which slots your ledger has with: `solana-ledger-tool -l path/to/ledger bounds`
>
> 3. Wait until 80% of the stake comes online
>
> To confirm your restarted validator is correctly waiting for the 80%:
> a. Look for `N% of active stake visible in gossip` log messages
> b. Ask it over RPC what slot it's on: `solana --url http://127.0.0.1:8899 slot`. It should return `12961040` until we get to 80% stake
>
> Thanks!
### Step 7. Wait and listen
Monitor the validators as they restart. Answer questions, help folks,