document optimistic confirmation and slashing roadmap (#10164)
* docs * book nits * Update docs/src/proposals/optimistic-confirmation-and-slashing.md Co-authored-by: Michael Vines <mvines@gmail.com> * Update optimistic-confirmation-and-slashing.md * Update optimistic-confirmation-and-slashing.md * Update optimistic-confirmation-and-slashing.md * Update optimistic-confirmation-and-slashing.md * Update optimistic-confirmation-and-slashing.md * fixups Co-authored-by: Michael Vines <mvines@gmail.com>
This commit is contained in:
		
				
					committed by
					
						 GitHub
						GitHub
					
				
			
			
				
	
			
			
			
						parent
						
							12a3b1ba6a
						
					
				
				
					commit
					c78fd2b36d
				
			| @@ -96,6 +96,7 @@ | |||||||
|   * [Commitment](implemented-proposals/commitment.md) |   * [Commitment](implemented-proposals/commitment.md) | ||||||
|   * [Snapshot Verification](implemented-proposals/snapshot-verification.md) |   * [Snapshot Verification](implemented-proposals/snapshot-verification.md) | ||||||
| * [Accepted Design Proposals](proposals/README.md) | * [Accepted Design Proposals](proposals/README.md) | ||||||
|  |   * [Optimistic Confirmation and Slashing](proposals/optimistic-confirmation-and-slashing.md) | ||||||
|   * [Secure Vote Signing](proposals/vote-signing-to-implement.md) |   * [Secure Vote Signing](proposals/vote-signing-to-implement.md) | ||||||
|   * [Cluster Test Framework](proposals/cluster-test-framework.md) |   * [Cluster Test Framework](proposals/cluster-test-framework.md) | ||||||
|   * [Validator](proposals/validator-proposal.md) |   * [Validator](proposals/validator-proposal.md) | ||||||
|   | |||||||
							
								
								
									
										89
									
								
								docs/src/proposals/optimistic-confirmation-and-slashing.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										89
									
								
								docs/src/proposals/optimistic-confirmation-and-slashing.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,89 @@ | |||||||
|  | # Optimistic Confirmation and Slashing | ||||||
|  |  | ||||||
|  | Progress on optimistic confirmation can be tracked here | ||||||
|  |  | ||||||
|  | https://github.com/solana-labs/solana/projects/52 | ||||||
|  |  | ||||||
|  | At the end of May, the mainnet-beta is moving to 1.1, and testnet | ||||||
|  | is moving to 1.2. With 1.2, testnet will behave as if we have 1-block | ||||||
|  | conf as long as no more than 4.66% of the validators are acting | ||||||
|  | maliciously.  Applications can assume that 2/3+ votes observed in | ||||||
|  | gossip confirm a block or that at least 4.66% of the network is | ||||||
|  | violating the protocol. | ||||||
|  |  | ||||||
|  | ## How does it work? | ||||||
|  |  | ||||||
|  | The general idea is that validators have to continue voting, following | ||||||
|  | their last fork, unless they can construct a proof that their fork | ||||||
|  | may not reach finality. The way validators construct this proof is | ||||||
|  | by collecting votes for all the other forks, excluding their own. | ||||||
|  | If the set of valid votes represents over 1/3+X of the epoch stake | ||||||
|  | weight, there is may not be a way for the validators current fork | ||||||
|  | to reach 2/3+ finality.  The validator hashes the proof (creates a | ||||||
|  | witness) and submits it with their vote for the alternative fork. | ||||||
|  | But if 2/3+ votes for the same block, it is impossible for any of | ||||||
|  | the nodes to construct this proof, and therefore no node is able | ||||||
|  | to switch forks and this block will be eventually finalized. | ||||||
|  |  | ||||||
|  |  | ||||||
|  | ## Tradeoffs | ||||||
|  |  | ||||||
|  | The safety margin is 1/3+X, where X represents the minimum amount | ||||||
|  | of stake that will be slashed in case the protocol is violated. The | ||||||
|  | tradeoff is that liveness is now reduced by 2X in the worst case. | ||||||
|  | If more than 1/3 - 2X of the network is unavailable, the network | ||||||
|  | may stall and will resume finalizing blocks after the network | ||||||
|  | recovers.  So far, we haven’t observed a large unavailability hit | ||||||
|  | on our mainnet, cosmos, or tezos. For our network, which is primarily | ||||||
|  | composed of high availability systems, this seems unlikely. Currently, | ||||||
|  | we have set the threshold percentage to 4.66%, which means that if | ||||||
|  | 23.68% have failed the network may stop finalizing blocks.  For our | ||||||
|  | network, which is primarily composed of high availability systems | ||||||
|  | a 23.68% drop in availabilty seems unlinkely.  1:10^12 odds assuming | ||||||
|  | five 4.7% staked nodes with 0.995 of uptime. | ||||||
|  |  | ||||||
|  | ## Security | ||||||
|  |  | ||||||
|  | Long term average votes per slot has been 670,000,000 votes / | ||||||
|  | 12,000,000 slots, or 55 out of 64 voting validators.  This includes | ||||||
|  | missed blocks due to block producer failures. When a client sees | ||||||
|  | 55/64, or ~86% confirming a block, it can expect that ~24% or (86 | ||||||
|  | - 66.666..  + 4.666..)% of the network must be slashed for this | ||||||
|  | block to fail full finalization. | ||||||
|  |  | ||||||
|  | ## Why Solana? | ||||||
|  |  | ||||||
|  | This approach can be built on other networks, but the implementation | ||||||
|  | complexity is significantly reduced on Solana because our votes | ||||||
|  | have provable VDF-based timeouts. It’s not clear if switching proofs | ||||||
|  | can be easily constructed in networks with weak assumptions about | ||||||
|  | time. | ||||||
|  |  | ||||||
|  | ## Slashing roadmap | ||||||
|  |  | ||||||
|  | Slashing is a hard problem, and it becomes harder when the goal of | ||||||
|  | the network is to be the fastest possible implementation. The | ||||||
|  | tradeoffs are especially apparent when optimizing for latency. For | ||||||
|  | example, we would really like the validators to cast and propagate | ||||||
|  | their votes before the memory has been synced to disk, which means | ||||||
|  | that the risk of local state corruption is much higher. | ||||||
|  |  | ||||||
|  | Fundamentally, our goal for slashing is to slash 100% in cases where | ||||||
|  | the node is maliciously trying to violate safety rules and 0% during | ||||||
|  | routine operation. How we aim to achieve that is to first implement | ||||||
|  | slashing proofs without any automatic slashing whatsoever. | ||||||
|  |  | ||||||
|  | Right now, for regular consensus, after a safety violation, the | ||||||
|  | network will halt. We can analyze the data and figure out who was | ||||||
|  | responsible and propose that the stake should be slashed after | ||||||
|  | restart. A similar approach will be used with a optimistic conf. | ||||||
|  | An optimistic conf safety violation is easily observable, but under | ||||||
|  | normal circumstances, an optimistic confirmation safety violation | ||||||
|  | may not halt the network. Once the violation has been observed, the | ||||||
|  | validators will freeze the affected stake in the next epoch and | ||||||
|  | will decide on the next upgrade if the violation requires slashing. | ||||||
|  |  | ||||||
|  | In the long term, transactions should be able to recover a portion | ||||||
|  | of the slashing collateral if the optimistic safety violation is | ||||||
|  | proven. In that scenario, each block is effectively insured by the | ||||||
|  | network. | ||||||
		Reference in New Issue
	
	Block a user