* Mostly implement key-value store and add integration points
Essential key-value store functionality is implemented, needs more work to be integrated, tested, and activated.
Behind the `kvstore` feature.
* wake up replay stage when the poh bank is cleared
* bump ticks per second
* Increase ticks per slot to match faster tick rate
* Remove check that working bank must be the bank for the greatest slot
* Make start_leader() skip starting TPU for slots we've already been leader for
This code wasn't updated after we started batching instructions.
The current code does allocations instead of using CreateAccount.
The runtime shouldn't allow that, so getting this code out of the
way before we lock down the runtime.
Vote program currently offers no path to vote with the
authorized voter. There should be a
VoteInstruction::new_authorized_vote() that accepts the
keypair of the authorized voter and the pubkey of the vote
account. The only option in current code is
VoteInstruction::new_vote() that accepts the voter's keypair
and assumes that pubkey is the vote account.
Multiple threads can enter the read lock and
all store the new empty set to account_maps.
Check again after taking write lock to make sure
only one thread actually inserts the new entry.
Also:
* Add an assertion to the transaction builder if not enough
keypairs were provided for all keys that require signatures.
* Expose bugs in the runtime.
vote_index not being maintained correctly during a squash.
The tokens==0 shielding accounts were being inserted with
owner=default Pubkey so they didn't know they are vote accounts
and should update the vote accounts set.
* Moved supermajority functions into new module, staking_utils
* Move staking functions out of bank, and into staking_utils, change get_supermajority_slot to only use state from epoch boundary
* Move bank slot height in staked_nodes_at_slot() to be bank id
* Add slot 3 back to ASCII art
* New slot-oriented diagrams
When 1-block-per-slotm, slots are drawn vertically. That's the ideal
case. Abandoning a block is what should look like something forking
off to the side.
* Add lockouts to vote program
* Rename MAX_VOTE_HISTORY TO MAX_LOCKOUT_HISTORY, change process_vote() to only pop votes after MAX_LOCKOUT_HISTORY + 1 votes have arrived
* Correctly calculate serialized size of an Option, rename root_block to root_slot
* add bank.id() which can be used by BankForks, blocktree_processor
* add bank.hash(), make hash_internal_state() private
* add bank.freeze()/is_frozen(), also useful for blocktree_processor, eventual freeze()ing in replay
We have lots of tests that work off genesis block. Also, one
might want to generate a future leader schedule under the assumption
the stakers stay the same.
I had suggested the last_id, but that puts an unnecessary dependency
on LastIdsQueue. Using epoch height is pretty interesting in that
given the same set of stakers, you simply increment the seed once
per epoch.
Also, tighten up the LeaderSchedule code.
Cleanup poh_recorder and poh_service.
* ticks are sent only if poh.tick_height > WorkingBank::min_tick_height and <= WorkingBank::max_tick_height
* entries are recorded only if poh.tick_height >= WorkingBank::min_tick_height and < WorkingBank::max_tick_height
Also, update LeaderScheduler's code to use node_id as well.
Unfortuntely, no unit tests for this, because there's currently
only one way to set staker_id/node_id, and they are both set
to the same value.
It's useful for unit-testing, but generally isn't a variable
validators should be modifying. Blockstream and BlockstreamService
were the only ones using it. Switching them from a hard-coded 10
to the default didn't cause any test failures, so running with it.
This ensures GenesisBlock is always configured with the same
ticks_per_slot as LeaderScheduler. This will make it easier
to migrate to bank-generated schedules.
Anything that affects how the ledger is interpreted needs to be
in the genesis block or someplace on the ledger before later
parts of the ledger are interpreted. We currently don't have an
on-chain program for cluster parameters, so that leaves only
the genesis block option.
This test passes consistently when the test suite is run with a
single thread. It fails consistently on MacOS when run as part
of the unit-test suite.
No idea why it passes in CI.
Borrow checker
query the previous parents accounts
cleanup!
s/tree/parents
Tests! Last_ids need to be inherited as well otherwise nothing works.
new_from_parent
* Use the accounts list from parents up to finalized bank for Account::load apis.
* Borrow checker
* query the previous parents accounts
* cleanup!
* s/tree/parents
* Tests! Last_ids need to be inherited as well otherwise nothing works.
Functional change: the leader scheduler is no longer implicitly
updated by PohRecorder via register_tick(). That's intended to
be a "feature" (crossing fingers).
By passing a config and not a Arc'ed LeaderScheduler, callers
need to use `Bank::leader_scheduler` to access the scheduler.
By using the new constructor, there should be no incentive to
reach into the bank for that object.
That way we don't need to TODOs saying "don't forget to iterate
over checkpoints too". It should be assumed that when the bank
references its previous checkpoint all its methods would
acknowledge it.
Fullnode was the only real consumer of process_ledger and it was
only there to process a Blocktree. Blocktree is a tree, and a
ledger is a sequence, so something's clearly not right here.
Drop all other dependencies on process_ledger (only one test) so
that it can be fixed up in isolation.
Making `solana_vote_program` is not an option because
then vote_program's entrypoint conflicts with reward_program's
entrypoint.
This unfortunately turns the SDK into a dumping ground for all
things shared between vote_program and other programs. Better
would be to create a solana-vote-api crate similar to the
solana-rewards-api crate.
Now clients can use all the libraries to create transactions
and disect account data without needing to be constrained about
what can be compiled into a shared object or BPF.
Likewise, program development can move forward without being
concerned with bloating the shared object.
* Modify db_ledger to support per_slot metadata, add signal for updates, and add chaining to slots in db_ledger
* Modify replay stage to ask db_ledger for updates based on slots
* Add repair send/receive metrics
* Add repair service, remove old repair code
* Fix tmp_copy_ledger and setup for tests to account for multiple slots and tick limits within slots
The assert_counters() helper creates unreadable tests and makes
us have to update every test any time a counter is added. Instead,
we can just assert the values of any particular counters the test
may have affected.
* Modify replay stage to ask db_ledger for updates instead of reading from upstream channel
* Add signal for db_ledger to update listeners about updates
* fix flaky test
- Update leader/validator pipeline stage graph, as any node can be
doing either of the roles
- Update network stats graphs to remove hostname based filtering
And add a new 'staker_id' VoteState member variable to offer a path to
work our way out. Update leader scheduler to use staker_id, but
continue setting it to 'from_id' for the moment.
No functional changes here.
The bpfloader crate was triggering cargo to perform excessive rebuilds
of in-workspace dependencies. Unclear why exactly, but seems related to
the special dual crate-type employed by bpfloader.
* Groundwork for entry tree, align constants with definitions in the book
* Fix edge case in test, node can keep generating ticks between handle_role_transition and exit() call
* Connect TPU's broadcast service with TVU's blob fetch stage
- This is needed since ledger is being written only in TVU now
* fix clippy warnings
* fix failing test
* fix broken tests
* fixed failing tests
Split up StatusDeque into different modules
* LastIdQueue tracks last_ids
* StatusCache keeps track of signature statuses
* StatusCache stores success as a bit in a bloom filter
* Overhead for 1m Ok transactions is 4mb in memory
* Less concurrency between the objects, last_id and status_cache are read and written to at different points in the pipeline
* Each object has its own strategy for merging into the root checkpoint
* Add timeout to Replicator::new; used when polling for leader
* Add timeout functionality to replicator ledger download
Shares the same timeout as polling for leader
Defaults to 30 seconds
* Add docs for Replicator::new
* split load+execute from commit in bank, insert record between them in TPU code
* clippy
* remove clear_signatures() race with commit_transactions()
* add #[test] back
* Allow chained BudgetExpr via indirection
Change `And`, `Or`, and `After` expressions to contain
`Box<BudgetExpr>`s instead of directly holding payments
* run cargo fmt
* Revert "Fix link to book in Local Testnet section (#2416)"
This reverts commit 710c0c9980.
* Revert "Add current leader information to dashboard (#2413)"
This reverts commit f0300c1711.
* Revert "remove window code from most places (#2389)"
This reverts commit e3c0bd5a3f.
* Add trait for RpcRequestHandler trait for RpcClient and add MockRpcClient for unit tests
* Add request_airdrop integration test
* Add timestamp_tx, witness_tx, and cancel_tx to wallet integration tests; add wallet integration tests to test-stable
* Add test cases
* Ignore plentiful sleeps in unit tests
* Use unwrap() on locks
An error there generally indicates a programmer error, not a
runtime error, so a detailed runtime message is not generally useful.
* Only clone Arcs when passing them across thread boundaries
* Cleanup TPU forwarder
By separating the query from the update, all the branches get easier to
test. Also, the update operation gets so simple, that we see it
belongs over in packet.rs.
* Thanks clippy
cute
This trait is for bloom, not crds.
Slightly better would be to put it in the SDK so that the trait
implementations could go into hash and pubkey, but if we don't
want compatibility constraints, this is the next best thing.
Unfortunately due to our multi-crate repo, as soon as
|./scripts/increment-cargo-version.sh| is run after a release, |cargo
package| will fail for crates that depend on other in-tree crates, as
the new crate version has not yet been published to crates.io.
For now this means that we need to continue flying blind and be prepared
to deal with minor publishing issues on each new release.
* test_encrypt_files_many_keys_multiple_keys passing
- buffer chunk size unified between single key and multiple key path,
which shouldn't be necessary but can fix later.
* test_encrypt_file_many_keys_bad_key_length passing
* Use vote signer service in fullnode
* Use native types for signature and pubkey, and address other review comments
* Start local vote signer if a remote service address is not provided
* Rebased to master
* Fixes after rebase
Add a command-line argument (min-hashes) to restrict the entries
processed by ledger-tool. For example, --min-hashes 1 will strip
"empty" Entries, i.e. those with num_hashes = 0.
Add basic ledger tool test
* implement per-thread, per-batch sleep in ms (throttling allows easier UI development)
* tidy up sleep() call with Duration::from_millis() instead of Duration::new()
* fixup indentation style
* removing multinode-demo/client-throttled.sh (same functionality available via arguments)
* Also implement more storage contract logic
* Add transactions for proof validation,
* Move storage state members into system storage account userdata
Not sure how else to comment, but I'd encourage keeping the content timeless rather than an update. So don't talk about recent releases or next we're doing X. (just a suggestion ;)
* Parallelize entry processing in replay stage in validators
- single threaded entry processing is not utlizing CPU cores to the fullest
* fix tests and address review comments
* Break out last_ids into its own module
* Boot SignatureNotFound from BankError
* No longer return BankError from LastIds methods
* No longer piggypack on BankError for a LastIds signature status
* Drop all dependencies on the bank
* SignatureStatus -> Status and LastIds -> StatusDeque
* Unstable tests, issue 2193
Separate CLI/clap related code, create a new `Config` struct to hold all
configuration/CLI args
Remove most code from `main.rs`
Add a little documentation
* Vote every number of ticks
* address review comments
* fix for failing leader rotation tests
* remove check for vote failure from replay tests
(as votes will be cached and transmitted when leader is available)
* Remove logging init from storage program: saw a crash in a test
indicating the logger being init'ed twice.
* Add entry_height mining proof to indicate which segment the result is
for
* Add an interface to get storage miner pubkeys for a given entry_height
* Add an interface to get the current storage mining entry_height
* Set the tvu socket to 0.0.0.0:0 in replicator to stop getting entries
after the desired ledger segment is downloaded.
* Use signature of PoH height to determine which block to download for
replicator.
* Reduce args in Tvu::new under to 8
Now pass in sockets through a the crate::tvu::Sockets struct
Move ClusterInfo.keypair to pub(crate) in order to remove redundant
signing keypair parameter
* remove commented code
Use 'network' for the networking stack. Examples:
* The network drops packets.
* The cluster rejects bad transactions.
* The Solana cluster runs on a gigabit network.
* Insert blobs into db_ledger in broadcast stage to support leader to validator transitions
* Add transmitting real slots to broadcast stage
* Handle real slots instead of default slots in window
* Switch to dummy repair on slots and modify erasure to support leader rotation
* Shorten length of holding locks
* Remove logger from replicator test
We followed the precedent set by the Rust book here, but now that
proposals are integrated, each proposal can simply include its own
terminology section.
* ledger block -> ledger segment
The book already defines a *block* to be a slight variation of
how block-based changes define it. It's the thing the cluster
confirms should be the next set of transactions on the ledger.
* Boot storage description from the book
* Move more of the replicator logic into the replicator class
* Add support for the RPC interface to query the storage last_id value
that the replicator would sign and use to pick a block.
* Fix replicator connecting to gossip and change test to exercise that
scenario.
First explain how a client interacts with existing programs and why
you'd do that. Next, mention that users can contribute their own programs.
Then explain how those programs can be written in any language.
Finally, mention persistent storage, which is only needed by
stateful programs.
* Change erasure to consume new RocksDb window
* Change tests for erasure
* Remove erasure from window
* Integrate erasure decoding back into window
* Remove corrupted blobs from ledger
* Replace Erasure result with result module's Result
Previous version talked about concurrency, which is described
in detail in the Anatomy of a Fullnode chapter. App developers
probably don't care that their programs run in parallel with
other programs. From their perspective, there's no difference
between 10x parallelism and a 10x faster CPU.
* Initial vote signing service implementation
- Does not use enclave for secure signing
* fix clippy errors
* added some tests
* more tests
* Address review comments + more tests
* Vote signing JSON RPC service
- barebone service that listens for RPC requests
* Daemon for vote signer service
* Add request APIs for JSON RPC
* Cleanup of cargo dependencies
* Fix compiler error
* Cleanup book
* Distinguish upstream from downstream validators
* Add BroadcastStage to Fullnode/Tpu diagrams
* First attempt to re-describe the runtime
* Reorg book
Push back details of the fullnode implementation
* Add db_window module for windowing functions from RocksDb
* Replace window with db_window functions in window_service
* Fix tests
* Make note of change in db_window
* Create RocksDb ledger in bin/fullnode
* Make db_ledger functions generic
* Add db_ledger to bin/replicator
First, talk about how a client interacts with Solana to do useful
things. Then describe how the fullnode you're talking to works and
why it's so very fast. Last, why that fullnode you don't trust
does what you asked it to anyway.
* Merge Leader and Validator diagrams
* New sdk-tools diagram
* Move terminology to just after introduction
* Purge use of LAMPORT as an acronym
* Add notes about persistent storage
An RPC client that fetches the signature status before the bank finishes
executing the corresponding Transaction should receive SignatureNotFound
instead of Confirmed
* Set drone address to always be the initial network entry point, so that even when leaders rotate the client can still find the drone
* Extract drone address as a separate argument to bench-tps
* Add drone port to client.sh instead of setting it in bench-tps
* Add drone entrypoint to scripts
* Fix build error
* Cluster Replicated Data Store
Separate the data storage and merge strategy from the network IO boundary.
Implement an eager push overlay for transporting recent messages.
Simulation shows fast convergence with 20k nodes.
* Add stamps RFC
* Don't use the language 'load the program'
* Replace stamps RFC with new more general drone design
* Fix typo
* Describe potential techniques for getting recent last_ids
That's supposed to be an ASCII format, but we're not making use
of it. We can switch back to that some day, but if we do, it shouldn't
be JSON-encoded.
* Add the solana-wallet documentation
There doesn't seem to be a way to publish bin docs to crates.io.
Until there is, we can include CLI documentation is the appendix
of the markdown book.
* A command to generate all the usage docs
Usage:
$ scripts/wallet-help.sh >> src/wallet.md
* Reorg TVU code to look like TVU diagram
And move channel creation into LedgerWriteStage so that it can
be used in the same was as all the other stages.
* Delete commented out code
* Stub out architecture documentation
* Add book HTML generation and book tests to CI
* Add heading
* Better table of contents
* Reference existing documentation
Move ASCII art from code comments into rendered SVG
* Attempt to fix CI
* Add lamport docs
And truncate lines to 80 characters
* Fix links
And reference shorter, newer description of PoH.
* Replace ASCII art with SVG
* Streamline for Pillbox
* Update path before optional install
* Use $CARGO_HOME instead of $HOME
* Delete code
Attempt to describe all data structures without code.
* Boot RPU from docs, add JsonRpcService
Also, use Rust naming conventions in the block diagrams to
minimize the jump from docs to code.
* Latest code uses tick_height
* Rename bob/ folder to art/
A home for any ASCII art
* Import JSON RPC API
* More mdbook docs
* Add Ncp
* Cleanup links
* Move pipelining description into fullnode description
* Move high-level transaction docs into top-level doc
* Delete unused files
add linked-list capability to accounts
change accounts from a linked list to a VecDeque
add checkpoint and rollback for lastids
add subscriber notifications for rollbacks
checkpoint transaction count, too
* Move finality computation into a service run from the banking stage, ComputeLeaderFinalityService
* Change last ids nth to tick height, remove separate tick height from bank
* Add first leader to genesis entries, consume in genesis.sh
* Set bootstrap leader in the bank on startup, remove instantiation of bootstrap leader from bin/fullnode
* Remove need to initialize bootstrap leader in leader_scheduler, now can be read from genesis entries
* Add separate interface new_with_leader() in mint for creating genesis leader entries
Deserialize operations are faster when done serially with the
MT banking stage and helps with performance improvement with
reduced thread context switches.
* Added tests to thin client to test VoteContract calls, fix VoteContract sizing errors
* Calculate upper bound on VoteProgram size at runtime, add test for serializing/deserializing a max sized VoteProgram state
This may or may not fix high latencies seen on the snap build on v100.
GPU driver will not have to JIT the device code for V100 though which
is an improvement.
* Add Vote Contract
* Move ownership of LeaderScheduler from Fullnode to the bank
* Modified ReplicateStage to consume leader information from bank
* Restart RPC Services in Leader To Validator Transition
* Make VoteContract Context Free
* Remove voting from ClusterInfo and Tpu
* Remove dependency on ActiveValidators in LeaderScheduler
* Switch VoteContract to have two steps 1) Register 2) Vote. Change thin client to create + register a voting account on fullnode startup
* Remove check in leader_to_validator transition for unique references to bank, b/c jsonrpc service and rpcpubsub hold references through jsonhttpserver
* Use IV to make unique identies
* Use hex! macro for hex literal and not string converted to u8 slice
* fix sha sampling to control init/end of sha state
* Move ledger write to its own stage
- Also, rename write_stage to leader_vote_stage, as write functionality
is moved to a different stage
* Address review comments
* Fix leader rotation test failure
* address review comments
Addresses the following problem
- Validators are not able to keep up with the leader
- The future blobs (outside of window) get dropped
- The validators won't process repair requests for these future blobs
* Add PoH height to process_ledger()
* Moved broadcast_stage Leader Scheduling logic to use Poh height instead of entry_height
* Moved LeaderScheduler logic to PoH in ReplicateStage
* Fix Leader scheduling tests to use PoH instead of entry height
* Change is_leader detection in repair() to use PoH instead of entry height
* Add tests to LeaderScheduler for new functionality
* fix Entry::new and genesis block PoH counts
* Moved LeaderScheduler to PoH ticks
* Cleanup to resolve PR comments
Budget now assumes the source account holds all tokens the program
should spend.
Note: the static guarantees implied by verify_plan() are meaningless
under the new contract engine. The bank no longer calls it. This
serves as a nice example of where comparing code coverage between
integration tests and unit tests would have shown us where a
change rendered unit tests meaningless.
Debits no longer need to be applied before credits. Instead, we
lock any accounts we'd debit and so error out on the second attempt
to lock the same account.
In the old bank (before the contract engine), Contract wasn't specific
to Budget. It provided the same service as what is now called
SystemProgram::Move, but without requiring a separate account.
- Testnet dashboard shows that channel pressure for write stage
is incrementing on every iteration of write.
- This change optimizes ledger writing by removing cloning of map
and reducing calls to flush
The integration tests are allowed to open sockets, so running them
in parallel may cause "Too many open files" errors. This patch
runs the unit tests in parallel and the integration test serially.
* Escape RUST_LOG configuration in remote-node.sh
- If it was set to #, it was causing other parameters to be commented out
* escape other variables as well
* disabled shell check
* Fix shellcheck error
* Upload bench output as build artifacts
* Fix tags types
* Pull previous stats from metrics
* Change the default branch for comparison
* Fix formatting
* Fix build errors
* Address review comments
* Dedup some common code
* Add eval for channel info to find branch name
Generate tick entry ids and only register ticks as the last_id expected by the bank. Since the bank is MT, the in-flight pipeline of transactions cannot be close to the end of the queue or there is a high possibility that a starved thread will encode an expired last_id into the ledger. The banking_stage therefore uses a shorter age limit for encoded last_ids then the validators.
Bench client doesn't send transactions that are older then 30 seconds.
* Added LeaderScheduler module and tests
* plumbing for LeaderScheduler in Fullnode + tests. Add vote processing for active set to ReplicateStage and WriteStage
* Add LeaderScheduler plumbing for Tvu, window, and tests
* Fix bank and switch tests to use new LeaderScheduler
* move leader rotation check from window service to replicate stage
* Add replicate_stage leader rotation exit test
* removed leader scheduler from the window service and associated modules/tests
* Corrected is_leader calculation in repair() function in window.rs
* Integrate LeaderScheduler with write_stage for leader to validator transitions
* Integrated LeaderScheduler with BroadcastStage
* Removed gossip leader rotation from crdt
* Add multi validator, leader test
* Comments and cleanup
* Remove unneeded checks from broadcast stage
* Fix case where a validator/leader need to immediately transition on startup after reading ledger and seeing they are not in the correct role
* Set new leader in validator -> validator transitions
* Clean up for PR comments, refactor LeaderScheduler from process_entry/process_ledger_tail
* Cleaned out LeaderScheduler options, implemented LeaderScheduler strategy that only picks the bootstrap leader to support existing tests, drone/airdrops
* Ignore test_full_leader_validator_network test due to bug where the next leader in line fails to get the last entry before rotation (b/c it hasn't started up yet). Added a test test_dropped_handoff_recovery go track this bug
* Change format of data for TPS/Finality metrics in testnet automation
* Revert number of nodes for testnet automation
* Split python command to its own script
* Fix python command line arguments
lua_State is not preserved across runs and account userdata is not converted into
Lua values. All this allows us to do is manipulate the number of tokens
in each account and DoS the Fullnode with those three little words,
"repeat until false".
Why bother? Research. rlua's project goals are well-aligned with the LAMPORT runtime.
What's next:
* rlua to add security limits, such as number of instructions executed
* Add a way to deserialize Account::userdata OR use Account::program_id
to look up a metatable for lua_newuserdata().
* Add support for different node counts
* Update variable names
* Delete network even after failures
* Add array for node counts
* Changed number of nodes to a space separated string of numbers
* Adjust number of nodes
* Snap will not be published if the env variable DO_NOT_PUBLISH_SNAP is set
* Address review comments
* Replaced influx db URL
Uploads two reports to Buildkite, one from cargo-cov and one from lcov via grcov. The lcov one is busted on linux and is what we need to bring codecov.io back up again. It works great on macos if you wanted to generate them locally and prefer lcov HTML reports.
* Also comment out non-coverage build to speed things up.
This reduces how much memory is written to last_id_sigs table on very TX, and has a 40% impact on
`cargo +nightly watch -x 'bench bench_banking_stage'`
See #1157 for details. The `from` account should be cloned
before execute_transaction(), and that's the only one that should
be stored if there's an error executing the program.
* Add check in window_service to exit in checks for leader rotation, and propagate that service exit up to fullnode
* Added logic to shutdown Tvu once ReplicateStage finishes
* Added test for successfully shutting down validator and starting up leader
* Add test for leader validator interaction
* fix streamer to check for exit signal before checking socket again to prevent busy leaders from never returning
* PR comments - Rewrite make_consecutive_blobs() function, revert genesis function change
lastidnotfound step 2:
* move "record stage", aka poh_service into banking stage
* remove Entry.has_more, is incompatible with leader rotation
* rewrite entry_next_hash in terms of Poh
* simplify and unify transaction hashing (no embedded nulls)
* register_last_entry from banking stage, fixes#1171 (w00t!)
* new PoH doesn't generate empty ledger entries, so some fixes necessary in
multinode tests that rely on that (e.g. giving validators airdrops)
* make window repair less patient, if we've been waiting for an answer,
don't be shy about most recent blobs
* delete recorder and record stage
* make more verbost thin_client error reporting
* more tracing in window (sigh)
* Add hooks for executing the storage contract
* Add store_ledger stage
Similar to replicate_stage but no voting/banking stuff, just convert
blobs to entries and write the ledger out
* Add storage_addr to tests and add new NodeInfo constructor
to reduce duplication...
* Move recycler instances to the point of allocation
* sinks no longer need to call `recycle`
* Remove the recycler arguments from all the apis that no longer need them
- The write stage will output vector of entries
- Broadcast stage will create blobs out of the entries
- Helps reduce MIPS requirements for write stage
* Move register_entry_id() call out of write stage
- Write stage is MIPS intensive and has become a bottleneck for
TPU pipeline
- This will reduce the MIPS requirements for the stage
* Fix rust format issues
* budget and system contracts and verification
* contract check_id methods
* system call contract
* verify contract execution rules
* move system into its own file
* allocate before transfer for budget
* store error in budget context
* budget contract and tests without bank
* moved budget of of bank
* Tweak GCE scripts for higher node count
- Some validators were unable to rsync config from leader when
the node count was high (e.g. 25). Looks like the leader node was
getting more rsync requests in parallel than it count handle.
- This change staggers the validators bootup, and rsync time
* Address review comments
* Use multiple sockets for receiving blobs on validators
- The blobs that are broadcasted by leader or retransmitted by peer
validators are received on replicate_port
- Using reuse_addr/reuse_port, multiple sockets can be opened for
the same port
- This allows the kernel to queue data to user space app on multiple
socket queues, preventing over-running one queue
- This helps with reducing packets dropped due to queue over-runs
Fixes#1224
* Fixed failing tests
* Implemented recvmmsg() for UDP packets
- This change implements binding between libc API for recvmmsg()
- The function can receive multiple packets using one system call
Fixes#1141
* Added unit tests for recvmmsg()
* Added recv_mmsg() wrapper for non Linux OS
* Address review comments for recvmmsg()
* Remove unnecessary imports
* Moved target specific dependencies to the function
In the error case that i>0 (we have blobs to send)
we break out of the loop and do not push the allocated r
to the v array. We should recycle this blob, otherwise it
will be dropped.
* fix poll_gossip_for_leader() loop to actually wait
for 30 seconds
* reduce reuseaddr use to only when necessary,
try to avoid already bound sockets
* move nat.rs to netutil.rs
* add gossip tracing to thin_client and bench-tps
* Migrate Budget DSL to use the Account state instead of global bank data structures.
* Serialize Instruction into Transaction::userdata.
* Store the pending set in the Account::userdata
* Enforce the token balance rules on contract execution. This becomes the entry point for generic contracts.
* This pr will have a performance impact on the bank. The next set of changes will fix this by locking each account during multi threaded execution of all the contracts.
* With this change a contract transaction needs to store its state under an address. That address could be the destination of the tokens, or any random address. For the latter, an extra step would be needed to claim the tokens which isn't implemented by budget_dsl at the moment.
* test tracking issue 1157
* remove client.sh from snap
* default to ephemeral instead of ~/.config key
* rework CLI for bench-tps
* remote multinode-demo stuff from remote-client.sh
* remove multinode-demo from remote-sanity and localnet-sanity
It needed to be passed the lock before, because it contained a
branch where one side didn't require locking. Now that that
defensive programming was hoisted, we can hoist the write lock
as well, leaving a simpler function for unit testing.
Even if transactions are dropped, accounts will have buffer
of tokens. Should reduce or eliminate AccountNotFound errors seen in the
leader while bench-tps is running.
* Reuse UDP port and open multiple sockets for transaction address
* Fixed failing crdt tests
* Add tests for reusing UDP ports
* Address review comments
* Updated bench-streamer to use multiple receive sockets
* Fix minimum number of recv sockets for bench-streamer
* Address review comments
Fixes#1132
* Moved bind_to function to nat.rs
* use USER instead of whoami
make gcloud_FigureRemoteUsername robust against unsolicited output
(that I get on login ;) )
validate --prefix argument
* Update gcloud.sh
If balance goes to 0, then bank removes the account
from it's account table and returns no account error. Thin client
should also update the account to this state or it will
still have the cached balance from the last successful get_balance().
* rename NodeInfo field of Node from "data" to "info"
(touches a lot of files)
* update client to use gossip to find leader, a la drone
* rework multinode scripts
* move more stuff into rust
* added usage to all
* no more rsync unless you're a validator (TODO: whack that, too)
* fullnode doesn't bail if drone isn't up yet, just keeps trying
* drone doesn't bail if network isn't up yet, just keeps trying
* remove trailing whitespace in ci/audit.sh
* code review fixups
* rename GOSSIP_PORT_RANGE => SOLANA_PORT_RANGE
* remove out-of-date TODO in localnet-sanity.sh
* remove features=test and code that was using it (localhost prohibitions in
crdt) added TODO in crdt.rs, maybe we should boot localhost in production
networks?
* boot tvu_window from NodeInfo: instead, send repair requests from the repair
socket (to gossip on peer) and answer repair requests via the sockaddr
from the repair request
* remove various unused pub functions
* banish SocketAddr parse().unwrap() to a macro that can also accept simpler stuff
* move gossip/NCP off assuming anything about its address
* use a single socket to send and receive gossip
* remove --addr/-a from CLIs
* rearrange networking utility code
* use Arc<UdpSocket> to share the Sync-safe UdpSocket among threads
* rename TestNode to Node
TODO:
* re-enable 127.0.0.1 as a valid address in crdt
* change repair request/response to a similar, single socket
* pick cloned sockets or Arc<UdpSocket> for all these (rpu uses tryclone())
* update contact_info with network truthiness instead of what the node
says?
And update transaction offsets to use the same approach as packet.rs.
Maybe this should be serialized_size(), but thanks to this
GenericArray update, those values are the same.
* Added poll_balance_with_timeout method
- updated bench-tps, fullnode and wallet to use this method instead
of repeatedly calling poll_get_balance()
* Address review comments
- Revert some changes to use wrapper poll_get_balance()
* Reverting bench-tps to use poll_get_balance
- The original code is checking if the balance has been updated,
instead of just retrieving the balance. The logic is different
than poll_balance_with_timeout()
* Reverting wallet to use poll_get_balance
- The break condition in the loop is different than poll_balance_with_timeout().
It's checking if the balance has been updated.
Situation is there can be that there can be bad entries in
the bench-tps CRDT table until they get purged later. Threads however
are created for those bad entries and then will hang on trying
to get the transaction_count from those bad addresses and never end.
* Revert benchmarks back to libtest
Criterion has too many dependencies, it's execution as slower, and
we didn't see the kind of precision we had hoped for to use it to
block CI builds.
* Ignore benchmarks that take more than a few milliseconds per iteration
* Revert "Ignore benchmarks that take more than a few milliseconds per iteration"
This reverts commit b87cdf6ef4.
* Don't run benchmarks in CI
They are already built in the nightly build. Executing them in CI
doesn't add much value until the results are precise enough to act
on.
# This is the 1st commit message:
Fix tesetment readme
# This is the commit message #2:
updte
# This is the commit message #3:
typo
# This is the commit message #4:
cleanup
comments
fixups!
fixups!
fixups for a real Result<> from get_balance()
on 2nd thought, be more rigorous
Merge branch 'rob-solana-accounts_with_state' into accounts_with_state
update
review comments
comments
get rid of option
2018-08-16 14:44:51 -07:00
561 changed files with 71436 additions and 18664 deletions
Solana™ is a new blockchain architecture built from the ground up for scale. The architecture supports
@ -17,222 +17,18 @@ All claims, content, designs, algorithms, estimates, roadmaps, specifications, a
Introduction
===
It's possible for a centralized database to process 710,000 transactions per second on a standard gigabit network if the transactions are, on average, no more than 176 bytes. A centralized database can also replicate itself and maintain high availability without significantly compromising that transaction rate using the distributed system technique known as Optimistic Concurrency Control [H.T.Kung, J.T.Robinson (1981)]. At Solana, we're demonstrating that these same theoretical limits apply just as well to blockchain on an adversarial network. The key ingredient? Finding a way to share time when nodes can't trust one-another. Once nodes can trust time, suddenly ~40 years of distributed systems research becomes applicable to blockchain! Furthermore, and much to our surprise, it can implemented using a mechanism that has existed in Bitcoin since day one. The Bitcoin feature is called nLocktime and it can be used to postdate transactions using block height instead of a timestamp. As a Bitcoin client, you'd use block height instead of a timestamp if you don't trust the network. Block height turns out to be an instance of what's being called a Verifiable Delay Function in cryptography circles. It's a cryptographically secure way to say time has passed. In Solana, we use a far more granular verifiable delay function, a SHA 256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we implement Optimistic Concurrency Control and are now well in route towards that theoretical limit of 710,000 transactions per second.
It's possible for a centralized database to process 710,000 transactions per second on a standard gigabit network if the transactions are, on average, no more than 176 bytes. A centralized database can also replicate itself and maintain high availability without significantly compromising that transaction rate using the distributed system technique known as Optimistic Concurrency Control [\[H.T.Kung, J.T.Robinson (1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At Solana, we're demonstrating that these same theoretical limits apply just as well to blockchain on an adversarial network. The key ingredient? Finding a way to share time when nodes can't trust one-another. Once nodes can trust time, suddenly ~40 years of distributed systems research becomes applicable to blockchain!
> Perhaps the most striking difference between algorithms obtained by our method and ones based upon timeout is that using timeout produces a traditional distributed algorithm in which the processes operate asynchronously, while our method produces a globally synchronous one in which every process does the same thing at (approximately) the same time. Our method seems to contradict the whole purpose of distributed processing, which is to permit different processes to operate independently and perform different functions. However, if a distributed system is really a single system, then the processes must be synchronized in some way. Conceptually, the easiest way to synchronize processes is to get them all to do the same thing at the same time. Therefore, our method is used to implement a kernel that performs the necessary synchronization--for example, making sure that two different processes do not try to modify a file at the same time. Processes might spend only a small fraction of their time executing the synchronizing kernel; the rest of the time, they can operate independently--e.g., accessing different files. This is an approach we have advocated even when fault-tolerance is not required. The method's basic simplicity makes it easier to understand the precise properties of a system, which is crucial if one is to know just how fault-tolerant the system is. [\[L.Lamport (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
Testnet Demos
Furthermore, and much to our surprise, it can be implemented using a mechanism that has existed in Bitcoin since day one. The Bitcoin feature is called nLocktime and it can be used to postdate transactions using block height instead of a timestamp. As a Bitcoin client, you'd use block height instead of a timestamp if you don't trust the network. Block height turns out to be an instance of what's being called a Verifiable Delay Function in cryptography circles. It's a cryptographically secure way to say time has passed. In Solana, we use a far more granular verifiable delay function, a SHA 256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we implement Optimistic Concurrency Control and are now well en route towards that theoretical limit of 710,000 transactions per second.
Architecture
===
The Solana repo contains all the scripts you might need to spin up your own
local testnet. Depending on what you're looking to achieve, you may want to
run a different variation, as the full-fledged, performance-enhanced
multinode testnet is considerably more complex to set up than a Rust-only,
singlenode testnode. If you are looking to develop high-level features, such
as experimenting with smart contracts, save yourself some setup headaches and
stick to the Rust-only singlenode demo. If you're doing performance optimization
of the transaction pipeline, consider the enhanced singlenode demo. If you're
doing consensus work, you'll need at least a Rust-only multinode demo. If you want
to reproduce our TPS metrics, run the enhanced multinode demo.
Before you jump into the code, review the online book [Solana: Blockchain Rebuilt for Scale](https://solana-labs.github.io/book/).
For all four variations, you'd need the latest Rust toolchain and the Solana
to see the debug and info sections for streamer and server respectively. Generally
we are using debug for infrequent debug messages, trace for potentially frequent messages and
info for performance-related logging.
Start your own testnet locally, instructions are in the book [Solana: Blockchain Rebuild for Scale: Getting Started](https://solana-labs.github.io/book/getting-started.html).
Attaching to a running process with gdb:
Remote Testnets
---
```
$ sudo gdb
attach <PID>
set logging on
thread apply all bt
```
We maintain several testnets:
*`testnet` - public stable testnet accessible via testnet.solana.com, with an https proxy for web apps at api.testnet.solana.com. Runs 24/7
*`testnet-beta` - public beta channel testnet accessible via beta.testnet.solana.com. Runs 24/7
*`testnet-edge` - public edge channel testnet accessible via edge.testnet.solana.com. Runs 24/7
*`testnet-perf` - permissioned stable testnet running a 24/7 soak test
*`testnet-beta-perf` - permissioned beta channel testnet running a multi-hour soak test weekday mornings
*`testnet-edge-perf` - permissioned edge channel testnet running a multi-hour soak test weekday mornings
## Deploy process
They are deployed with the `ci/testnet-manager.sh` script through a list of [scheduled
After a block reaches finality, all blocks from that one on down
to the genesis block form a linear chain with the familiar name
blockchain. Until that point, however, the validator must maintain all
potentially valid chains, called *forks*. The process by which forks
naturally form as a result of leader rotation is described in
[fork generation](fork-generation.md). The *blocktree* data structure
described here is how a validator copes with those forks until blocks
are finalized.
The blocktree allows a validator to record every blob it observes
on the network, in any order, as long as the blob is signed by the expected
leader for a given slot.
Blobs are moved to a fork-able key space the tuple of `leader slot` + `blob
index` (within the slot). This permits the skip-list structure of the Solana
protocol to be stored in its entirety, without a-priori choosing which fork to
follow, which Entries to persist or when to persist them.
Repair requests for recent blobs are served out of RAM or recent files and out
of deeper storage for less recent blobs, as implemented by the store backing
Blocktree.
### Functionalities of Blocktree
1. Persistence: the Blocktree lives in the front of the nodes verification
pipeline, right behind network receive and signature verification. If the
blob received is consistent with the leader schedule (i.e. was signed by the
leader for the indicated slot), it is immediately stored.
2. Repair: repair is the same as window repair above, but able to serve any
blob that's been received. Blocktree stores blobs with signatures,
preserving the chain of origination.
3. Forks: Blocktree supports random access of blobs, so can support a
validator's need to rollback and replay from a Bank checkpoint.
4. Restart: with proper pruning/culling, the Blocktree can be replayed by
ordered enumeration of entries from slot 0. The logic of the replay stage
(i.e. dealing with forks) will have to be used for the most recent entries in
the Blocktree.
### Blocktree Design
1. Entries in the Blocktree are stored as key-value pairs, where the key is the concatenated
slot index and blob index for an entry, and the value is the entry data. Note blob indexes are zero-based for each slot (i.e. they're slot-relative).
2. The Blocktree maintains metadata for each slot, in the `SlotMeta` struct containing:
*`slot_index` - The index of this slot
*`num_blocks` - The number of blocks in the slot (used for chaining to a previous slot)
*`consumed` - The highest blob index `n`, such that for all `m < n`, there exists a blob in this slot with blob index equal to `n` (i.e. the highest consecutive blob index).
*`received` - The highest received blob index for the slot
*`next_slots` - A list of future slots this slot could chain to. Used when rebuilding
the ledger to find possible fork points.
*`last_index` - The index of the blob that is flagged as the last blob for this slot. This flag on a blob will be set by the leader for a slot when they are transmitting the last blob for a slot.
*`is_rooted` - True iff every block from 0...slot forms a full sequence without any holes. We can derive is_rooted for each slot with the following rules. Let slot(n) be the slot with index `n`, and slot(n).is_full() is true if the slot with index `n` has all the ticks expected for that slot. Let is_rooted(n) be the statement that "the slot(n).is_rooted is true". Then:
is_rooted(0)
is_rooted(n+1) iff (is_rooted(n) and slot(n).is_full()
3. Chaining - When a blob for a new slot `x` arrives, we check the number of blocks (`num_blocks`) for that new slot (this information is encoded in the blob). We then know that this new slot chains to slot `x - num_blocks`.
4. Subscriptions - The Blocktree records a set of slots that have been "subscribed" to. This means entries that chain to these slots will be sent on the Blocktree channel for consumption by the ReplayStage. See the `Blocktree APIs` for details.
5. Update notifications - The Blocktree notifies listeners when slot(n).is_rooted is flipped from false to true for any `n`.
### Blocktree APIs
The Blocktree offers a subscription based API that ReplayStage uses to ask for entries it's interested in. The entries will be sent on a channel exposed by the Blocktree. These subscription API's are as follows:
1.`fn get_slots_since(slot_indexes: &[u64]) -> Vec<SlotMeta>`: Returns new slots connecting to any element of the list `slot_indexes`.
2.`fn get_slot_entries(slot_index: u64, entry_start_index: usize, max_entries: Option<u64>) -> Vec<Entry>`: Returns the entry vector for the slot starting with `entry_start_index`, capping the result at `max` if `max_entries == Some(max)`, otherwise, no upper limit on the length of the return vector is imposed.
Note: Cumulatively, this means that the replay stage will now have to know when a slot is finished, and subscribe to the next slot it's interested in to get the next set of entries. Previously, the burden of chaining slots fell on the Blocktree.
### Interfacing with Bank
The bank exposes to replay stage:
1.`prev_hash`: which PoH chain it's working on as indicated by the hash of the last
entry it processed
2.`tick_height`: the ticks in the PoH chain currently being verified by this
bank
3.`votes`: a stack of records that contain:
1.`prev_hashes`: what anything after this vote must chain to in PoH
2.`tick_height`: the tick height at which this vote was cast
3.`lockout period`: how long a chain must be observed to be in the ledger to
be able to be chained below this vote
Replay stage uses Blocktree APIs to find the longest chain of entries it can
hang off a previous vote. If that chain of entries does not hang off the
latest vote, the replay stage rolls back the bank to that vote and replays the
chain from there.
### Pruning Blocktree
Once Blocktree entries are old enough, representing all the possible forks
becomes less useful, perhaps even problematic for replay upon restart. Once a
validator's votes have reached max lockout, however, any Blocktree contents
that are not on the PoH chain for that vote for can be pruned, expunged.
Replicator nodes will be responsible for storing really old ledger contents,
and validators need only persist their bank periodically.
A colluding validation-client, may take the strategy to mark PoReps from non-colluding replicator nodes as invalid as an attempt to maximize the rewards for the colluding replicator nodes. In this case, it isn’t feasible for the offended-against replicator nodes to petition the network for resolution as this would result in a network-wide vote on each offending PoRep and create too much overhead for the network to progress adequately. Also, this mitigation attempt would still be vulnerable to a >= 51% staked colluder.
Alternatively, transaction fees from submitted PoReps are pooled and distributed across validation-clients in proportion to the number of valid PoReps discounted by the number of invalid PoReps as voted by each validator-client. Thus invalid votes are directly dis-incentivized through this reward channel. Invalid votes that are revealed by replicator nodes as fishing PoReps, will not be discounted from the payout PoRep count.
Another collusion attack involves a validator-client who may take the strategy to ignore invalid PoReps from colluding replicator and vote them as valid. In this case, colluding replicator-clients would not have to store the data while still receiving rewards for validated PoReps. Additionally, colluding validator nodes would also receive rewards for validating these PoReps. To mitigate this attack, validators must randomly sample PoReps corresponding to the ledger block they are validating and because of this, there will be multiple validators that will receive the colluding replicator’s invalid submissions. These non-colluding validators will be incentivized to mark these PoReps as invalid as they have no way to determine whether the proposed invalid PoRep is actually a fishing PoRep, for which a confirmation vote would result in the validator’s stake being slashed.
In this case, the proportion of time a colluding pair will be successful has an upper limit determined by the % of stake of the network claimed by the colluding validator. This also sets bounds to the value of such an attack. For example, if a colluding validator controls 10% of the total validator stake, transaction fees will be lost (likely sent to mining pool) by the colluding replicator 90% of the time and so the attack vector is only profitable if the per-PoRep reward at least 90% higher than the average PoRep transaction fee. While, probabilistically, some colluding replicator-client PoReps will find their way to colluding validation-clients, the network can also monitor rates of paired (validator + replicator) discrepancies in voting patterns and censor identified colluders in these cases.
Long term economic sustainability is one of the guiding principles of Solana’s economic design. While it is impossible to predict how decentralized economies will develop over time, especially economies with flexible decentralized governances, we can arrange economic components such that, under certain conditions, a sustainable economy may take shape in the long term. In the case of Solana’s network, these components take the form of the remittances and deposits into and out of the reserve ‘mining pool’.
The dominant remittances from the Solana mining pool are validator and replicator rewards. The deposit mechanism is a flat, protocol-specified and adjusted, % of each transaction fee.
The Replicator rewards are to be delivered to replicators from the mining pool after successful PoRep validation. The per-PoRep reward amount is determined as a function of the total network storage redundancy at the time of the PoRep validation and the network goal redundancy. This function is likely to take the form of a discount from a base reward to be delivered when the network has achieved and maintained its goal redundancy. An example of such a reward function is shown in **Figure 3**
**Figure 3**: Example PoRep reward design as a function of global network storage redundancy.
In the example shown in Figure 1, multiple per PoRep base rewards are explored (as a % of Tx Fee) to be delivered when the global ledger replication redundancy meets 10X. When the global ledger replication redundancy is less than 10X, the base reward is discounted as a function of the square of the ratio of the actual ledger replication redundancy to the goal redundancy (i.e. 10X).
The other protocol-based remittance goes to validation-clients as a reward distributed in proportion to stake-weight for voting to validate the ledger state. The functional issuance of this reward is described in [State-validation Protocol-based Rewards](ed_vce_state_validation_protocol_based_rewards.md) and is designed to reduce over time until validators are incentivized solely through collection of transaction fees. Therefore, in the long-run, protocol-based rewards to replication-nodes will be the only remittances from the mining pool, and will have to be countered by the portion of each non-PoRep transaction fee that is directed back into the mining pool. I.e. for a long-term self-sustaining economy, replicator-client rewards must be subsidized through a minimum fee on each non-PoRep transaction pre-allocated to the mining pool. Through this constraint, we can write the following inequality:
Solana’s crypto-economic system is designed to promote a healthy, long term self-sustaining economy with participant incentives aligned to the security and decentralization of the network. The main participants in this economy are validation-clients and replication-clients. Their contributions to the network, state validation and data storage respectively, and their requisite remittance mechanisms are discussed below.
The main channels of participant remittances are referred to as protocol-based rewards and transaction fees. Protocol-based rewards are protocol-derived issuances from a network-controlled reserve of tokens (sometimes referred to as the ‘mining pool’). These rewards will constitute the total reward delivered to replication clients and a portion of the total rewards for validation clients, the remaining sourced from transaction fees. In the early days of the network, it is likely that protocol-based rewards, deployed based on predefined issuance schedule, will drive the majority of participant incentives to join the network.
These protocol-based rewards, to be distributed to participating validation and replication clients, are to be specified as annual interest rates calculated per, real-time, Solana epoch [DEFINITION]. As discussed further below, the issuance rates are determined as a function of total network validator staked percentage and total replication provided by replicators in each previous epoch. The choice for validator and replicator client rewards to be based on participation rates, rather than a global fixed inflation or interest rate, emphasizes a protocol priority of overall economic security, rather than monetary supply predictability. Due to Solana’s hard total supply cap of 1B tokens and the bounds of client participant rates in the protocol, we believe that global interest, and supply issuance, scenarios should be able to be modeled with reasonable uncertainties.
Transaction fees are market-based participant-to-participant transfers, attached to network interactions as a necessary motivation and compensation for the inclusion and execution of a proposed transaction (be it a state execution or proof-of-replication verification). A mechanism for continuous and long-term funding of the mining pool through a pre-dedicated portion of transaction fees is also discussed below.
A high-level schematic of Solana’s crypto-economic design is shown below in **Figure 1**. The specifics of validation-client economics are described in sections: [Validation-client Economics](ed_validation_client_economics.md), [State-validation Protocol-based Rewards](ed_vce_state_validation_protocol_based_rewards.md), [State-validation Transaction Fees](ed_vce_state_validation_transaction_fees.md) and [Replication-validation Transaction Fees](ed_vce_replication_validation_transaction_fees.md). Also, the chapter titled [Validation Stake Delegation](ed_vce_validation_stake_delegation.md) closes with a discussion of validator delegation opportunties and marketplace. The [Replication-client Economics](ed_replication_client_economics.md) chapter will review the Solana network design for global ledger storage/redundancy and replicator-client economics ([Storage-replication rewards](ed_rce_storage_replication_rewards.md)) along with a replicator-to-validator delegation mechanism designed to aide participant on-boarding into the Solana economy discussed in [Replication-client Reward Auto-delegation](ed_rce_replication_client_reward_auto_delegation.md). The [Economic Sustainability](ed_economic_sustainability.md) section dives deeper into Solana’s design for long-term economic sustainability and outlines the constraints and conditions for a self-sustaining economy. Finally, in chapter [Attack Vectors](ed_attack_vectors.md), various attack vectors will be described and potential vulnerabilities explored and parameterized.
<!--  -->
The ability for Solana network participant’s to earn rewards by providing storage service is a unique on-boarding path that requires little hardware overhead and minimal upfront capital. It offers an avenue for individuals with extra-storage space on their home laptops or PCs to contribute to the security of the network and become integrated into the Solana economy.
To enhance this on-boarding ramp and facilitate further participation and investment in the Solana economy, replication-clients have the opportunity to auto-delegate their rewards to validation-clients of their choice. Much like the automatic reinvestment of stock dividends, in this scenario, a replicator-client can earn Solana tokens by providing some storage capacity to the network (i.e. via submitting valid PoReps), have the protocol-based rewards automatically assigned as delegation to a staked validator node and therefore earning interest in the validation-client reward pool.
Replicator-clients download, encrypt and submit PoReps for ledger block sections.3 PoReps submitted to the PoH stream, and subsequently validated, function as evidence that the submitting replicator client is indeed storing the assigned ledger block sections on local hard drive space as a service to the network. Therefore, replicator clients should earn protocol rewards proportional to the amount of storage, and the number of successfully validated PoReps, that they are verifiably providing to the network.
Additionally, replicator clients have the opportunity to capture a portion of slashed bounties [TBD] of dishonest validator clients. This can be accomplished by a replicator client submitting a verifiably false PoRep for which a dishonest validator client receives and signs as a valid PoRep. This reward incentive is to prevent lazy validators and minimize validator-replicator collusion attacks, more on this below.
Replication-clients should be rewarded for providing the network with storage space. Incentivization of the set of replicators provides data security through redundancy of the historical ledger. Replication nodes are rewarded in proportion to the amount of ledger data storage provided. These rewards are captured by generating and entering Proofs of Replication (PoReps) into the PoH stream which can be validated by Validation nodes as described above in the [Replication-validation Transaction Fees](ed_vce_replication_validation_transaction_fees.md) chapter.
Validator-clients are eligible to receive protocol-based (i.e. via mining pool) rewards issued via stake-based annual interest rates by providing compute (CPU+GPU) resources to validate and vote on a given PoH state. These protocol-based rewards are determined through an algorithmic schedule as a function of total amount of Solana tokens staked in the system and duration since network launch (genesis block). Additionally, these clients may earn revenue through two types of transaction fees: state-validation transaction fees and pooled Proof-of-Replication (PoRep) transaction fees. The distribution of these two types of transaction fees to the participating validation set are designed independently as economic goals and attack vectors are unique between the state- generation/validation mechanism and the ledger replication/validation mechanism. For clarity, we separately describe the design and motivation of the three types of potential revenue streams for validation-clients below: state-validation protocol-based rewards, state-validation transaction fees and PoRep-validation transaction fees.
As previously mentioned, validator-clients will also be responsible for validating PoReps submitted into the PoH stream by replicator-clients. In this case, validators are providing compute (CPU/GPU) and light storage resources to confirm that these replication proofs could only be generated by a client that is storing the referenced PoH leger block.2
While replication-clients are incentivized and rewarded through protocol-based rewards schedule (see [Replication-client Economics](ed_replication_client_economics.md)), validator-clients will be incentivized to include and validate PoReps in PoH through the distribution of the transaction fees associated with the submitted PoRep. As will be described in detail in the Section 3.1, replication-client rewards are protocol-based and designed to reward based on a global data redundancy factor. I.e. the protocol will incentivize replication-client participation through rewards based on a target ledger redundancy (e.g. 10x data redundancy). It was chosen not to include a distribution of these rewards to PoRep validators, and to rely only on the collection of PoRep attached transaction fees, due to the fact that the confluence of two participation incentive modes (state-validation inflation rate via global staked % and replication-validation rewards based on global redundancy factor) on the incentives of a single network participant (a validator-client) potentially opened up a significant incentive-driven attack surface area.
The validation of PoReps by validation-clients is computationally more expensive than state-validation (detail in the [Economic Sustainability](ed_economic_sustainability.md) chapter), thus the transaction fees are expected to be proportionally higher. However, because replication-client rewards are distributed in proportion to and only after submitted PoReps are validated, they are uniquely motivated for the inclusion and validation of their proofs. This pressure is expected to generate an adequate market economy between replication-clients and validation-clients. Additionally, transaction fees submitted with PoReps have no minimum amount pre-allocated to the mining pool, as do state-validation transaction fees.
There are various attack vectors available for colluding validation and replication clients, as described in detail below in [Economic Sustainability](ed_economic_sustainability). To protect against various collusion attack vectors, for a given epoch, PoRep transaction fees are pooled, and redistributed across participating validation-clients in proportion to the number of validated PoReps in the epoch less the number of invalidated PoReps [DIAGRAM]. This design rewards validators proportional to the number of PoReps they process and validate, while providing negative pressure for validation-clients to submit lazy or malicious invalid votes on submitted PoReps (note that it is computationally prohibitive to determine whether a validator-client has marked a valid PoRep as invalid).
Validator-clients have two functional roles in the Solana network
* Validate (vote) the current global state of that PoH along with any Proofs-of-Replication (see [Replication Client Economics](ed_replication_client_economics.md)) that they are eligible to validate
* Be elected as ‘leader’ on a stake-weighted round-robin schedule during which time they are responsible for collecting outstanding transactions and Proofs-of-Replication and incorporating them into the PoH, thus updating the global state of the network and providing chain continuity.
Validator-client rewards for these services are to be distributed at the end of each Solana epoch. Compensation for validator-clients is provided via a protocol-based annual interest rate dispersed in proportion to the stake-weight of each validator (see below) along with leader-claimed transaction fees available during each leader rotation. I.e. during the time a given validator-client is elected as leader, it has the opportunity to keep a portion of each non-PoRep transaction fee, less a protocol-specified amount that is returned to the mining pool (see [Validation-client State Transaction Fees](ed_vce_state_validation_transaction_fees.md)). PoRep transaction fees are not collected directly by the leader client but pooled and returned to the validator set in proportion to the number of successfully validated PoReps. (see [Replication-client Transaction Fees](ed_vce_replication_validation_transaction_fees.md))
The protocol-based annual interest-rate (%) per epoch to be distributed to validation-clients is to be a function of:
* the current fraction of staked SOLs out of the current total circulating supply,
* the global time since the genesis block instantiation
* the up-time/participation [% of available slots/blocks that validator had opportunity to vote on?] of a given validator over the previous epoch.
The first two factors are protocol parameters only (i.e. independent of validator behavior in a given epoch) and describe a global validation reward schedule designed to both incentivize early participation and optimal security in the network. This schedule sets a maximum annual validator-client interest rate per epoch.
At any given point in time, this interest rate is pegged to a defined value given a specific % staked SOL out of the circulating supply (e.g. 10% interest rate when 66% of circulating SOL is staked). The interest rate adjusts as the square-root [TBD] of the % staked, leading to higher validation-client interest rates as the % staked drops below the targeted goal, thus incentivizing more participation leading to more security in the network. An example of such a schedule, for a specified point in time (e.g. network launch) is shown in **Table 1**.
**Table 1:** Example interest rate schedule based on % SOL staked out of circulating supply. In this case, interest rates are fixed at 10% for 66% of staked circulating supply
Over time, the interest rate, at any network staked percentage, will drop as described by an algorithmic schedule. Validation-client interest rates are designed to be higher in the early days of the network to incentivize participation and jumpstart the network economy. This mining-pool provided interest rate will reduce over time until a network-chosen baseline value is reached. This is a fixed, long-term, interest rate to be provided to validator-clients. This value does not represent the total interest available to validator-clients as transaction fees for both state-validation and ledger storage replication (PoReps) are not accounted for here. A validation-client interest rate schedule as a function of % network staked and time is shown in** Figure 2**.
**Figure 2:** In this example schedule, the annual interest rate [%] reduces at around 16.7% per year, until it reaches the long-term, fixed, 4% rate.
This epoch-specific protocol-defined interest rate sets an upper limit of *protocol-generated* annual interest rate (not absolute total interest rate) possible to be delivered to any validator-client per epoch. The distributed interest rate per epoch is then discounted from this value based on the participation of the validator-client during the previous epoch. Each epoch is comprised of XXX slots. The protocol-defined interest rate is then discounted by the log [TBD] of the % of slots a given validator submitted a vote on a PoH branch during that epoch, see **Figure XX**
Each message sent through the network, to be processed by the current leader validation-client and confirmed as a global state transaction, must contain a transaction fee. Transaction fees offer many benefits in the Solana economic design, for example they:
* provide unit compensation to the validator network for the CPU/GPU resources necessary to process the state transaction,
* reduce network spam by introducing real cost to transactions,
* open avenues for a transaction market to incentivize validation-client to collect and process submitted transactions in their function as leader,
* and provide potential long-term economic stability of the network through a protocol-captured minimum fee amount per transaction, as described below.
Many current blockchain economies (e.g. Bitcoin, Ethereum), rely on protocol-based rewards to support the economy in the short term, with the assumption that the revenue generated through transaction fees will support the economy in the long term, when the protocol derived rewards expire. In an attempt to create a sustainable economy through protocol-based rewards and transaction fees, a fixed portion of each transaction fee is sent to the mining pool, with the resulting fee going to the current leader processing the transaction. These pooled fees, then re-enter the system through rewards distributed to validation-clients, through the process described above, and replication-clients, as discussed below.
The intent of this design is to retain leader incentive to include as many transactions as possible within the leader-slot time, while providing a redistribution avenue that protects against "tax evasion" attacks (i.e. side-channel fee payments)<sup>[1](ed_referenced.md)</sup>. Constraints on the fixed portion of transaction fees going to the mining pool, to establish long-term economic sustainability, are established and discussed in detail in the [Economic Sustainability](ed_economic_sustainability.md) section.
This minimum, protocol-earmarked, portion of each transaction fee can be dynamically adjusted depending on historical gas usage. In this way, the protocol can use the minimum fee to target a desired hardware utilisation. By monitoring a protocol specified gas usage with respect to a desired, target usage amount (e.g. 50% of a block's capacity), the minimum fee can be raised/lowered which should, in turn, lower/raise the actual gas usage per block until it reaches the target amount. This adjustment process can be thought of as similar to the difficulty adjustment algorithm in the Bitcoin protocol, however in this case it is adjusting the minimum transaction fee to guide the transaction processing hardware usage to a desired level.
Additionally, the minimum protocol captured fee can be a consideration in fork selection. In the case of a PoH fork with a malicious, censoring leader, we would expect the total procotol captured fee to be less than a comparable honest fork, due to the fees lost from censoring. If the censoring leader is to compensate for these lost protocol fees, they would have to replace the fees on their fork themselves, thus potentially reducing the incentive to censor in the first place.
Running a Solana validation-client required relatively modest upfront hardware capital investment. **Table 2** provides an example hardware configuration to support ~1M tx/s with estimated ‘off-the-shelf’ costs:
|Component|Example|Estimated Cost|
|--- |--- |--- |
|GPU|2x 2080 Ti|$2500|
|or|4x 1080 Ti|$2800|
|OS/Ledger Storage|Samsung 860 Evo 2TB|$370|
|Accounts storage|2x Samsung 970 Pro M.2 512GB|$340|
|RAM|32 Gb|$300|
|Motherboard|AMD x399|$400|
|CPU|AMD Threadripper 2920x|$650|
|Case||$100|
|Power supply|EVGA 1600W|$300|
|Network|> 500 mbps||
|Network (1)|Google webpass business bay area 1gbps unlimited|$5500/mo|
|Network (2)|Hurricane Electric bay area colo 1gbps|$500/mo|
**Table 2** example high-end hardware setup for running a Solana client.
Despite the low-barrier to entry as a validation-client, from a capital investment perspective, as in any developing economy, there will be much opportunity and need for trusted validation services as evidenced by node reliability, UX/UI, APIs and other software accessibility tools. Additionally, although Solana’s validator node startup costs are nominal when compared to similar networks, they may still be somewhat restrictive for some potential participants. In the spirit of developing a true decentralized, permissionless network, these interested parties still have two options to become involved in the Solana network/economy:
1. Delegation of previously acquired tokens with a reliable validation node to earn a portion of interest generated
2. Provide local storage space as a replication-client and receive rewards by submitting Proof-of-Replication (see [Replication-client Economics](ed_replication_client_economics.md)).
a. This participant has the additional option to directly delegate their earned storage rewards ([Replication-client Reward Auto-delegation](ed_rce_replication_client_reward_auto_delegation.md))
Delegation of tokens to validation-clients, via option 1, provides a way for passive Solana token holders to become part of the active Solana economy and earn interest rates proportional to the interest rate generated by the delegated validation-client. Additionally, this feature creates a healthy validation-client market, with potential validation-client nodes competing to build reliable, transparent and profitable delegation services.
You can observe the effects of your client's transactions on our [dashboard](https://metrics.solana.com:3000/d/testnet/testnet-hud?orgId=2&from=now-30m&to=now&refresh=5s&var-testnet=testnet)
Solana nodes accept HTTP requests using the [JSON-RPC 2.0](https://www.jsonrpc.org/specification) specification.
To interact with a Solana node inside a JavaScript application, use the [solana-web3.js](https://github.com/solana-labs/solana-web3.js) library, which gives a convenient interface for the RPC methods.
Returns the status of a given signature. This method is similar to
[confirmTransaction](#confirmtransaction) but provides more resolution for error
events.
##### Parameters:
*`string` - Signature of Transaction to confirm, as base-58 encoded string
##### Results:
*`string` - Transaction status:
*`Confirmed` - Transaction was successful
*`SignatureNotFound` - Unknown transaction
*`ProgramRuntimeError` - An error occurred in the program that processed this Transaction
*`AccountInUse` - Another Transaction had a write lock one of the Accounts specified in this Transaction. The Transaction may succeed if retried
*`GenericFailure` - Some other error occurred. **Note**: In the future new Transaction statuses may be added to this list. It's safe to assume that all new statuses will be more specific error conditions that previously presented as `GenericFailure`
Initial Proof of Stake (PoS) (i.e. using in-protocol asset, SOL, to provide
secure consensus) design ideas outlined here. Solana will implement a proof of
stake reward/security scheme for node validators in the cluster. The purpose is
threefold:
- Align validator incentives with that of the greater cluster through
skin-in-the-game deposits at risk
- Avoid 'nothing at stake' fork voting issues by implementing slashing rules
aimed at promoting fork convergence
- Provide an avenue for validator rewards provided as a function of validator
participation in the cluster.
While many of the details of the specific implementation are currently under
consideration and are expected to come into focus through specific modeling
studies and parameter exploration on the Solana testnet, we outline here our
current thinking on the main components of the PoS system. Much of this
thinking is based on the current status of Casper FFG, with optimizations and
specific attributes to be modified as is allowed by Solana's Proof of History
(PoH) blockchain data structure.
### General Overview
Solana's ledger validation design is based on a rotating, stake-weighted selected leader broadcasting transactions in a PoH data
structure to validating nodes. These nodes, upon receiving the leader's
broadcast, have the opportunity to vote on the current state and PoH height by
signing a transaction into the PoH stream.
To become a Solana validator, a fullnode must deposit/lock-up some amount
of SOL in a contract. This SOL will not be accessible for a specific time
period. The precise duration of the staking lockup period has not been
determined. However we can consider three phases of this time for which
specific parameters will be necessary:
- *Warm-up period*: which SOL is deposited and inaccessible to the node,
however PoH transaction validation has not begun. Most likely on the order of
days to weeks
- *Validation period*: a minimum duration for which the deposited SOL will be
inaccessible, at risk of slashing (see slashing rules below) and earning
rewards for the validator participation. Likely duration of months to a
year.
- *Cool-down period*: a duration of time following the submission of a
'withdrawal' transaction. During this period validation responsibilities have
been removed and the funds continue to be inaccessible. Accumulated rewards
should be delivered at the end of this period, along with the return of the
initial deposit.
Solana's trustless sense of time and ordering provided by its PoH data
structure, along with its
[avalanche](https://www.youtube.com/watch?v=qt_gDRXHrHQ&t=1s) data broadcast
and transmission design, should provide sub-second transaction confirmation times that scale
with the log of the number of nodes in the cluster. This means we shouldn't
have to restrict the number of validating nodes with a prohibitive 'minimum
deposits' and expect nodes to be able to become validators with nominal amounts
of SOL staked. At the same time, Solana's focus on high-throughput should create incentive for validation clients to provide high-performant and reliable hardware. Combined with potential a minimum network speed threshold to join as a validation-client, we expect a healthy validation delegation market to emerge. To this end, Solana's testnet will lead into a "Tour de SOL" validation-client competition, focusing on throughput and uptime to rank and reward testnet validators.
### Slashing rules
Unlike Proof of Work (PoW) where off-chain capital expenses are already
deployed at the time of block construction/voting, PoS systems require
capital-at-risk to prevent a logical/optimal strategy of multiple chain voting.
We intend to implement slashing rules which, if broken, result some amount of
the offending validator's deposited stake to be removed from circulation. Given
the ordering properties of the PoH data structure, we believe we can simplify
our slashing rules to the level of a voting lockout time assigned per vote.
I.e. Each vote has an associated lockout time (PoH duration) that represents a
duration by any additional vote from that validator must be in a PoH that
contains the original vote, or a portion of that validator's stake is
slashable. This duration time is a function of the initial vote PoH count and
all additional vote PoH counts. It will likely take the form:
Lockout<sub>i</sub>(PoH<sub>i</sub>, PoH<sub>j</sub>) = PoH<sub>j</sub> + K *
exp((PoH<sub>j</sub> - PoH<sub>i</sub>) / K)
Where PoH<sub>i</sub> is the height of the vote that the lockout is to be
applied to and PoH<sub>j</sub> is the height of the current vote on the same
fork. If the validator submits a vote on a different PoH fork on any
PoH<sub>k</sub> where k > j > i and PoH<sub>k</sub><Lockout(PoH<sub>i</sub>,
PoH<sub>j</sub>), then a portion of that validator's stake is at risk of being
slashed.
In addition to the functional form lockout described above, early
implementation may be a numerical approximation based on a First In, First Out
(FIFO) data structure and the following logic:
- FIFO queue holding 32 votes per active validator
- new votes are pushed on top of queue (`push_front`)
- expired votes are popped off top (`pop_front`)
- as votes are pushed into the queue, the lockout of each queued vote doubles
- votes are removed from back of queue if `queue.len() > 32`
- the earliest and latest height that has been removed from the back of the
queue should be stored
It is likely that a reward will be offered as a % of the slashed amount to any
node that submits proof of this slashing condition being violated to the PoH.
#### Partial Slashing
In the schema described so far, when a validator votes on a given PoH stream,
they are committing themselves to that fork for a time determined by the vote
lockout. An open question is whether validators will be hesitant to begin
voting on an available fork if the penalties are perceived too harsh for an
honest mistake or flipped bit.
One way to address this concern would be a partial slashing design that results
in a slashable amount as a function of either:
1. the fraction of validators, out of the total validator pool, that were also
slashed during the same time period (ala Casper)
2. the amount of time since the vote was cast (e.g. a linearly increasing % of
total deposited as slashable amount over time), or both.
This is an area currently under exploration
### Penalties
As discussed in the [Economic Design](ed_overview.md) section, annual validator interest rates are to be specified as a
function of total percentage of circulating supply that has been staked. The cluster rewards validators who are online
and actively participating in the validation process throughout the entirety of
their *validation period*. For validators that go offline/fail to validate
transactions during this period, their annual reward is effectively reduced.
Similarly, we may consider an algorithmic reduction in a validator's active
amount staked amount in the case that they are offline. I.e. if a validator is
inactive for some amount of time, either due to a partition or otherwise, the
amount of their stake that is considered ‘active’ (eligible to earn rewards)
may be reduced. This design would be structured to help long-lived partitions
to eventually reach finality on their respective chains as the % of non-voting
total stake is reduced over time until a super-majority can be achieved by the
active validators in each partition. Similarly, upon re-engaging, the ‘active’
amount staked will come back online at some defined rate. Different rates of
stake reduction may be considered depending on the size of the partition/active
This design describes additional vote signing behavior that will make the
process more secure.
Currently, Solana implements a vote-signing service that evaluates each vote to
ensure it does not violate a slashing condition. The service could potentially
have different variations, depending on the hardware platform capabilities. In
particular, it could be used in conjunction with a secure enclave (such as SGX).
The enclave could generate an asymmetric key, exposing an API for user
(untrusted) code to sign the vote transactions, while keeping the vote-signing
private key in its protected memory.
The following sections outline how this architecture would work:
## Message Flow
1. The node initializes the enclave at startup
* The enclave generates an asymmetric key and returns the public key to the
node
* The keypair is ephemeral. A new keypair is generated on node bootup. A
new keypair might also be generated at runtime based on some TBD
criteria.
* The enclave returns its attestation report to the node
2. The node performs attestation of the enclave (e.g using Intel's IAS APIs)
* The node ensures that the Secure Enclave is running on a TPM and is
signed by a trusted party
3. The stakeholder of the node grants ephemeral key permission to use its stake.
This process is TBD.
4. The node's untrusted, non-enclave software calls trusted enclave software
using its interface to sign transactions and other data.
* In case of vote signing, the node needs to verify the PoH. The PoH
verification is an integral part of signing. The enclave would be
presented with some verifiable data to check before signing the vote.
* The process of generating the verifiable data in untrusted space is TBD
## PoH Verification
1. When the node votes on an en entry `X`, there's a lockout period `N`, for
which it cannot vote on a fork that does not contain `X` in its history.
2. Every time the node votes on the derivative of `X`, say `X+y`, the lockout
period for `X` increases by a factor `F` (i.e. the duration node cannot vote on
a fork that does not contain `X` increases).
* The lockout period for `X+y` is still `N` until the node votes again.
3. The lockout period increment is capped (e.g. factor `F` applies maximum 32
times).
4. The signing enclave must not sign a vote that violates this policy. This
means
* Enclave is initialized with `N`, `F` and `Factor cap`
* Enclave stores `Factor cap` number of entry IDs on which the node had
previously voted
* The sign request contains the entry ID for the new vote
* Enclave verifies that new vote's entry ID is on the correct fork
(following the rules #1 and #2 above)
## Ancestor Verification
This is alternate, albeit, less certain approach to verifying voting fork.
1. The validator maintains an active set of nodes in the cluster
2. It observes the votes from the active set in the last voting period
3. It stores the ancestor/last_tick at which each node voted
4. It sends new vote request to vote-signing service
* It includes previous votes from nodes in the active set, and their
corresponding ancestors
5. The signer checks if the previous votes contains a vote from the validator,
and the vote ancestor matches with majority of the nodes
* It signs the new vote if the check is successful
* It asserts (raises an alarm of some sort) if the check is unsuccessful
The premise is that the validator can be spoofed at most once to vote on
incorrect data. If someone hijacks the validator and submits a vote request for
bogus data, that vote will not be included in the PoH (as it'll be rejected by
the cluster). The next time the validator sends a request to sign the vote, the
signing service will detect that validator's last vote is missing (as part of
#5 above).
## Fork determination
Due to the fact that the enclave cannot process PoH, it has no direct knowledge
of fork history of a submitted validator vote. Each enclave should be initiated
with the current *active set* of public keys. A validator should submit its
current vote along with the votes of the active set (including itself) that it
observed in the slot of its previous vote. In this way, the enclave can surmise
the votes accompanying the validator's previous vote and thus the fork being
voted on. This is not possible for the validator's initial submitted vote, as
it will not have a 'previous' slot to reference. To account for this, a short
voting freeze should apply until the second vote is submitted containing the
votes within the active set, along with it's own vote, at the height of the
initial vote.
## Enclave configuration
A staking client should be configurable to prevent voting on inactive forks.
This mechanism should use the client's known active set `N_active` along with a
threshold vote `N_vote` and a threshold depth `N_depth` to determine whether or
not to continue voting on a submitted fork. This configuration should take the
form of a rule such that the client will only vote on a fork if it observes
more than `N_vote` at `N_depth`. Practically, this represents the client from
confirming that it has observed some probability of economic finality of the
submitted fork at a depth where an additional vote would create a lockout for
an undesirable amount of time if that fork turns out not to be live.
## Challenges
1. Generation of verifiable data in untrusted space for PoH verification in the
enclave.
2. Need infrastructure for granting stake to an ephemeral key.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.