* Multi-version snapshot support
* rustfmt
* Remove CLI options and runtime support for selection output snapshot version.
Address some clippy complaints.
* Muzzle clippy type complexity warning.
Despite clippy's suggestion, it is not currently possible to create type aliases
for traits and so everything within the 'Box<...>' cannot be type aliased.
This then leaves creating full blown traits, and either implementing
said traits by closure (somehow) or moving the closures into new structs
implementing said traits which seems a bit of a palaver.
Alternatively it is possible to define and use the type alias 'type ResultBox<T> = Result<Box<T>>'
which does seems rather pointless and not a great reduction in complexity but is enough to keep clippy happy.
In the end I simply went with squelching the clippy warning.
* Remove now unused Serialize/Deserialize trait implementations for AccountStorageEntry and AppendVec
* refactor versioned de/serialisers
* rename serde_utils to serde_snapshot
* move call to accounts_db.generate_index() back down to context_accountsdb_from_stream()
* update version 1.1.1 to 1.2.0
remove nested use of serialize_bytes
* cleanups
* Add back measurement of account storage entry serialization.
Remove construction of Vec and HashMap temporaries during serialization.
* consolidate serialisation test cases into serde_snapshot.
clean up leakage of implementation details in serde_snapshot.
* move short term / legacy snapshot code into child module
* add serialize_iter_as_tuple
* preliminary integration of following commit
commit 6d58b73c47294bfb93465d5a83cd2175660b6e6d
Author: Ryo Onodera <ryoqun@gmail.com>
Date: Wed May 20 14:02:02 2020 +0900
Confine snapshot 1.1 relic to versioned codepath
* refactored serde_snapshot, rustfmt
legacy accounts_db format now "owns" both leading u64s, legacy bank_rc format has none
* reduce type complexity (clippy)
* Fixup commitment-aggregation metric
* Trigger notifications after commitment-cache update
* Fixup fn name
* Add single-confirmation commitment level
* Rename to highest_confirmed_slot
* Pass commitment-cache info directly to notifications
* Use match
* Update commitment docs
* Update out of date pubsub docs
* Don't discard transaction records
Not enough certainty, because only half the blockhashes
are returned via the RPC call. Even if all 300, there's
still the possiblity other validators are behind, and
a superamajority will vote on on a block that includes
the transaction.
* fmt
* lamports->SOL in user-facing error msg
* Check for sufficient balance for spend and fee
* Add ALL option to solana transfer
* Rework TransferAmount to check for sign_only in parse
* Refactor TransferAmount & fee-check handling to be more general
* Add addl checks mechanism
* Move checks out of cli.rs
* Rename to SpendAmount to be more general & move
* Impl ALL/spend helpers for create-nonce-account
* Impl spend helpers for create-vote-account
* Impl ALL/spend helpers for create-stake-account
* Impl spend helpers for ping
* Impl ALL/spend helpers for pay
* Impl spend helpers for validator-info
* Remove unused fns
* Remove retry_get_balance
* Add a couple unit tests
* Rework send_util fn signatures
* Initial commit
* Execute transfers
* Refactor for testing
* Cleanup readme
* Rewrite
* Cleanup
* Cleanup
* Cleanup client
* Use a Null Client to move prints closer to where messages are sent
* Upgrade Solana
* Move core functionality into its own module
* Handle transaction errors
* Merge allocations
* Fixes
* Cleanup readme
* Fix markdown
* Add example input
* Add integration test - currently fails
* Add integration test
* Add metrics
* Use RpcClient in dry-run, just don't send messages
* More metrics
* Fix dry run with no keys
* Only require one approval if fee-payer is the sender keypair
* Fix bugs
* Don't create the transaction log if nothing to put into it;
otherwise the next innvocation won't add the header
* Apply previous transactions to allocations with matching recipients
* Bail out of any account already has a balance
* Polish
* Add new 'balances' command
* 9 decimal places
* Add missing file
* Better dry-run; keypair options now optional
* Change field name from 'bid' to 'accepted'
Also, tolerate precision change from 2 decimal places to 4
* Write to transaction log immediately
* Rename allocations_csv to bids_csv
So that we can bypass bids_csv with an allocations CSV file
* Upgrade Solana
* Remove faucet from integration test
* Cleaner integration test
Won't work until this lands and is released:
https://github.com/solana-labs/solana/pull/9717
* Update README
* Add TravicCI script to build and test (#1)
* Add distribute-stake command (#2)
* Distribute -> DistributeTokens (#3)
* Cache cargo deps (#4)
* Add docs (#5)
* Switch to latest Solana 1.1 release (#7)
* distribute -> distribute-tokens (#9)
* Switch from CSV to a pickledb database (#8)
* Switch from CSV to a pickledb database
* Allow PickleDb errors to bubble up
* Dedup
* Hoist db
* Add finalized field to TransactionInfo
* Don't allow RPC client to resign transactions
* Remove dead code
* Use transport::Result
* Record unconfirmed transaction
* Fix: separate stake account per allocation
* Catch transport errors
* Panic if we attempt to replay a transaction that hasn't been finalized
* Attempt to fix CI
PickleDb isn't calling flush() or close() after writing to files.
No issue on MacOS, but looks racy in CI.
* Revert "Attempt to fix CI"
This reverts commit 1632394f636c54402b3578120e8817dd1660e19b.
* Poll for signature before returning
* Add --sol-for-fees option for stake distributions
* Add --allocations-csv option (#14)
* Add allocations-csv option
* Add tests or GTFO
* Apply review feedback
* apply feedback
* Add read_allocations function
* Update arg_parser.rs
* Fix balances command (#17)
* Fix balances command
* Fix readme
* Add --force to transfer to non-empty accounts (#18)
* Add --no-wait (#16)
* Add ThinClient methods to implement --no-wait
* Plumb --no-wait through
No tests yet
* Check transaction status on startup
* Easier to test
* Wait until transaction is finalized before checking if it failed with an error
It's possible that a minority fork thinks it failed.
* Add unit tests
* Remove dead code and rustfmt
* Don't flush database to file if doing a dry-run
* Continue when transactions not yet finalized (#20)
If those transactions are dropped, the next run will execute them.
* Return the number of confirmations (#21)
* Add read_allocations() unit-test (#22)
Delete the copy-pasted top-level test.
Fixes#19
* Add a CSV printer (#23)
* Remove all the copypasta (#24)
* Move resolve_distribute_stake_args into its own function
* Add stake args to token args
* Unify option names
* Move Command::DistributeStake into DistributeTokens
* Remove process_distribute_stake
* Only unique signers
* Use sender keypair to fund new fee-payer accounts
* Unify distribute_tokens and distribute_stake
* Rename print-database command to transaction-log (#25)
* Send all transactions as quickly as possible, then wait (#26)
* Send all transactions as quickly as possible, then wait
* Exit when finalized or blockhashes have expired
* Don't need blockhash in the CSV output
* Better types
CSV library was choking on Pubkey as a type. PickleDb doesn't have that problem.
* Resend if blockhash has not expired
* Attempt to fix CI
* Move log to stderr
* Add constructor, tuck away client (#30)
* Add constructor, tuck away client
* Fix unwrap() caught by CI
* Fix optional option flagged as required
* Bunch of cleanup (#31)
* Remove untested --no-wait feature
* Make --transactions-db an option, not an arg
So that in the future, we can make it optional
* Remove more untested features
Too many false positives in that santity check. Use --dry-run
instead.
* Add dry-run mode to ThinClient
* Cleaner dry-run
* Make key parameters required
Just don't use them in --dry-run
* Add option to write the transaction log
--dry-run doesn't write to the database. Use this option if you
want a copy of the transaction log before the final run.
* Revert --transaction-log addition
Implement #27 first
* Fix CI
* Update readme
* Fix CI in copypasta
* Sort transaction log by finalized date (#33)
* Make --transaction-db option implicit (#34)
* Move db functionality into its own module (#35)
* Move db functionality into its own module
* Rename tokens module to commands
* Version bump
* Upgrade Solana
* Add solana-tokens to build
* Remove Cargo.lock
* Remove vscode file
* Remove TravisCI build script
* Install solana-tokens
Co-authored-by: Dan Albert <dan@solana.com>
* Switch AccountsIndex.account_maps from HashMap to BTreeMap
* Introduce eager rent collection
* Start to add tests
* Avoid too short eager rent collection cycles
* Add more tests
* Add more tests...
* Refacotr!!!!!!
* Refactoring follow up
* More tiny cleanups
* Don't rewrite 0-lamport accounts to be deterministic
* Refactor a bit
* Do hard fork, restore tests, and perf. mitigation
* Fix build...
* Refactor and add switch over for testnet (TdS)
* Use to_be_bytes
* cleanup
* More tiny cleanup
* Rebase cleanup
* Set Bank::genesis_hash when resuming from snapshot
* Reorder fns and clean ups
* Better naming and commenting
* Yet more naming clarifications
* Make prefix width strictly uniform for 2-base partition_count
* Fix typo...
* Revert cluster-dependent gate
* kick ci?
* kick ci?
* kick ci?
* Account for descendants < root not existing in BankForks, purge ancestors/descendants map for consistency with BankForks and progress map
Co-authored-by: Carl <carl@solana.com>
* Switch subscriptions to use commitment instead of confirmations
* Add bank method to return account and last-modified slot
* Add last_modified_slot to subscription data and use to filter account subscriptions
* Update tests to non-zero last_notified_slot
* Add accounts subscriptions to test; fails at higher tx load
* Pass BankForks to RpcSubscriptions
* Use BankForks on add_account_subscription to properly initialize last_notified_slot
* Bundle subscriptions
* Check for non-equality
* Use commitment to initialize last_notified_slot; revert context.slot chage
* Focus bench on squash
Squash performance does not depend on adding a small number accounts. It mainly depends on total number of accounts that need to be hashed during freeze operation. New bank means a clone of accounts db, so we don't get previous errors in log.
* Fix fmt and add slot counter
Indexing into accounts array does not match account_keys otherwise.
Also enforce program accounts not at index 0
Enforce at least 1 Read-write signing fee-payer account.
* Revert "Untar is called for shred archives that do not exist. (#9565)"
This reverts commit 729cb5eec6.
* Revert "Dont insert shred payload into rocksdb (#9366)"
This reverts commit 5ed39de8c5.
* Add largest_confirmed_root to BlockCommitmentCache
* clippy
* Add blockstore to BlockCommitmentCache to check root
* Add rooted_stake helper fn and test
* Nodes that are behind should correctly id confirmed roots
* Simplify rooted_stake collector
* Experiment to backpropagate Cargo.lock updates to all lock files
* Move most of dependabot-specific code to its own file
* Various cleanups
* Fine tune..
* Clean up shells and stop obscure API...
* Clean up node setup scripts for new CI boxes
* Move files under ci directory
* Set CUDA env var to setup cuda drivers
* Fixup and add README
* shellcheck
* Apply review feedback, rename dir and setup files
Co-authored-by: publish-docs.sh <maintainers@solana.com>
* Add filesystem wallet page
* Move validator paper wallet instructions to validator page
* Remove paper wallet staking section
* Add steps for multiple fs and paper wallets
* Add keypair convention page and better multi-wallet example
* Introduce background AppendVec shrink mechanism
* Support ledger tool
* Clean up
* save
* save
* Fix CI
* More clean up
* Add tests
* Clean up yet more
* Use account.hash...
* Fix typo....
* Add comment
* Rename accounts_cleanup_service
* Fix purging happening every slot when cleanup service is not started at slot 0
* Purge by shred count instead of slots since slots can have variable
number of shreds
* Add runtime methods to simply get status and slot
* Add helper function to get slot confirmation_count from BlockCommitmentCache
* Return cluster confirmations in getSignatureStatus
* Remove use of invalid get_signature_confirmation_status
* Remove unused methods
* Update pubsub to use cluster confirmations
* Fix test_check_signature_subscribe failure
* Refactor confirmations to read commitment cache only once
* Review comments
* Use bank, root from BlockCommitmentCache
* Update docs
* Add metric for block-commitment aggregations
Co-authored-by: Justin Starry <justin@solana.com>
* Separate client types into own crate, so ledger does not need it
Removes about 50 crates of dependency from ledger
* Drop Rpc name from transaction-status types
* CLI: Fix `create-nonce-account --seed ...`
* CLI: Add test another for `create-nonce-account --seed...`
Explicitly demonstrates a partner workflow with the following
requirements:
1) Nonce account address derived from an offline nonce
authority address
2) Fully online account creation
3) Account creation in a single signing session
* alphabetize
* SDK: Add `NullSigner` implementation
* SDK: Split `Transaction::verify()` to gain access to results
* CLI: Minor refactor of --sign_only result parsing
* CLI: Enable paritial signing
Signers specified by pubkey, but without a matching --signer arg
supplied fall back to a `NullSigner` when --sign-only is in effect.
This allows their pubkey to be used for TX construction as usual,
but leaves their `sign_message()` a NOP. As such, with --sign-only
in effect, signing and verification must be done separately, with
the latter's per-signature results considered
* CLI: Surface/report missing/bad signers to user
* CLI: Suppress --sign-only JSON output
* nits
* Docs for multi-session offline signing
* Accounts hash consistency halting
* Add option to inject account hash faults for testing.
Enable option in local cluster test to see that node halts.
* feat: cli command for vote account withdraw
* Rework names
* Update to flexible signer, and make consistent with other cli apis
* Add integration test
* Clean up default help msg
Co-authored-by: Michael Vines <mvines@gmail.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* Add failing test
* Fix create-address-with-seed regression
* Add apis to enable generating a pubkey from all various signers
* Enable other signers as --from in create-with-seed
* Copy current state version to v0
* Add `FeeCalculator` to nonce state
* fixup compile
* Dump v0 handling...
Since we new account data is all zeros, new `Current` versioned accounts
look like v0. We could hack around this with some data size checks, but
the `account_utils::*State` traits are applied to `Account`, not the
state data, so we're kind SOL...
* Create more representative test `RecentBlockhashes`
* Improve CLI nonce account display
Co-Authored-By: Michael Vines <mvines@gmail.com>
* Fix that last bank test...
* clippy/fmt
Co-authored-by: Michael Vines <mvines@gmail.com>
* watchtower: send error message as notification
* watchtower: send all clear notification when ok again
* watchtower: add twilio sms notifications
* watchtower: flag to suppress duplicate notifications
* remove trailing space character
* changes as per suggestion on PR
* all changes together
* cargo fmt
* Do periodic inbound compaction for rooted slots
* Add comment
* nits
* Consider not_compacted_roots in cleanup_dead_slot
* Renames in AccountsIndex
* Rename to reflect expansion of removed accounts
* Fix a comment
* rename
* Parallelize clean over AccountsIndex
* Some niceties
* Reduce locks and real chunked parallelism
* Measure each step for sampling opportunities
* Just noticed par iter is maybe lazy
* Replace storage scan with optimized index scan
* Various clean-ups
* Clear uncleared_roots even if no updates
* SDK: Split new `FeeRateGovernor` out of `FeeCalculator`
Leaving `FeeCalculator` to *only* calculate transaction fees
* Replace `FeeCalculator` with `FeeRateGovernor` as appropriate
* Expose recent `FeeRateGovernor` to clients
* Move `burn()` back into `FeeCalculator`
Appease BPF tests
* Revert "Move `burn()` back into `FeeCalculator`"
This reverts commit f3035624307196722b62ff8b74c12cfcc13b1941.
* Adjust BPF `Fee` sysvar test to reflect removal of `burn()` from `FeeCalculator`
* Make `FeeRateGovernor`'s `lamports_per_signature` private
* rebase artifacts
* fmt
* Drop 'Recent'
* Drop _with_commitment variant
* Use a more portable integer for `target_signatures_per_slot`
* Add docs for `getReeRateCalculator` JSON RPC method
* Don't return `lamports_per_signature` in `getFeeRateGovernor` JSONRPC reply
* Rename (keypair util is not a thing)
* Add method to generate_unique_signers
* Cli: refactor signer handling and remote-wallet init
* Fixup unit tests
* Fixup intergation tests
* Update keypair path print statement
* Remove &None
* Use deterministic key in test
* Retain storage-account as index
* Make signer index-handling less brittle
* Cache pubkey on RemoteKeypair::new
* Make signer_of consistent + return pubkey
* Remove &matches double references
* Nonce authorities need special handling
* Add keypair_util_from_path helper
* Cli: impl config.keypair as a trait object
* SDK: Add Debug and PartialEq for dyn Signer
* ClapUtils: Arg parsing from pubkey+signers to Presigner
* Impl Signers for &dyn Signer collections
* CLI: Add helper for getting signers from args
* CLI: Replace SigningAuthority with Signer trait-objs
* CLI: Drop disused signers command field
* CLI: Drop redundant tests
* Add clap validator that handles all current signer types
* clap_utils: Factor Presigner resolution to helper
* SDK: `From` for boxing Signer implementors to trait objects
* SDK: Derive `Clone` for `Presigner`
* Remove panic
* Cli: dedup signers in transfer for remote-wallet ergonomics
* Update docs vis-a-vis ASK changes
* Cli: update transaction types to use new dynamic-signer methods
* CLI: Fix tests No. 1
what to do about write_keypair outstanding
* Work around `CliConfig`'s signer not necessarily being a `Keypair`
* CLI: Fix tests No. 2
* Remove unused arg
* Remove unused methods
* Move offline arg constants upstream
* Make cli signing fallible
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update epoch slots to include all missing slots
* new test for compress/decompress
* address review comments
* limit cache based on size, instead of comparing roots
* Add fallible methods to KeypairUtil
* Add RemoteKeypair struct and impl KeypairUtil
* Implement RemoteKeypair in keygen; also add parse_keypair_path for cleanup
* Show insufficient purge_zero_lamport_account logic
* Add another pass to detect non-deleted values and increment the count
Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
* Fixup sign_transaction; pass derivation_path by reference
* Pass total message length as BE u16
* Remove live integration tests (to ledger-app-solana)
* CLI: Don't sanity-check stake account when offline
* Add test helper returning vote pubkey with validator
* Delegate to the BSL. No need to force
* Be sure our offline ops are truly offline
* Specify our authorities correctly
* checks
* Add CrdsValue timeout checks on Pull Responses
* Allow older values to enter Crds as long as a ContactInfo exists
* Allow staked contact infos to be inserted into crds if they haven't expired
* Try and handle oveflows
* Fix test
* Some comments
* Fix compile
* fix test deadlock
* Add a test for processing timed out values received via pull response
When a new root is created, the oldest slot is popped off
but when the logic checks for identical slots, it assumes
that any difference means a slot was popped off the front.
* Use solana-cli config keypair in solana-keygen
* s/infile/keypair for consistency across modules and more generality across access methods
* Move config into separate crate
* Consensus fix, don't consider threshold check if
lockouts are not increased
* Change partition tests to wait for epoch with > lockout slots
* Use atomic bool to signal partition
* Verb-noun-ify Nonce API
* Unify instruction naming with API naming
The more verbose nonce_account/NonceAccount was chosen for clarity
that these instructions work on a unique species of system account
* Rename bootstrap leader to bootstrap validator
It's a normal validator as soon as other validators enter the
leader schedule.
* cargo fmt
* Fix build
Thanks @CriesofCarrots!
* Split timestamp calculation into separate fn for math unit testing
* Add failing test
* Fix failing test; also bump stakes to near expected cluster max supply
* Don't error on timestamp of slot 0
* Spy just for RPC to avoid premature supermajority
* Make gossip_content_info private
Co-Authored-By: Michael Vines <mvines@gmail.com>
* Fix misindent...
Co-authored-by: Michael Vines <mvines@gmail.com>
* Consolidate entry tick verification into one function
* Mark bad slots as dead in blocktree processor
* more feedback
* Add bank.is_complete
* feedback
* Bank: Return nonce pubkey/account from `check_tx_durable_nonce`
* Forward account with HashAgeKind::DurableNonce
* Add durable nonce helper for HashAgeKind
* Add nonce util for advancing stored nonce in runtime
* Advance nonce in runtime
* Store rolled back nonce account on TX InstructionError
* nonce: Add test for replayed InstErr fee theft
* save limit deserialize
* save
* Save
* Clean up
* rustfmt
* rustfmt
* Just comment out to please CI
* Fix ci...
* Move code
* Rustfmt
* Crean up control flow
* Add another comment
* Introduce predetermined constant limit on snapshot data files (deserialize side)
* Introduce predetermined constant limit on snapshot data files (serialize side)
* rustfmt
* Tweak message
* Revert dynamic memory limit
* Limit size of snapshot data file (de)serialization
* Fix test breakage
* Clean up
* Fix uses formatting
* Rename: deserialize_{for,from}_snapshot
* Simplify comment
* Use Slot
* Provide slot for status cache
* Align variable name with snapshot_status_cache_file_path
* Define serialize_snapshot_data_file_with_metrics
* Fix build.......
* De-marco serialize_snapshot_data_file_with_metrics
* Revert u64 => Slot
* Propose Solana ABI management
* Mention fuzz testing
* Address minor review comments
* Remove versioning and unit tests
* Rename
* Clean up a bit
* Pass through Grammarly
* Yet more tweaks...
* Check append vec file size
* Don't use panic
* Clean up a bit
* Clean up
* Clean ups
* Change assertion into sanization check
* Remove...
* Clean up
* More clean up
* More clean up
* Use assert_matches
This reverts commit a217920561.
This commit is causing trouble when the TdS cluster is reset and
validators running an older genesis config are still present.
Occasionally an RPC URL from an older validator will be selected,
causing a new node to fail to boot.
The blocksteamer instance is the TdS cluster entrypoint. Running an
additional solana-gossip node allows other participants to join a
cluster even if the validator node on the blocksteamer instance goes down.
* Stabilize fn coverage by pruning all updated files
* Pruning didn't work; Switch to clean room dir
* Oh, shellcheck...
* Remove the data_dir variable
* Comment about relationale for find + while read
Solana™ is a new blockchain architecture built from the ground up for scale. The architecture supports
up to 710 thousand transactions per second on a gigabit network.
Disclaimer
===
All claims, content, designs, algorithms, estimates, roadmaps, specifications, and performance measurements described in this project are done with the author's best effort. It is up to the reader to check and validate their accuracy and truthfulness. Furthermore nothing in this project constitutes a solicitation for investment.
Introduction
===
It's possible for a centralized database to process 710,000 transactions per second on a standard gigabit network if the transactions are, on average, no more than 176 bytes. A centralized database can also replicate itself and maintain high availability without significantly compromising that transaction rate using the distributed system technique known as Optimistic Concurrency Control [\[H.T.Kung, J.T.Robinson (1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At Solana, we're demonstrating that these same theoretical limits apply just as well to blockchain on an adversarial network. The key ingredient? Finding a way to share time when nodes can't trust one-another. Once nodes can trust time, suddenly ~40 years of distributed systems research becomes applicable to blockchain!
> Perhaps the most striking difference between algorithms obtained by our method and ones based upon timeout is that using timeout produces a traditional distributed algorithm in which the processes operate asynchronously, while our method produces a globally synchronous one in which every process does the same thing at (approximately) the same time. Our method seems to contradict the whole purpose of distributed processing, which is to permit different processes to operate independently and perform different functions. However, if a distributed system is really a single system, then the processes must be synchronized in some way. Conceptually, the easiest way to synchronize processes is to get them all to do the same thing at the same time. Therefore, our method is used to implement a kernel that performs the necessary synchronization--for example, making sure that two different processes do not try to modify a file at the same time. Processes might spend only a small fraction of their time executing the synchronizing kernel; the rest of the time, they can operate independently--e.g., accessing different files. This is an approach we have advocated even when fault-tolerance is not required. The method's basic simplicity makes it easier to understand the precise properties of a system, which is crucial if one is to know just how fault-tolerant the system is. [\[L.Lamport (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
Furthermore, and much to our surprise, it can be implemented using a mechanism that has existed in Bitcoin since day one. The Bitcoin feature is called nLocktime and it can be used to postdate transactions using block height instead of a timestamp. As a Bitcoin client, you'd use block height instead of a timestamp if you don't trust the network. Block height turns out to be an instance of what's being called a Verifiable Delay Function in cryptography circles. It's a cryptographically secure way to say time has passed. In Solana, we use a far more granular verifiable delay function, a SHA 256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we implement Optimistic Concurrency Control and are now well en route towards that theoretical limit of 710,000 transactions per second.
Architecture
===
Before you jump into the code, review the online book [Solana: Blockchain Rebuilt for Scale](https://docs.solana.com/book/).
(The _latest_ development version of the online book is also [available here](https://docs.solana.com/book/v/master/).)
Release Binaries
===
Official release binaries are available at [Github Releases](https://github.com/solana-labs/solana/releases).
Additionally we provide pre-release binaries for the latest code on the edge and
beta channels. Note that these pre-release binaries may be less stable than an
Start your own testnet locally, instructions are in the [online docs](https://docs.solana.com/bench-tps).
Start your own testnet locally, instructions are in the book [Solana: Blockchain Rebuild for Scale: Getting Started](https://docs.solana.com/book/getting-started).
### Accessing the remote testnet
*`testnet` - public stable testnet accessible via devnet.solana.com. Runs 24/7
Remote Testnets
---
We maintain several testnets:
*`testnet` - public stable testnet accessible via testnet.solana.com. Runs 24/7
*`testnet-beta` - public beta channel testnet accessible via beta.testnet.solana.com. Runs 24/7
*`testnet-edge` - public edge channel testnet accessible via edge.testnet.solana.com. Runs 24/7
## Deploy process
They are deployed with the `ci/testnet-manager.sh` script through a list of [scheduled
First install the nightly build of rustc. `cargo bench` requires use of the
unstable features only available in the nightly build.
@ -213,13 +79,11 @@ Run the benchmarks:
$ cargo +nightly bench
```
Release Process
---
# Release Process
The release process for this project is described [here](RELEASE.md).
Code coverage
---
# Code coverage
To generate code coverage statistics:
@ -228,7 +92,6 @@ $ scripts/coverage.sh
$ open target/cov/lcov-local/index.html
```
Why coverage? While most see coverage as a code quality metric, we see it primarily as a developer
productivity metric. When a developer makes a change to the codebase, presumably it's a *solution* to
some problem. Our unit-test suite is how we encode the set of *problems* the codebase solves. Running
@ -240,3 +103,7 @@ problem is solved by this code?" On the other hand, if a test does fail and you
better way to solve the same problem, a Pull Request with your solution would most certainly be
welcome! Likewise, if rewriting a test can better communicate what code it's protecting, please
send us that patch!
# Disclaimer
All claims, content, designs, algorithms, estimates, roadmaps, specifications, and performance measurements described in this project are done with the author's best effort. It is up to the reader to check and validate their accuracy and truthfulness. Furthermore nothing in this project constitutes a solicitation for investment.
@ -138,30 +138,11 @@ There are three release channels that map to branches as follows:
### Update documentation
TODO: Documentation update procedure is WIP as we move to gitbook
Document the new recommended version by updating `book/src/running-archiver.md` and `book/src/validator-testnet.md` on the release (beta) branch to point at the `solana-install` for the upcoming release version.
Document the new recommended version by updating `docs/src/running-archiver.md` and `docs/src/validator-testnet.md` on the release (beta) branch to point at the `solana-install` for the upcoming release version.
#### Publish updated Book
We maintain three copies of the "book" as official documentation:
### Update software on devnet.solana.com
1) "Book" is the documentation for the latest official release. This should get manually updated whenever a new release is made. It is published here:
https://solana-labs.github.io/book/
2) "Book-edge" tracks the tip of the master branch and updates automatically.
https://solana-labs.github.io/book-edge/
3) "Book-beta" tracks the tip of the beta branch and updates automatically.
https://solana-labs.github.io/book-beta/
To manually trigger an update of the "Book", create a new job of the manual-update-book pipeline.
Set the tag of the latest release as the PUBLISH_BOOK_TAG environment variable.
Solana supports a node type called an _blockstreamer_. This validator variation is intended for applications that need to observe the data plane without participating in transaction validation or ledger replication.
A blockstreamer runs without a vote signer, and can optionally stream ledger entries out to a Unix domain socket as they are processed. The JSON-RPC service still functions as on any other node.
To run a blockstreamer, include the argument `no-signer` and \(optional\) `blockstream` socket location:
Solana nodes accept HTTP requests using the [JSON-RPC 2.0](https://www.jsonrpc.org/specification) specification.
To interact with a Solana node inside a JavaScript application, use the [solana-web3.js](https://github.com/solana-labs/solana-web3.js) library, which gives a convenient interface for the RPC methods.
The result field will be an array with two fields:
* Commitment
*`null` - Unknown block
*`object` - BlockCommitment
*`array` - commitment, array of u64 integers logging the amount of cluster stake in lamports that has voted on the block at each depth from 0 to `MAX_LOCKOUT_HISTORY`
* 'integer' - total active stake, in lamports, of the current epoch
#### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","id":1, "method":"getBlockCommitment","params":[5]}' http://localhost:8899
*`slot` - (optional) Fetch the leader schedule for the epoch that corresponds to the provided slot. If unspecified, the leader schedule for the current epoch is fetched
Returns the status of a given signature. This method is similar to [confirmTransaction](jsonrpc-api.md#confirmtransaction) but provides more resolution for error events.
#### Parameters:
*`string` - Signature of Transaction to confirm, as base-58 encoded string
Subscribe to a transaction signature to receive notification when the transaction is confirmed On `signatureNotification`, the subscription is automatically cancelled
#### Parameters:
* `string` - Transaction Signature, as base-58 encoded string
* `integer` - optional, number of confirmed blocks to wait before notification.
* **header:** Details the account types of and signatures required by
the transaction
* **num\_required\_signatures:** The total number of signatures
required to make the transaction valid.
* **num\_credit\_only\_signed\_accounts:** The last
`num_readonly_signed_accounts` signatures refer to signing
credit only accounts. Credit only accounts can be used concurrently
by multiple parallel transactions, but their balance may only be
increased, and their account data is read-only.
* **num\_credit\_only\_unsigned\_accounts:** The last
`num_readonly_unsigned_accounts` public keys in `account_keys` refer
to non-signing credit only accounts
* **account\_keys:** List of public keys used by the transaction, including
by the instructions and for signatures. The first
`num_required_signatures` public keys must sign the transaction.
* **recent\_blockhash:** The ID of a recent ledger entry. Validators will
reject transactions with a `recent_blockhash` that is too old.
* **instructions:** A list of [instructions](https://github.com/solana-labs/solana/tree/aacead62c0eb052068172eba6b53fc85874d6d54/book/src/instruction.md) that are
run sequentially and committed in one atomic transaction if all
succeed.
* **signatures:** A list of signatures applied to the transaction. The
list is always of length `num_required_signatures`, and the signature
at index `i` corresponds to the public key at index `i` in `account_keys`.
The list is initialized with empty signatures \(i.e. zeros\), and
populated as signatures are added.
## Transaction Signing
A `Transaction` is signed by using an ed25519 keypair to sign the serialization of the `message`. The resulting signature is placed at the index of `signatures` matching the index of the keypair's public key in `account_keys`.
## Transaction Serialization
`Transaction`s \(and their `message`s\) are serialized and deserialized using the [bincode](https://crates.io/crates/bincode) crate with a non-standard vector serialization that uses only one byte for the length if it can be encoded in 7 bits, 2 bytes if it fits in 14 bits, or 3 bytes if it requires 15 or 16 bits. The vector serialization is defined by Solana's [short-vec](https://github.com/solana-labs/solana/blob/master/sdk/src/short_vec.rs).
At full capacity on a 1gbps network solana will generate 4 petabytes of data per year. To prevent the network from centralizing around validators that have to store the full data set this protocol proposes a way for mining nodes to provide storage capacity for pieces of the data.
The basic idea to Proof of Replication is encrypting a dataset with a public symmetric key using CBC encryption, then hash the encrypted dataset. The main problem with the naive approach is that a dishonest storage node can stream the encryption and delete the data as it's hashed. The simple solution is to periodically regenerate the hash based on a signed PoH value. This ensures that all the data is present during the generation of the proof and it also requires validators to have the entirety of the encrypted data present for verification of every proof of every identity. So the space required to validate is `number_of_proofs * data_size`
## Optimization with PoH
Our improvement on this approach is to randomly sample the encrypted segments faster than it takes to encrypt, and record the hash of those samples into the PoH ledger. Thus the segments stay in the exact same order for every PoRep and verification can stream the data and verify all the proofs in a single batch. This way we can verify multiple proofs concurrently, each one on its own CUDA core. The total space required for verification is `1_ledger_segment + 2_cbc_blocks * number_of_identities` with core count equal to `number_of_identities`. We use a 64-byte chacha CBC block size.
## Network
Validators for PoRep are the same validators that are verifying transactions. If an archiver can prove that a validator verified a fake PoRep, then the validator will not receive a reward for that storage epoch.
Archivers are specialized _light clients_. They download a part of the ledger \(a.k.a Segment\) and store it, and provide PoReps of storing the ledger. For each verified PoRep archivers earn a reward of sol from the mining pool.
## Constraints
We have the following constraints:
* Verification requires generating the CBC blocks. That requires space of 2
blocks per identity, and 1 CUDA core per identity for the same dataset. So as
many identities at once should be batched with as many proofs for those
identities verified concurrently for the same dataset.
* Validators will randomly sample the set of storage proofs to the set that
they can handle, and only the creators of those chosen proofs will be
rewarded. The validator can run a benchmark whenever its hardware configuration
changes to determine what rate it can validate storage proofs.
## Validation and Replication Protocol
### Constants
1. SLOTS\_PER\_SEGMENT: Number of slots in a segment of ledger data. The
unit of storage for an archiver.
2. NUM\_KEY\_ROTATION\_SEGMENTS: Number of segments after which archivers
regenerate their encryption keys and select a new dataset to store.
3. NUM\_STORAGE\_PROOFS: Number of storage proofs required for a storage proof
claim to be successfully rewarded.
4. RATIO\_OF\_FAKE\_PROOFS: Ratio of fake proofs to real proofs that a storage
mining proof claim has to contain to be valid for a reward.
5. NUM\_STORAGE\_SAMPLES: Number of samples required for a storage mining
proof.
6. NUM\_CHACHA\_ROUNDS: Number of encryption rounds performed to generate
encrypted state.
7. NUM\_SLOTS\_PER\_TURN: Number of slots that define a single storage epoch or
a "turn" of the PoRep game.
### Validator behavior
1. Validators join the network and begin looking for archiver accounts at each
storage epoch/turn boundary.
2. Every turn, Validators sign the PoH value at the boundary and use that signature
to randomly pick proofs to verify from each storage account found in the turn boundary.
This signed value is also submitted to the validator's storage account and will be used by
archivers at a later stage to cross-verify.
3. Every `NUM_SLOTS_PER_TURN` slots the validator advertises the PoH value. This is value
is also served to Archivers via RPC interfaces.
4. For a given turn N, all validations get locked out until turn N+3 \(a gap of 2 turn/epoch\).
At which point all validations during that turn are available for reward collection.
5. Any incorrect validations will be marked during the turn in between.
### Archiver behavior
1. Since an archiver is somewhat of a light client and not downloading all the
ledger data, they have to rely on other validators and archivers for information.
Any given validator may or may not be malicious and give incorrect information, although
there are not any obvious attack vectors that this could accomplish besides having the
archiver do extra wasted work. For many of the operations there are a number of options
depending on how paranoid an archiver is:
* \(a\) archiver can ask a validator
* \(b\) archiver can ask multiple validators
* \(c\) archiver can ask other archivers
* \(d\) archiver can subscribe to the full transaction stream and generate
the information itself \(assuming the slot is recent enough\)
* \(e\) archiver can subscribe to an abbreviated transaction stream to
generate the information itself \(assuming the slot is recent enough\)
2. An archiver obtains the PoH hash corresponding to the last turn with its slot.
3. The archiver signs the PoH hash with its keypair. That signature is the
seed used to pick the segment to replicate and also the encryption key. The
archiver mods the signature with the slot to get which segment to
replicate.
4. The archiver retrives the ledger by asking peer validators and
archivers. See 6.5.
5. The archiver then encrypts that segment with the key with chacha algorithm
in CBC mode with `NUM_CHACHA_ROUNDS` of encryption.
6. The archiver initializes a chacha rng with the a signed recent PoH value as
the seed.
7. The archiver generates `NUM_STORAGE_SAMPLES` samples in the range of the
entry size and samples the encrypted segment with sha256 for 32-bytes at each
offset value. Sampling the state should be faster than generating the encrypted
segment.
8. The archiver sends a PoRep proof transaction which contains its sha state
at the end of the sampling operation, its seed and the samples it used to the
current leader and it is put onto the ledger.
9. During a given turn the archiver should submit many proofs for the same segment
and based on the `RATIO_OF_FAKE_PROOFS` some of those proofs must be fake.
10. As the PoRep game enters the next turn, the archiver must submit a
transaction with the mask of which proofs were fake during the last turn. This
transaction will define the rewards for both archivers and validators.
11. Finally for a turn N, as the PoRep game enters turn N + 3, archiver's proofs for
turn N will be counted towards their rewards.
### The PoRep Game
The Proof of Replication game has 4 primary stages. For each "turn" multiple PoRep games can be in progress but each in a different stage.
The 4 stages of the PoRep Game are as follows:
1. Proof submission stage
* Archivers: submit as many proofs as possible during this stage
* Validators: No-op
2. Proof verification stage
* Archivers: No-op
* Validators: Select archivers and verify their proofs from the previous turn
3. Proof challenge stage
* Archivers: Submit the proof mask with justifications \(for fake proofs submitted 2 turns ago\)
* Validators: No-op
4. Reward collection stage
* Archivers: Collect rewards for 3 turns ago
* Validators: Collect rewards for 3 turns ago
For each turn of the PoRep game, both Validators and Archivers evaluate each stage. The stages are run as separate transactions on the storage program.
### Finding who has a given block of ledger
1. Validators monitor the turns in the PoRep game and look at the rooted bank
at turn boundaries for any proofs.
2. Validators maintain a map of ledger segments and corresponding archiver public keys.
The map is updated when a Validator processes an archiver's proofs for a segment.
The validator provides an RPC interface to access the this map. Using this API, clients
can map a segment to an archiver's network address \(correlating it via cluster\_info table\).
The clients can then send repair requests to the archiver to retrieve segments.
3. Validators would need to invalidate this list every N turns.
## Sybil attacks
For any random seed, we force everyone to use a signature that is derived from a PoH hash at the turn boundary. Everyone uses the same count, so the same PoH hash is signed by every participant. The signatures are then each cryptographically tied to the keypair, which prevents a leader from grinding on the resulting value for more than 1 identity.
Since there are many more client identities then encryption identities, we need to split the reward for multiple clients, and prevent Sybil attacks from generating many clients to acquire the same block of data. To remain BFT we want to avoid a single human entity from storing all the replications of a single chunk of the ledger.
Our solution to this is to force the clients to continue using the same identity. If the first round is used to acquire the same block for many client identities, the second round for the same client identities will force a redistribution of the signatures, and therefore PoRep identities and blocks. Thus to get a reward for archivers need to store the first block for free and the network can reward long lived client identities more than new ones.
## Validator attacks
* If a validator approves fake proofs, archiver can easily out them by
showing the initial state for the hash.
* If a validator marks real proofs as fake, no on-chain computation can be done
to distinguish who is correct. Rewards would have to rely on the results from
multiple validators to catch bad actors and archivers from being denied rewards.
* Validator stealing mining proof results for itself. The proofs are derived
from a signature from an archiver, since the validator does not know the
private key used to generate the encryption key, it cannot be the generator of
the proof.
## Reward incentives
Fake proofs are easy to generate but difficult to verify. For this reason, PoRep proof transactions generated by archivers may require a higher fee than a normal transaction to represent the computational cost required by validators.
Some percentage of fake proofs are also necessary to receive a reward from storage mining.
## Notes
* We can reduce the costs of verification of PoRep by using PoH, and actually
make it feasible to verify a large number of proofs for a global dataset.
* We can eliminate grinding by forcing everyone to sign the same PoH hash and
use the signatures as the seed
* The game between validators and archivers is over random blocks and random
encryption identities and random data samples. The goal of randomization is
to prevent colluding groups from having overlap on data or validation.
* Archiver clients fish for lazy validators by submitting fake proofs that
they can prove are fake.
* To defend against Sybil client identities that try to store the same block we
force the clients to store for multiple rounds before receiving a reward.
* Validators should also get rewarded for validating submitted storage proofs
as incentive for storing the ledger. They can only validate proofs if they
Each transaction that is submitted to the Solana ledger imposes costs. Transaction fees paid by the submitter, and collected by a validator, in theory, account for the acute, transacitonal, costs of validating and adding that data to the ledger. At the same time, our compensation design for archivers (see [Replication-client Economics](ed_replication_client_economics.md)), in theory, accounts for the long term storage of the historical ledger. Unaccounted in this process is the mid-term storage of active ledger state, necessarily maintined by the rotating validator set. This type of storage imposes costs not only to validators but also to the broader network as active state grows so does data transmission and validation overhead. To account for these costs, we describe here our preliminary design and implementation of storage rent.
Storage rent can be paid via one of two methods:
Method 1: Set it and forget it
With this approach, accounts with two-years worth of rent deposits secured are exempt from network rent charges. By maintaining this minimum-balance, the broader network benefits from reduced liquitity and the account holder can trust that their `Account::data` will be retained for continual access/usage.
Method 2: Pay per byte
If an account has less than two-years worth of deposited rent the network charges rent on a per-epoch basis, in credit for the next epoch (but in arrears when necessary). This rent is deducted at a rate specified in genesis, in lamports per kilobyte-year.
For information on the technical implementation details of this design, see the [Rent](rent.md) section.
Solana’s crypto-economic system is designed to promote a healthy, long term self-sustaining economy with participant incentives aligned to the security and decentralization of the network. The main participants in this economy are validation-clients and replication-clients. Their contributions to the network, state validation and data storage respectively, and their requisite incentive mechanisms are discussed below.
The main channels of participant remittances are referred to as protocol-based rewards and transaction fees. Protocol-based rewards are issuances from a global, protocol-defined, inflation rate. These rewards will constitute the total reward delivered to replication and validation clients, the remaining sourced from transaction fees. In the early days of the network, it is likely that protocol-based rewards, deployed based on predefined issuance schedule, will drive the majority of participant incentives to participate in the network.
These protocol-based rewards, to be distributed to participating validation and replication clients, are to be a result of a global supply inflation rate, calculated per Solana epoch and distributed amongst the active validator set. As discussed further below, the per annum inflation rate is based on a pre-determined disinflationary schedule. This provides the network with monetary supply predictability which supports long term economic stability and security.
Transaction fees are market-based participant-to-participant transfers, attached to network interactions as a necessary motivation and compensation for the inclusion and execution of a proposed transaction \(be it a state execution or proof-of-replication verification\). A mechanism for long-term economic stability and forking protection through partial burning of each transaction fee is also discussed below.
A high-level schematic of Solana’s crypto-economic design is shown below in **Figure 1**. The specifics of validation-client economics are described in sections: [Validation-client Economics](ed_validation_client_economics/), [State-validation Protocol-based Rewards](ed_validation_client_economics/ed_vce_state_validation_protocol_based_rewards.md), [State-validation Transaction Fees](ed_validation_client_economics/ed_vce_state_validation_transaction_fees.md) and [Replication-validation Transaction Fees](ed_validation_client_economics/ed_vce_replication_validation_transaction_fees.md). Also, the chapter titled [Validation Stake Delegation](ed_validation_client_economics/ed_vce_validation_stake_delegation.md) closes with a discussion of validator delegation opportunties and marketplace. Additionally, in [Storage Rent Economics](https://github.com/solana-labs/solana/tree/aacead62c0eb052068172eba6b53fc85874d6d54/book/src/ed_storage_rent_economics.md), we describe an implementation of storage rent to account for the externality costs of maintaining the active state of the ledger. The [Replication-client Economics](ed_replication_client_economics/) chapter will review the Solana network design for global ledger storage/redundancy and archiver-client economics \([Storage-replication rewards](ed_replication_client_economics/ed_rce_storage_replication_rewards.md)\) along with an archiver-to-validator delegation mechanism designed to aide participant on-boarding into the Solana economy discussed in [Replication-client Reward Auto-delegation](ed_replication_client_economics/ed_rce_replication_client_reward_auto_delegation.md). An outline of features for an MVP economic design is discussed in the [Economic Design MVP](ed_mvp.md) section. Finally, in chapter [Attack Vectors](ed_attack_vectors.md), various attack vectors will be described and potential vulnerabilities explored and parameterized.
**Figure 1**: Schematic overview of Solana economic incentive design.
A colluding validation-client, may take the strategy to mark PoReps from non-colluding archiver nodes as invalid as an attempt to maximize the rewards for the colluding archiver nodes. In this case, it isn’t feasible for the offended-against archiver nodes to petition the network for resolution as this would result in a network-wide vote on each offending PoRep and create too much overhead for the network to progress adequately. Also, this mitigation attempt would still be vulnerable to a >= 51% staked colluder.
Alternatively, transaction fees from submitted PoReps are pooled and distributed across validation-clients in proportion to the number of valid PoReps discounted by the number of invalid PoReps as voted by each validator-client. Thus invalid votes are directly dis-incentivized through this reward channel. Invalid votes that are revealed by archiver nodes as fishing PoReps, will not be discounted from the payout PoRep count.
Another collusion attack involves a validator-client who may take the strategy to ignore invalid PoReps from colluding archiver and vote them as valid. In this case, colluding archiver-clients would not have to store the data while still receiving rewards for validated PoReps. Additionally, colluding validator nodes would also receive rewards for validating these PoReps. To mitigate this attack, validators must randomly sample PoReps corresponding to the ledger block they are validating and because of this, there will be multiple validators that will receive the colluding archiver’s invalid submissions. These non-colluding validators will be incentivized to mark these PoReps as invalid as they have no way to determine whether the proposed invalid PoRep is actually a fishing PoRep, for which a confirmation vote would result in the validator’s stake being slashed.
In this case, the proportion of time a colluding pair will be successful has an upper limit determined by the % of stake of the network claimed by the colluding validator. This also sets bounds to the value of such an attack. For example, if a colluding validator controls 10% of the total validator stake, transaction fees will be lost \(likely sent to mining pool\) by the colluding archiver 90% of the time and so the attack vector is only profitable if the per-PoRep reward at least 90% higher than the average PoRep transaction fee. While, probabilistically, some colluding archiver-client PoReps will find their way to colluding validation-clients, the network can also monitor rates of paired \(validator + archiver\) discrepancies in voting patterns and censor identified colluders in these cases.
Long term economic sustainability is one of the guiding principles of Solana’s economic design. While it is impossible to predict how decentralized economies will develop over time, especially economies with flexible decentralized governances, we can arrange economic components such that, under certain conditions, a sustainable economy may take shape in the long term. In the case of Solana’s network, these components take the form of token issuance \(via inflation\) and token burning’.
The dominant remittances from the Solana mining pool are validator and archiver rewards. The disinflationary mechanism is a flat, protocol-specified and adjusted, % of each transaction fee.
The Archiver rewards are to be delivered to archivers as a portion of the network inflation after successful PoRep validation. The per-PoRep reward amount is determined as a function of the total network storage redundancy at the time of the PoRep validation and the network goal redundancy. This function is likely to take the form of a discount from a base reward to be delivered when the network has achieved and maintained its goal redundancy. An example of such a reward function is shown in **Figure 3**
**Figure 3**: Example PoRep reward design as a function of global network storage redundancy.
In the example shown in Figure 1, multiple per PoRep base rewards are explored \(as a % of Tx Fee\) to be delivered when the global ledger replication redundancy meets 10X. When the global ledger replication redundancy is less than 10X, the base reward is discounted as a function of the square of the ratio of the actual ledger replication redundancy to the goal redundancy \(i.e. 10X\).
The preceeding sections, outlined in the [Economic Design Overview](./), describe a long-term vision of a sustainable Solana economy. Of course, we don't expect the final implementation to perfectly match what has been described above. We intend to fully engage with network stakeholders throughout the implementation phases \(i.e. pre-testnet, testnet, mainnet\) to ensure the system supports, and is representative of, the various network participants' interests. The first step toward this goal, however, is outlining a some desired MVP economic features to be available for early pre-testnet and testnet participants. Below is a rough sketch outlining basic economic functionality from which a more complete and functional system can be developed.
## MVP Economic Features
* Faucet to deliver testnet SOLs to validators for staking and application development.
* Mechanism by which validators are rewarded via network inflation.
* Ability to delegate tokens to validator nodes
* Validator set commission fees on interest from delegated tokens.
* Archivers to receive fixed, arbitrary reward for submitting validated PoReps. Reward size mechanism \(i.e. PoRep reward as a function of total ledger redundancy\) to come later.
* Pooling of archiver PoRep transaction fees and weighted distribution to validators based on PoRep verification \(see [Replication-validation Transaction Fees](ed_validation_client_economics/ed_vce_replication_validation_transaction_fees.md). It will be useful to test this protection against attacks on testnet.
* Nice-to-have: auto-delegation of archiver rewards to validator.
Replication-clients should be rewarded for providing the network with storage space. Incentivization of the set of archivers provides data security through redundancy of the historical ledger. Replication nodes are rewarded in proportion to the amount of ledger data storage provided, as proved by successfully submitting Proofs-of-Replication to the cluster.. These rewards are captured by generating and entering Proofs of Replication \(PoReps\) into the PoH stream which can be validated by Validation nodes as described above in the [Replication-validation Transaction Fees](../ed_validation_client_economics/ed_vce_replication_validation_transaction_fees.md) chapter.
The ability for Solana network participants to earn rewards by providing storage service is a unique on-boarding path that requires little hardware overhead and minimal upfront capital. It offers an avenue for individuals with extra-storage space on their home laptops or PCs to contribute to the security of the network and become integrated into the Solana economy.
To enhance this on-boarding ramp and facilitate further participation and investment in the Solana economy, replication-clients have the opportunity to auto-delegate their rewards to validation-clients of their choice. Much like the automatic reinvestment of stock dividends, in this scenario, an archiver-client can earn Solana tokens by providing some storage capacity to the network \(i.e. via submitting valid PoReps\), have the protocol-based rewards automatically assigned as delegation to a staked validator node of the archiver's choice and earn interest, less a fee, from the validation-client's network participation.
Archiver-clients download, encrypt and submit PoReps for ledger block sections.3 PoReps submitted to the PoH stream, and subsequently validated, function as evidence that the submitting archiver client is indeed storing the assigned ledger block sections on local hard drive space as a service to the network. Therefore, archiver clients should earn protocol rewards proportional to the amount of storage, and the number of successfully validated PoReps, that they are verifiably providing to the network.
Additionally, archiver clients have the opportunity to capture a portion of slashed bounties \[TBD\] of dishonest validator clients. This can be accomplished by an archiver client submitting a verifiably false PoRep for which a dishonest validator client receives and signs as a valid PoRep. This reward incentive is to prevent lazy validators and minimize validator-archiver collusion attacks, more on this below.
Validator-clients are eligible to receive protocol-based \(i.e. inflation-based\) rewards issued via stake-based annual interest rates \(calculated per epoch\) by providing compute \(CPU+GPU\) resources to validate and vote on a given PoH state. These protocol-based rewards are determined through an algorithmic disinflationary schedule as a function of total amount of circulating tokens. The network is expected to launch with an annual inflation rate around 15%, set to decrease by 15% per year until a long-term stable rate of 1-2% is reached. These issuances are to be split and distributed to participating validators and archivers, with around 90% of the issued tokens allocated for validator rewards. Because the network will be distributing a fixed amount of inflation rewards across the stake-weighted valdiator set, any individual validator's interest rate will be a function of the amount of staked SOL in relation to the circulating SOL.
Additionally, validator clients may earn revenue through fees via state-validation transactions and Proof-of-Replication \(PoRep\) transactions. For clarity, we separately describe the design and motivation of these revenue distriubutions for validation-clients below: state-validation protocol-based rewards, state-validation transaction fees and rent, and PoRep-validation transaction fees.
As previously mentioned, validator-clients will also be responsible for validating PoReps submitted into the PoH stream by archiver-clients. In this case, validators are providing compute \(CPU/GPU\) and light storage resources to confirm that these replication proofs could only be generated by a client that is storing the referenced PoH leger block.
While replication-clients are incentivized and rewarded through protocol-based rewards schedule \(see [Replication-client Economics](../ed_replication_client_economics/)\), validator-clients will be incentivized to include and validate PoReps in PoH through collection of transaction fees associated with the submitted PoReps and distribution of protocol rewards proportional to the validated PoReps. As will be described in detail in the Section 3.1, replication-client rewards are protocol-based and designed to reward based on a global data redundancy factor. I.e. the protocol will incentivize replication-client participation through rewards based on a target ledger redundancy \(e.g. 10x data redundancy\).
The validation of PoReps by validation-clients is computationally more expensive than state-validation \(detail in the [Economic Sustainability](../ed_economic_sustainability.md) chapter\), thus the transaction fees are expected to be proportionally higher.
There are various attack vectors available for colluding validation and replication clients, also described in detail below in [Economic Sustainability](https://github.com/solana-labs/solana/tree/aacead62c0eb052068172eba6b53fc85874d6d54/book/src/ed_economic_sustainability/README.md). To protect against various collusion attack vectors, for a given epoch, validator rewards are distributed across participating validation-clients in proportion to the number of validated PoReps in the epoch less the number of PoReps that mismatch the archivers challenge. The PoRep challenge game is described in [Ledger Replication](https://github.com/solana-labs/solana/blob/master/book/src/ledger-replication.md#the-porep-game). This design rewards validators proportional to the number of PoReps they process and validate, while providing negative pressure for validation-clients to submit lazy or malicious invalid votes on submitted PoReps \(note that it is computationally prohibitive to determine whether a validator-client has marked a valid PoRep as invalid\).
Validator-clients have two functional roles in the Solana network:
* Validate \(vote\) the current global state of that PoH along with any Proofs-of-Replication \(see [Replication Client Economics](../ed_replication_client_economics/)\) that they are eligible to validate.
* Be elected as ‘leader’ on a stake-weighted round-robin schedule during which time they are responsible for collecting outstanding transactions and Proofs-of-Replication and incorporating them into the PoH, thus updating the global state of the network and providing chain continuity.
Validator-client rewards for these services are to be distributed at the end of each Solana epoch. As previously discussed, compensation for validator-clients is provided via a protocol-based annual inflation rate dispersed in proportion to the stake-weight of each validator \(see below\) along with leader-claimed transaction fees available during each leader rotation. I.e. during the time a given validator-client is elected as leader, it has the opportunity to keep a portion of each transaction fee, less a protocol-specified amount that is destroyed \(see [Validation-client State Transaction Fees](ed_vce_state_validation_transaction_fees.md)\). PoRep transaction fees are also collected by the leader client and validator PoRep rewards are distributed in proportion to the number of validated PoReps less the number of PoReps that mismatch an archiver's challenge. \(see [Replication-client Transaction Fees](ed_vce_replication_validation_transaction_fees.md)\)
The effective protocol-based annual interest rate \(%\) per epoch received by validation-clients is to be a function of:
* the current global inflation rate, derived from the pre-determined dis-inflationary issuance schedule \(see [Validation-client Economics](https://github.com/solana-labs/solana/tree/aacead62c0eb052068172eba6b53fc85874d6d54/book/src/ed_validartion_client_economics.md)\)
* the fraction of staked SOLs out of the current total circulating supply,
* the up-time/participation \[% of available slots that validator had opportunity to vote on\] of a given validator over the previous epoch.
The first factor is a function of protocol parameters only \(i.e. independent of validator behavior in a given epoch\) and results in a global validation reward schedule designed to incentivize early participation, provide clear montetary stability and provide optimal security in the network.
At any given point in time, a specific validator's interest rate can be determined based on the porportion of circulating supply that is staked by the network and the validator's uptime/activity in the previous epoch. For example, consider a hypothetical instance of the network with an initial circulating token supply of 250MM tokens with an additional 250MM vesting over 3 years. Additionally an inflation rate is specified at network launch of 7.5%, and a disinflationary schedule of 20% decrease in inflation rate per year \(the actual rates to be implemented are to be worked out during the testnet experimentation phase of mainnet launch\). With these broad assumptions, the 10-year inflation rate \(adjusted daily for this example\) is shown in **Figure 2**, while the total circulating token supply is illustrated in **Figure 3**. Neglected in this toy-model is the inflation supression due to the portion of each transaction fee that is to be destroyed.
 \*\*Figure 2:\*\* In this example schedule, the annual inflation rate \[%\] reduces at around 20% per year, until it reaches the long-term, fixed, 1.5% rate.
 \*\*Figure 3:\*\* The total token supply over a 10-year period, based on an initial 250MM tokens with the disinflationary inflation schedule as shown in \*\*Figure 2\*\* Over time, the interest rate, at a fixed network staked percentage, will reduce concordant with network inflation. Validation-client interest rates are designed to be higher in the early days of the network to incentivize participation and jumpstart the network economy. As previously mentioned, the inflation rate is expected to stabalize near 1-2% which also results in a fixed, long-term, interest rate to be provided to validator-clients. This value does not represent the total interest available to validator-clients as transaction fees for state-validation and ledger storage replication \(PoReps\) are not accounted for here. Given these example parameters, annualized validator-specific interest rates can be determined based on the global fraction of tokens bonded as stake, as well as their uptime/activity in the previous epoch. For the purpose of this example, we assume 100% uptime for all validators and a split in interest-based rewards between validators and archiver nodes of 80%/20%. Additionally, the fraction of staked circulating supply is assummed to be constant. Based on these assumptions, an annualized validation-client interest rate schedule as a function of % circulating token supply that is staked is shown in\*\* Figure 4\*\*.
**Figure 4:** Shown here are example validator interest rates over time, neglecting transaction fees, segmented by fraction of total circulating supply bonded as stake.
This epoch-specific protocol-defined interest rate sets an upper limit of _protocol-generated_ annual interest rate \(not absolute total interest rate\) possible to be delivered to any validator-client per epoch. The distributed interest rate per epoch is then discounted from this value based on the participation of the validator-client during the previous epoch.
The RepairService is in charge of retrieving missing shreds that failed to be delivered by primary communication protocols like Avalanche. It is in charge of managing the protocols described below in the `Repair Protocols` section below.
## Challenges:
1\) Validators can fail to receive particular shreds due to network failures
2\) Consider a scenario where blocktree contains the set of slots {1, 3, 5}. Then Blocktree receives shreds for some slot 7, where for each of the shreds b, b.parent == 6, so then the parent-child relation 6 -> 7 is stored in blocktree. However, there is no way to chain these slots to any of the existing banks in Blocktree, and thus the `Shred Repair` protocol will not repair these slots. If these slots happen to be part of the main chain, this will halt replay progress on this node.
3\) Validators that find themselves behind the cluster by an entire epoch struggle/fail to catch up because they do not have a leader schedule for future epochs. If nodes were to blindly accept repair shreds in these future epochs, this exposes nodes to spam.
## Repair Protocols
The repair protocol makes best attempts to progress the forking structure of Blocktree.
The different protocol strategies to address the above challenges:
1. Shred Repair \(Addresses Challenge \#1\): This is the most basic repair protocol, with the purpose of detecting and filling "holes" in the ledger. Blocktree tracks the latest root slot. RepairService will then periodically iterate every fork in blocktree starting from the root slot, sending repair requests to validators for any missing shreds. It will send at most some `N` repair reqeusts per iteration.
Note: Validators will only accept shreds within the current verifiable epoch \(epoch the validator has a leader schedule for\).
2. Preemptive Slot Repair \(Addresses Challenge \#2\): The goal of this protocol is to discover the chaining relationship of "orphan" slots that do not currently chain to any known fork.
* Blocktree will track the set of "orphan" slots in a separate column family.
* RepairService will periodically make `RequestOrphan` requests for each of the orphans in blocktree.
`RequestOrphan(orphan)` request - `orphan` is the orphan slot that the requestor wants to know the parents of `RequestOrphan(orphan)` response - The highest shreds for each of the first `N` parents of the requested `orphan`
On receiving the responses `p`, where `p` is some shred in a parent slot, validators will:
* Insert an empty `SlotMeta` in blocktree for `p.slot` if it doesn't already exist.
* If `p.slot` does exist, update the parent of `p` based on `parents`
Note: that once these empty slots are added to blocktree, the `Shred Repair` protocol should attempt to fill those slots.
Note: Validators will only accept responses containing shreds within the current verifiable epoch \(epoch the validator has a leader schedule for\).
3. Repairmen \(Addresses Challenge \#3\): This part of the repair protocol is the primary mechanism by which new nodes joining the cluster catch up after loading a snapshot. This protocol works in a "forward" fashion, so validators can verify every shred that they receive against a known leader schedule.
Each validator advertises in gossip:
* Current root
* The set of all completed slots in the confirmed epochs \(an epoch that was calculated based on a bank <= current root\) past the current root
Observers of this gossip message with higher epochs \(repairmen\) send shreds to catch the lagging node up with the rest of the cluster. The repairmen are responsible for sending the slots within the epochs that are confrimed by the advertised `root` in gossip. The repairmen divide the responsibility of sending each of the missing slots in these epochs based on a random seed \(simple shred.index iteration by N, seeded with the repairman's node\_pubkey\). Ideally, each repairman in an N node cluster \(N nodes whose epochs are higher than that of the repairee\) sends 1/N of the missing shreds. Both data and coding shreds for missing slots are sent. Repairmen do not send shreds again to the same validator until they see the message in gossip updated, at which point they perform another iteration of this protocol.
Gossip messages are updated every time a validator receives a complete slot within the epoch. Completed slots are detected by blocktree and sent over a channel to RepairService. It is important to note that we know that by the time a slot X is complete, the epoch schedule must exist for the epoch that contains slot X because WindowService will reject shreds for unconfirmed epochs. When a newly completed slot is detected, we also update the current root if it has changed since the last update. The root is made available to RepairService through Blocktree, which holds the latest root.
Solana is an open source project implementing a new, high-performance, permissionless blockchain. Solana is also the name of a company headquartered in San Francisco that maintains the open source project.
## About this Book
This book describes the Solana open source project, a blockchain built from the ground up for scale. The book covers why Solana is useful, how to use it, how it works, and why it will continue to work long after the company Solana closes its doors. The goal of the Solana architecture is to demonstrate there exists a set of software algorithms that when used in combination to implement a blockchain, removes software as a performance bottleneck, allowing transaction throughput to scale proportionally with network bandwidth. The architecture goes on to satisfy all three desirable properties of a proper blockchain: it is scalable, secure and decentralized.
The architecture describes a theoretical upper bound of 710 thousand transactions per second \(tps\) on a standard gigabit network and 28.4 million tps on 40 gigabit. Furthermore, the architecture supports safe, concurrent execution of programs authored in general purpose programming languages such as C or Rust.
## Disclaimer
All claims, content, designs, algorithms, estimates, roadmaps, specifications, and performance measurements described in this project are done with the author's best effort. It is up to the reader to check and validate their accuracy and truthfulness. Furthermore, nothing in this project constitutes a solicitation for investment.
## History of the Solana Codebase
In November of 2017, Anatoly Yakovenko published a whitepaper describing Proof of History, a technique for keeping time between computers that do not trust one another. From Anatoly's previous experience designing distributed systems at Qualcomm, Mesosphere and Dropbox, he knew that a reliable clock makes network synchronization very simple. When synchronization is simple the resulting network can be blazing fast, bound only by network bandwidth.
Anatoly watched as blockchain systems without clocks, such as Bitcoin and Ethereum, struggled to scale beyond 15 transactions per second worldwide when centralized payment systems such as Visa required peaks of 65,000 tps. Without a clock, it was clear they'd never graduate to being the global payment system or global supercomputer most had dreamed them to be. When Anatoly solved the problem of getting computers that don’t trust each other to agree on time, he knew he had the key to bring 40 years of distributed systems research to the world of blockchain. The resulting cluster wouldn't be just 10 times faster, or a 100 times, or a 1,000 times, but 10,000 times faster, right out of the gate!
Anatoly's implementation began in a private codebase and was implemented in the C programming language. Greg Fitzgerald, who had previously worked with Anatoly at semiconductor giant Qualcomm Incorporated, encouraged him to reimplement the project in the Rust programming language. Greg had worked on the LLVM compiler infrastructure, which underlies both the Clang C/C++ compiler as well as the Rust compiler. Greg claimed that the language's safety guarantees would improve software productivity and that its lack of a garbage collector would allow programs to perform as well as those written in C. Anatoly gave it a shot and just two weeks later, had migrated his entire codebase to Rust. Sold. With plans to weave all the world's transactions together on a single, scalable blockchain, Anatoly called the project Loom.
On February 13th of 2018, Greg began prototyping the first open source implementation of Anatoly's whitepaper. The project was published to GitHub under the name Silk in the loomprotocol organization. On February 28th, Greg made his first release, demonstrating 10 thousand signed transactions could be verified and processed in just over half a second. Shortly after, another former Qualcomm cohort, Stephen Akridge, demonstrated throughput could be massively improved by offloading signature verification to graphics processors. Anatoly recruited Greg, Stephen and three others to co-found a company, then called Loom.
Around the same time, Ethereum-based project Loom Network sprung up and many people were confused about whether they were the same project. The Loom team decided it would rebrand. They chose the name Solana, a nod to a small beach town North of San Diego called Solana Beach, where Anatoly, Greg and Stephen lived and surfed for three years when they worked for Qualcomm. On March 28th, the team created the Solana Labs GitHub organization and renamed Greg's prototype Silk to Solana.
In June of 2018, the team scaled up the technology to run on cloud-based networks and on July 19th, published a 50-node, permissioned, public testnet consistently supporting bursts of 250,000 transactions per second. In a later release in December, called v0.10 Pillbox, the team published a permissioned testnet running 150 nodes on a gigabit network and demonstrated soak tests processing an _average_ of 200 thousand transactions per second with bursts over 500 thousand. The project was also extended to support on-chain programs written in the C programming language and run concurrently in a safe execution environment called BPF.
## What is a Solana Cluster?
A cluster is a set of computers that work together and can be viewed from the outside as a single system. A Solana cluster is a set of independently owned computers working together \(and sometimes against each other\) to verify the output of untrusted, user-submitted programs. A Solana cluster can be utilized any time a user wants to preserve an immutable record of events in time or programmatic interpretations of those events. One use is to track which of the computers did meaningful work to keep the cluster running. Another use might be to track the possession of real-world assets. In each case, the cluster produces a record of events called the ledger. It will be preserved for the lifetime of the cluster. As long as someone somewhere in the world maintains a copy of the ledger, the output of its programs \(which may contain a record of who possesses what\) will forever be reproducible, independent of the organization that launched it.
## What are SOLs?
A SOL is the name of Solana's native token, which can be passed to nodes in a Solana cluster in exchange for running an on-chain program or validating its output. The system may perform micropayments of fractional SOLs, which are called _lamports_. They are named in honor of Solana's biggest technical influence, [Leslie Lamport](https://en.wikipedia.org/wiki/Leslie_Lamport). A lamport has a value of 0.000000001 SOL.
Currently, there is no way to create instruction `pay_and_launch_missiles` that executes `token_instruction::pay` from the `acme` program. The workaround is to extend the `acme` program with the implementation of the `token` program, and create `token` accounts with `ACME_PROGRAM_ID`, which the `acme` program is permitted to modify. With that workaround, `acme` can modify token-like accounts created by the `acme` program, but not token accounts created by the `token` program.
## Proposed Solution
The goal of this design is to modify Solana's runtime such that an on-chain program can invoke an instruction from another program.
Given two on-chain programs `token` and `acme`, each implementing instructions `pay()` and `launch_missiles()` respectively, we would ideally like to implement the `acme` module with a call to a function defined in the `token` module:
The above code would require that the `token` crate be dynamically linked, so that a custom linker could intercept calls and validate accesses to `keyed_accounts`. That is, even though the client intends to modify both `token` and `acme` accounts, only `token` program is permitted to modify the `token` account, and only the `acme` program is permitted to modify the `acme` account.
Backing off from that ideal cross-program call, a slightly more verbose solution is to expose token's existing `process_instruction()` entrypoint to the acme program:
let instruction = token_instruction::pay(&alice_pubkey);
process_instruction(&instruction)?;
launch_missiles(keyed_accounts)?;
}
```
where `process_instruction()` is built into Solana's runtime and responsible for routing the given instruction to the `token` program via the instruction's `program_id` field. Before invoking `pay()`, the runtime must also ensure that `acme` didn't modify any accounts owned by `token`. It does this by calling `runtime::verify_account_changes()` and then afterward updating all the `pre_*` variables to tentatively commit `acme`'s account modifications. After `pay()` completes, the runtime must again ensure that `token` didn't modify any accounts owned by `acme`. It should call `verify_account_changes()` again, but this time with the `token` program ID. Lastly, after `pay_and_launch_missiles()` completes, the runtime must call `verify_account_changes()` one more time, where it normally would, but using all updated `pre_*` variables. If executing `pay_and_launch_missiles()` up to `pay()` made no invalid account changes, `pay()` made no invalid changes, and executing from `pay()` until `pay_and_launch_missiles()` returns made no invalid changes, then the runtime can transitively assume `pay_and_launch_missiles()` as whole made no invalid account changes, and therefore commit all account modifications.
### Setting `KeyedAccount.is_signer`
When `process_instruction()` is invoked, the runtime must create a new `KeyedAccounts` parameter using the signatures from the _original_ transaction data. Since the `token` program is immutable and existed on-chain prior to the `acme` program, the runtime can safely treat the transaction signature as a signature of a transaction with a `token` instruction. When the runtime sees the given instruction references `alice_pubkey`, it looks up the key in the transaction to see if that key corresponds to a transaction signature. In this case it does and so sets `KeyedAccount.is_signer`, thereby authorizing the `token` program to modify Alice's account.
The storage epoch should be the number of slots which results in around 100GB-1TB of ledger to be generated for archivers to store. Archivers will start storing ledger when a given fork has a high probability of not being rolled back.
## Validator behavior
1. Every NUM\_KEY\_ROTATION\_TICKS it also validates samples received from
archivers. It signs the PoH hash at that point and uses the following
algorithm with the signature as the input:
* The low 5 bits of the first byte of the signature creates an index into
another starting byte of the signature.
* The validator then looks at the set of storage proofs where the byte of
the proof's sha state vector starting from the low byte matches exactly
with the chosen byte\(s\) of the signature.
* If the set of proofs is larger than the validator can handle, then it
increases to matching 2 bytes in the signature.
* Validator continues to increase the number of matching bytes until a
workable set is found.
* It then creates a mask of valid proofs and fake proofs and sends it to
the leader. This is a storage proof confirmation transaction.
2. After a lockout period of NUM\_SECONDS\_STORAGE\_LOCKOUT seconds, the
validator then submits a storage proof claim transaction which then causes the
distribution of the storage reward if no challenges were seen for the proof to
the validators and archivers party to the proofs.
## Archiver behavior
1. The archiver then generates another set of offsets which it submits a fake
proof with an incorrect sha state. It can be proven to be fake by providing the
seed for the hash result.
* A fake proof should consist of an archiver hash of a signature of a PoH
value. That way when the archiver reveals the fake proof, it can be
verified on chain.
2. The archiver monitors the ledger, if it sees a fake proof integrated, it
creates a challenge transaction and submits it to the current leader. The
transacation proves the validator incorrectly validated a fake storage proof.
The archiver is rewarded and the validator's staking balance is slashed or
frozen.
## Storage proof contract logic
Each archiver and validator will have their own storage account. The validator's account would be separate from their gossip id similiar to their vote account. These should be implemented as two programs one which handles the validator as the keysigner and one for the archiver. In that way when the programs reference other accounts, they can check the program id to ensure it is a validator or archiver account they are referencing.
### SubmitMiningProof
```text
SubmitMiningProof {
slot: u64,
sha_state: Hash,
signature: Signature,
};
keys = [archiver_keypair]
```
Archivers create these after mining their stored ledger data for a certain hash value. The slot is the end slot of the segment of ledger they are storing, the sha\_state the result of the archiver using the hash function to sample their encrypted ledger segment. The signature is the signature that was created when they signed a PoH value for the current storage epoch. The list of proofs from the current storage epoch should be saved in the account state, and then transfered to a list of proofs for the previous epoch when the epoch passes. In a given storage epoch a given archiver should only submit proofs for one segment.
The program should have a list of slots which are valid storage mining slots. This list should be maintained by keeping track of slots which are rooted slots in which a significant portion of the network has voted on with a high lockout value, maybe 32-votes old. Every SLOTS\_PER\_SEGMENT number of slots would be added to this set. The program should check that the slot is in this set. The set can be maintained by receiving a AdvertiseStorageRecentBlockHash and checking with its bank/Tower BFT state.
The program should do a signature verify check on the signature, public key from the transaction submitter and the message of the previous storage epoch PoH value.
A validator will submit this transaction to indicate that a set of proofs for a given segment are valid/not-valid or skipped where the validator did not look at it. The keypairs for the archivers that it looked at should be referenced in the keys so the program logic can go to those accounts and see that the proofs are generated in the previous epoch. The sampling of the storage proofs should be verified ensuring that the correct proofs are skipped by the validator according to the logic outlined in the validator behavior of sampling.
The included archiver keys will indicate the the storage samples which are being referenced; the length of the proof\_mask should be verified against the set of storage proofs in the referenced archiver account\(s\), and should match with the number of proofs submitted in the previous storage epoch in the state of said archiver account.
### ClaimStorageReward
```text
ClaimStorageReward {
}
keys = [validator_keypair or archiver_keypair, validator/archiver_keypairs (unsigned)]
```
Archivers and validators will use this transaction to get paid tokens from a program state where SubmitStorageProof, ProofValidation and ChallengeProofValidations are in a state where proofs have been submitted and validated and there are no ChallengeProofValidations referencing those proofs. For a validator, it should reference the archiver keypairs to which it has validated proofs in the relevant epoch. And for an archiver it should reference validator keypairs for which it has validated and wants to be rewarded.
### ChallengeProofValidation
```text
ChallengeProofValidation {
proof_index: u64,
hash_seed_value: Vec<u8>,
}
keys = [archiver_keypair, validator_keypair]
```
This transaction is for catching lazy validators who are not doing the work to validate proofs. An archiver will submit this transaction when it sees a validator has approved a fake SubmitMiningProof transaction. Since the archiver is a light client not looking at the full chain, it will have to ask a validator or some set of validators for this information maybe via RPC call to obtain all ProofValidations for a certain segment in the previous storage epoch. The program will look in the validator account state see that a ProofValidation is submitted in the previous storage epoch and hash the hash\_seed\_value and see that the hash matches the SubmitMiningProof transaction and that the validator marked it as valid. If so, then it will save the challenge to the list of challenges that it has in its state.
### AdvertiseStorageRecentBlockhash
```text
AdvertiseStorageRecentBlockhash {
hash: Hash,
slot: u64,
}
```
Validators and archivers will submit this to indicate that a new storage epoch has passed and that the storage proofs which are current proofs should now be for the previous epoch. Other transactions should check to see that the epoch that they are referencing is accurate according to current chain state.
It is often useful to allow low resourced clients to participate in a Solana cluster. Be this participation economic or contract execution, verification that a client's activity has been accepted by the network is typically expensive. This proposal lays out a mechanism for such clients to confirm that their actions have been committed to the ledger state with minimal resource expenditure and third-party trust.
## A Naive Approach
Validators store the signatures of recently confirmed transactions for a short period of time to ensure that they are not processed more than once. Validators provide a JSON RPC endpoint, which clients can use to query the cluster if a transaction has been recently processed. Validators also provide a PubSub notification, whereby a client registers to be notified when a given signature is observed by the validator. While these two mechanisms allow a client to verify a payment, they are not a proof and rely on completely trusting a validator.
We will describe a way to minimize this trust using Merkle Proofs to anchor the validator's response in the ledger, allowing the client to confirm on their own that a sufficient number of their preferred validators have confirmed a transaction. Requiring multiple validator attestations further reduces trust in the validator, as it increases both the technical and economic difficulty of compromising several other network participants.
## Light Clients
A 'light client' is a cluster participant that does not itself run a validator. This light client would provide a level of security greater than trusting a remote validator, without requiring the light client to spend a lot of resources verifying the ledger.
Rather than providing transaction signatures directly to a light client, the validator instead generates a Merkle Proof from the transaction of interest to the root of a Merkle Tree of all transactions in the including block. This Merkle Root is stored in a ledger entry which is voted on by validators, providing it consensus legitimacy. The additional level of security for a light client depends on an initial canonical set of validators the light client considers to be the stakeholders of the cluster. As that set is changed, the client can update its internal set of known validators with [receipts](simple-payment-and-state-verification.md#receipts). This may become challenging with a large number of delegated stakes.
Validators themselves may want to use light client APIs for performance reasons. For example, during the initial launch of a validator, the validator may use a cluster provided checkpoint of the state and verify it with a receipt.
## Receipts
A receipt is a minimal proof that; a transaction has been included in a block, that the block has been voted on by the client's preferred set of validators and that the votes have reached the desired confirmation depth.
The receipts for both state and payments start with a Merkle Path from the value into a Bank-Merkle that has been voted on and included in the ledger. A chain of PoH Entries containing subsequent validator votes, deriving from the Bank-Merkle, is the confirmation proof.
Clients can examine this ledger data and compute the finality using Solana's fork selection rules.
### Payment Merkle Path
A payment receipt is a data structure that contains a Merkle Path from a transaction to the required set of validator votes.
An Entry-Merkle is a Merkle Root including all transactions in the entry, sorted by signature.
A Block-Merkle is a Merkle root of all the Entry-Merkles sequenced in the block. Transaction status is necessary for the receipt because the state receipt is constructed for the block. Two transactions over the same state can appear in the block, and therefore, there is no way to infer from just the state whether a transaction that is committed to the ledger has succeeded or failed in modifying the intended state. It may not be necessary to encode the full status code, but a single status bit to indicate the transaction's success.
### State Merkle Path
A state receipt provides a confirmation that a specific state is committed at the end of the block. Inter-block state transitions do not generate a receipt.
For example:
* A sends 5 Lamports to B
* B spends 5 Lamports
* C sends 5 Lamports to A
At the end of the block, A and B are in the exact same starting state, and any state receipt would point to the same value for A or B.
The Bank-Merkle is computed from the Merkle Tree of the new state changes, along with the Previous Bank-Merkle, and the Block-Merkle.
A state receipt contains only the state changes occurring in the block. A direct Merkle Path to the current Bank-Merkle guarantees the state value at that bank hash, but it cannot be used to generate a “current” receipt to the latest state if the state modification occurred in some previous block. There is no guarantee that the path provided by the validator is the latest one available out of all the previous Bank-Merkles.
Clients that want to query the chain for a receipt of the "latest" state would need to create a transaction that would update the Merkle Path for that account, such as a credit of 0 Lamports.
### Validator Votes
Leaders should coalesce the validator votes by stake weight into a single entry. This will reduce the number of entries necessary to create a receipt.
### Chain of Entries
A receipt has a PoH link from the payment or state Merkle Path root to a list of consecutive validation votes.
/// This Entry definition skips over the transactions and only contains the
/// hash of the transactions used to modify PoH.
LightEntry {
/// The number of hashes since the previous Entry ID.
pub num_hashes: u64,
/// The SHA-256 hash `num_hashes` after the previous Entry ID.
hash: Hash,
/// The Merkle Root of the transactions encoded into the Entry.
entry_hash: Hash,
}
```
The light entries are reconstructed from Entries and simply show the entry Merkle Root that was mixed in to the PoH hash, instead of the full transaction set.
Clients do not need the starting vote state. The [fork selection](https://github.com/solana-labs/solana/tree/aacead62c0eb052068172eba6b53fc85874d6d54/book/src/book/src/fork-selection.md) algorithm is defined such that only votes that appear after the transaction provide finality for the transaction, and finality is independent of the starting state.
### Verification
A light client that is aware of the supermajority set validators can verify a receipt by following the Merkle Path to the PoH chain. The Bank-Merkle is the Merkle Root and will appear in votes included in an Entry. The light client can simulate [fork selection](https://github.com/solana-labs/solana/tree/aacead62c0eb052068172eba6b53fc85874d6d54/book/src/book/src/fork-selection.md) for the consecutive votes and verify that the receipt is confirmed at the desired lockout threshold.
### Synthetic State
Synthetic state should be computed into the Bank-Merkle along with the bank generated state.
For example:
* Epoch validator accounts and their stakes and weights.
* Computed fee rates
These values should have an entry in the Bank-Merkle. They should live under known accounts, and therefore have an exact address in the Merkle Path.
When we first started Solana, the goal was to de-risk our TPS claims. We knew that between optimistic concurrency control and sufficiently long leader slots, that PoS consensus was not the biggest risk to TPS. It was GPU-based signature verification, software pipelining and concurrent banking. Thus, the TPU was born. After topping 100k TPS, we split the team into one group working toward 710k TPS and another to flesh out the validator pipeline. Hence, the TVU was born. The current architecture is a consequence of incremental development with that ordering and project priorities. It is not a reflection of what we ever believed was the most technically elegant cross-section of those technologies. In the context of leader rotation, the strong distinction between leading and validating is blurred.
## Difference between validating and leading
The fundamental difference between the pipelines is when the PoH is present. In a leader, we process transactions, removing bad ones, and then tag the result with a PoH hash. In the validator, we verify that hash, peel it off, and process the transactions in exactly the same way. The only difference is that if a validator sees a bad transaction, it can't simply remove it like the leader does, because that would cause the PoH hash to change. Instead, it rejects the whole block. The other difference between the pipelines is what happens _after_ banking. The leader broadcasts entries to downstream validators whereas the validator will have already done that in RetransmitStage, which is a confirmation time optimization. The validation pipeline, on the other hand, has one last step. Any time it finishes processing a block, it needs to weigh any forks it's observing, possibly cast a vote, and if so, reset its PoH hash to the block hash it just voted on.
## Proposed Design
We unwrap the many abstraction layers and build a single pipeline that can toggle leader mode on whenever the validator's ID shows up in the leader schedule.
This document describes how to setup an archiver in the testnet
Please note some of the information and instructions described here may change in future releases.
## Overview
Archivers are specialized light clients. They download a part of the ledger \(a.k.a Segment\) and store it. They earn rewards for storing segments.
The testnet features a validator running at testnet.solana.com, which serves as the entrypoint to the cluster for your archiver node.
Additionally there is a blockexplorer available at [http://testnet.solana.com/](http://testnet.solana.com/).
The testnet is configured to reset the ledger daily, or sooner should the hourly automated cluster sanity test fail.
## Machine Requirements
Archivers don't need specialized hardware. Anything with more than 128GB of disk space will be able to participate in the cluster as an archiver node.
Currently the disk space requirements are very low but we expect them to change in the future.
Prebuilt binaries are available for Linux x86\_64 \(Ubuntu 18.04 recommended\), macOS, and Windows.
### Confirm The Testnet Is Reachable
Before starting an archiver node, sanity check that the cluster is accessible to your machine by running some simple commands. If any of the commands fail, please retry 5-10 minutes later to confirm the testnet is not just restarting itself before debugging further.
Fetch the current transaction count over JSON RPC:
```bash
curl -X POST -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1, "method":"getTransactionCount"}' http://testnet.solana.com:8899
```
Inspect the blockexplorer at [http://testnet.solana.com/](http://testnet.solana.com/) for activity.
View the [metrics dashboard](https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta?var-testnet=testnet) for more detail on cluster activity.
## Archiver Setup
#### Obtaining The Software
#### Bootstrap with `solana-install`
The `solana-install` tool can be used to easily install and upgrade the cluster software.
#### Linux and mac OS
```bash
curl -sSf https://raw.githubusercontent.com/solana-labs/solana/v0.18.0/install/solana-install-init.sh | sh -s
```
Alternatively build the `solana-install` program from source and run the following command to obtain the same result:
```bash
solana-install init
```
#### Windows
Download and install **solana-install-init** from [https://github.com/solana-labs/solana/releases/latest](https://github.com/solana-labs/solana/releases/latest)
After a successful install, `solana-install update` may be used to easily update the software to a newer version at any time.
#### Download Prebuilt Binaries
If you would rather not use `solana-install` to manage the install, you can manually download and install the binaries.
#### Linux
Download the binaries by navigating to [https://github.com/solana-labs/solana/releases/latest](https://github.com/solana-labs/solana/releases/latest), download **solana-release-x86\_64-unknown-linux-gnu.tar.bz2**, then extract the archive:
```bash
tar jxf solana-release-x86_64-unknown-linux-gnu.tar.bz2
cd solana-release/
exportPATH=$PWD/bin:$PATH
```
#### mac OS
Download the binaries by navigating to [https://github.com/solana-labs/solana/releases/latest](https://github.com/solana-labs/solana/releases/latest), download **solana-release-x86\_64-apple-darwin.tar.bz2**, then extract the archive:
```bash
tar jxf solana-release-x86_64-apple-darwin.tar.bz2
cd solana-release/
exportPATH=$PWD/bin:$PATH
```
#### Windows
Download the binaries by navigating to [https://github.com/solana-labs/solana/releases/latest](https://github.com/solana-labs/solana/releases/latest), download **solana-release-x86\_64-pc-windows-msvc.tar.bz2**, then extract it into a folder. It is a good idea to add this extracted folder to your windows PATH.
## Starting The Archiver
Try running following command to join the gossip network and view all the other nodes in the cluster:
Transactions encode lists of instructions that are executed sequentially, and only committed if all the instructions complete successfully. All account updates are reverted upon the failure of a transaction. Each transaction details the accounts used, including which must sign and which are read only, a recent blockhash, the instructions, and any signatures.
## Accounts and Signatures
Each transaction explicitly lists all account public keys referenced by the transaction's instructions. A subset of those public keys are each accompanied by a transaction signature. Those signatures signal on-chain programs that the account holder has authorized the transaction. Typically, the program uses the authorization to permit debiting the account or modifying its data.
The transaction also marks some accounts as _read-only accounts_. The runtime permits read-only accounts to be read concurrently. If a program attempts to modify a read-only account, the transaction is rejected by the runtime.
## Recent Blockhash
A Transaction includes a recent blockhash to prevent duplication and to give transactions lifetimes. Any transaction that is completely identical to a previous one is rejected, so adding a newer blockhash allows multiple transactions to repeat the exact same action. Transactions also have lifetimes that are defined by the blockhash, as any transaction whose blockhash is too old will be rejected.
## Instructions
Each instruction specifies a single program account \(which must be marked executable\), a subset of the transaction's accounts that should be passed to the program, and a data byte array instruction that is passed to the program. The program interprets the data array and operates on the accounts specified by the instructions. The program can return successfully, or with an error code. An error return causes the entire transaction to fail immediately.
After a block reaches finality, all blocks from that one on down to the genesis block form a linear chain with the familiar name blockchain. Until that point, however, the validator must maintain all potentially valid chains, called _forks_. The process by which forks naturally form as a result of leader rotation is described in [fork generation](../../cluster/fork-generation.md). The _blocktree_ data structure described here is how a validator copes with those forks until blocks are finalized.
The blocktree allows a validator to record every shred it observes on the network, in any order, as long as the shred is signed by the expected leader for a given slot.
Shreds are moved to a fork-able key space the tuple of `leader slot` + `shred index` \(within the slot\). This permits the skip-list structure of the Solana protocol to be stored in its entirety, without a-priori choosing which fork to follow, which Entries to persist or when to persist them.
Repair requests for recent shreds are served out of RAM or recent files and out of deeper storage for less recent shreds, as implemented by the store backing Blocktree.
## Functionalities of Blocktree
1. Persistence: the Blocktree lives in the front of the nodes verification
pipeline, right behind network receive and signature verification. If the
shred received is consistent with the leader schedule \(i.e. was signed by the
leader for the indicated slot\), it is immediately stored.
2. Repair: repair is the same as window repair above, but able to serve any
shred that's been received. Blocktree stores shreds with signatures,
preserving the chain of origination.
3. Forks: Blocktree supports random access of shreds, so can support a
validator's need to rollback and replay from a Bank checkpoint.
4. Restart: with proper pruning/culling, the Blocktree can be replayed by
ordered enumeration of entries from slot 0. The logic of the replay stage
\(i.e. dealing with forks\) will have to be used for the most recent entries in
the Blocktree.
## Blocktree Design
1. Entries in the Blocktree are stored as key-value pairs, where the key is the concatenated slot index and shred index for an entry, and the value is the entry data. Note shred indexes are zero-based for each slot \(i.e. they're slot-relative\).
2. The Blocktree maintains metadata for each slot, in the `SlotMeta` struct containing:
*`slot_index` - The index of this slot
*`num_blocks` - The number of blocks in the slot \(used for chaining to a previous slot\)
*`consumed` - The highest shred index `n`, such that for all `m < n`, there exists a shred in this slot with shred index equal to `n` \(i.e. the highest consecutive shred index\).
*`received` - The highest received shred index for the slot
*`next_slots` - A list of future slots this slot could chain to. Used when rebuilding
the ledger to find possible fork points.
*`last_index` - The index of the shred that is flagged as the last shred for this slot. This flag on a shred will be set by the leader for a slot when they are transmitting the last shred for a slot.
*`is_rooted` - True iff every block from 0...slot forms a full sequence without any holes. We can derive is\_rooted for each slot with the following rules. Let slot\(n\) be the slot with index `n`, and slot\(n\).is\_full\(\) is true if the slot with index `n` has all the ticks expected for that slot. Let is\_rooted\(n\) be the statement that "the slot\(n\).is\_rooted is true". Then:
is\_rooted\(0\) is\_rooted\(n+1\) iff \(is\_rooted\(n\) and slot\(n\).is\_full\(\)
3. Chaining - When a shred for a new slot `x` arrives, we check the number of blocks \(`num_blocks`\) for that new slot \(this information is encoded in the shred\). We then know that this new slot chains to slot `x - num_blocks`.
4. Subscriptions - The Blocktree records a set of slots that have been "subscribed" to. This means entries that chain to these slots will be sent on the Blocktree channel for consumption by the ReplayStage. See the `Blocktree APIs` for details.
5. Update notifications - The Blocktree notifies listeners when slot\(n\).is\_rooted is flipped from false to true for any `n`.
## Blocktree APIs
The Blocktree offers a subscription based API that ReplayStage uses to ask for entries it's interested in. The entries will be sent on a channel exposed by the Blocktree. These subscription API's are as follows: 1. `fn get_slots_since(slot_indexes: &[u64]) -> Vec<SlotMeta>`: Returns new slots connecting to any element of the list `slot_indexes`.
1.`fn get_slot_entries(slot_index: u64, entry_start_index: usize, max_entries: Option<u64>) -> Vec<Entry>`: Returns the entry vector for the slot starting with `entry_start_index`, capping the result at `max` if `max_entries == Some(max)`, otherwise, no upper limit on the length of the return vector is imposed.
Note: Cumulatively, this means that the replay stage will now have to know when a slot is finished, and subscribe to the next slot it's interested in to get the next set of entries. Previously, the burden of chaining slots fell on the Blocktree.
## Interfacing with Bank
The bank exposes to replay stage:
1.`prev_hash`: which PoH chain it's working on as indicated by the hash of the last
entry it processed
2.`tick_height`: the ticks in the PoH chain currently being verified by this
bank
3.`votes`: a stack of records that contain: 1. `prev_hashes`: what anything after this vote must chain to in PoH 2. `tick_height`: the tick height at which this vote was cast 3. `lockout period`: how long a chain must be observed to be in the ledger to
be able to be chained below this vote
Replay stage uses Blocktree APIs to find the longest chain of entries it can hang off a previous vote. If that chain of entries does not hang off the latest vote, the replay stage rolls back the bank to that vote and replays the chain from there.
## Pruning Blocktree
Once Blocktree entries are old enough, representing all the possible forks becomes less useful, perhaps even problematic for replay upon restart. Once a validator's votes have reached max lockout, however, any Blocktree contents that are not on the PoH chain for that vote for can be pruned, expunged.
Archiver nodes will be responsible for storing really old ledger contents, and validators need only persist their bank periodically.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.