* sends slots (instead of stakes) through broadcast flow Current broadcast code is computing stakes for each slot before sending them down the channel: https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228 https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349 Since the stakes are a function of epoch the slot belongs to (and so does not necessarily change from one slot to another), forwarding the slot itself would allow better caching downstream. In addition we need to invalidate the cache if the epoch changes (which the current code does not do), and that requires to know which slot (and so epoch) current broadcasted shreds belong to: https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344 (cherry picked from commit44b11154ca
) # Conflicts: # core/src/broadcast_stage/broadcast_duplicates_run.rs # core/src/broadcast_stage/standard_broadcast_run.rs * implements cluster-nodes cache Cluster nodes are cached keyed by the respective epoch from which stakes are obtained, and so if epoch changes cluster-nodes will be recomputed. A time-to-live eviction policy is enforced to refresh entries in case gossip contact-infos are updated. (cherry picked from commitecc1c7957f
) * uses cluster-nodes cache in retransmit stage The new cluster-nodes cache will: * ensure cluster-nodes are recalculated if the epoch (and so the epoch staked nodes) changes. * encapsulate time-to-live eviction policy. (cherry picked from commit30bec3921e
) * uses cluster-nodes cache in broadcast-stage * Current caching mechanism does not update cluster-nodes when the epoch (and so epoch staked nodes) changes: https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344 * Additionally, the cache update has a concurrency bug in which the thread which does compare_and_swap may be blocked when it tries to obtain the write-lock on cache, while other threads will keep running ahead with the outdated cache (since the atomic timestamp is already updated). In the new ClusterNodesCache, entries are keyed by epoch, and so if epoch changes cluster-nodes will be recalculated. The time-to-live eviction policy is also encapsulated and rigidly enforced. (cherry picked from commitaa32738dd5
) # Conflicts: # core/src/broadcast_stage/broadcast_duplicates_run.rs # core/src/broadcast_stage/fail_entry_verification_broadcast_run.rs # core/src/broadcast_stage/standard_broadcast_run.rs * unifies cluster-nodes computation & caching across turbine stages Broadcast-stage is using epoch_staked_nodes based on the same slot that shreds belong to: https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228 https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349 But retransmit-stage is using bank-epoch of the working-bank: https://github.com/solana-labs/solana/blob/19bd30262/core/src/retransmit_stage.rs#L272-L289 So the two are not consistent at epoch boundaries where some nodes may have a working bank (or similarly a root bank) lagging other nodes. As a result the node which obtains a packet may construct turbine broadcast tree inconsistently with its parent node in the tree and so some packets may fail to reach all nodes in the tree. (cherry picked from commit50d0e830c9
) * adds fallback & metric for when epoch staked-nodes are none (cherry picked from commitfb69f45f14
) * allows only one thread to update cluster-nodes cache entry for an epoch If two threads simultaneously call into ClusterNodesCache::get for the same epoch, and the cache entry is outdated, then both threads recompute cluster-nodes for the epoch and redundantly overwrite each other. This commit wraps ClusterNodesCache entries in Arc<Mutex<...>>, so that when needed only one thread does the computations to update the entry. (cherry picked from commiteaf927cf49
) * falls back on working-bank if root-bank::epoch-staked-nodes is none bank.get_leader_schedule_epoch(shred_slot) is one epoch after epoch_schedule.get_epoch(shred_slot). At epoch boundaries, shred is already one epoch after the root-slot. So we need epoch-stakes 2 epochs ahead of the root. But the root bank only has epoch-stakes for one epoch ahead, and as a result looking up epoch staked-nodes from the root-bank fails. To be backward compatible with the current master code, this commit implements a fallback on working-bank if epoch staked-nodes obtained from the root-bank is none. (cherry picked from commite4be00fece
) * removes backport merge conflicts Co-authored-by: behzad nouri <behzadnouri@gmail.com>
Building
1. Install rustc, cargo and rustfmt.
$ curl https://sh.rustup.rs -sSf | sh
$ source $HOME/.cargo/env
$ rustup component add rustfmt
Please make sure you are always using the latest stable rust version by running:
$ rustup update
On Linux systems you may need to install libssl-dev, pkg-config, zlib1g-dev, etc. On Ubuntu:
$ sudo apt-get update
$ sudo apt-get install libssl-dev libudev-dev pkg-config zlib1g-dev llvm clang make
On Mac M1s, make sure you set up your terminal & homebrew to use Rosetta. You can install it with:
$ softwareupdate --install-rosetta
2. Download the source code.
$ git clone https://github.com/solana-labs/solana.git
$ cd solana
3. Build.
$ cargo build
Testing
Run the test suite:
$ cargo test
Starting a local testnet
Start your own testnet locally, instructions are in the online docs.
Accessing the remote development cluster
devnet
- stable public cluster for development accessible via devnet.solana.com. Runs 24/7. Learn more about the public clusters
Benchmarking
First install the nightly build of rustc. cargo bench
requires use of the
unstable features only available in the nightly build.
$ rustup install nightly
Run the benchmarks:
$ cargo +nightly bench
Release Process
The release process for this project is described here.
Code coverage
To generate code coverage statistics:
$ scripts/coverage.sh
$ open target/cov/lcov-local/index.html
Why coverage? While most see coverage as a code quality metric, we see it primarily as a developer productivity metric. When a developer makes a change to the codebase, presumably it's a solution to some problem. Our unit-test suite is how we encode the set of problems the codebase solves. Running the test suite should indicate that your change didn't infringe on anyone else's solutions. Adding a test protects your solution from future changes. Say you don't understand why a line of code exists, try deleting it and running the unit-tests. The nearest test failure should tell you what problem was solved by that code. If no test fails, go ahead and submit a Pull Request that asks, "what problem is solved by this code?" On the other hand, if a test does fail and you can think of a better way to solve the same problem, a Pull Request with your solution would most certainly be welcome! Likewise, if rewriting a test can better communicate what code it's protecting, please send us that patch!
Disclaimer
All claims, content, designs, algorithms, estimates, roadmaps, specifications, and performance measurements described in this project are done with the Solana Foundation's ("SF") best efforts. It is up to the reader to check and validate their accuracy and truthfulness. Furthermore nothing in this project constitutes a solicitation for investment.
Any content produced by SF or developer resources that SF provides, are for educational and inspiration purposes only. SF does not encourage, induce or sanction the deployment, integration or use of any such applications (including the code comprising the Solana blockchain protocol) in violation of applicable laws or regulations and hereby prohibits any such deployment, integration or use. This includes use of any such applications by the reader (a) in violation of export control or sanctions laws of the United States or any other applicable jurisdiction, (b) if the reader is located in or ordinarily resident in a country or territory subject to comprehensive sanctions administered by the U.S. Office of Foreign Assets Control (OFAC), or (c) if the reader is or is working on behalf of a Specially Designated National (SDN) or a person subject to similar blocking or denied party prohibitions.
The reader should be aware that U.S. export control and sanctions laws prohibit U.S. persons (and other persons that are subject to such laws) from transacting with persons in certain countries and territories or that are on the SDN list. As a project based primarily on open-source software, it is possible that such sanctioned persons may nevertheless bypass prohibitions, obtain the code comprising the Solana blockchain protocol (or other project code or applications) and deploy, integrate, or otherwise use it. Accordingly, there is a risk to individuals that other persons using the Solana blockchain protocol may be sanctioned persons and that transactions with such persons would be a violation of U.S. export controls and sanctions law. This risk applies to individuals, organizations, and other ecosystem participants that deploy, integrate, or use the Solana blockchain protocol code directly (e.g., as a node operator), and individuals that transact on the Solana blockchain through light clients, third party interfaces, and/or wallet software.