mergify[bot] d68377e927 unifies cluster-nodes computation & caching across turbine stages (backport #18971) (#20231)
* sends slots (instead of stakes) through broadcast flow

Current broadcast code is computing stakes for each slot before sending
them down the channel:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228
https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349

Since the stakes are a function of epoch the slot belongs to (and so
does not necessarily change from one slot to another), forwarding the
slot itself would allow better caching downstream.

In addition we need to invalidate the cache if the epoch changes (which
the current code does not do), and that requires to know which slot (and
so epoch) current broadcasted shreds belong to:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344

(cherry picked from commit 44b11154ca)

# Conflicts:
#	core/src/broadcast_stage/broadcast_duplicates_run.rs
#	core/src/broadcast_stage/standard_broadcast_run.rs

* implements cluster-nodes cache

Cluster nodes are cached keyed by the respective epoch from which stakes
are obtained, and so if epoch changes cluster-nodes will be recomputed.

A time-to-live eviction policy is enforced to refresh entries in case
gossip contact-infos are updated.

(cherry picked from commit ecc1c7957f)

* uses cluster-nodes cache in retransmit stage

The new cluster-nodes cache will:
  * ensure cluster-nodes are recalculated if the epoch (and so the epoch
    staked nodes) changes.
  * encapsulate time-to-live eviction policy.

(cherry picked from commit 30bec3921e)

* uses cluster-nodes cache in broadcast-stage

* Current caching mechanism does not update cluster-nodes when the epoch
  (and so epoch staked nodes) changes:
  https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344

* Additionally, the cache update has a concurrency bug in which the
  thread which does compare_and_swap may be blocked when it tries to
  obtain the write-lock on cache, while other threads will keep running
  ahead with the outdated cache (since the atomic timestamp is already
  updated).

In the new ClusterNodesCache, entries are keyed by epoch, and so if
epoch changes cluster-nodes will be recalculated. The time-to-live
eviction policy is also encapsulated and rigidly enforced.

(cherry picked from commit aa32738dd5)

# Conflicts:
#	core/src/broadcast_stage/broadcast_duplicates_run.rs
#	core/src/broadcast_stage/fail_entry_verification_broadcast_run.rs
#	core/src/broadcast_stage/standard_broadcast_run.rs

* unifies cluster-nodes computation & caching across turbine stages

Broadcast-stage is using epoch_staked_nodes based on the same slot that
shreds belong to:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228
https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349

But retransmit-stage is using bank-epoch of the working-bank:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/retransmit_stage.rs#L272-L289

So the two are not consistent at epoch boundaries where some nodes may
have a working bank (or similarly a root bank) lagging other nodes. As a
result the node which obtains a packet may construct turbine broadcast
tree inconsistently with its parent node in the tree and so some packets
may fail to reach all nodes in the tree.

(cherry picked from commit 50d0e830c9)

* adds fallback & metric for when epoch staked-nodes are none

(cherry picked from commit fb69f45f14)

* allows only one thread to update cluster-nodes cache entry for an epoch

If two threads simultaneously call into ClusterNodesCache::get for the
same epoch, and the cache entry is outdated, then both threads recompute
cluster-nodes for the epoch and redundantly overwrite each other.

This commit wraps ClusterNodesCache entries in Arc<Mutex<...>>, so that
when needed only one thread does the computations to update the entry.

(cherry picked from commit eaf927cf49)

* falls back on working-bank if root-bank::epoch-staked-nodes is none

bank.get_leader_schedule_epoch(shred_slot)
is one epoch after epoch_schedule.get_epoch(shred_slot).

At epoch boundaries, shred is already one epoch after the root-slot. So
we need epoch-stakes 2 epochs ahead of the root. But the root bank only
has epoch-stakes for one epoch ahead, and as a result looking up epoch
staked-nodes from the root-bank fails.

To be backward compatible with the current master code, this commit
implements a fallback on working-bank if epoch staked-nodes obtained
from the root-bank is none.

(cherry picked from commit e4be00fece)

* removes backport merge conflicts

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-09-26 23:45:42 +00:00
2020-10-04 10:18:42 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-26 12:50:53 -07:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2021-09-16 13:16:09 -06:00
2018-07-12 09:40:40 -06:00
2018-12-20 12:32:25 -08:00
2021-01-29 19:03:10 +00:00
2021-02-17 09:01:45 -08:00
2021-04-07 14:45:03 +08:00
2020-11-05 14:29:17 -08:00
2020-11-05 14:29:17 -08:00
2020-08-20 18:02:36 +00:00
2020-09-24 07:53:30 +00:00
2020-10-24 08:37:55 -07:00

Solana

Solana crate Solana documentation Build status codecov

Building

1. Install rustc, cargo and rustfmt.

$ curl https://sh.rustup.rs -sSf | sh
$ source $HOME/.cargo/env
$ rustup component add rustfmt

Please make sure you are always using the latest stable rust version by running:

$ rustup update

On Linux systems you may need to install libssl-dev, pkg-config, zlib1g-dev, etc. On Ubuntu:

$ sudo apt-get update
$ sudo apt-get install libssl-dev libudev-dev pkg-config zlib1g-dev llvm clang make

On Mac M1s, make sure you set up your terminal & homebrew to use Rosetta. You can install it with:

$ softwareupdate --install-rosetta

2. Download the source code.

$ git clone https://github.com/solana-labs/solana.git
$ cd solana

3. Build.

$ cargo build

Testing

Run the test suite:

$ cargo test

Starting a local testnet

Start your own testnet locally, instructions are in the online docs.

Accessing the remote development cluster

  • devnet - stable public cluster for development accessible via devnet.solana.com. Runs 24/7. Learn more about the public clusters

Benchmarking

First install the nightly build of rustc. cargo bench requires use of the unstable features only available in the nightly build.

$ rustup install nightly

Run the benchmarks:

$ cargo +nightly bench

Release Process

The release process for this project is described here.

Code coverage

To generate code coverage statistics:

$ scripts/coverage.sh
$ open target/cov/lcov-local/index.html

Why coverage? While most see coverage as a code quality metric, we see it primarily as a developer productivity metric. When a developer makes a change to the codebase, presumably it's a solution to some problem. Our unit-test suite is how we encode the set of problems the codebase solves. Running the test suite should indicate that your change didn't infringe on anyone else's solutions. Adding a test protects your solution from future changes. Say you don't understand why a line of code exists, try deleting it and running the unit-tests. The nearest test failure should tell you what problem was solved by that code. If no test fails, go ahead and submit a Pull Request that asks, "what problem is solved by this code?" On the other hand, if a test does fail and you can think of a better way to solve the same problem, a Pull Request with your solution would most certainly be welcome! Likewise, if rewriting a test can better communicate what code it's protecting, please send us that patch!

Disclaimer

All claims, content, designs, algorithms, estimates, roadmaps, specifications, and performance measurements described in this project are done with the Solana Foundation's ("SF") best efforts. It is up to the reader to check and validate their accuracy and truthfulness. Furthermore nothing in this project constitutes a solicitation for investment.

Any content produced by SF or developer resources that SF provides, are for educational and inspiration purposes only. SF does not encourage, induce or sanction the deployment, integration or use of any such applications (including the code comprising the Solana blockchain protocol) in violation of applicable laws or regulations and hereby prohibits any such deployment, integration or use. This includes use of any such applications by the reader (a) in violation of export control or sanctions laws of the United States or any other applicable jurisdiction, (b) if the reader is located in or ordinarily resident in a country or territory subject to comprehensive sanctions administered by the U.S. Office of Foreign Assets Control (OFAC), or (c) if the reader is or is working on behalf of a Specially Designated National (SDN) or a person subject to similar blocking or denied party prohibitions.

The reader should be aware that U.S. export control and sanctions laws prohibit U.S. persons (and other persons that are subject to such laws) from transacting with persons in certain countries and territories or that are on the SDN list. As a project based primarily on open-source software, it is possible that such sanctioned persons may nevertheless bypass prohibitions, obtain the code comprising the Solana blockchain protocol (or other project code or applications) and deploy, integrate, or otherwise use it. Accordingly, there is a risk to individuals that other persons using the Solana blockchain protocol may be sanctioned persons and that transactions with such persons would be a violation of U.S. export controls and sanctions law. This risk applies to individuals, organizations, and other ecosystem participants that deploy, integrate, or use the Solana blockchain protocol code directly (e.g., as a node operator), and individuals that transact on the Solana blockchain through light clients, third party interfaces, and/or wallet software.

Description
Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.
Readme 514 MiB
Languages
Rust 87.7%
TypeScript 8%
Shell 2.6%
SCSS 0.7%
C 0.6%
Other 0.2%