* removes raw indexing from streamer (#19183)
Raw indexing is verbose and error-prone. This same code had an indexing
bug causing validator nodes panic just a few months ago:
https://github.com/solana-labs/solana/commit/482b8c6be
(cherry picked from commit 8229a4fbf6)
# Conflicts:
# streamer/Cargo.toml
* removes backport merge conflicts
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
* sends slots (instead of stakes) through broadcast flow
Current broadcast code is computing stakes for each slot before sending
them down the channel:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349
Since the stakes are a function of epoch the slot belongs to (and so
does not necessarily change from one slot to another), forwarding the
slot itself would allow better caching downstream.
In addition we need to invalidate the cache if the epoch changes (which
the current code does not do), and that requires to know which slot (and
so epoch) current broadcasted shreds belong to:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344
(cherry picked from commit 44b11154ca)
# Conflicts:
# core/src/broadcast_stage/broadcast_duplicates_run.rs
# core/src/broadcast_stage/standard_broadcast_run.rs
* implements cluster-nodes cache
Cluster nodes are cached keyed by the respective epoch from which stakes
are obtained, and so if epoch changes cluster-nodes will be recomputed.
A time-to-live eviction policy is enforced to refresh entries in case
gossip contact-infos are updated.
(cherry picked from commit ecc1c7957f)
* uses cluster-nodes cache in retransmit stage
The new cluster-nodes cache will:
* ensure cluster-nodes are recalculated if the epoch (and so the epoch
staked nodes) changes.
* encapsulate time-to-live eviction policy.
(cherry picked from commit 30bec3921e)
* uses cluster-nodes cache in broadcast-stage
* Current caching mechanism does not update cluster-nodes when the epoch
(and so epoch staked nodes) changes:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344
* Additionally, the cache update has a concurrency bug in which the
thread which does compare_and_swap may be blocked when it tries to
obtain the write-lock on cache, while other threads will keep running
ahead with the outdated cache (since the atomic timestamp is already
updated).
In the new ClusterNodesCache, entries are keyed by epoch, and so if
epoch changes cluster-nodes will be recalculated. The time-to-live
eviction policy is also encapsulated and rigidly enforced.
(cherry picked from commit aa32738dd5)
# Conflicts:
# core/src/broadcast_stage/broadcast_duplicates_run.rs
# core/src/broadcast_stage/fail_entry_verification_broadcast_run.rs
# core/src/broadcast_stage/standard_broadcast_run.rs
* unifies cluster-nodes computation & caching across turbine stages
Broadcast-stage is using epoch_staked_nodes based on the same slot that
shreds belong to:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349
But retransmit-stage is using bank-epoch of the working-bank:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/retransmit_stage.rs#L272-L289
So the two are not consistent at epoch boundaries where some nodes may
have a working bank (or similarly a root bank) lagging other nodes. As a
result the node which obtains a packet may construct turbine broadcast
tree inconsistently with its parent node in the tree and so some packets
may fail to reach all nodes in the tree.
(cherry picked from commit 50d0e830c9)
* adds fallback & metric for when epoch staked-nodes are none
(cherry picked from commit fb69f45f14)
* allows only one thread to update cluster-nodes cache entry for an epoch
If two threads simultaneously call into ClusterNodesCache::get for the
same epoch, and the cache entry is outdated, then both threads recompute
cluster-nodes for the epoch and redundantly overwrite each other.
This commit wraps ClusterNodesCache entries in Arc<Mutex<...>>, so that
when needed only one thread does the computations to update the entry.
(cherry picked from commit eaf927cf49)
* falls back on working-bank if root-bank::epoch-staked-nodes is none
bank.get_leader_schedule_epoch(shred_slot)
is one epoch after epoch_schedule.get_epoch(shred_slot).
At epoch boundaries, shred is already one epoch after the root-slot. So
we need epoch-stakes 2 epochs ahead of the root. But the root bank only
has epoch-stakes for one epoch ahead, and as a result looking up epoch
staked-nodes from the root-bank fails.
To be backward compatible with the current master code, this commit
implements a fallback on working-bank if epoch staked-nodes obtained
from the root-bank is none.
(cherry picked from commit e4be00fece)
* removes backport merge conflicts
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
Rationalize usage of sendmmsg(2). Skip packets which failed to send and track failures.
(cherry picked from commit ae5ad5cf9b)
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
* Revert "temporarily disable new audit"
This reverts commit 3dfbd95ddc.
* Bump version of zeroize_derive from v1.0.0 to v1.2.0
(cherry picked from commit 0c62a6fe3f)
Co-authored-by: Justin Starry <justin@solana.com>
* sigverify to identify and mark simple vote transaction (#20021)
* check vote tx at get_packet_offsets to cover both cpu and gpu paths
* add pubkey_len to PacketOffsets to reduce the redundant bytes counting
* allow vote to have 1 or 2 sigs (#20082)
* windows: Make solana-test-validator work (#20099)
* windows: Make solana-test-validator work
The important changes to get this going on Windows:
* ledger lock needs to be done on a file instead of the directory
* IPC service needs to use the Windows pipe naming scheme
* always disable the JIT
* file logging not possible yet because we can't redirect stderr,
but this will change once env_logger fixes the pipe output target!
* Integrate review feedback
(cherry picked from commit 567f30aa1a)
# Conflicts:
# validator/src/bin/solana-test-validator.rs
# validator/src/lib.rs
# validator/src/main.rs
* Fix merge conflicts
Co-authored-by: Jon Cinque <jon.cinque@gmail.com>
* Add stable_log output when a program is loaded as native code instead of BPF
(cherry picked from commit 34f5020457)
* Add ProgramTest::add_builtin_program()
This permits the unit testing of builtin programs in the ProgramTest environment
(cherry picked from commit 830ca369f1)
Co-authored-by: Michael Vines <mvines@gmail.com>
* move `./run.sh` into `./scripts`
(cherry picked from commit 92e343da26)
* add some guidance in place of `./run.sh`
(cherry picked from commit 33de7b856f)
Co-authored-by: Trent Nelson <trent@solana.com>
* client: Add retry logic on Pubsub 429s (#19990)
(cherry picked from commit e9b066d497)
* Use exponential backoff for older version of tungstenite
Co-authored-by: Jon Cinque <jon.cinque@gmail.com>
* Optimize RPC pubsub for multiple clients with the same subscription (#18943)
* reimplement rpc pubsub with a broadcast queue
* update tests for new pubsub implementation
* fix: fix review suggestions
* chore(rpc): add additional pubsub metrics
* integrate max subscriptions check into SubscriptionTracker to reduce locking
* separate subscription control from tracker
* limit memory usage of items in pubsub broadcast queue, improve error handling
* add more pubsub metrics
* add final count metrics to pubsub
* add metric for total number of subscriptions
* fix small review suggestions
* remove by_params from SubscriptionTracker and add node_progress_watchers map instead
* add subscription tracker tests
* add metrics for number of pubsub notifications as a counter
* ignore clippy lint in TokenCounter
* fix underflow in token counter
* reduce queue capacity in pubsub tests
* fix(rpc): fix test timeouts
* fix race in account subscription test
* Add RpcSubscriptions::new_for_tests
Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
(cherry picked from commit 65227f44dc)
# Conflicts:
# Cargo.lock
# core/Cargo.toml
# core/src/replay_stage.rs
# core/src/validator.rs
# replica-node/src/replica_node.rs
# rpc/Cargo.toml
* Fix conflicts (and standardize naming to make future subscription backports easier
Co-authored-by: Pavel Strakhov <ri@idzaaus.org>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* rpc: performance fix for getProgramAccounts
The accounts were gradually pushed into a vector, which produced
significant slowdowns for very large responses.
* rpc: rewrite loops using iterators
Co-authored-by: Christian Kamm <ckamm@delightful-solutions.de>
(cherry picked from commit f1bbf1d8b0)
Co-authored-by: Christian Kamm <mail@ckamm.de>