* Add function for changing thread's nice value
Linux only.
* Add validator option to change niceness of snapshot packager thread
* Add validator option to change niceness of RPC server threads
Fixes https://github.com/solana-labs/solana/issues/14556
* Run `./scripts/cargo-for-all-lock-files.sh tree`
* Add missing websocket methods to rust RPC PubSub client (#21065)
- Added accountSubscribe, programSubscribe, slotSubscribe and rootSubscribe to rust RpcClient
- Removed duplication on cleanup threads
- Moved RPCVote from rpc/ to client/rpc_response
(cherry picked from commit a0f9e0e8ee)
# Conflicts:
# Cargo.lock
# client-test/Cargo.toml
# core/tests/client.rs
* Fix conflicts
* Make test result not depend on TestValidator setup
Co-authored-by: Manuel Gil <manugildev@gmail.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
Merge AccountsDb plugin framework to v1.8 (#20518)
Summary of Changes
Create a plugin mechanism in the accounts update path so that accounts data can be streamed out to external data stores (be it Kafka or Postgres). The plugin mechanism allows
Data stores of connection strings/credentials to be configured,
Accounts with patterns to be streamed
PostgreSQL implementation of the streaming for different destination stores to be plugged in.
The code comprises 4 major parts:
accountsdb-plugin-intf: defines the plugin interface which concrete plugin should implement.
accountsdb-plugin-manager: manages the load/unload of plugins and provide interfaces which the validator can notify of accounts update to plugins.
accountsdb-plugin-postgres: the concrete plugin implementation for PostgreSQL
The validator integrations: updated streamed right after snapshot restore and after account update from transaction processing or other real updates.
The plugin is optionally loaded on demand by new validator CLI argument -- there is no impact if the plugin is not loaded.
* Optimize RPC pubsub for multiple clients with the same subscription (#18943)
* reimplement rpc pubsub with a broadcast queue
* update tests for new pubsub implementation
* fix: fix review suggestions
* chore(rpc): add additional pubsub metrics
* integrate max subscriptions check into SubscriptionTracker to reduce locking
* separate subscription control from tracker
* limit memory usage of items in pubsub broadcast queue, improve error handling
* add more pubsub metrics
* add final count metrics to pubsub
* add metric for total number of subscriptions
* fix small review suggestions
* remove by_params from SubscriptionTracker and add node_progress_watchers map instead
* add subscription tracker tests
* add metrics for number of pubsub notifications as a counter
* ignore clippy lint in TokenCounter
* fix underflow in token counter
* reduce queue capacity in pubsub tests
* fix(rpc): fix test timeouts
* fix race in account subscription test
* Add RpcSubscriptions::new_for_tests
Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
(cherry picked from commit 65227f44dc)
# Conflicts:
# Cargo.lock
# core/Cargo.toml
# core/src/replay_stage.rs
# core/src/validator.rs
# replica-node/src/replica_node.rs
# rpc/Cargo.toml
* Fix conflicts (and standardize naming to make future subscription backports easier
Co-authored-by: Pavel Strakhov <ri@idzaaus.org>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* Make account shrink configurable #17544 (#17778)
1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true
Fixes#17544
(cherry picked from commit 269d995832)
# Conflicts:
# core/tests/snapshots.rs
# ledger/src/bank_forks_utils.rs
# runtime/src/accounts_db.rs
# runtime/src/snapshot_utils.rs
* fix some merge errors
Co-authored-by: Lijun Wang <83639177+lijunwangs@users.noreply.github.com>
Co-authored-by: Jeff Washington (jwash) <wash678@gmail.com>
* verify bank hash on startup with ledger tool option (#17939)
(cherry picked from commit f558b9b6bf)
# Conflicts:
# core/tests/snapshots.rs
# ledger/src/bank_forks_utils.rs
# runtime/src/snapshot_utils.rs
* fix merge errors
Co-authored-by: Jeff Washington (jwash) <75863576+jeffwashington@users.noreply.github.com>
Co-authored-by: Jeff Washington (jwash) <wash678@gmail.com>
* add metrics for startup
* roll timings up higher
* fix test
* fix duplicate
(cherry picked from commit 471b34132e)
# Conflicts:
# ledger/src/bank_forks_utils.rs
# runtime/src/snapshot_utils.rs
conflicts because #17778 is not present.
Co-authored-by: Jeff Washington (jwash) <75863576+jeffwashington@users.noreply.github.com>
* Avoid full-range compactions with periodic filtered b.g. ones (#16697)
* Update rocksdb to v0.16.0
* Promote the infrequent and important log to info!
* Force background compaction by ttl without manual compaction
* Fix test
* Support no compaction mode in test_ledger_cleanup_compaction
* Fix comment
* Make compaction_interval customizable
* Avoid major compaction with periodic filtering...
* Adress lazy_static, special cfs and range check
* Clean up a bit and add comment
* Add comment
* More comments...
* Config code cleanup
* Add comment
* Use .conflicts_with()
* Nullify unneeded delete_range ops for special CFs
* Some clean ups
* Clarify the locking intention
* Ensure special CFs' consistency with PurgeType::CompactionFilter
* Fix comment
* Fix bad copy paste
* Fix various types...
* Don't use tuples
* Add a unit test for compaction_filter
* Fix typo...
* Remove flag and just use new behavior always
* Fix wrong condition negation...
* Doc. about no set_last_purged_slot in purge_slots
* Write a test and fix off-by-one bug....
* Apply suggestions from code review
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
* Follow up to github review suggestions
* Fix line-wrapping
* Fix conflict
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
(cherry picked from commit 1f97b2365f)
# Conflicts:
# ledger/src/blockstore_db.rs
* Fix conflict
Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
* Move gossip modules to solana-gossip
* Update Protocol abi digest due to move
* Move gossip benches and hook up CI
* Remove unneeded Result entries
* Single use statements
For all code paths (gossip push, pull, purge, etc) that remove or
override a crds value, it is necessary to record hash of values purged
from crds table, in order to exclude them from subsequent pull-requests;
otherwise the next pull request will likely return outdated values,
wasting bandwidth:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491
Currently this is done all over the place in multiple modules, and this
has caused bugs in the past where purged values were not recorded.
This commit encapsulated this bookkeeping into crds module, so that any
code path which removes or overrides a crds value, also records the hash
of purged value in-place.
In order to remove port-based forwarding logic in turbine, we need to
first track how often the turbine retransmit/broadcast trees mismatch
across nodes.
One consistency condition is that if the node is on the critical path
(i.e. the first node in each neighborhood), then we expect that the
packet arrives at tvu socket as opposed to tvu-forwards.
This commit adds a metric to track how often above condition is not met.
If stakes are unknown, then timeouts will be short, resulting in values
being purged from the crds table, and consequently higher pull-response
load when they are obtained again from gossip. In particular, this slows
down validator start where almost all values obtained from entrypoint
are immediately discarded.
* Upgrade Rust to 1.52.0
update nightly_version to newly pushed docker image
fix clippy lint errors
1.52 comes with grcov 0.8.0, include this version to script
* upgrade to Rust 1.52.1
* disabling Serum from downstream projects until it is upgraded to Rust 1.52.1
* purge_old_snapshot_archives is changed to take an extra argument 'maximum_snapshots_to_retain' to control the max number of latest snapshot archives to retain. Note the oldest snapshot is always retained as before and is not subjected to this new options.
* The validator and ledger-tool executables are modified with a CLI argument --maximum-snapshots-to-retain. And the options are propagated down the call chains. Their corresponding shell scripts were changed accordingly.
* SnapshotConfig is modified to have an extra field for the maximum_snapshots_to_retain
* Unit tests are developed to cover purge_old_snapshot_archives
Having an ordinal index on crds values based on insert order allows to
efficiently filter values using a cursor. In particular
CrdsGossipPush::push_messages hash-map can be replaced with a cursor,
saving on the bookkeepings, purging, etc
VersionedCrdsValue.insert_timestamp is used for fetching crds values
inserted since last query:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298
So it is crucial that insert_timestamp does not go backward in time when
new values are inserted into the table. However std::time::SystemTime is
not monotonic, or due to workload, lock contention, thread scheduling,
etc, ... new values may be inserted with a stalled timestamp way in the
past. Additionally, reading system time for the above purpose is
inefficient/unnecessary.
This commit adds an ordinal index to crds values indicating their insert
order. Additionally, it implements a new Cursor type for fetching values
inserted since last query.
IP addresses need to be validated before sending packets to them.
This commit, sends a ping packet to nodes before any pull requests.
Pull requests are then only sent to the nodes which have responded with
the correct hash of their respective ping packet.
rocksdb compaction can cause long stalls, so
make it more configurable to try and reduce those stalls
and also to coordinate between multiple nodes to not induce
stall at the same time.
* Update tonic & prost, and regenerate proto
* Reignore doc code
* Revert pull #14367, but pin tokio to v0.2 for jsonrpc
* Bump backoff and goauth -> and therefore tokio
* Bump tokio in faucet, net-utils
* Bump remaining tokio, plus tarpc