Commit Graph

2929 Commits

Author SHA1 Message Date
behzad nouri
c3048b451d samples repair peers using WeightedIndex (#13919)
To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.
2020-12-03 14:26:07 +00:00
Trent Nelson
404fc1570d runtime: Replace HashAgeKind with NonceRollbackInfo 2020-12-02 20:10:08 +00:00
Tyera Eulberg
10c81a2448 Remove rpc_banks from validator (#13882)
* Remove rpc_banks from validator

* Bump abi-digest
2020-12-02 03:25:09 +00:00
Michael Vines
0a8bc347a1 Restore discover_cluster to avoid test panics 2020-12-01 17:58:28 -08:00
Michael Vines
3eece38ffa Add expects() to improve error logs on join failures 2020-12-01 17:58:28 -08:00
Michael Vines
73111b005f Reduce the number of snapshots 2020-12-01 11:13:37 -08:00
Tyera Eulberg
8fd1e55805 Add logging in check_blockstore_max_root (#13887) 2020-12-01 07:44:18 +00:00
Michael Vines
90d557d916 Strengthen EpochSlots sanitization 2020-11-30 14:40:25 -08:00
behzad nouri
e1793e5a13 caches vote-state de-serialized from vote accounts (#13795)
Gossip and other places repeatedly de-serialize vote-state stored in
vote accounts. Ideally the first de-serialization should cache the
result.

This commit adds new VoteAccount type which lazily de-serializes
VoteState from Account data and caches the result internally.

Serialize and Deserialize traits are manually implemented to match
existing code. So, despite changes to frozen_abi, this commit should be
backward compatible.
2020-11-30 17:18:33 +00:00
Michael Vines
43b82b31e5 More TestValidator cleanup 2020-11-26 08:56:25 +00:00
Michael Vines
b5f7e39be8 TestValidator public interface cleanup 2020-11-25 17:04:37 -08:00
Tyera Eulberg
4ff0f0949a Separate blockstore checks for not (yet) rooted and cleaned up (#13814) 2020-11-25 22:59:38 +00:00
Michael Vines
4ef2da0ff0 Add solana logs command 2020-11-25 11:44:41 -08:00
sakridge
b70abdc645 Nonce updates (#13799)
* runtime: Add `FeeCalculator` resolution method to `HashAgeKind`

* runtime: Plumb fee-collected accounts for failed nonce tx rollback

* runtime: Use fee-collected nonce/fee account for nonced TX error rollback

* runtime: Add test for failed nonced TX accounts rollback

* Fee payer test

* fixup: replace nonce account when it pays the fee

* fixup: nonce fee-payer collect test

* fixup: fixup: clippy/fmt for replace...

* runtime: Test for `HashAgeKind::fee_calculator()`

* Clippy

Co-authored-by: Trent Nelson <trent@solana.com>
2020-11-24 23:53:51 -08:00
Michael Vines
215ddecaa5 Add base64+zstd encoding for RPC account data 2020-11-25 02:03:23 +00:00
Michael Vines
61ab2072bd Clean up default commitment handling for subscriptions 2020-11-23 22:54:47 -08:00
Tyera Eulberg
7befad2f6d Check SlotNotRooted if confirmed block not found in blockstore or bigtable (#13776) 2020-11-24 03:36:20 +00:00
behzad nouri
26bf2b7e45 processes pull-request callers only once per unique caller (#13750)
process_pull_requests acquires a write lock on crds table to update
records timestamp for each of the pull-request callers:
https://github.com/solana-labs/solana/blob/3087c9049/core/src/crds_gossip_pull.rs#L287-L300
However, pull-requests overlap a lot in callers and this function ends
up doing a lot of redundant duplicate work.

This commit obtains unique callers before acquiring an exclusive lock on
crds table.
2020-11-22 17:51:14 +00:00
sakridge
c1eb350c47 Allow contact debug interval to be adjusted (#13737) 2020-11-20 14:47:37 -08:00
Ryo Onodera
b74d7b5758 Fix fragile tests in prep of stake rewrite pr (#13654)
* Fix fragile tests in prep of stake rewrite pr

* Restore BOOTSTRAP_VALIDATOR_LAMPORTS where appropriate

* Further clean up

* Further clean up

* Aligh with other call site change

* Remove false warn!

* fix ci!
2020-11-20 17:21:03 +09:00
behzad nouri
a8c29505f0 sanitizes bloom filters to avoid division by zero (#13714)
Pull requests received over the wire can cause a validator to panic
because of division by zero in bloom filters:
https://github.com/solana-labs/solana/blob/af08ba93e/runtime/src/bloom.rs#L86-L88
2020-11-19 23:35:22 +00:00
dependabot[bot]
856693ac1f chore: bump lru from 0.6.0 to 0.6.1
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.6.0 to 0.6.1.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.6.0...0.6.1)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-19 14:28:50 -08:00
behzad nouri
b58f69297f makes crds fields private (#13703)
Crds fields should maintain several invariants between themselves, so
exposing them as public fields can be bug prone. In addition these
invariants are asserted on every write:
https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L138-L154
https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L239-L262
which adds extra instructions and is not optimal. Should these fields be
private the asserts will be redundant.
2020-11-19 20:57:40 +00:00
behzad nouri
1ffab5de77 breaks prunes data into chunks to fit into packets (#13613)
Validator logs show that prune messages are dropped because they exceed
packet data size:
https://github.com/solana-labs/solana/blob/f25c969ad/perf/src/packet.rs#L90-L92
This can exacerbate gossip traffic by redundantly increasing push
messages across network. The workaround is to break prunes into smaller
chunks and send over in multiple messages.
2020-11-19 16:38:01 +00:00
Trent Nelson
f2a1a0ac5c RPC: Demote missing block error to warning
It frightens the tourists
2020-11-19 04:54:49 +00:00
Tyera Eulberg
3e4acba72f Quiet notification logs when no subscriptions (#13629) 2020-11-17 06:37:38 +00:00
Tyera Eulberg
ef99689592 Improve TestValidator instantiation (#13627)
* Add TestValidator::new_with_fees constructor, and warning for low bootstrap_validator_lamports

* Add logging to solana-tokens integration test to help catch low bootstrap_validator_lamports in the future

* Reasonable TestValidator mint_lamports
2020-11-16 23:27:36 -07:00
behzad nouri
5e8490ab9d packs more crds-values in a single gossip packet (#13500)
split_gossip_messages:
https://github.com/solana-labs/solana/blob/a97c04b40/core/src/cluster_info.rs#L1536-L1574
splits crds-values into chunks to fit into a gossip packet. However it is
using a global upper-bound for the header-size across all protocols:
https://github.com/solana-labs/solana/blob/a97c04b40/core/src/cluster_info.rs#L90-L93
This can be wasteful as the specific gossip protocol can have smaller
header than this upper-bound (e.g. Protocol::PushMessage is 170 bytes
smaller). Adding more crds-values in one gossip packet can avoid the
overheads of separate packets and reduce total number of bytes sent over
the wire.

This commit updates the splitting function to take a max-chunk-size
argument. At call-site, this value is set to the size of the protocol
which the values are sent over.
2020-11-15 18:23:59 +00:00
behzad nouri
cbea9ebc34 indexes nodes' contact infos in crds table (#13553)
In several places in gossip code, the entire crds table is scanned only
to filter out nodes' contact infos. Currently on mainnet, crds table is
of size ~70k, while there are only ~470 nodes. So the full table scan is
inefficient. Instead we may maintain an index of only nodes' contact
infos.
2020-11-15 16:38:04 +00:00
Michael Vines
5d72e52ad0 Disable the PubSub vote subscription by default
The --rpc-pubsub-enable-vote-subscription flag may be used to enable it.
The current vote subscription is problematic because it emits a
notification for *every* vote, so hundreds a second in a real cluster.
Critically it's also missing information about *who* is voting,
rendering all those notifications practically useless.

Until these two issues can be resolved, the vote subscription is not
much more than a potential DoS vector.
2020-11-14 12:36:37 -08:00
Tyera Eulberg
88ae321d3f Add counter metrics to rpc-subscriptions (#13596) 2020-11-14 12:40:24 -07:00
Michael Vines
baa6b3a261 Add stable program logging for BPF and native programs 2020-11-14 08:26:01 -08:00
Tyera Eulberg
34bf80ba9c Send pubsub metrics to metrics server (#13584) 2020-11-13 19:31:23 +00:00
sakridge
c1f3f9d27b Stop searching for incorrect shred version after a minute (#13512) 2020-11-12 14:01:13 -08:00
behzad nouri
4e4e12b384 filters out offline nodes from pull options (#13533)
Inactive nodes are still observing incoming gossip traffic:
https://discord.com/channels/428295358100013066/670512312339398668/776140351291260968
likely because of pull-requests.

Previous related issues and commits:
https://github.com/solana-labs/solana/issues/12409
https://github.com/solana-labs/solana/pull/12620
https://github.com/solana-labs/solana/pull/12674

This commit implements same logic as
https://github.com/solana-labs/solana/pull/12674
to exclude inactive nodes from pull options, with the same periodic
retry logic for offline staked nodes in order to mitigate eclipse
attack.
2020-11-12 16:09:37 +00:00
carllin
9821a7754c Discard pre hard fork persisted tower if hard-forking (#13536)
* Discard pre hard fork persisted tower if hard-forking

* Relax config.require_tower

* Add cluster test

* nits

* Remove unnecessary check

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
Co-authored-by: Carl Lin <carl@solana.com>
2020-11-12 23:29:04 +09:00
Ryo Onodera
89b474e192 Fix slow/stuck unstaking due to toggling in epoch (#13501)
* Fix slow/stuck unstaking due to toggling in epoch

* nits

* nits

* Add stake_program_v2 feature status check to cli

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2020-11-11 14:11:57 -07:00
Trent Nelson
38f15e41b5 Validator: Periodically log what we're waiting for during --wait-for-supermajority 2020-11-11 20:03:26 +00:00
sakridge
70c4626efe Fix signature access (#13491) 2020-11-10 08:35:03 -08:00
Justin Starry
a97c04b400 Send RPC notification when account is deleted (#13440)
* Send RPC notification when account is deleted

* Remove unwrap
2020-11-10 19:48:42 +08:00
sakridge
b4cf968e14 Add back shredding broadcast stats (#13463) 2020-11-09 23:04:27 -08:00
sakridge
c644b05c54 Fix avx check with newest nightly compiler (#13465) 2020-11-09 08:04:34 -08:00
behzad nouri
73ac104df2 propagates errors out of Packet::from_data (#13445)
Packet::from_data is ignoring serialization errors:
https://github.com/solana-labs/solana/blob/d08c3232e/sdk/src/packet.rs#L42-L48
This is likely never useful as the packet will be sent over the wire
taking bandwidth but at the receiving end will either fail to
deserialize or it will be invalid.
This commit will propagate the errors out of the function to the
call-site, allowing the call-site to handle the error.
2020-11-08 15:10:03 +00:00
behzad nouri
7f4debdad5 drops older gossip packets when load shedding (#13364)
Gossip drops incoming packets when overloaded:
https://github.com/solana-labs/solana/blob/f6a73098a/core/src/cluster_info.rs#L2462-L2475
However newer packets are dropped in favor of the older ones.
This is probably not ideal as newer packets are more likely to contain
more recent data, so dropping them will keep the validator state
lagging.
2020-11-05 17:14:28 +00:00
behzad nouri
8f0796436a shares the lock on gossip when processing prune messages (#13339)
Processing prune messages acquires an exclusive lock on gossip:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L1824-L1825
This can be reduced to a shared lock if active-sets are changed to use
atomic bloom filters:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/crds_gossip_push.rs#L50
2020-11-05 15:42:00 +00:00
behzad nouri
118ce47b97 measures processing time of each kind of gossip packets (#13366) 2020-11-05 15:34:34 +00:00
Michael Vines
8838a55a1a Bump spl-token and spl-memo crate versions 2020-11-04 21:44:33 +00:00
behzad nouri
10fa4f45ab uses thread-pool when handling push messages (#13338)
From runtime profiles, the majority time of solana-listen thread:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L2720
is spent handling push messages. The code here:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L2272-L2364
may utilize the idle gossip thread-pool.
2020-11-04 19:15:58 +00:00
sakridge
0d663158d0 Reduce repair_stats debug (#13393) 2020-11-04 10:32:48 -08:00
Tyera Eulberg
eb2560e782 Prevent block times from ever going backward 2020-10-31 21:30:42 -07:00