Commit Graph

1992 Commits

Author SHA1 Message Date
behzad nouri
1d267eae6b std::process::exit to kill all threads 2020-12-09 10:24:23 -08:00
behzad nouri
895d7d6a65 removes RwLock on ClusterInfo.instance 2020-12-09 10:24:23 -08:00
behzad nouri
542198180a pushes node-instance along with version early in gossip 2020-12-09 10:24:23 -08:00
behzad nouri
8cd5eb9863 checks for duplicate validator instances using gossip 2020-12-09 10:24:23 -08:00
sakridge
f6600810d7 Use LRU cache and blake3 hash of shreds to filter duplicates (#13976) 2020-12-07 16:42:39 -08:00
Michael Vines
6e9dbb4f6e Add --rpc-max-multiple-accounts to override the getMultipleAccounts JSON RPC maximum 2020-12-07 16:31:01 -08:00
carllin
239a191612 Remove unneeded BankWeight fork choice (#13978)
Co-authored-by: Carl Lin <carl@solana.com>
2020-12-07 13:47:14 -08:00
Tyera Eulberg
6ae4d2e5cb Fix logsSubscribe (#13996) 2020-12-07 19:00:52 +00:00
Ryo Onodera
3d9d7557c8 core/validator: Wrap std::process:exit(1) for easier testing (#13990) 2020-12-07 16:43:03 +00:00
Alexander Meißner
a706706572 Validator CLI option to enable just-in-time compilation of BPF (#13789)
* Adds a CLI option to the validator to enable just-in-time compilation of BPF.

* Refactoring to use bpf_loader_program instead of feature_set to pass JIT flag from the validator CLI to the executor.
2020-12-07 09:49:55 +01:00
behzad nouri
6706f2b3bb removes recursive read-locks on gossip (#13973)
ClusterInfo::tvu_peers acquires a read-lock on gossip:
https://github.com/solana-labs/solana/blob/f0e934145/core/src/cluster_info.rs#L1171-L1185
and so, ClusterInfo::repair_peers is recursively locking gossip for
read twice:
https://github.com/solana-labs/solana/blob/f0e934145/core/src/cluster_info.rs#L1202-L1223
But std::sync::RwLock is not re-entrant (recursive).
2020-12-06 15:14:49 +00:00
Tyera Eulberg
ca35bb3ac8 Report highest_confirmed_root and _slot in commitment metric (#13964) 2020-12-05 00:50:00 +00:00
carllin
34b68288c8 Fix propagation skip check (#13933)
Co-authored-by: Carl Lin <carl@solana.com>
2020-12-03 12:31:38 -08:00
behzad nouri
c3048b451d samples repair peers using WeightedIndex (#13919)
To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.
2020-12-03 14:26:07 +00:00
Trent Nelson
404fc1570d runtime: Replace HashAgeKind with NonceRollbackInfo 2020-12-02 20:10:08 +00:00
Tyera Eulberg
10c81a2448 Remove rpc_banks from validator (#13882)
* Remove rpc_banks from validator

* Bump abi-digest
2020-12-02 03:25:09 +00:00
Michael Vines
0a8bc347a1 Restore discover_cluster to avoid test panics 2020-12-01 17:58:28 -08:00
Michael Vines
3eece38ffa Add expects() to improve error logs on join failures 2020-12-01 17:58:28 -08:00
Michael Vines
73111b005f Reduce the number of snapshots 2020-12-01 11:13:37 -08:00
Tyera Eulberg
8fd1e55805 Add logging in check_blockstore_max_root (#13887) 2020-12-01 07:44:18 +00:00
Michael Vines
90d557d916 Strengthen EpochSlots sanitization 2020-11-30 14:40:25 -08:00
behzad nouri
e1793e5a13 caches vote-state de-serialized from vote accounts (#13795)
Gossip and other places repeatedly de-serialize vote-state stored in
vote accounts. Ideally the first de-serialization should cache the
result.

This commit adds new VoteAccount type which lazily de-serializes
VoteState from Account data and caches the result internally.

Serialize and Deserialize traits are manually implemented to match
existing code. So, despite changes to frozen_abi, this commit should be
backward compatible.
2020-11-30 17:18:33 +00:00
Michael Vines
43b82b31e5 More TestValidator cleanup 2020-11-26 08:56:25 +00:00
Michael Vines
b5f7e39be8 TestValidator public interface cleanup 2020-11-25 17:04:37 -08:00
Tyera Eulberg
4ff0f0949a Separate blockstore checks for not (yet) rooted and cleaned up (#13814) 2020-11-25 22:59:38 +00:00
Michael Vines
4ef2da0ff0 Add solana logs command 2020-11-25 11:44:41 -08:00
sakridge
b70abdc645 Nonce updates (#13799)
* runtime: Add `FeeCalculator` resolution method to `HashAgeKind`

* runtime: Plumb fee-collected accounts for failed nonce tx rollback

* runtime: Use fee-collected nonce/fee account for nonced TX error rollback

* runtime: Add test for failed nonced TX accounts rollback

* Fee payer test

* fixup: replace nonce account when it pays the fee

* fixup: nonce fee-payer collect test

* fixup: fixup: clippy/fmt for replace...

* runtime: Test for `HashAgeKind::fee_calculator()`

* Clippy

Co-authored-by: Trent Nelson <trent@solana.com>
2020-11-24 23:53:51 -08:00
Michael Vines
215ddecaa5 Add base64+zstd encoding for RPC account data 2020-11-25 02:03:23 +00:00
Michael Vines
61ab2072bd Clean up default commitment handling for subscriptions 2020-11-23 22:54:47 -08:00
Tyera Eulberg
7befad2f6d Check SlotNotRooted if confirmed block not found in blockstore or bigtable (#13776) 2020-11-24 03:36:20 +00:00
behzad nouri
26bf2b7e45 processes pull-request callers only once per unique caller (#13750)
process_pull_requests acquires a write lock on crds table to update
records timestamp for each of the pull-request callers:
https://github.com/solana-labs/solana/blob/3087c9049/core/src/crds_gossip_pull.rs#L287-L300
However, pull-requests overlap a lot in callers and this function ends
up doing a lot of redundant duplicate work.

This commit obtains unique callers before acquiring an exclusive lock on
crds table.
2020-11-22 17:51:14 +00:00
sakridge
c1eb350c47 Allow contact debug interval to be adjusted (#13737) 2020-11-20 14:47:37 -08:00
Ryo Onodera
b74d7b5758 Fix fragile tests in prep of stake rewrite pr (#13654)
* Fix fragile tests in prep of stake rewrite pr

* Restore BOOTSTRAP_VALIDATOR_LAMPORTS where appropriate

* Further clean up

* Further clean up

* Aligh with other call site change

* Remove false warn!

* fix ci!
2020-11-20 17:21:03 +09:00
behzad nouri
a8c29505f0 sanitizes bloom filters to avoid division by zero (#13714)
Pull requests received over the wire can cause a validator to panic
because of division by zero in bloom filters:
https://github.com/solana-labs/solana/blob/af08ba93e/runtime/src/bloom.rs#L86-L88
2020-11-19 23:35:22 +00:00
dependabot[bot]
856693ac1f chore: bump lru from 0.6.0 to 0.6.1
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.6.0 to 0.6.1.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.6.0...0.6.1)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-19 14:28:50 -08:00
behzad nouri
b58f69297f makes crds fields private (#13703)
Crds fields should maintain several invariants between themselves, so
exposing them as public fields can be bug prone. In addition these
invariants are asserted on every write:
https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L138-L154
https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L239-L262
which adds extra instructions and is not optimal. Should these fields be
private the asserts will be redundant.
2020-11-19 20:57:40 +00:00
behzad nouri
1ffab5de77 breaks prunes data into chunks to fit into packets (#13613)
Validator logs show that prune messages are dropped because they exceed
packet data size:
https://github.com/solana-labs/solana/blob/f25c969ad/perf/src/packet.rs#L90-L92
This can exacerbate gossip traffic by redundantly increasing push
messages across network. The workaround is to break prunes into smaller
chunks and send over in multiple messages.
2020-11-19 16:38:01 +00:00
Trent Nelson
f2a1a0ac5c RPC: Demote missing block error to warning
It frightens the tourists
2020-11-19 04:54:49 +00:00
Tyera Eulberg
3e4acba72f Quiet notification logs when no subscriptions (#13629) 2020-11-17 06:37:38 +00:00
Tyera Eulberg
ef99689592 Improve TestValidator instantiation (#13627)
* Add TestValidator::new_with_fees constructor, and warning for low bootstrap_validator_lamports

* Add logging to solana-tokens integration test to help catch low bootstrap_validator_lamports in the future

* Reasonable TestValidator mint_lamports
2020-11-16 23:27:36 -07:00
behzad nouri
5e8490ab9d packs more crds-values in a single gossip packet (#13500)
split_gossip_messages:
https://github.com/solana-labs/solana/blob/a97c04b40/core/src/cluster_info.rs#L1536-L1574
splits crds-values into chunks to fit into a gossip packet. However it is
using a global upper-bound for the header-size across all protocols:
https://github.com/solana-labs/solana/blob/a97c04b40/core/src/cluster_info.rs#L90-L93
This can be wasteful as the specific gossip protocol can have smaller
header than this upper-bound (e.g. Protocol::PushMessage is 170 bytes
smaller). Adding more crds-values in one gossip packet can avoid the
overheads of separate packets and reduce total number of bytes sent over
the wire.

This commit updates the splitting function to take a max-chunk-size
argument. At call-site, this value is set to the size of the protocol
which the values are sent over.
2020-11-15 18:23:59 +00:00
behzad nouri
cbea9ebc34 indexes nodes' contact infos in crds table (#13553)
In several places in gossip code, the entire crds table is scanned only
to filter out nodes' contact infos. Currently on mainnet, crds table is
of size ~70k, while there are only ~470 nodes. So the full table scan is
inefficient. Instead we may maintain an index of only nodes' contact
infos.
2020-11-15 16:38:04 +00:00
Michael Vines
5d72e52ad0 Disable the PubSub vote subscription by default
The --rpc-pubsub-enable-vote-subscription flag may be used to enable it.
The current vote subscription is problematic because it emits a
notification for *every* vote, so hundreds a second in a real cluster.
Critically it's also missing information about *who* is voting,
rendering all those notifications practically useless.

Until these two issues can be resolved, the vote subscription is not
much more than a potential DoS vector.
2020-11-14 12:36:37 -08:00
Tyera Eulberg
88ae321d3f Add counter metrics to rpc-subscriptions (#13596) 2020-11-14 12:40:24 -07:00
Michael Vines
baa6b3a261 Add stable program logging for BPF and native programs 2020-11-14 08:26:01 -08:00
Tyera Eulberg
34bf80ba9c Send pubsub metrics to metrics server (#13584) 2020-11-13 19:31:23 +00:00
sakridge
c1f3f9d27b Stop searching for incorrect shred version after a minute (#13512) 2020-11-12 14:01:13 -08:00
behzad nouri
4e4e12b384 filters out offline nodes from pull options (#13533)
Inactive nodes are still observing incoming gossip traffic:
https://discord.com/channels/428295358100013066/670512312339398668/776140351291260968
likely because of pull-requests.

Previous related issues and commits:
https://github.com/solana-labs/solana/issues/12409
https://github.com/solana-labs/solana/pull/12620
https://github.com/solana-labs/solana/pull/12674

This commit implements same logic as
https://github.com/solana-labs/solana/pull/12674
to exclude inactive nodes from pull options, with the same periodic
retry logic for offline staked nodes in order to mitigate eclipse
attack.
2020-11-12 16:09:37 +00:00
carllin
9821a7754c Discard pre hard fork persisted tower if hard-forking (#13536)
* Discard pre hard fork persisted tower if hard-forking

* Relax config.require_tower

* Add cluster test

* nits

* Remove unnecessary check

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
Co-authored-by: Carl Lin <carl@solana.com>
2020-11-12 23:29:04 +09:00
Ryo Onodera
89b474e192 Fix slow/stuck unstaking due to toggling in epoch (#13501)
* Fix slow/stuck unstaking due to toggling in epoch

* nits

* nits

* Add stake_program_v2 feature status check to cli

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2020-11-11 14:11:57 -07:00