2295 Commits

Author SHA1 Message Date
behzad nouri
4e4e12b384
filters out offline nodes from pull options (#13533)
Inactive nodes are still observing incoming gossip traffic:
https://discord.com/channels/428295358100013066/670512312339398668/776140351291260968
likely because of pull-requests.

Previous related issues and commits:
https://github.com/solana-labs/solana/issues/12409
https://github.com/solana-labs/solana/pull/12620
https://github.com/solana-labs/solana/pull/12674

This commit implements same logic as
https://github.com/solana-labs/solana/pull/12674
to exclude inactive nodes from pull options, with the same periodic
retry logic for offline staked nodes in order to mitigate eclipse
attack.
2020-11-12 16:09:37 +00:00
carllin
9821a7754c
Discard pre hard fork persisted tower if hard-forking (#13536)
* Discard pre hard fork persisted tower if hard-forking

* Relax config.require_tower

* Add cluster test

* nits

* Remove unnecessary check

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
Co-authored-by: Carl Lin <carl@solana.com>
2020-11-12 23:29:04 +09:00
Ryo Onodera
89b474e192
Fix slow/stuck unstaking due to toggling in epoch (#13501)
* Fix slow/stuck unstaking due to toggling in epoch

* nits

* nits

* Add stake_program_v2 feature status check to cli

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2020-11-11 14:11:57 -07:00
Trent Nelson
38f15e41b5 Validator: Periodically log what we're waiting for during --wait-for-supermajority 2020-11-11 20:03:26 +00:00
sakridge
70c4626efe
Fix signature access (#13491) 2020-11-10 08:35:03 -08:00
Justin Starry
a97c04b400
Send RPC notification when account is deleted (#13440)
* Send RPC notification when account is deleted

* Remove unwrap
2020-11-10 19:48:42 +08:00
sakridge
b4cf968e14
Add back shredding broadcast stats (#13463) 2020-11-09 23:04:27 -08:00
sakridge
c644b05c54
Fix avx check with newest nightly compiler (#13465) 2020-11-09 08:04:34 -08:00
behzad nouri
73ac104df2
propagates errors out of Packet::from_data (#13445)
Packet::from_data is ignoring serialization errors:
https://github.com/solana-labs/solana/blob/d08c3232e/sdk/src/packet.rs#L42-L48
This is likely never useful as the packet will be sent over the wire
taking bandwidth but at the receiving end will either fail to
deserialize or it will be invalid.
This commit will propagate the errors out of the function to the
call-site, allowing the call-site to handle the error.
2020-11-08 15:10:03 +00:00
behzad nouri
7f4debdad5
drops older gossip packets when load shedding (#13364)
Gossip drops incoming packets when overloaded:
https://github.com/solana-labs/solana/blob/f6a73098a/core/src/cluster_info.rs#L2462-L2475
However newer packets are dropped in favor of the older ones.
This is probably not ideal as newer packets are more likely to contain
more recent data, so dropping them will keep the validator state
lagging.
2020-11-05 17:14:28 +00:00
behzad nouri
8f0796436a
shares the lock on gossip when processing prune messages (#13339)
Processing prune messages acquires an exclusive lock on gossip:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L1824-L1825
This can be reduced to a shared lock if active-sets are changed to use
atomic bloom filters:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/crds_gossip_push.rs#L50
2020-11-05 15:42:00 +00:00
behzad nouri
118ce47b97
measures processing time of each kind of gossip packets (#13366) 2020-11-05 15:34:34 +00:00
Michael Vines
8838a55a1a Bump spl-token and spl-memo crate versions 2020-11-04 21:44:33 +00:00
behzad nouri
10fa4f45ab
uses thread-pool when handling push messages (#13338)
From runtime profiles, the majority time of solana-listen thread:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L2720
is spent handling push messages. The code here:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L2272-L2364
may utilize the idle gossip thread-pool.
2020-11-04 19:15:58 +00:00
sakridge
0d663158d0
Reduce repair_stats debug (#13393) 2020-11-04 10:32:48 -08:00
Tyera Eulberg
eb2560e782 Prevent block times from ever going backward 2020-10-31 21:30:42 -07:00
Tyera Eulberg
90778615f6 Use bounded timestamp-correction when feature enabled 2020-10-31 21:30:42 -07:00
sakridge
8415c22b59
Reduce debug (#13319) 2020-10-30 16:18:44 -07:00
Ryo Onodera
1df15d85c3
Fix tower/blockstore unsync due to external causes (#12671)
* Fix tower/blockstore unsync due to external causes

* Add and clean up long comments

* Clean up test

* Comment about warped_slot_history

* Run test_future_tower with master-only/master-slave

* Update comments about false leader condition
2020-10-30 19:31:23 +09:00
Michael Vines
df8dab9d2b Native/builtin programs now receive an InvokeContext 2020-10-29 21:45:24 -07:00
Greg Fitzgerald
ca00197009
Upgrade tarpc and tokio (#13293) 2020-10-29 19:21:18 -06:00
behzad nouri
3738611f5c
adds more parallel processing to gossip packets handling (#12988) 2020-10-29 15:17:19 +00:00
behzad nouri
be80f6d5c5
excludes origin from prune set (#13204)
On the receiving end, prune messages are ignored if the origin points to
the node itself:
https://github.com/solana-labs/solana/blob/631f029fe/core/src/crds_gossip_push.rs#L285-L295
So to avoid sending these over the wire, the requester can exclude
origin from the prune set.
2020-10-29 12:50:58 +00:00
Jack May
c458d4b213
move Account to solana-sdk (#13198) 2020-10-28 22:01:07 -07:00
behzad nouri
ae91270961
implements ping-pong packets between nodes (#12794)
https://hackerone.com/reports/991106

> It’s possible to use UDP gossip protocol to amplify DDoS attacks. An attacker
> can spoof IP address in UDP packet when sending PullRequest to the node.
> There's no any validation if provided source IP address is not spoofed and
> the node can send much larger PullResponse to victim's IP. As I checked,
> PullRequest is about 290 bytes, while PullResponse is about 10 kB. It means
> that amplification is about 34x. This way an attacker can easily perform DDoS
> attack both on Solana node and third-party server.
>
> To prevent it, need for example to implement ping-pong mechanism similar as
> in Ethereum: Before accepting requests from remote client needs to validate
> his IP. Local node sends Ping packet to the remote node and it needs to reply
> with Pong packet that contains hash of matching Ping packet. Content of Ping
> packet is unpredictable. If hash from Pong packet matches, local node can
> remember IP where Ping packet was sent as correct and allow further
> communication.
>
> More info:
> https://github.com/ethereum/devp2p/blob/master/discv4.md#endpoint-proof
> https://github.com/ethereum/devp2p/blob/master/discv4.md#wire-protocol

The commit adds a PingCache, which maintains records of remote nodes
which have returned a valid response to a ping message, and on-the-fly
ping messages pending a pong response from the remote node.

When handling pull-requests, those from addresses which have not passed
the ping-pong check are filtered out, and additionally ping packets are
added for addresses which need to be (re)verified.
2020-10-28 17:03:02 +00:00
carllin
f96ab5a818
Fix log (#13207)
Co-authored-by: Carl Lin <carl@solana.com>
2020-10-27 18:56:57 -07:00
Tyera Eulberg
39686ef098
Use bank timestamp to populate Blockstore::blocktime_cf when correction active (#13158) 2020-10-26 19:23:45 +00:00
behzad nouri
4bfda3e766
marks pull request creation time only once per peer (#13113)
mark_pull_request_creation time requires an exclusive lock on gossip:
https://github.com/solana-labs/solana/blob/16944e218/core/src/cluster_info.rs#L1547-L1548
Current code is redundantly marking each peer once for each request.
There are at most only 2 unique peers, whereas there are hundreds of
requests per each. So the lock is acquired hundreds of time longer than
necessary.
2020-10-26 17:11:31 +00:00
Ryo Onodera
66c7a98009
Allow existence of vote on root in saved tower (#13135) 2020-10-26 11:08:20 +09:00
Michael Vines
a4956844bd Update frozen_abi hashes
The movement of files in sdk/ caused ABI hashes to change
2020-10-24 08:37:55 -07:00
Josh
766406fd23
add precompile verification to simulate_transaction (#13080) 2020-10-23 20:47:51 -07:00
Ryo Onodera
0264147d42
Clean up opt conf verifier and vote state tracker (#13081)
* Clean up opt conf verifier and vote state tracker

* Update test to follow new message and some knob

* Rename
2020-10-24 10:19:12 +09:00
behzad nouri
37c8842bcb
scans crds table in parallel for finding old labels (#13073)
From runtime profiles, the majority time of ClusterInfo::handle_purge
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1605-L1626
is spent scanning crds table finding old labels:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/crds.rs#L175-L197

This can be done in parallel given that gossip thread-pool:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1637-L1641
is idle when handle_purge is invoked:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1681
2020-10-23 14:17:37 +00:00
Justin Starry
c95f6c4b83
Remove spammy invalid rpc log (#13100) 2020-10-23 07:05:29 +00:00
Justin Starry
8b0242a5d8
Allow nodes to advertise a different rpc address over gossip (#13053)
* Allow nodes to advertise a different rpc address over gossip

* Feedback
2020-10-22 03:31:48 +00:00
Michael Vines
959880db60 Remove unused pubkey::Pubkey imports 2020-10-21 19:08:13 -07:00
Michael Vines
17c391121a Run codemod --extensions rs Hash::new_rand solana_sdk:#️⃣:new_rand 2020-10-21 19:08:13 -07:00
Michael Vines
7bc073defe Run codemod --extensions rs Pubkey::new_rand solana_sdk::pubkey::new_rand 2020-10-21 19:08:13 -07:00
Ryo Onodera
0776fa05c7
Add ledger-tool dead-slots and improve purge a lot (#13065)
* Add ledger-tool dead-slots and improve purge a lot

* Reduce batch size...

* Add --dead-slots-only and fixed purge ordering
2020-10-21 17:45:21 +00:00
Ryo Onodera
efdb560e97
Various clean-ups before assert adjustment (#13006)
* Various clean-ups before assert adjustment

* oops
2020-10-21 10:26:20 +09:00
Michael Vines
6858950f76 Remove frozen ABI modules from solana-sdk 2020-10-20 16:11:30 -07:00
Ryo Onodera
c0675968b1
Support Debug Bank (#13017) 2020-10-21 01:05:45 +09:00
Trent Nelson
3b3f7341fa validator: Activate RPC before halting on slot 2020-10-20 02:09:07 +00:00
behzad nouri
75d62ca095
improves threads' utilization in processing gossip packets (#12962)
ClusterInfo::process_packets handles incoming packets in a thread_pool:
https://github.com/solana-labs/solana/blob/87311cce7/core/src/cluster_info.rs#L2118-L2134

However, profiling runtime shows that threads are not well utilized and
a lot of the processing is done sequentially.

This commit redistributes the work done in parallel. Testing on a gce
cluster shows 20%+ improvement in processing gossip packets with much
smaller variations.
2020-10-19 19:03:38 +00:00
Ryo Onodera
54517ea454
Follow up to persistent tower with tests and API cleaning (#12350)
* Follow up to persistent tower

* Ignore for now...

* Hard-code validator identities for easy reasoning

* Add a test for opt. conf violation without tower

* Fix compile with rust < 1.47

* Remove unused method

* More move of assert tweak to the asser pr

* Add comments

* Clean up

* Clean the test addressing various review comments

* Clean up a bit
2020-10-19 16:37:03 +09:00
Ryo Onodera
fd8ec27fe8
Another some tower logging improvements (#12940) 2020-10-16 14:44:07 +09:00
behzad nouri
48283161c3
passes through feature-set to gossip requests handling (#12878)
* passes through feature-set to down to gossip requests handling
* takes the feature-set from root_bank instead of working_bank
2020-10-15 20:54:21 +00:00
Tyera Eulberg
d008dfb7ad
Bump spl-memo and spl-token versions (#12917) 2020-10-15 18:23:41 +00:00
behzad nouri
05cf15a382
implements DataBudget using atomics (#12856) 2020-10-15 11:33:58 +00:00
Ryo Onodera
a44e4d386f
Better tower logs for SwitchForkDecision and etc (#12875)
* Better tower logs for SwitchForkDecision and etc

* nits

* Update comment
2020-10-15 18:30:33 +09:00