solana

Author	SHA1	Message	Date
mergify[bot]	5057aaddc0	Send votes to next leader's TPU instead of our TPU (#16663 ) (cherry picked from commit `c8b474cd0b`) Co-authored-by: Michael Vines <mvines@gmail.com>	2021-04-20 08:45:58 +00:00
Michael Vines	a1b0f2f681	Increase test timeout	2021-04-19 04:12:16 +00:00
mergify[bot]	719db7eed0	uses timeouts based on stake for filtering pull responses (#16549 ) (#16551 ) filter_pull_responses is using default timeout when discarding pull responses (except for ContactInfo): https://github.com/solana-labs/solana/blob/f804ce63c/core/src/crds_gossip_pull.rs#L349-L350 But purging code uses timeouts based on stake: https://github.com/solana-labs/solana/blob/f804ce63c/core/src/cluster_info.rs#L1867-L1870 So the crds value will not be purged from the sender's table and will be sent again over the next pull request. (cherry picked from commit `d92721aab9`) Co-authored-by: behzad nouri <behzadnouri@gmail.com>	2021-04-14 21:43:48 +00:00
mergify[bot]	4ddb72a32d	prioritizes contact-infos in pull responses (#16541 ) (#16550 ) Expired crds values where the contact-info does not exist are wasted: https://github.com/solana-labs/solana/blob/f804ce63c/core/src/crds_gossip_pull.rs#L353-L378 and then are sent again over the next pull-request. Also, the stake of the first response (which can be anything) is used to weight all pull-responses to a node, while the rest of responses can have different stake. https://github.com/solana-labs/solana/blob/f804ce63c/core/src/cluster_info.rs#L2231 (cherry picked from commit `f35a6a8be0`) Co-authored-by: behzad nouri <behzadnouri@gmail.com>	2021-04-14 20:14:22 +00:00
Justin Starry	579065443a	v1.6: Use blake3 message hash in status cache (#16507 )	2021-04-13 16:57:20 +08:00
mergify[bot]	79ee0e06b2	Cluster info shred spies (bp #16389 ) (#16395 ) * cluster-info: Don't subtract non-shred spies from node count (cherry picked from commit `b6b08706b9`) * cluster-info: Get rid of some integer math while we're here (cherry picked from commit `b71875df61`) Co-authored-by: Trent Nelson <trent@solana.com>	2021-04-06 01:37:16 +00:00
mergify[bot]	8f852d8a6b	makes test_pull_request_time_pruning smaller (#16128 ) (#16144 ) (cherry picked from commit `b041b55028`) Co-authored-by: behzad nouri <behzadnouri@gmail.com>	2021-03-26 01:20:26 +00:00
mergify[bot]	7475a6f444	makes turbine peer computation consistent between broadcast and retransmit (#14910 ) (#16143 ) get_broadcast_peers is using tvu_peers: https://github.com/solana-labs/solana/blob/84e52b606/core/src/broadcast_stage.rs#L362-L370 which is potentially inconsistent with retransmit_peers: https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1332-L1345 Also, the leader does not include its own contact-info when broadcasting shreds: https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1324 but on the retransmit side, slot leader is removed only _after_ neighbors and children are computed: https://github.com/solana-labs/solana/blob/84e52b606/core/src/retransmit_stage.rs#L383-L384 So the turbine broadcast tree is different between the two stages. This commit: * Removes retransmit_peers. Broadcast and retransmit stages will use tvu_peers consistently. * Retransmit stage removes slot leader _before_ computing children and neighbors. (cherry picked from commit `570fd3f810`) Co-authored-by: behzad nouri <behzadnouri@gmail.com>	2021-03-26 00:16:48 +00:00
mergify[bot]	dd2d25d698	limits CrdsGossipPull::pull_request_time size (#15793 ) (#16097 ) There is no pruning logic on CrdsGossipPull::pull_request_time https://github.com/solana-labs/solana/blob/79ac1997d/core/src/crds_gossip_pull.rs#L172-L174 potentially allowing this to take too much memory. Additionally, CrdsGossipPush::last_pushed_to is pruning recent push timestamps: https://github.com/solana-labs/solana/blob/79ac1997d/core/src/crds_gossip_push.rs#L275-L279 instead of the older ones. Co-authored-by: Nathan Hawkins <utsl@utsl.org> (cherry picked from commit `a6c23648cb`) Co-authored-by: behzad nouri <behzadnouri@gmail.com>	2021-03-24 20:05:04 +00:00
behzad nouri	f2865dfd63	requires stakes for propagating crds values through gossip (#15561 )	2021-03-12 15:50:14 +00:00
behzad nouri	56923c91bf	limits number of unique pubkeys in the crds table (#15539 )	2021-03-10 20:46:05 +00:00
behzad nouri	5a9896706c	indexes epoch slots in crds table (#15459 ) ClusterInfo::get_epoch_slots_since scans the entire crds table to obtain epoch-slots inserted since a timestamp: https://github.com/solana-labs/solana/blob/013daa8f4/core/src/cluster_info.rs#L1245-L1262 The alternative is to index epoch-slots in crds table ordered by their insert timestamp.	2021-02-26 14:12:04 +00:00
carllin	c2e8814dce	Add limit and shrink policy for recycler (#15320 )	2021-02-24 00:15:58 -08:00
Michael Vines	5df36aec7d	Pacify clippy	2021-02-19 20:08:41 -08:00
behzad nouri	aa3aac766f	adds metrics for inbound/outbound gossip packets counts (#15407 )	2021-02-19 22:49:35 +00:00
behzad nouri	076c20f1ca	checks that prune-messages have the same inner/outer pubkey (#15352 )	2021-02-16 21:06:18 +00:00
behzad nouri	0ad063f4e9	adds flag to disable duplicate instance check (#15006 )	2021-02-03 16:26:17 +00:00
dependabot[bot]	1df93fa2be	chore: bump serde from 1.0.112 to 1.0.118 (#14828 ) * chore: bump serde from 1.0.112 to 1.0.122 Bumps [serde](https://github.com/serde-rs/serde) from 1.0.112 to 1.0.122. - [Release notes](https://github.com/serde-rs/serde/releases) - [Commits](https://github.com/serde-rs/serde/compare/v1.0.112...v1.0.122) Signed-off-by: dependabot[bot] <support@github.com> * [auto-commit] Update all Cargo lock files * Update frozen_abi digest following serde update * Revert "chore: bump serde from 1.0.112 to 1.0.122" This reverts commit `a3ef4442a4`. * Revert "[auto-commit] Update all Cargo lock files" This reverts commit `c41c3b005f`. * chore: bump serde from 1.0.112 to 1.0.118 Bumps [serde](https://github.com/serde-rs/serde) from 1.0.112 to 1.0.118. - [Release notes](https://github.com/serde-rs/serde/releases) - [Commits](https://github.com/serde-rs/serde/compare/v1.0.112...v1.0.118) Signed-off-by: dependabot[bot] <support@github.com> * [auto-commit] Update all Cargo lock files * Remove serum-dex pinning * blind commit! Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com> Co-authored-by: Ryo Onodera <ryoqun@gmail.com>	2021-02-02 23:28:16 +09:00
behzad nouri	e1021d9f83	removes redundant epoch stakes cache in retransmit (#14781 ) Following `d6d76219b`, staked nodes computed from vote accounts are already cached in runtime::Stakes, so the caching in retransmit_stage is redundant.	2021-01-24 21:15:09 +00:00
behzad nouri	491b059755	broadcasts duplicate shreds through gossip (#14699 )	2021-01-24 15:47:43 +00:00
behzad nouri	8e581601d6	patches crds vote-index assignment bug (#14438 ) If tower is full, old votes are evicted from the front of the deque: https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L367-L373 whereas recent votes if expire are evicted from the back: https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L529-L537 As a result, from a single tower_index scalar, we cannot infer which crds-vote should be overwritten: https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L576 In addition there is an off by one bug in the existing code. tower_index is bounded by MAX_LOCKOUT_HISTORY - 1: https://github.com/solana-labs/solana/blob/2074e407c/core/src/consensus.rs#L382 So, it is at most 30, whereas MAX_VOTES is 32: https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L29 Which means that this branch is never taken: https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L590-L593 so crds table alwasys keeps 29 oldest votes by wallclock, and then only overrides the 30st one each time. (i.e a tally of only two most recent votes).	2021-01-21 13:08:07 +00:00
behzad nouri	b5fd0ed859	rewrites turbine retransmit peers computation (#14584 )	2021-01-19 04:18:47 +00:00
Michael Vines	9ddd6f08e8	Persist gossip contact info	2020-12-27 20:46:54 -08:00
behzad nouri	2fd38d9912	indexes votes in crds table (#14272 )	2020-12-27 13:31:05 +00:00
behzad nouri	49019c6613	obtains staked-nodes from the root-bank (#14257 ) ... as opposed to the working bank	2020-12-27 13:28:05 +00:00
Michael Vines	ace360ade2	Multiple entrypoint support	2020-12-22 18:35:31 -08:00
Michael Vines	3373082ffa	Update entrypoint contact info even when shred version adoption is not requested	2020-12-22 18:35:31 -08:00
behzad nouri	a14cfd660a	removes &Arc<Self> receivers (#14234 )	2020-12-22 23:51:53 +00:00
behzad nouri	691031fefd	limits number of crds values returned when responding to pull requests (#13739 ) Crds values buffered when responding to pull-requests can be very large taking a lot of memory. Added a limit for number of buffered crds values based on outbound data budget.	2020-12-18 18:45:12 +00:00
behzad nouri	6a3797e164	adds crds-value for broadcasting duplicate shreds through gossip (#14133 ) In gossip, the header overhead we get from: https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/cluster_info.rs#L434-L435 https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/crds_value.rs#L31-L36 https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/crds_value.rs#L73 already exceeds SIZE_OF_NONCE in shreds. We also need aditional meta-data (wallclock, source pubkey, ...). Which means that given the SHRED_PAYLOAD_SIZE, we cannot fit all these in PACKET_DATA_SIZE: https://github.com/solana-labs/solana/blob/de9ac43eb/ledger/src/shred.rs#L80 On top of that, we need 2 shred payloads as the proof of duplicate. So each DuplicateShred crds value includes only a chunk of the payload, along with the meta-data to reconstruct the full payload from the chunks on the receiving end.	2020-12-18 14:32:43 +00:00
behzad nouri	d6d76219b6	caches staked nodes computed from vote-accounts (#13929 )	2020-12-17 21:22:50 +00:00
Michael Vines	7143aaa89b	Clippy	2020-12-14 08:03:29 -08:00
behzad nouri	409fe3bca1	adds the instance token to crds-labels for node-instance crds-values (#14037 ) If a node "a" receives instance-info from node "b1" it will override any instance-info associated with "b1" pubkey in its crds table. This makes it less likely that when "b1" receives crds values from "a" (either through pull or push), it sees other instances of itself (because node "a" discarded them when it received "b1" instance info). In order for the crds table to contain all instance-info associated with the same pubkey at the same time, we need to add the instance tokens to the keys in the crds table (i.e. the CrdsValueLabel).	2020-12-10 17:01:55 +00:00
behzad nouri	1d267eae6b	std::process::exit to kill all threads	2020-12-09 10:24:23 -08:00
behzad nouri	895d7d6a65	removes RwLock on ClusterInfo.instance	2020-12-09 10:24:23 -08:00
behzad nouri	542198180a	pushes node-instance along with version early in gossip	2020-12-09 10:24:23 -08:00
behzad nouri	8cd5eb9863	checks for duplicate validator instances using gossip	2020-12-09 10:24:23 -08:00
behzad nouri	6706f2b3bb	removes recursive read-locks on gossip (#13973 ) ClusterInfo::tvu_peers acquires a read-lock on gossip: https://github.com/solana-labs/solana/blob/f0e934145/core/src/cluster_info.rs#L1171-L1185 and so, ClusterInfo::repair_peers is recursively locking gossip for read twice: https://github.com/solana-labs/solana/blob/f0e934145/core/src/cluster_info.rs#L1202-L1223 But std::sync::RwLock is not re-entrant (recursive).	2020-12-06 15:14:49 +00:00
behzad nouri	c3048b451d	samples repair peers using WeightedIndex (#13919 ) To output one random sample, weighted_best generates n random numbers: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63 WeightedIndex does so with only one random number: https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240 Additionally, if the index is already constructed, it only does a total of O(log(n)) amount of work; which can be achieved if RepairCache, caches the weighted index: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83 Also, the repair-peers code can be reorganized to have fewer redundant unlock-then-lock code.	2020-12-03 14:26:07 +00:00
Tyera Eulberg	10c81a2448	Remove rpc_banks from validator (#13882 ) * Remove rpc_banks from validator * Bump abi-digest	2020-12-02 03:25:09 +00:00
behzad nouri	26bf2b7e45	processes pull-request callers only once per unique caller (#13750 ) process_pull_requests acquires a write lock on crds table to update records timestamp for each of the pull-request callers: https://github.com/solana-labs/solana/blob/3087c9049/core/src/crds_gossip_pull.rs#L287-L300 However, pull-requests overlap a lot in callers and this function ends up doing a lot of redundant duplicate work. This commit obtains unique callers before acquiring an exclusive lock on crds table.	2020-11-22 17:51:14 +00:00
sakridge	c1eb350c47	Allow contact debug interval to be adjusted (#13737 )	2020-11-20 14:47:37 -08:00
behzad nouri	b58f69297f	makes crds fields private (#13703 ) Crds fields should maintain several invariants between themselves, so exposing them as public fields can be bug prone. In addition these invariants are asserted on every write: https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L138-L154 https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L239-L262 which adds extra instructions and is not optimal. Should these fields be private the asserts will be redundant.	2020-11-19 20:57:40 +00:00
behzad nouri	1ffab5de77	breaks prunes data into chunks to fit into packets (#13613 ) Validator logs show that prune messages are dropped because they exceed packet data size: https://github.com/solana-labs/solana/blob/f25c969ad/perf/src/packet.rs#L90-L92 This can exacerbate gossip traffic by redundantly increasing push messages across network. The workaround is to break prunes into smaller chunks and send over in multiple messages.	2020-11-19 16:38:01 +00:00
behzad nouri	5e8490ab9d	packs more crds-values in a single gossip packet (#13500 ) split_gossip_messages: https://github.com/solana-labs/solana/blob/a97c04b40/core/src/cluster_info.rs#L1536-L1574 splits crds-values into chunks to fit into a gossip packet. However it is using a global upper-bound for the header-size across all protocols: https://github.com/solana-labs/solana/blob/a97c04b40/core/src/cluster_info.rs#L90-L93 This can be wasteful as the specific gossip protocol can have smaller header than this upper-bound (e.g. Protocol::PushMessage is 170 bytes smaller). Adding more crds-values in one gossip packet can avoid the overheads of separate packets and reduce total number of bytes sent over the wire. This commit updates the splitting function to take a max-chunk-size argument. At call-site, this value is set to the size of the protocol which the values are sent over.	2020-11-15 18:23:59 +00:00
behzad nouri	cbea9ebc34	indexes nodes' contact infos in crds table (#13553 ) In several places in gossip code, the entire crds table is scanned only to filter out nodes' contact infos. Currently on mainnet, crds table is of size ~70k, while there are only ~470 nodes. So the full table scan is inefficient. Instead we may maintain an index of only nodes' contact infos.	2020-11-15 16:38:04 +00:00
behzad nouri	73ac104df2	propagates errors out of Packet::from_data (#13445 ) Packet::from_data is ignoring serialization errors: https://github.com/solana-labs/solana/blob/d08c3232e/sdk/src/packet.rs#L42-L48 This is likely never useful as the packet will be sent over the wire taking bandwidth but at the receiving end will either fail to deserialize or it will be invalid. This commit will propagate the errors out of the function to the call-site, allowing the call-site to handle the error.	2020-11-08 15:10:03 +00:00
behzad nouri	7f4debdad5	drops older gossip packets when load shedding (#13364 ) Gossip drops incoming packets when overloaded: https://github.com/solana-labs/solana/blob/f6a73098a/core/src/cluster_info.rs#L2462-L2475 However newer packets are dropped in favor of the older ones. This is probably not ideal as newer packets are more likely to contain more recent data, so dropping them will keep the validator state lagging.	2020-11-05 17:14:28 +00:00
behzad nouri	8f0796436a	shares the lock on gossip when processing prune messages (#13339 ) Processing prune messages acquires an exclusive lock on gossip: https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L1824-L1825 This can be reduced to a shared lock if active-sets are changed to use atomic bloom filters: https://github.com/solana-labs/solana/blob/55b0428ff/core/src/crds_gossip_push.rs#L50	2020-11-05 15:42:00 +00:00
behzad nouri	118ce47b97	measures processing time of each kind of gossip packets (#13366 )	2020-11-05 15:34:34 +00:00

1 2 3 4 5 ...

293 Commits