solana

Author	SHA1	Message	Date
Michael Vines	a1ef2bd74d	Ignore flaky test_pull_request_time_pruning	2021-04-21 12:07:36 -07:00
behzad nouri	37b8587d4e	expands number of erasure coding shreds in the last batch in slots (#16484 ) Number of parity coding shreds is always less than the number of data shreds in FEC blocks: https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719 Data shreds are batched in chunks of 32 shreds each: https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714 However the very last batch of data shreds in a slot can be small, in which case the loss rate can be exacerbated. This commit expands the number of coding shreds in the last FEC block in slots to: 64 - number of data shreds; so that FEC blocks are always 64 data and parity coding shreds each. As a consequence of this, the last FEC block has more parity coding shreds than data shreds. So for some shred indices we will have a coding shred but no data shreds. This should not cause any kind of overlapping FEC blocks as in: https://github.com/solana-labs/solana/pull/10095 since this is done only for the very last batch in a slot, and the next slot will reset the shred index.	2021-04-21 12:47:50 +00:00
Tyera Eulberg	0924c2d070	Add port and gossip options to solana-test-validator (#16696 )	2021-04-21 02:40:52 +00:00
behzad nouri	bc90e04e64	uses current local timestamp when recording purged values CrdsGossipPull.purged_values is meant to record recently purged values so that they are excluded from imminent pull requests, until the entire cluster have synced to the updated value: https://github.com/solana-labs/solana/blob/c826cddbb/core/src/crds_gossip_pull.rs#L449-L454 However, VersionedCrdsValue.local_timestamp represents the local time when the value was last updated, and given that crds values may have different timeouts based on stake, it does not necessarily represent how recently the value was purged: https://github.com/solana-labs/solana/blob/c826cddbb/core/src/crds.rs#L75-L76 As such, recording current local timestamp when purging values is more appropriate. Additionally, purge_purged assumes that the purge_values is sorted in timestamps when draining the old ones; which is not true if those timestamps are VersionedCrdsValue.local_timestamp: https://github.com/solana-labs/solana/blob/c826cddbb/core/src/crds_gossip_pull.rs#L563-L571	2021-04-20 11:21:00 +00:00
Michael Vines	c8b474cd0b	Send votes to next leader's TPU instead of our TPU	2021-04-20 00:38:21 -07:00
Michael Vines	b06e93fe5b	Increase test timeout	2021-04-18 20:55:02 -07:00
behzad nouri	e405747409	Revert "Add limit and shrink policy for recycler (#15320 )" This reverts commit `c2e8814dce`.	2021-04-18 19:29:24 +00:00
behzad nouri	d92721aab9	uses timeouts based on stake for filtering pull responses (#16549 ) filter_pull_responses is using default timeout when discarding pull responses (except for ContactInfo): https://github.com/solana-labs/solana/blob/f804ce63c/core/src/crds_gossip_pull.rs#L349-L350 But purging code uses timeouts based on stake: https://github.com/solana-labs/solana/blob/f804ce63c/core/src/cluster_info.rs#L1867-L1870 So the crds value will not be purged from the sender's table and will be sent again over the next pull request.	2021-04-14 20:18:00 +00:00
behzad nouri	f35a6a8be0	prioritizes contact-infos in pull responses (#16541 ) Expired crds values where the contact-info does not exist are wasted: https://github.com/solana-labs/solana/blob/f804ce63c/core/src/crds_gossip_pull.rs#L353-L378 and then are sent again over the next pull-request. Also, the stake of the first response (which can be anything) is used to weight all pull-responses to a node, while the rest of responses can have different stake. https://github.com/solana-labs/solana/blob/f804ce63c/core/src/cluster_info.rs#L2231	2021-04-14 18:45:20 +00:00
Justin Starry	85eb37fab0	Merge pull request from GHSA-8v47-8c53-wwrc * Track transaction check time separately from account loads * banking packet process metrics * Remove signature clone in status cache lookup * Reduce allocations when converting packets to transactions * Add blake3 hash of transaction messages in status cache * Bug fixes * fix tests and run fmt * Address feedback * fix simd tx entry verification * Fix rebase * Feedback * clean up * Add tests * Remove feature switch and fall back to signature check * Bump programs/bpf Cargo.lock * clippy * nudge benches * Bump `BankSlotDelta` frozen ABI hash` * Add blake3 to sdk/programs/Cargo.lock * nudge bpf tests * short circuit status cache checks Co-authored-by: Trent Nelson <trent@solana.com>	2021-04-13 00:28:08 -06:00
Christian Drappi	54a04bac3d	Apple M1 compatibility (#16346 ) Co-authored-by: Christian Drappi <christiandrappi@Christians-MacBook-Pro.local>	2021-04-09 17:21:01 -07:00
behzad nouri	22a18a68e3	stops consuming pinned vectors with a recycler (#16441 ) If the vector is pinned and has a recycler, From<PinnedVec> implementation of Vec should clone (instead of consuming) the underlying vector so that the next allocation of a PinnedVec will recycle an already pinned one.	2021-04-09 16:55:24 +00:00
Trent Nelson	b71875df61	cluster-info: Get rid of some integer math while we're here	2021-04-06 00:09:37 +00:00
Trent Nelson	b6b08706b9	cluster-info: Don't subtract non-shred spies from node count	2021-04-06 00:09:37 +00:00
behzad nouri	b041b55028	makes test_pull_request_time_pruning smaller (#16128 )	2021-03-25 22:44:43 +00:00
behzad nouri	a6c23648cb	limits CrdsGossipPull::pull_request_time size (#15793 ) There is no pruning logic on CrdsGossipPull::pull_request_time https://github.com/solana-labs/solana/blob/79ac1997d/core/src/crds_gossip_pull.rs#L172-L174 potentially allowing this to take too much memory. Additionally, CrdsGossipPush::last_pushed_to is pruning recent push timestamps: https://github.com/solana-labs/solana/blob/79ac1997d/core/src/crds_gossip_push.rs#L275-L279 instead of the older ones. Co-authored-by: Nathan Hawkins <utsl@utsl.org>	2021-03-24 18:33:56 +00:00
behzad nouri	570fd3f810	makes turbine peer computation consistent between broadcast and retransmit (#14910 ) get_broadcast_peers is using tvu_peers: https://github.com/solana-labs/solana/blob/84e52b606/core/src/broadcast_stage.rs#L362-L370 which is potentially inconsistent with retransmit_peers: https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1332-L1345 Also, the leader does not include its own contact-info when broadcasting shreds: https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1324 but on the retransmit side, slot leader is removed only _after_ neighbors and children are computed: https://github.com/solana-labs/solana/blob/84e52b606/core/src/retransmit_stage.rs#L383-L384 So the turbine broadcast tree is different between the two stages. This commit: * Removes retransmit_peers. Broadcast and retransmit stages will use tvu_peers consistently. * Retransmit stage removes slot leader _before_ computing children and neighbors.	2021-03-24 13:34:48 +00:00
behzad nouri	f2865dfd63	requires stakes for propagating crds values through gossip (#15561 )	2021-03-12 15:50:14 +00:00
behzad nouri	56923c91bf	limits number of unique pubkeys in the crds table (#15539 )	2021-03-10 20:46:05 +00:00
behzad nouri	5a9896706c	indexes epoch slots in crds table (#15459 ) ClusterInfo::get_epoch_slots_since scans the entire crds table to obtain epoch-slots inserted since a timestamp: https://github.com/solana-labs/solana/blob/013daa8f4/core/src/cluster_info.rs#L1245-L1262 The alternative is to index epoch-slots in crds table ordered by their insert timestamp.	2021-02-26 14:12:04 +00:00
carllin	c2e8814dce	Add limit and shrink policy for recycler (#15320 )	2021-02-24 00:15:58 -08:00
Michael Vines	5df36aec7d	Pacify clippy	2021-02-19 20:08:41 -08:00
behzad nouri	aa3aac766f	adds metrics for inbound/outbound gossip packets counts (#15407 )	2021-02-19 22:49:35 +00:00
behzad nouri	076c20f1ca	checks that prune-messages have the same inner/outer pubkey (#15352 )	2021-02-16 21:06:18 +00:00
behzad nouri	0ad063f4e9	adds flag to disable duplicate instance check (#15006 )	2021-02-03 16:26:17 +00:00
dependabot[bot]	1df93fa2be	chore: bump serde from 1.0.112 to 1.0.118 (#14828 ) * chore: bump serde from 1.0.112 to 1.0.122 Bumps [serde](https://github.com/serde-rs/serde) from 1.0.112 to 1.0.122. - [Release notes](https://github.com/serde-rs/serde/releases) - [Commits](https://github.com/serde-rs/serde/compare/v1.0.112...v1.0.122) Signed-off-by: dependabot[bot] <support@github.com> * [auto-commit] Update all Cargo lock files * Update frozen_abi digest following serde update * Revert "chore: bump serde from 1.0.112 to 1.0.122" This reverts commit `a3ef4442a4`. * Revert "[auto-commit] Update all Cargo lock files" This reverts commit `c41c3b005f`. * chore: bump serde from 1.0.112 to 1.0.118 Bumps [serde](https://github.com/serde-rs/serde) from 1.0.112 to 1.0.118. - [Release notes](https://github.com/serde-rs/serde/releases) - [Commits](https://github.com/serde-rs/serde/compare/v1.0.112...v1.0.118) Signed-off-by: dependabot[bot] <support@github.com> * [auto-commit] Update all Cargo lock files * Remove serum-dex pinning * blind commit! Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com> Co-authored-by: Ryo Onodera <ryoqun@gmail.com>	2021-02-02 23:28:16 +09:00
behzad nouri	e1021d9f83	removes redundant epoch stakes cache in retransmit (#14781 ) Following `d6d76219b`, staked nodes computed from vote accounts are already cached in runtime::Stakes, so the caching in retransmit_stage is redundant.	2021-01-24 21:15:09 +00:00
behzad nouri	491b059755	broadcasts duplicate shreds through gossip (#14699 )	2021-01-24 15:47:43 +00:00
behzad nouri	8e581601d6	patches crds vote-index assignment bug (#14438 ) If tower is full, old votes are evicted from the front of the deque: https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L367-L373 whereas recent votes if expire are evicted from the back: https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L529-L537 As a result, from a single tower_index scalar, we cannot infer which crds-vote should be overwritten: https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L576 In addition there is an off by one bug in the existing code. tower_index is bounded by MAX_LOCKOUT_HISTORY - 1: https://github.com/solana-labs/solana/blob/2074e407c/core/src/consensus.rs#L382 So, it is at most 30, whereas MAX_VOTES is 32: https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L29 Which means that this branch is never taken: https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L590-L593 so crds table alwasys keeps 29 oldest votes by wallclock, and then only overrides the 30st one each time. (i.e a tally of only two most recent votes).	2021-01-21 13:08:07 +00:00
behzad nouri	b5fd0ed859	rewrites turbine retransmit peers computation (#14584 )	2021-01-19 04:18:47 +00:00
Michael Vines	9ddd6f08e8	Persist gossip contact info	2020-12-27 20:46:54 -08:00
behzad nouri	2fd38d9912	indexes votes in crds table (#14272 )	2020-12-27 13:31:05 +00:00
behzad nouri	49019c6613	obtains staked-nodes from the root-bank (#14257 ) ... as opposed to the working bank	2020-12-27 13:28:05 +00:00
Michael Vines	ace360ade2	Multiple entrypoint support	2020-12-22 18:35:31 -08:00
Michael Vines	3373082ffa	Update entrypoint contact info even when shred version adoption is not requested	2020-12-22 18:35:31 -08:00
behzad nouri	a14cfd660a	removes &Arc<Self> receivers (#14234 )	2020-12-22 23:51:53 +00:00
behzad nouri	691031fefd	limits number of crds values returned when responding to pull requests (#13739 ) Crds values buffered when responding to pull-requests can be very large taking a lot of memory. Added a limit for number of buffered crds values based on outbound data budget.	2020-12-18 18:45:12 +00:00
behzad nouri	6a3797e164	adds crds-value for broadcasting duplicate shreds through gossip (#14133 ) In gossip, the header overhead we get from: https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/cluster_info.rs#L434-L435 https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/crds_value.rs#L31-L36 https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/crds_value.rs#L73 already exceeds SIZE_OF_NONCE in shreds. We also need aditional meta-data (wallclock, source pubkey, ...). Which means that given the SHRED_PAYLOAD_SIZE, we cannot fit all these in PACKET_DATA_SIZE: https://github.com/solana-labs/solana/blob/de9ac43eb/ledger/src/shred.rs#L80 On top of that, we need 2 shred payloads as the proof of duplicate. So each DuplicateShred crds value includes only a chunk of the payload, along with the meta-data to reconstruct the full payload from the chunks on the receiving end.	2020-12-18 14:32:43 +00:00
behzad nouri	d6d76219b6	caches staked nodes computed from vote-accounts (#13929 )	2020-12-17 21:22:50 +00:00
Michael Vines	7143aaa89b	Clippy	2020-12-14 08:03:29 -08:00
behzad nouri	409fe3bca1	adds the instance token to crds-labels for node-instance crds-values (#14037 ) If a node "a" receives instance-info from node "b1" it will override any instance-info associated with "b1" pubkey in its crds table. This makes it less likely that when "b1" receives crds values from "a" (either through pull or push), it sees other instances of itself (because node "a" discarded them when it received "b1" instance info). In order for the crds table to contain all instance-info associated with the same pubkey at the same time, we need to add the instance tokens to the keys in the crds table (i.e. the CrdsValueLabel).	2020-12-10 17:01:55 +00:00
behzad nouri	1d267eae6b	std::process::exit to kill all threads	2020-12-09 10:24:23 -08:00
behzad nouri	895d7d6a65	removes RwLock on ClusterInfo.instance	2020-12-09 10:24:23 -08:00
behzad nouri	542198180a	pushes node-instance along with version early in gossip	2020-12-09 10:24:23 -08:00
behzad nouri	8cd5eb9863	checks for duplicate validator instances using gossip	2020-12-09 10:24:23 -08:00
behzad nouri	6706f2b3bb	removes recursive read-locks on gossip (#13973 ) ClusterInfo::tvu_peers acquires a read-lock on gossip: https://github.com/solana-labs/solana/blob/f0e934145/core/src/cluster_info.rs#L1171-L1185 and so, ClusterInfo::repair_peers is recursively locking gossip for read twice: https://github.com/solana-labs/solana/blob/f0e934145/core/src/cluster_info.rs#L1202-L1223 But std::sync::RwLock is not re-entrant (recursive).	2020-12-06 15:14:49 +00:00
behzad nouri	c3048b451d	samples repair peers using WeightedIndex (#13919 ) To output one random sample, weighted_best generates n random numbers: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63 WeightedIndex does so with only one random number: https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240 Additionally, if the index is already constructed, it only does a total of O(log(n)) amount of work; which can be achieved if RepairCache, caches the weighted index: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83 Also, the repair-peers code can be reorganized to have fewer redundant unlock-then-lock code.	2020-12-03 14:26:07 +00:00
Tyera Eulberg	10c81a2448	Remove rpc_banks from validator (#13882 ) * Remove rpc_banks from validator * Bump abi-digest	2020-12-02 03:25:09 +00:00
behzad nouri	26bf2b7e45	processes pull-request callers only once per unique caller (#13750 ) process_pull_requests acquires a write lock on crds table to update records timestamp for each of the pull-request callers: https://github.com/solana-labs/solana/blob/3087c9049/core/src/crds_gossip_pull.rs#L287-L300 However, pull-requests overlap a lot in callers and this function ends up doing a lot of redundant duplicate work. This commit obtains unique callers before acquiring an exclusive lock on crds table.	2020-11-22 17:51:14 +00:00
sakridge	c1eb350c47	Allow contact debug interval to be adjusted (#13737 )	2020-11-20 14:47:37 -08:00

1 2 3 4 5 ...

301 Commits