Commit Graph

2333 Commits

Author SHA1 Message Date
mergify[bot]
40c95dde4f prioritizes more recent values in pull responses (#17238) (#17384)
On the receiving end, the outdated values are discarded, and they will
only waste bandwidth:
https://github.com/solana-labs/solana/blob/3f0480d06/core/src/crds_gossip_pull.rs#L385-L400

This is also exacerbating validator start, since the entrypoint is
returning old values in pull responses, and the validator immediately
discards those; resulting in huge delay until the validator obtains
contact-info of the entrypoint and is able to adopt shred-version and
fully start.

(cherry picked from commit 5e6b00fe98)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-21 16:03:04 +00:00
mergify[bot]
0d38f11998 bumps up min number of bloom items in gossip pull requests (#17236) (#17383)
When a validator starts, it has an (almost) empty crds table and it only
sends one pull-request to the entrypoint. The bloom filter in the
pull-request targets 10% false rate given the number of items. So, if
the `num_items` is very wrong, it makes a very small bloom filter with a
very high false rate:
https://github.com/solana-labs/solana/blob/2ae57c172/runtime/src/bloom.rs#L70-L80
https://github.com/solana-labs/solana/blob/2ae57c172/core/src/crds_gossip_pull.rs#L48

As a result, it is very unlikely that the validator obtains entrypoint's
contact-info in response. This exacerbates how long the validator will
loop on:
    > Waiting to adopt entrypoint shred version
https://github.com/solana-labs/solana/blob/ed51cde37/validator/src/main.rs#L390-L412

This commit increases the min number of bloom items when making gossip
pull requests. Effectively this will break the entrypoint crds table
into 64 shards, one pull-request for each, a larger bloom filter for
each shard, and increases the chances that the response will include
entrypoint's contact-info, which is needed for adopting shred version
and validator start.

(cherry picked from commit e8b35a4f7b)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-21 15:21:52 +00:00
mergify[bot]
d2e98cb531 prunes received-cache only once per unique owner's key (#17039) (#17337)
(cherry picked from commit 0e646d10bb)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-19 22:50:42 +00:00
mergify[bot]
32681e2739 removes manual trait impl for contact-info (#17332) (#17335)
The current implementations use only the id and disregard other fields,
in particular wallclock. This can lead to bugs where an outdated
contact-info shadows or overrides a current one because they compare
equal.

(cherry picked from commit 13b032b2d4)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-19 22:33:32 +00:00
mergify[bot]
dc0b21fa83 patches flaky test_new_mark_creation_time (#17288) (#17336)
(cherry picked from commit f7b0184f81)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-19 22:22:24 +00:00
mergify[bot]
f80af6dc1c adds gossip metrics for number of staked nodes (#17330) (#17333)
(cherry picked from commit e7073ecab1)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-19 20:41:07 +00:00
Tyera Eulberg
409ac4dcfa Bump version to v1.6.10 (#17250) 2021-05-15 01:47:56 +00:00
mergify[bot]
a08a6d55fa test-validator: Display genesis hash in dashboard (backport #17216) (#17225)
* rpc: plumb shred_version through RpcContactInfo

(cherry picked from commit 67e6a3106f)

* test-validator: Display more cluster info in dash

(cherry picked from commit 754c708473)

Co-authored-by: Trent Nelson <trent@solana.com>
2021-05-14 09:56:27 +00:00
mergify[bot]
4313240b1b Return error for excluded secondary-index keys (backport #17193) (#17215)
* Return error for excluded secondary-index keys (#17193)

* Add runtime helpers to check secondary indexes for key

* Add custom rpc error

* Check secondary-index key inclusion in rpc

* Clone complete AccountSecondaryIndexes into rpc to avoid bank query

(cherry picked from commit 27004f1b76)

# Conflicts:
#	core/src/rpc.rs

* Fix conflicts

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-05-13 23:04:01 +00:00
mergify[bot]
733ef4b0b8 type AccountSecondaryIndexes = HashSet (backport #17108) (#17149)
* type AccountSecondaryIndexes = HashSet (#17108)

(cherry picked from commit f39dda00e0)

# Conflicts:
#	runtime/benches/accounts.rs
#	runtime/src/accounts.rs
#	runtime/src/accounts_db.rs
#	runtime/src/accounts_index.rs

* resolve merge errors

Co-authored-by: Jeff Washington (jwash) <75863576+jeffwashington@users.noreply.github.com>
Co-authored-by: Jeff Washington (jwash) <wash678@gmail.com>
2021-05-10 20:55:33 +00:00
mergify[bot]
0cf83887c6 Move block-time caching earlier (#17109) (#17150)
* Require that blockstore block-time only be recognized slot, instead of root

* Move cache_block_time to after Bank freeze

* Single use statement

* Pass transaction_status_sender by reference

* Remove unnecessary slot-existence check before caching block time altogether

* Move block-time existence check into Blockstore::cache_block_time, Blockstore no longer needed in blockstore_processor helper

(cherry picked from commit 6e9deaf1bd)

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-05-10 20:31:56 +00:00
mergify[bot]
094271be7d indexes crds values by their insert order (backport #16809) (#17132)
* indexes crds values by their insert order

(cherry picked from commit dfa3e7a61c)

* reads gossip push messages off crds ordinal index

Having an ordinal index on crds values based on insert order allows to
efficiently filter values using a cursor. In particular
CrdsGossipPush::push_messages hash-map can be replaced with a cursor,
saving on the bookkeepings, purging, etc

(cherry picked from commit 22c02b917e)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-10 00:00:00 +00:00
Michael Vines
65e1b881f9 Bump version to v1.6.9 2021-05-08 06:28:08 +00:00
mergify[bot]
28b9e5b572 getBlockProduction now correctly reports block production (#17116)
(cherry picked from commit d6c076f1b6)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-05-08 04:19:39 +00:00
mergify[bot]
330f42c375 implements cursor for gossip crds table queries (#16952) (#17084)
VersionedCrdsValue.insert_timestamp is used for fetching crds values
inserted since last query:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298

So it is crucial that insert_timestamp does not go backward in time when
new values are inserted into the table. However std::time::SystemTime is
not monotonic, or due to workload, lock contention, thread scheduling,
etc, ... new values may be inserted with a stalled timestamp way in the
past. Additionally, reading system time for the above purpose is
inefficient/unnecessary.

This commit adds an ordinal index to crds values indicating their insert
order. Additionally, it implements a new Cursor type for fetching values
inserted since last query.

(cherry picked from commit fa86a335b0)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-06 15:23:01 +00:00
mergify[bot]
970bba495f Add --tower argument to specify where tower files are persisted (#17060)
(cherry picked from commit 9ba2c53b85)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-05-05 20:37:36 +00:00
Michael Vines
524b380a71 Bump version to 1.6.8 2021-05-04 12:46:57 -07:00
mergify[bot]
7723673038 test-validator: Plumb --limit-ledger-size (#17027)
(cherry picked from commit f17b80236f)

Co-authored-by: Trent Nelson <trent@solana.com>
2021-05-04 10:09:53 +00:00
mergify[bot]
32fc4e3d0f Add keys (backport #17014) (#17015)
* Rotate keys

(cherry picked from commit b2778f34f5)

* Key rotation

(cherry picked from commit b948a18841)

* Add keys

(cherry picked from commit 6318705607)

Co-authored-by: publish-docs.sh <maintainers@solana.com>
2021-05-04 01:34:53 +00:00
mergify[bot]
834c96a374 validates gossip addresses before sending pull-requests (backport #16748) (#17009)
* uses Mutex instead of RwLock for ping_cache

(cherry picked from commit 2231017b35)

* validates gossip addresses before sending pull-requests

IP addresses need to be validated before sending packets to them.
This commit, sends a ping packet to nodes before any pull requests.
Pull requests are then only sent to the nodes which have responded with
the correct hash of their respective ping packet.

(cherry picked from commit 7cea2c4466)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-03 19:40:02 +00:00
mergify[bot]
2195d980a2 patches local pending push messages processing (#16833) (#17007)
process_push_messages writes local pending push messages to the crds
table, but it discards the return value:
https://github.com/solana-labs/solana/blob/cf779c63c/core/src/crds_gossip.rs#L96-L102

In order to exclude outdated values from the next pull-request, we need
to record the hash of values purged/overridden by the local push
messages, otherwise pull-responses will return outdated values back to
the node:
https://github.com/solana-labs/solana/blob/c1829dd00/core/src/crds_gossip_pull.rs#L447-L452

Additionally, gossip packets arrive and are processed out of order. So,
local pending push messages should be flushed *before* generating bloom
filters for pull-requests, preventing pull-responses returning the same
values back to the node itself. This requires flipping order of
generating pull and push messages:
https://github.com/solana-labs/solana/blob/cf779c63c/core/src/cluster_info.rs#L1757-L1762

Both above bugs cause redundant traffic and bandwidth waste in gossip
pull-responses.

(cherry picked from commit a698e34744)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-05-03 17:27:38 +00:00
mergify[bot]
c6c7feb0c2 Retry latest vote if expired (#16735) (#16927)
(cherry picked from commit b5d30846d6)

Co-authored-by: carllin <carl@solana.com>
2021-05-03 05:13:29 +00:00
mergify[bot]
cc7fc447a4 Distinguish max replayed and max observed vote (#16936) (#16956)
(cherry picked from commit 5981399612)

Co-authored-by: carllin <carl@solana.com>
2021-04-30 00:48:56 +00:00
Michael Vines
49a415414f Add getBlockProduction RPC method 2021-04-28 21:38:53 -07:00
mergify[bot]
25aee12502 retains peer's contact-info when making pull requests (#16715) (#16907)
ClusterInfo::new_pull_requests has to lookup contact-infos:
https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/cluster_info.rs#L1663-L1673

when it was already available when making pull requests:
https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/crds_gossip_pull.rs#L232

(cherry picked from commit 25054bfd35)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-28 14:54:54 +00:00
mergify[bot]
d8e8528797 removes delayed crds inserts when upserting gossip table (#16806) (#16905)
It is crucial that VersionedCrdsValue::insert_timestamp does not go
backward in time:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds.rs#L67-L79

Otherwise methods such as get_votes and get_epoch_slots_since will
break, which will break their downstream flow, including vote-listener
and optimistic confirmation:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298

For that, Crds::new_versioned is intended to be called "atomically" with
Crds::insert_verioned (as the comment already says so):
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds.rs#L126-L129

However, currently this is violated in the code. For example,
filter_pull_responses creates VersionedCrdsValues (with the current
timestamp), then acquires an exclusive lock on gossip, then
process_pull_responses writes those values to the crds table:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L2375-L2392

Depending on the workload and lock contention, the insert_timestamps may
well be in the past when these values finally are inserted into gossip.

To avoid such scenarios, this commit:
  * removes Crds::new_versioned and Crd::insert_versioned.
  * makes VersionedCrdsValue constructor private, only invoked in
    Crds::insert, so that insert_timestamp is populated right before
    insert.

This will improve insert_timestamp monotonicity as long as Crds::insert
is not called with a stalled timestamp. Following commits may further
improve this by calling timestamp() inside Crds::insert, and/or
switching to std::time::Instant which guarantees monotonicity.

(cherry picked from commit 1ac2a8cfa5)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-28 13:36:21 +00:00
mergify[bot]
ed8c796877 moves cluster-info metrics to a separate module (#16883) (#16898)
(cherry picked from commit b17d5eeaee)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-28 04:18:46 +00:00
mergify[bot]
4a35053fba uses current timestamp when flushing local pending push queue (#16808) (#16896)
local_message_pending_push_queue is recording timestamps at the time the
value is created, and uses that when the pending values are flushed:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L321
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds_gossip.rs#L96-L102

which is then used as the insert_timestamp when inserting values in the
crds table:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds_gossip_push.rs#L183

The flushing may happen 100ms after the values are created (or even
later if there is a lock contention). This will cause non-monotone
insert_timestamps in the crds table (where time goes backward),
hindering the usability of insert_timestamps for other computations.

For example both ClusterInfo::get_votes and get_epoch_slots_since rely
on monotone insert_timestamps when values are inserted into the table:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298

This commit removes timestamps from local_message_pending_push_queue and
uses current timestamp when flushing the queue.

(cherry picked from commit b468ead1b1)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-28 01:59:29 +00:00
mergify[bot]
5a3bf5c90e limits to data_header.size when combining shreds' payloads (backport #16708) (#16870)
* limits to data_header.size when combining shreds' payloads (#16708)

Shredder::deshred is ignoring data_header.size when combining shreds' payloads:
https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/shred.rs#L940-L961

Also adding more sanity checks on the alignment of data shreds indices.

(cherry picked from commit 0f3ac51cf1)

# Conflicts:
#	ledger/src/shred.rs

* removes backport merge conflicts

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-27 14:44:58 +00:00
mergify[bot]
de6ec11efc records hash of values purged by expired pull-responses (#16800) (#16871)
process_pull_responses should record hash of values purged by expired
responses (as well as unexpired ones):
https://github.com/solana-labs/solana/blob/c1829dd00/core/src/crds_gossip_pull.rs#L385-L387

otherwise, these values are not excluded from following pull-requests
(from likely different nodes):
https://github.com/solana-labs/solana/blob/c1829dd00/core/src/crds_gossip_pull.rs#L447-L452

and would waste bandwidth should they be included in subsequent
pull-responses.

(cherry picked from commit 3b8d6b59fb)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-27 13:26:11 +00:00
mergify[bot]
7aec87c086 Add getVoteAccounts RPC method parameter to restrict results to a single vote account (#16859)
(cherry picked from commit 59fc33635a)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-27 05:43:44 +00:00
mergify[bot]
4f20798654 removes old runtime feature gates in gossip and turbine (#16633) (#16828)
(cherry picked from commit 9706512115)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-26 18:40:42 +00:00
mergify[bot]
57dd8a555a Disable flaky test_poh_service (#16772) (#16797)
(cherry picked from commit 63436cc2bf)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-24 04:38:27 +00:00
mergify[bot]
d9726e61bc retains crds values if the origin is still active (#16576) (#16771)
Local timestamps are updated for records associated with a pubkey if the
origin is still active:
https://github.com/solana-labs/solana/blob/c8ed14c64/core/src/crds.rs#L301-L311

However this is done inconsistently on some gossip paths (pull requests
and pull responses) but not all (e.g. push messages). Additionally
update_record_timestamp is inefficient since there can be ~800 values
associated with each pubkey.

This commit updates records timestamps only on contact-infos; and,
instead utilizes origin's timestamp when purging old values.

(cherry picked from commit 2c82f2154d)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-23 16:42:54 +00:00
mergify[bot]
786fa4f22e removes first_coding_index from erasure recovery code (#16646) (#16770)
first_coding_index is the same as the set_index and is so redundant:
https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/blockstore_meta.rs#L49-L60

(cherry picked from commit 03194145c0)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-23 13:21:27 +00:00
mergify[bot]
5b74678e37 Ingest votes from gossip into fork choice (#16560) (#16724)
(cherry picked from commit 4c94f8933f)

Co-authored-by: carllin <carl@solana.com>
2021-04-23 07:20:10 +00:00
mergify[bot]
d203bd1998 Add TPU client for sending txs to the current leader tpu port (#16736) (#16762)
* Add TPU client for sending txs to the current leader tpu port

* Update tpu_client.rs

(cherry picked from commit 75b8434b76)

Co-authored-by: Justin Starry <justin@solana.com>
2021-04-23 02:50:30 +00:00
mergify[bot]
fadf1efa41 Update getLeaderSchedule options (#16749) (#16752)
(cherry picked from commit 636b5987af)

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-04-22 21:02:46 +00:00
mergify[bot]
13e176a633 getLeaderSchedule now supports filtered results based on validator identity (#16731)
(cherry picked from commit 6004c0abf5)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-22 02:29:01 +00:00
mergify[bot]
e51d7af847 verify_pubkey() now takes a ref (#16725)
(cherry picked from commit 91b6888e15)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-21 23:22:13 +00:00
mergify[bot]
ae605f8f02 expands number of erasure coding shreds in the last batch in slots (backport #16484) (#16707)
* expands number of erasure coding shreds in the last batch in slots (#16484)

Number of parity coding shreds is always less than the number of data
shreds in FEC blocks:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719

Data shreds are batched in chunks of 32 shreds each:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714

However the very last batch of data shreds in a slot can be small, in
which case the loss rate can be exacerbated.

This commit expands the number of coding shreds in the last FEC block in
slots to: 64 - number of data shreds; so that FEC blocks are always 64
data and parity coding shreds each.

As a consequence of this, the last FEC block has more parity coding
shreds than data shreds. So for some shred indices we will have a coding
shred but no data shreds. This should not cause any kind of overlapping
FEC blocks as in:
https://github.com/solana-labs/solana/pull/10095
since this is done only for the very last batch in a slot, and the next
slot will reset the shred index.

(cherry picked from commit 37b8587d4e)

# Conflicts:
#	core/benches/shredder.rs
#	ledger/src/shred.rs

* removes backport merge conflicts

* ignore the flaky test for now

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-21 15:25:26 +00:00
mergify[bot]
e15ddbb979 Add port and gossip options to solana-test-validator (#16696) (#16698)
(cherry picked from commit 0924c2d070)

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-04-21 03:46:52 +00:00
mergify[bot]
a5794efe16 getVoteAccounts: Limit the length of the epoch_credits array (#16692)
(cherry picked from commit 34addee882)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-20 22:55:17 +00:00
mergify[bot]
a8836649cb uses current local timestamp when recording purged values (#16675)
CrdsGossipPull.purged_values is meant to record recently purged values
so that they are excluded from imminent pull requests, until the entire
cluster have synced to the updated value:
https://github.com/solana-labs/solana/blob/c826cddbb/core/src/crds_gossip_pull.rs#L449-L454

However, VersionedCrdsValue.local_timestamp represents the local time
when the value was last updated, and given that crds values may have
different timeouts based on stake, it does not necessarily represent how
recently the value was purged:
https://github.com/solana-labs/solana/blob/c826cddbb/core/src/crds.rs#L75-L76

As such, recording current local timestamp when purging values is more
appropriate. Additionally, purge_purged assumes that the purge_values is
sorted in timestamps when draining the old ones; which is not true if
those timestamps are VersionedCrdsValue.local_timestamp:
https://github.com/solana-labs/solana/blob/c826cddbb/core/src/crds_gossip_pull.rs#L563-L571

(cherry picked from commit bc90e04e64)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-04-20 12:42:30 +00:00
mergify[bot]
558a46f5d5 RPC: use finalized as default pubsub commitment level (#16659) (#16666)
* RPC: use finalized as default pubsub commitment level

* update docs

* Fix tests

(cherry picked from commit a7e65c0034)

Co-authored-by: Justin Starry <justin@solana.com>
2021-04-20 09:47:50 +00:00
mergify[bot]
5057aaddc0 Send votes to next leader's TPU instead of our TPU (#16663)
(cherry picked from commit c8b474cd0b)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-20 08:45:58 +00:00
Michael Vines
a1b0f2f681 Increase test timeout 2021-04-19 04:12:16 +00:00
Michael Vines
f59d4f29d9 clippy 2021-04-19 04:12:16 +00:00
mergify[bot]
25491780df Remove unnecessary clone (#16621)
(cherry picked from commit 6907a2366e)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-17 18:31:18 +00:00
Trent Nelson
4e94446fc3 Bump version to v1.6.7 2021-04-16 23:31:30 +00:00