Commit Graph

2137 Commits

Author SHA1 Message Date
Michael Vines
3d3bdcb966 Drop Error suffix from enum values to avoid the enum_variant_names clippy lint
(cherry picked from commit 4a12c715a3)
2021-06-18 19:59:20 -07:00
mergify[bot]
0e7512a225 Fix Nightly Clippy Warnings (backport #18065) (#18070)
* chore: cargo +nightly clippy --fix -Z unstable-options

(cherry picked from commit 6514096a67)

# Conflicts:
#	core/src/banking_stage.rs
#	core/src/cost_model.rs
#	core/src/cost_tracker.rs
#	core/src/execute_cost_table.rs
#	core/src/replay_stage.rs
#	core/src/tvu.rs
#	ledger-tool/src/main.rs
#	programs/bpf_loader/build.rs
#	rbpf-cli/src/main.rs
#	sdk/cargo-build-bpf/src/main.rs
#	sdk/cargo-test-bpf/src/main.rs
#	sdk/src/secp256k1_instruction.rs

* chore: cargo fmt

(cherry picked from commit 789f33e8db)

* Updates BPF program assert_instruction_count tests.

(cherry picked from commit c1e03f3410)

# Conflicts:
#	programs/bpf/tests/programs.rs

* Resolve conflicts

Co-authored-by: Alexander Meißner <AlexanderMeissner@gmx.net>
Co-authored-by: Michael Vines <mvines@gmail.com>
2021-06-18 20:02:48 +00:00
mergify[bot]
e9c234d89f Make account shrink configurable #17544 (backport #17778) (#18013)
* Make account shrink configurable #17544 (#17778)

1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true

Fixes #17544

(cherry picked from commit 269d995832)

# Conflicts:
#	core/tests/snapshots.rs
#	ledger/src/bank_forks_utils.rs
#	runtime/src/accounts_db.rs
#	runtime/src/snapshot_utils.rs

* fix some merge errors

Co-authored-by: Lijun Wang <83639177+lijunwangs@users.noreply.github.com>
Co-authored-by: Jeff Washington (jwash) <wash678@gmail.com>
2021-06-17 17:32:03 +00:00
mergify[bot]
4df9da5c48 validator: run poh speed test earlier in start up (#18024)
(cherry picked from commit 5bc6c89adc)

Co-authored-by: Trent Nelson <trent@solana.com>
2021-06-16 23:28:15 +00:00
mergify[bot]
6479c11e9a Don't store votes unless we are leader soon (#17803) (#17894)
(cherry picked from commit 0feac57cb0)

Co-authored-by: sakridge <sakridge@gmail.com>
2021-06-16 21:28:06 +00:00
mergify[bot]
ef205593c5 removes port-based forwarding logic from turbine retransmit (#17716) (#17973)
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
https://github.com/solana-labs/solana/issues/6672
https://github.com/solana-labs/solana/pull/7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
https://github.com/solana-labs/solana/pull/17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.

(cherry picked from commit 161838655c)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-06-15 15:16:20 +00:00
mergify[bot]
e2e41a29eb Don't use pinned memory when unnecessary (#17832) (#17934)
Reports of excessive GPU memory usage and errors
from cudaHostRegister. There are some cases where pinning is
not required.

(cherry picked from commit eeee75c5be)

Co-authored-by: sakridge <sakridge@gmail.com>
2021-06-14 16:30:51 +00:00
mergify[bot]
274a238a00 Port unconfirmed duplicate tracking logic from ProgressMap to ForkChoice (#17779) (#17889)
(cherry picked from commit c8535be0e1)

Co-authored-by: carllin <carl@solana.com>
2021-06-11 11:41:59 +00:00
mergify[bot]
10507f0ade Account for duplicate before a bank is frozen or replayed (#17866) (#17883)
(cherry picked from commit afafa624a3)

Co-authored-by: carllin <carl@solana.com>
2021-06-11 07:06:42 +00:00
mergify[bot]
738df79394 Add local cluster tests that broadcast duplicate slots (#13995) (#17863)
* Add duplicate node local cluster test

* fix clippy

* remove dupe test

(cherry picked from commit 050bb5446d)

Co-authored-by: Justin Starry <justin@solana.com>
2021-06-10 03:17:44 +00:00
mergify[bot]
c9318f8dc2 Wrap long lines (#17842)
(cherry picked from commit e5e7390d44)

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-06-09 08:10:34 +00:00
mergify[bot]
9f35db28e5 Switch EpochSlots to be frozen slots, not completed slots (#17168) (#17776)
(cherry picked from commit 96ba2edfeb)

Co-authored-by: carllin <carl@solana.com>
2021-06-07 22:51:30 +00:00
mergify[bot]
fd68b4e7a8 Support out of band dumping of unrooted slots in AccountsDb (#17269) (#17777)
* Accounts dumping logic

* Add test for interaction between cache flush and remove_unrooted_slot()

* Update comments

* Rename

* renaming

* Add more comments

* Renaming

* Fixup test and bad check

(cherry picked from commit bbcdf073ba)

Co-authored-by: carllin <carl@solana.com>
2021-06-07 09:37:55 +00:00
mergify[bot]
e247625025 Create solana-poh and move remaining rpc modules to solana-rpc (backport #17698) (#17745)
* Create solana-poh and move remaining rpc modules to solana-rpc (#17698)

* Create solana-poh crate

* Move BigTableUploadService to solana-ledger

* Add solana-rpc to workspace

* Move dependencies to solana-rpc

* Move remaining rpc modules to solana-rpc

* Single use statement solana-poh

* Single use statement solana-rpc

(cherry picked from commit 544b3c0d17)

# Conflicts:
#	Cargo.lock
#	banking-bench/Cargo.toml
#	core/Cargo.toml
#	core/benches/banking_stage.rs
#	local-cluster/Cargo.toml
#	rpc/Cargo.toml
#	stake-monitor/Cargo.toml
#	validator/Cargo.toml

* Fix conflicts & versions

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-06-04 18:19:08 +00:00
mergify[bot]
0de1ce0c2c adds fallback logic if retransmit multicast fails (#17714) (#17742)
In retransmit-stage, based on the packet.meta.seed and resulting
children/neighbors, each packet is sent to a different set of peers:
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L421-L457

However, current code errors out as soon as a multicast call fails,
which will skip all the remaining packets:
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can exacerbate packets loss in turbine.

This commit:
  * keeps iterating over retransmit packets for loop even if some
    intermediate sends fail.
  * adds a fallback to UdpSocket::send_to if multicast fails.

Recent discord chat:
https://discord.com/channels/428295358100013066/689412830075551748/849530845052403733

(cherry picked from commit be957f25c9)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2021-06-04 16:20:44 +00:00
mergify[bot]
5e7db52087 Avoid full-range compactions with periodic filtered b.g. ones (backport #16697) (#17741)
* Avoid full-range compactions with periodic filtered b.g. ones (#16697)

* Update rocksdb to v0.16.0

* Promote the infrequent and important log to info!

* Force background compaction by ttl without manual compaction

* Fix test

* Support no compaction mode in test_ledger_cleanup_compaction

* Fix comment

* Make compaction_interval customizable

* Avoid major compaction with periodic filtering...

* Adress lazy_static, special cfs and range check

* Clean up a bit and add comment

* Add comment

* More comments...

* Config code cleanup

* Add comment

* Use .conflicts_with()

* Nullify unneeded delete_range ops for special CFs

* Some clean ups

* Clarify the locking intention

* Ensure special CFs' consistency with PurgeType::CompactionFilter

* Fix comment

* Fix bad copy paste

* Fix various types...

* Don't use tuples

* Add a unit test for compaction_filter

* Fix typo...

* Remove flag and just use new behavior always

* Fix wrong condition negation...

* Doc. about no set_last_purged_slot in purge_slots

* Write a test and fix off-by-one bug....

* Apply suggestions from code review

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Follow up to github review suggestions

* Fix line-wrapping

* Fix conflict

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
(cherry picked from commit 1f97b2365f)

# Conflicts:
#	ledger/src/blockstore_db.rs

* Fix conflict

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
2021-06-04 14:38:02 +00:00
mergify[bot]
893df9b277 Rename ValidatorExit and move to sdk (#17728) (#17729)
(cherry picked from commit 3a647c4bea)

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-06-04 04:38:49 +00:00
Tyera Eulberg
ab581dafc2 Add block height to ConfirmedBlock structs (#17523)
* Add BlockHeight CF to blockstore

* Rename CacheBlockTimeService to be more general

* Cache block-height using service

* Fixup previous proto mishandling

* Add block_height to block structs

* Add block-height to solana block

* Fallback to BankForks if block time or block height are not yet written to Blockstore

* Add docs

* Review comments
2021-05-26 22:16:16 -06:00
Michael Vines
9541411c15 Plumb transaction-level rewards (aka "rent debits") into the getTransaction RPC method 2021-05-27 03:05:05 +00:00
carllin
52dccc656a Purge slots greater than new last index (#16071) 2021-05-26 16:12:57 -07:00
Michael Vines
cbce440af4 simulateTransaction can now return accounts modified by the simulation 2021-05-26 14:20:23 -07:00
Tyera Eulberg
6abe089740 Add custom error for tx-history queries when node does not support (#17494) 2021-05-26 13:27:41 -06:00
Tyera Eulberg
9a5330b7eb Move gossip modules into solana-gossip crate (#17352)
* Move gossip modules to solana-gossip

* Update Protocol abi digest due to move

* Move gossip benches and hook up CI

* Remove unneeded Result entries

* Single use statements
2021-05-26 09:15:46 -06:00
Tyera Eulberg
e9bc1c6b07 Add last valid block height to rpc Fees (#17506)
* Add last_valid_block_height to fees rpc

* Add getBlockHeight rpc

* Update docs
2021-05-26 07:26:19 +00:00
Justin Starry
660d37aadf sigVerify conflicts with replace, add tests 2021-05-25 17:32:00 -07:00
Justin Starry
e14f3eb529 rename flag 2021-05-25 17:32:00 -07:00
Justin Starry
96cef5260c Add a flag to simulateTransaction to use most recent blockhash 2021-05-25 17:32:00 -07:00
carllin
d8bc56fa51 Refactor purge_slots_from_cache_and_store() and handle_reclaims() (#17319) 2021-05-24 13:51:17 -07:00
Tyera Eulberg
41ec1c8d50 Add blockstore-root-scan for api nodes on boot (#17402)
* Add blockstore-root-scan for api nodes on boot

* Ensure cluster-confirmed root and parents are set as root in blockstore in load_frozen_forks()

* Plumb rpc-scan-and-fix-roots validator flag
2021-05-24 13:24:47 -06:00
Michael Vines
30b60a976b Avoid ip_echo_server unwrap 2021-05-24 12:10:50 -07:00
behzad nouri
e867d7f3b8 removes Crds::lookup and lookup_versioned (#17438) 2021-05-24 18:21:54 +00:00
sakridge
a8dca3976b Refactor genesis download/load/check functions (#17276)
* Refactor genesis ingest functions

* Consolidate genesis.bin/genesis.tar.bz2 references
2021-05-24 16:45:36 +02:00
behzad nouri
9d112cf41f encapsulates purged values bookkeeping into crds module (#17265)
For all code paths (gossip push, pull, purge, etc) that remove or
override a crds value, it is necessary to record hash of values purged
from crds table, in order to exclude them from subsequent pull-requests;
otherwise the next pull request will likely return outdated values,
wasting bandwidth:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491

Currently this is done all over the place in multiple modules, and this
has caused bugs in the past where purged values were not recorded.

This commit encapsulated this bookkeeping into crds module, so that any
code path which removes or overrides a crds value, also records the hash
of purged value in-place.
2021-05-24 13:47:21 +00:00
behzad nouri
060332c704 indexes crds votes by insert order (#17340)
Crds::get_votes is scanning over all votes in the crds table only to
return those inserted since the given cursor:
https://github.com/solana-labs/solana/blob/2ae57c172/core/src/crds.rs#L250-L266

Having votes indexed by insert order avoids the table scan and will be
more efficient.
2021-05-24 13:35:01 +00:00
behzad nouri
5567305a5f rolls back min number of bloom items for debug builds (#17420)
coverage ci builds are have become flaky presumably because of the
overhead added in https://github.com/solana-labs/solana/pull/17236
for very small test clusters.

This commit uses a smaller min number of bloom items condition on that
if debug assertions are enabled or not.

Previous attempt at fixing the flakiness:
https://github.com/solana-labs/solana/pull/17408
2021-05-23 16:50:19 +00:00
carllin
8664b2cc39 Fix bad assertion (#17401) 2021-05-22 20:18:13 -07:00
behzad nouri
cf1acfb021 uses Duration type for gossip discover timeout 2021-05-22 19:17:36 +00:00
behzad nouri
d6496376ce increases timeout duration for gossip discover 2021-05-22 19:17:36 +00:00
behzad nouri
a7870cda8d records hash of timed-out pull responses
Gossip should record hash of pull responses which are timed out and
fail to insert:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L397-L400

so that they are excluded from the next pull request:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491

otherwise the next pull request will likely include the same timed out
values and waste bandwidth.
2021-05-22 17:02:24 +00:00
Nikita
d41266e4e9 rpc: add context toggle to getProgramAccounts (#17399)
* fix(rpc): return context in get_program_accounts

* doc(rpc): document withContext flag

* fix(rpc): fix comment

Co-authored-by: Michael Vines <mvines@gmail.com>

* fix(rpc): fix doc

Co-authored-by: Michael Vines <mvines@gmail.com>

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-05-22 07:12:21 +00:00
behzad nouri
9471ba61c5 removes redundant slots sort in push_epoch_slots 2021-05-21 19:17:15 +00:00
behzad nouri
9339a6f8f3 locks gossip only once in push_epoch_slots
push_epoch_slots is unlocking and locking again gossip when iterating
over epoch slot indices which is wasteful:
https://github.com/solana-labs/solana/blob/0486df02b/core/src/cluster_info.rs#L915-L929
2021-05-21 19:17:15 +00:00
behzad nouri
ff0e623d30 removes the nested for loop from retransmit-stage
The code can be simplified by just flattening the vector of packets.
2021-05-21 17:10:56 +00:00
behzad nouri
71de021177 adds metric for turbine retransmit tree mismatch
In order to remove port-based forwarding logic in turbine, we need to
first track how often the turbine retransmit/broadcast trees mismatch
across nodes.
One consistency condition is that if the node is on the critical path
(i.e. the first node in each neighborhood), then we expect that the
packet arrives at tvu socket as opposed to tvu-forwards.

This commit adds a metric to track how often above condition is not met.
2021-05-21 17:10:56 +00:00
behzad nouri
2adce67260 extends crds values timeouts if stakes are unknown (#17261)
If stakes are unknown, then timeouts will be short, resulting in values
being purged from the crds table, and consequently higher pull-response
load when they are obtained again from gossip. In particular, this slows
down validator start where almost all values obtained from entrypoint
are immediately discarded.
2021-05-21 15:55:22 +00:00
behzad nouri
5e6b00fe98 prioritizes more recent values in pull responses (#17238)
On the receiving end, the outdated values are discarded, and they will
only waste bandwidth:
https://github.com/solana-labs/solana/blob/3f0480d06/core/src/crds_gossip_pull.rs#L385-L400

This is also exacerbating validator start, since the entrypoint is
returning old values in pull responses, and the validator immediately
discards those; resulting in huge delay until the validator obtains
contact-info of the entrypoint and is able to adopt shred-version and
fully start.
2021-05-21 14:07:46 +00:00
behzad nouri
e8b35a4f7b bumps up min number of bloom items in gossip pull requests (#17236)
When a validator starts, it has an (almost) empty crds table and it only
sends one pull-request to the entrypoint. The bloom filter in the
pull-request targets 10% false rate given the number of items. So, if
the `num_items` is very wrong, it makes a very small bloom filter with a
very high false rate:
https://github.com/solana-labs/solana/blob/2ae57c172/runtime/src/bloom.rs#L70-L80
https://github.com/solana-labs/solana/blob/2ae57c172/core/src/crds_gossip_pull.rs#L48

As a result, it is very unlikely that the validator obtains entrypoint's
contact-info in response. This exacerbates how long the validator will
loop on:
    > Waiting to adopt entrypoint shred version
https://github.com/solana-labs/solana/blob/ed51cde37/validator/src/main.rs#L390-L412

This commit increases the min number of bloom items when making gossip
pull requests. Effectively this will break the entrypoint crds table
into 64 shards, one pull-request for each, a larger bloom filter for
each shard, and increases the chances that the response will include
entrypoint's contact-info, which is needed for adopting shred version
and validator start.
2021-05-21 13:59:26 +00:00
behzad nouri
13b032b2d4 removes manual trait impl for contact-info (#17332)
The current implementations use only the id and disregard other fields,
in particular wallclock. This can lead to bugs where an outdated
contact-info shadows or overrides a current one because they compare
equal.
2021-05-19 20:56:10 +00:00
behzad nouri
e7073ecab1 adds gossip metrics for number of staked nodes (#17330) 2021-05-19 19:25:21 +00:00
Tao Zhu
0781fe1b4f Upgrade Rust to 1.52.0 (#17096)
* Upgrade Rust to 1.52.0
update nightly_version to newly pushed docker image
fix clippy lint errors
1.52 comes with grcov 0.8.0, include this version to script

* upgrade to Rust 1.52.1

* disabling Serum from downstream projects until it is upgraded to Rust 1.52.1
2021-05-19 09:31:47 -05:00