Commit Graph

1959 Commits

Author SHA1 Message Date
1f1eb9f26e Add separate push queue to reduce push lock contention (#12713) 2020-10-13 18:10:25 -07:00
247228ee61 Implementation-defined RPC server errors are now accessible to client/ users 2020-10-13 10:05:44 -07:00
649fe6d3b6 get_vote_accounts: access HashMap directly instead of turning it into an iterator 2020-10-13 04:12:10 +00:00
c5c8da1ac0 Expose all rewards (fees, rent, voting and staking) in RPC getConfirmedBlock and the cli 2020-10-09 21:54:13 -07:00
1f4bcf70b0 Fix various ledger-tool error due to no builtins (#12759)
* Fix various ledger-tool error due to no builtins

* Add missing file...
2020-10-09 12:19:36 -06:00
2c5f83c264 Add new internal accounts (#12740)
Co-authored-by: publish-docs.sh <maintainers@solana.com>
2020-10-09 00:48:32 +00:00
8f5431551e Store program logs in blockstore / bigtable (TransactionWithStatusMeta) (#12678)
* introduce store program logs in blockstore / bigtable

* fix test, transaction logs created for successful transactions

* fix test for legacy bincode implementation around log_messages

* only api nodes should record logs

* truncate transaction logs to 100KB

* refactor log truncate for improved coverage
2020-10-08 12:06:15 -07:00
b5faa11f73 removes invalid/outdated pending push messages early (#12555)
In CrdsGossipPush::new_push_messages:
https://github.com/solana-labs/solana/blob/972619edb/core/src/crds_gossip_push.rs#L211-L228
we already have paid the cost of looking-up the label in crds table and
checking the hash value and wallclock only to find out that in some
cases the value is invalid or is outdated. So might as well remove the
value here rather than wait for the next call to
purge_old_pending_push_messages:
https://github.com/solana-labs/solana/blob/972619edb/core/src/crds_gossip_push.rs#L372
2020-10-07 18:29:20 +00:00
e35889542b RPC: Support base64 encoded transactions
Defaults to base58
2020-10-06 22:41:06 -06:00
7f67d36777 RPC: Check encoded transaction size before decoding 2020-10-06 22:41:06 -06:00
a5c6a78f6d filters out inactive nodes from push options (#12674)
* filters out inactive nodes from push options

https://github.com/solana-labs/solana/pull/12620
patched the DDOS issue with nodes which go offline:
https://github.com/solana-labs/solana/issues/12409

However, offline nodes still see (much lesser) traffic spike, likely
because no origins are pruned from their bloom filter in active set:
https://github.com/solana-labs/solana/blob/aaf3790d8/core/src/crds_gossip_push.rs#L276-L286
and so multiple nodes push redundant duplicate messages to them
simultaneously:
https://github.com/solana-labs/solana/blob/aaf3790d8/core/src/crds_gossip_push.rs#L254-L255

This commit will filter out inactive peers from potential push targets
entirely. To mitigate eclipse attacks, staked nodes are retried
periodically.

* uses current timestamp in test/crds_gossip
2020-10-06 13:48:32 +00:00
71c469c72b Weight push peers by how long we haven't pushed to them (#12620) 2020-10-02 13:57:26 -07:00
75b621160e Add GetConfirmedBlocksWithLimit RPC method 2020-10-01 22:56:17 -07:00
f41a73d76a Expose validator cli arguments for pubsub buffer tuning 2020-10-01 20:30:40 -07:00
1866521df6 retains hash value of outdated responses received from pull requests (#12513)
pull_response_fail_inserts has been increasing:
https://cdn.discordapp.com/attachments/478692221441409024/759096187587657778/pull_response_fail_insert.png
but for outdated values which fail to insert:
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds_gossip_pull.rs#L332-L344
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds.rs#L104-L108
are not recorded anywhere, and so the next pull request may obtain the
same redundant payload again, unnecessary taking bandwidth.

This commit holds on to the hashes of failed-inserts for a while, similar
to purged_values:
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds_gossip_pull.rs#L380
and filter them out for the next pull request:
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds_gossip_pull.rs#L204
2020-10-01 00:39:22 +00:00
c31a34fbcb Include post balance information for rewards (#12598)
* Include post balance information for rewards

* Add post-balance to stored Reward struct

* Handle extended Reward in bigtable

Co-authored-by: Michael Vines <mvines@gmail.com>
2020-09-30 17:57:06 -06:00
3c7b9c2938 Move remaining nonce utils from runtime to SDK 2020-09-30 05:45:42 +00:00
537bbde22e builds crds filters in parallel (#12360)
Based on run-time profiles, the majority time of new_pull_requests is
spent building bloom filters, in hashing and bit-vec ops.

This commit builds crds filters in parallel using rayon constructs. The
added benchmark shows ~5x speedup (4-core machine, 8 threads).
2020-09-29 23:06:02 +00:00
96a7d4dbd8 Query BigTable for block time if does not exist in blockstore (#12560) 2020-09-29 21:39:36 +00:00
ce98088457 Track inserted repair shreds (#12455) 2020-09-29 14:13:21 -07:00
36d55c0667 Increase rpc pubsub max payload to unblock large account notifications (#12548) 2020-09-30 00:09:39 +08:00
0d5258b6d3 separates out ClusterInfo::{gossip,listen} thread-pools (#12535)
https://github.com/solana-labs/solana/pull/12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing https://github.com/solana-labs/solana/pull/12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts https://github.com/solana-labs/solana/pull/12402
and instead adds separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67
2020-09-29 09:05:31 +00:00
57ed4e4657 patches bug in Crds::find_old_labels with pubkey specific timeout (#12528)
Current code only returns values which are expired based on the default
timeout. Example from the added unit test:
  - value inserted at time 0
  - pubkey specific timeout = 1
  - default timeout = 3
Then at now = 2, the value is expired, but the function fails to return
the value because it compares with the default timeout.
2020-09-29 09:04:40 +00:00
89621adca7 Rpc -> proper optimistic confirmation (#12514)
* Add service to track the most recent optimistically confirmed bank

* Plumb service into ClusterInfoVoteListener and ReplayStage

* Clean up test

* Use OptimisticallyConfirmedBank in RPC

* Remove superfluous notifications from RpcSubscriptions

* Use crossbeam to avoid mpsc recv_timeout panic

* Review comments

* Remove superfluous last_checked_slots, but pass in OptimisticallyConfirmedBank for complete correctness
2020-09-28 20:43:05 -06:00
06f84c65f1 Fix rooted accounts cleanup, simplify locking (#12194)
Co-authored-by: Carl Lin <carl@solana.com>
2020-09-28 16:04:46 -07:00
c94fe9236f purges old pending push messages more efficiently (#12522) 2020-09-28 21:59:59 +00:00
31696a1d72 Port BPFLoader2 activation to FeatureSet and rework built-in program activation 2020-09-28 12:50:19 -07:00
6583c8cffe Add precompile verification to preflight (#12486) 2020-09-27 22:29:00 -07:00
7526bb96f3 Make test_process_rest_api less fragile 2020-09-25 11:40:36 -07:00
c10da16d7b Port instructions sysvar and secp256k1 program activation to FeatureSet 2020-09-25 11:40:36 -07:00
35f5f9fc7b Add feature set identifier to gossiped version information 2020-09-25 11:40:36 -07:00
1d04c1db94 introduce RpcPerfSample and modify getPerformanceSamples output (#12434)
* introduce RpcPerfSample and modify getPerformanceSamples output

* camelCase test results
2020-09-24 14:22:22 -07:00
42f1ef8acb moves gossip-work thread pool cons to ClusterInfo::new (#12402) 2020-09-24 18:36:31 +00:00
6601ec8f26 Record and store invoked instructions in transaction meta (#12311)
* Record invoked instructions and store in transaction meta

* Enable cpi recording if transaction sender is some

* Rename invoked to innerInstructions
2020-09-24 22:36:22 +08:00
731a943239 Remove transaction encoding from storage layer (#12404) 2020-09-24 13:10:29 +08:00
68e5a2ef56 Add RPC notify and banking keys debug (#12396) 2020-09-23 18:46:42 -07:00
e1a212fb79 Bump spl-token (#12395) 2020-09-22 17:08:54 -06:00
65a6bfad09 Add blockstore column to store performance sampling data (#12251)
* Add blockstore column to store performance sampling data

* introduce timer and write performance metrics to blockstore

* introduce getRecentPerformanceSamples rpc

* only run on rpc nodes enabled with transaction history

* add unit tests for get_recent_performance_samples

* remove RpcResponse from rpc call

* refactor to use Instant::now and elapsed for timer

* switch to root bank and ensure not negative subraction

* Add PerfSamples to purge/compaction

* refactor to use Instant::now and elapsed for timer

* switch to root bank and ensure not negative subraction

* remove duplicate constants

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2020-09-22 12:26:32 -07:00
208dd1de3a Move TestValidator into its own module 2020-09-19 08:35:26 -07:00
1a03afccb1 validator/ cleanup 2020-09-19 08:35:26 -07:00
cb8661bd49 Persistent tower (#10718)
* Save/restore Tower

* Avoid unwrap()

* Rebase cleanups

* Forcibly pass test

* Correct reconcilation of votes after validator resume

* d b g

* Add more tests

* fsync and fix test

* Add test

* Fix fmt

* Debug

* Fix tests...

* save

* Clarify error message and code cleaning around it

* Move most of code out of tower save hot codepath

* Proper comment for the lack of fsync on tower

* Clean up

* Clean up

* Simpler type alias

* Manage tower-restored ancestor slots without banks

* Add comment

* Extract long code blocks...

* Add comment

* Simplify returned tuple...

* Tweak too aggresive log

* Fix typo...

* Add test

* Update comment

* Improve test to require non-empty stray restored slots

* Measure tower save and dump all tower contents

* Log adjust and add threshold related assertions

* cleanup adjust

* Properly lower stray restored slots priority...

* Rust fmt

* Fix test....

* Clarify comments a bit and add TowerError::TooNew

* Further clean-up arround TowerError

* Truly create ancestors by excluding last vote slot

* Add comment for stray_restored_slots

* Add comment for stray_restored_slots

* Use BTreeSet

* Consider root_slot into post-replay adjustment

* Tweak logging

* Add test for stray_restored_ancestors

* Reorder some code

* Better names for unit tests

* Add frozen_abi to SavedTower

* Fold long lines

* Tweak stray ancestors and too old slot history

* Re-adjust error conditon of too old slot history

* Test normal ancestors is checked before stray ones

* Fix conflict, update tests, adjust behavior a bit

* Fix test

* Address review comments

* Last touch!

* Immediately after creating cleaning pr

* Revert stray slots

* Revert comment...

* Report error as metrics

* Revert not to panic! and ignore unfixable test...

* Normalize lockouts.root_slot more strictly

* Add comments for panic! and more assertions

* Proper initialize root without vote account

* Clarify code and comments based on review feedback

* Fix rebase

* Further simplify based on assured tower root

* Reorder code for more readability

Co-authored-by: Michael Vines <mvines@gmail.com>
2020-09-19 14:03:54 +09:00
c4913e3c9e SendTransactionServices now exit their thread on channel drop instead of by a flag 2020-09-18 17:29:10 +00:00
75c3690ccd Give the duplicate send_transaction_service a different thread name 2020-09-18 17:29:10 +00:00
9b866d79fb shards crds values based on their hash prefix (#12187)
filter_crds_values checks every crds filter against every hash value:
https://github.com/solana-labs/solana/blob/ee646aa7/core/src/crds_gossip_pull.rs#L432
which can be inefficient if the filter's bit-mask only matches small
portion of the entire crds table.

This commit shards crds values into separate tables based on shard_bits
first bits of their hash prefix. Given a (mask, mask_bits) filter,
filtering crds can be done by inspecting only relevant shards.

If CrdsFilter.mask_bits <= shard_bits, then precisely only the crds
values which match (mask, mask_bits) bit pattern are traversed.
If CrdsFilter.mask_bits > shard_bits, then approximately only
1/2^shard_bits of crds values are inspected.

Benchmarking on a gce cluster of 20 nodes, I see ~10% improvement in
generate_pull_responses metric, but with larger clusters, crds table and
2^mask_bits are both larger, so the impact should be more significant.
2020-09-17 14:05:16 +00:00
32dcce0ac1 RPC: Limit request payload size to 50kB 2020-09-16 20:21:59 +00:00
f6cda2579f Fix off-by-one max payload checks 2020-09-16 12:46:06 -07:00
c6eea94edc Remove stale comment 2020-09-16 08:42:26 -07:00
749208fa32 RPC sendTransaction now returns transaction logs on simulation failure 2020-09-16 08:42:26 -07:00
3930cb865a Add keccak-secp256k1 instruction (#11839)
* Implement keccak-secp256k1 instruction

Verifies eth addreses with ecrecover function

* Move secp256k1 test
2020-09-15 18:23:21 -07:00
daae638781 Add --gossip-validator argument 2020-09-14 20:18:27 -07:00