Compare commits

...

698 Commits
v0.7.0 ... v0.8

Author SHA1 Message Date
390af512de Permit testnets without a GPU 2018-09-26 10:38:10 -07:00
2a9be901da Mark --outfile parameter as required 2018-09-26 10:38:06 -07:00
d794fee66f Remove unused variables and imports after cherry-picking from master 2018-09-19 11:49:47 -07:00
9b66d4d363 Read multiple entries in write stage (#1259)
- Also use rayon to parallelize to_blobs() to maximize CPU usage
2018-09-19 11:49:47 -07:00
bff8f2614b Move entry->blob creation out of write stage (#1257)
- The write stage will output vector of entries
- Broadcast stage will create blobs out of the entries
- Helps reduce MIPS requirements for write stage
2018-09-19 11:49:47 -07:00
8f0648e8fc Move register_entry_id() call out of write stage (#1253)
* Move register_entry_id() call out of write stage

- Write stage is MIPS intensive and has become a bottleneck for
  TPU pipeline
- This will reduce the MIPS requirements for the stage

* Fix rust format issues
2018-09-19 11:49:47 -07:00
d81eaf69db Update fetch-perf-libs.sh 2018-09-17 11:54:45 -07:00
b5935a3830 cargo fmt 2018-09-14 20:30:04 -07:00
c1b07d0f21 Upgrade rust stable to 1.29 2018-09-14 20:30:04 -07:00
a1579b5a47 Remove large-network test, it's ignored anyway 2018-09-14 20:11:46 -07:00
77949a4be6 cherry pick readme update 2018-09-13 19:19:48 -07:00
af58940964 Fix missing recycle in recv_from (#1205)
In the error case that i>0 (we have blobs to send)
we break out of the loop and do not push the allocated r
to the v array. We should recycle this blob, otherwise it
will be dropped.
2018-09-13 10:27:24 -07:00
21963b8c82 fix "leak" in Blob::recv_from (#1198)
* fix "leak" in Blob::recv_from

fixes #1199
2018-09-13 10:27:24 -07:00
b52230097e groom Fullnode's new_with_bank() to match new() more 2018-09-12 09:24:42 -07:00
a8fdb8a5a7 use a single BlobRecycler per fullnode 2018-09-11 16:56:54 -07:00
297f859631 Change '>=' back to '>' to fix recycling of blobs/packets (#1192)
Recycler will have a strong ref to the item so it will be at
least 1, >= will always prevent recycling.
2018-09-11 16:52:45 -07:00
5d19b799af Fix snap configuration for netstat daemon (#1190)
- Also increased the frequency at which the stats are sent
- Fixed file permissions for snapcraft.yaml
2018-09-11 14:49:05 -07:00
af3eb5a16c .sh 2018-09-11 11:29:49 -07:00
b313b7f6f9 Revert "move rpc_server to drop() semantics instead of having its own thread"
This reverts commit 40aa0654fa.
2018-09-10 22:48:33 -07:00
016ee36808 remove -x 2018-09-10 21:40:14 -07:00
c3fc98c48f use gossip to find the leader for every airdrop request 2018-09-10 21:29:45 -07:00
40aa0654fa move rpc_server to drop() semantics instead of having its own thread 2018-09-10 20:25:53 -07:00
bace2880d0 Correct spelling 2018-09-10 19:58:21 -07:00
9d80eefb81 Log the number of accounts each 250k txes (#1178) 2018-09-10 17:40:00 -07:00
1c17c6dd2b Report UDP network statistics (#1176)
* Report UDP network statistics

Fixes #1093

* Address review comments

* Address additional review comments

* Fix shellcheck errors
2018-09-10 15:52:08 -07:00
2be0dbddbb Correct spelling 2018-09-10 13:48:43 -07:00
a91b785ba5 move fullnode trace generation into crdt 2018-09-10 13:47:57 -07:00
0ef05de889 Add sleep to prevent spinning thread 2018-09-10 12:50:28 -07:00
a093d5c809 Fix erasure build 2018-09-10 11:40:26 -06:00
fc64e1853c Initialize Window, not SharedWindow
Wrap with Arc<RwLock>> when/if needed, no earlier.
2018-09-10 11:40:26 -06:00
7f669094de Split window into two modules 2018-09-10 11:40:26 -06:00
5025d89c88 Inline window method implementations 2018-09-10 11:40:26 -06:00
2b44c4504a Use WindowUtil for more idiomatic code 2018-09-10 11:40:26 -06:00
d2c9beb843 Add a trait to pretend Window is an object 2018-09-10 11:40:26 -06:00
9e6d3bf532 Correct spelling 2018-09-10 09:29:01 -07:00
a89b611e9e comments (#1165) 2018-09-09 07:07:38 -07:00
ebcac3c2d1 Use a common solana user on all testnet instances 2018-09-08 22:34:26 -07:00
7029e4395c Fix OOM reporting 2018-09-08 18:57:31 -07:00
5afcdcbbe6 More log grooming 2018-09-08 14:16:34 -07:00
3840b4b516 Groom log output 2018-09-08 14:10:18 -07:00
7aeb6d642b Display log file 2018-09-08 13:59:45 -07:00
1d6c4aacae Retry rsync a couple times before failing 2018-09-08 13:59:45 -07:00
9f5c86e60c Install earlyoom at gce instance startup 2018-09-08 13:59:45 -07:00
9f413fd656 Establish net/scripts/... for better scoping 2018-09-08 13:59:45 -07:00
97c3125a78 improve localnet-sanity's robustness (#1160)
* fix poll_gossip_for_leader() loop to actually wait
         for 30 seconds
    * reduce reuseaddr use to only when necessary,
         try to avoid already bound sockets
    * move nat.rs to netutil.rs
    * add gossip tracing to thin_client and bench-tps
2018-09-09 04:50:43 +09:00
a77aca75b2 Add NO_VALIDATOR_SANITY back 2018-09-07 22:37:05 -07:00
96bfd9478b make all the nodes have a pretty seq id (#1159) 2018-09-08 14:18:18 +09:00
e8206cb2d4 Echo the network address before entering a quiet polling loop 2018-09-07 21:20:00 -07:00
c3af0d9d25 Improve client.log 2018-09-07 21:20:00 -07:00
932c994dc9 Use new bench-tps command-line args 2018-09-07 21:20:00 -07:00
c34d911eaf Migrate Budget DSL to use the Account state (#979)
* Migrate Budget DSL to use the Account state instead of global bank data structures.

* Serialize Instruction into Transaction::userdata.
* Store the pending set in the Account::userdata
* Enforce the token balance rules on contract execution. This becomes the entry point for generic contracts.
* This pr will have a performance impact on the bank. The next set of changes will fix this by locking each account during multi threaded execution of all the contracts.
* With this change a contract transaction needs to store its state under an address. That address could be the destination of the tokens, or any random address. For the latter, an extra step would be needed to claim the tokens which isn't implemented by budget_dsl at the moment.
* test tracking issue 1157
2018-09-07 20:18:36 -07:00
ddd1871840 Install libssl1.1 for solanalabs/rust docker image compat 2018-09-07 19:57:41 -07:00
db825788fa Document how to get ssh access into CD testnets 2018-09-07 19:41:13 -07:00
b1b03ec13b Refine docker image tagging to avoid breaking stabilization branches on updates 2018-09-07 18:42:25 -07:00
73a8441add /var/snap is not writable by most users 2018-09-07 17:41:20 -07:00
bf29590f41 WSL needs ReuseAddr in addition to ReusePort (which it doesn't honor) (#1149) 2018-09-08 07:28:22 +09:00
51b27779c9 client changes for TODOs and looping (#1138)
* remove client.sh from snap
* default to ephemeral instead of ~/.config key
* rework CLI for bench-tps
* remote multinode-demo stuff from remote-client.sh
* remove multinode-demo from remote-sanity and localnet-sanity
2018-09-08 07:07:10 +09:00
5169c8d08f Add method to return hash of bank state 2018-09-07 15:38:53 -06:00
0d945e6a92 Groom testnet-sanity logging 2018-09-07 12:45:48 -07:00
1090254ba5 Add datapoints for leader/validator start 2018-09-07 12:45:48 -07:00
e51445d857 🙃 2018-09-07 12:24:34 -07:00
4b47abd3bf Fix --num-nodes argument parsing 2018-09-07 12:20:42 -07:00
71a617b4dc Fix erasure build 2018-09-07 13:18:19 -06:00
a722802c95 Window write lock to read lock 2018-09-07 13:18:19 -06:00
e9f44b6661 window -> window_service 2018-09-07 13:18:19 -06:00
9693de1867 Reposition parameters 2018-09-07 13:18:19 -06:00
f7ea95aed1 Hoist lock, reposition parameters 2018-09-07 13:18:19 -06:00
f07ce59be8 Toggle parameters 2018-09-07 13:18:19 -06:00
da423b6cf0 Hoist read lock 2018-09-07 13:18:19 -06:00
d5f60b68e4 Hoist window write lock 2018-09-07 13:18:19 -06:00
78b3a8f7f9 Hoist repair_window() branches
This probably would have been done if repair_window() was unit-tested.
2018-09-07 13:18:19 -06:00
d77699c126 Do the easy check first
All functions above operate on immutable values, so this shouldn't
change functionality, but no repair_window() tests to be certain.hI
2018-09-07 13:18:19 -06:00
09ba0dae15 Remove redundant clone() 2018-09-07 13:18:19 -06:00
a5c7575207 Rewrite find_next_missing, call it clear_slots 2018-09-07 13:18:19 -06:00
50f040530b Remove redundant cast 2018-09-07 13:18:19 -06:00
7f99c90539 Simplify using early return and Result::ok() 2018-09-07 13:18:19 -06:00
d8564b725c Don't reference window to get each slot 2018-09-07 13:18:19 -06:00
e4de25442a Hoist write lock
It needed to be passed the lock before, because it contained a
branch where one side didn't require locking. Now that that
defensive programming was hoisted, we can hoist the write lock
as well, leaving a simpler function for unit testing.
2018-09-07 13:18:19 -06:00
3b2ea8fd40 Hoist untested branch in window
If there were unit tests for this function, the author would have
written it this way to make their own life easier.
2018-09-07 13:18:19 -06:00
9a1832ed61 Bump ping timeout 2018-09-07 12:01:43 -07:00
9e45f1f5e2 Doc fixup 2018-09-07 12:01:43 -07:00
ee682d5bc3 Move wallet-sanity.sh out of multinode-demo/ 2018-09-07 12:01:43 -07:00
05decc863f Make set -x more buildkite friendly 2018-09-07 12:01:43 -07:00
506a81e8cc Assume -y 2018-09-07 12:01:43 -07:00
dcb30a8489 Delete leader node first 2018-09-07 12:01:43 -07:00
a2631e89f6 Use consistent style 2018-09-07 12:01:43 -07:00
ab208ddb77 Clean up arg handling 2018-09-07 12:01:43 -07:00
09a48d773a Run bench-tps in a tmux 2018-09-07 12:01:43 -07:00
88298bf321 Add -n option 2018-09-07 12:01:43 -07:00
d252f7f687 Revert "Default to 10 validators"
This reverts commit ed5fbaef06.
2018-09-07 12:01:43 -07:00
533ebc17f2 Install multilog automatically on a CI machine 2018-09-07 11:56:23 -07:00
f4947236dc Keep cargo-target-cache size under 6GB-ish 2018-09-07 11:45:27 -07:00
e088833b81 s/create/start/ 2018-09-06 21:07:11 -07:00
53e16f68d9 Improve error handling 2018-09-06 20:57:05 -07:00
ed5fbaef06 Default to 10 validators 2018-09-06 20:46:49 -07:00
b1bacf12a6 Add some log sections 2018-09-06 20:38:11 -07:00
66ff602659 Rewrite ci/testnet-{deploy,sanity}.sh in terms of net/ primitives 2018-09-06 19:54:39 -07:00
e175c9dea9 Remove ip address hardcode. Fixes #959 2018-09-06 19:54:39 -07:00
5a57d9b5d9 de-y 2018-09-06 19:54:39 -07:00
03e87e4169 Add more metrics 2018-09-06 19:54:39 -07:00
abfff66d53 Retry ssh a couple times before giving up 2018-09-06 19:54:39 -07:00
31dee553d5 Split start/version reporting 2018-09-06 19:54:39 -07:00
9ca6a2d25b Configure boot disk size 2018-09-06 19:54:39 -07:00
a3178c3bc7 Remove unused name tag 2018-09-06 19:54:39 -07:00
aa07bdfbaa Optionally suppress delete confirmation 2018-09-06 19:54:39 -07:00
eaef9be710 Clarify -f 2018-09-06 19:54:39 -07:00
cae345b416 Allow - in prefix 2018-09-06 19:54:39 -07:00
acb1171422 Add -e option 2018-09-06 19:54:39 -07:00
52d8f293b6 Add links to citations
And fix hyphens in quote.
2018-09-06 20:41:05 -06:00
636eb8d058 Add Leslie Lamport quote to README 2018-09-06 20:41:05 -06:00
0fa27f65bb Use the default Pubkey formatter instead of debug_id() 2018-09-06 16:31:47 -06:00
8f94e3f7ae Buffer tokens when switching directions to prevent errors (#1126)
Even if transactions are dropped, accounts will have buffer
of tokens. Should reduce or eliminate AccountNotFound errors seen in the
leader while bench-tps is running.
2018-09-06 14:20:01 -07:00
05460eec0d Open multiple sockets for transaction UDP port (#1128)
* Reuse UDP port and open multiple sockets for transaction address

* Fixed failing crdt tests

* Add tests for reusing UDP ports

* Address review comments

* Updated bench-streamer to use multiple receive sockets

* Fix minimum number of recv sockets for bench-streamer

* Address review comments

Fixes #1132

* Moved bind_to function to nat.rs
2018-09-06 14:13:40 -07:00
072d0b67e4 Send deploy metrics to the testnet-specific database 2018-09-06 08:30:03 -07:00
fdc48d521c use USER instead of whoami (#1134)
* use USER instead of whoami

make gcloud_FigureRemoteUsername robust against unsolicited output
   (that I get on login ;) )

validate --prefix argument

* Update gcloud.sh
2018-09-07 00:18:05 +09:00
6560b0e2cc s/whoami/id -un/ 2018-09-05 14:26:21 -07:00
ec38dba209 GCE leader nodes can now be provisioned with a static IP address 2018-09-05 14:26:21 -07:00
d9e4bce6ad Add drop stats to bench-tps (#1127)
See how many transactions made it through
2018-09-05 11:58:41 -07:00
1fd4343621 Add total count to stat (#1124) 2018-09-05 09:28:18 -07:00
8d87627a49 t 2018-09-05 09:09:50 -07:00
aacf27fb76 Add convienience link to current Snap log files 2018-09-05 09:02:02 -07:00
a51536d107 Add log tail hint 2018-09-05 09:02:02 -07:00
1c874fbc1b Make This is little more hacky 2018-09-05 09:02:02 -07:00
0362169671 Better scope leader and validator setup 2018-09-05 09:02:02 -07:00
e2e569cb43 Set rsync url for local deployments 2018-09-05 09:02:02 -07:00
8c51b47e85 Preserve existing ssh config 2018-09-05 09:02:02 -07:00
017eb10e76 Add file header doc 2018-09-05 09:02:02 -07:00
f50aeb0e58 Always add perf-libs to LD_LIBRARY_PATH 2018-09-05 09:02:02 -07:00
48c19d3100 Enable cargo features to be specified 2018-09-05 09:02:02 -07:00
aaf0a23134 Add Tips section 2018-09-05 09:02:02 -07:00
89db85dbf9 Work around concurrent |gcloud compute ssh| terminal issue 2018-09-05 09:02:02 -07:00
e677cda027 Private IP networks now work, and are the default 2018-09-05 09:02:02 -07:00
db9219ccc8 Improve error monitoring 2018-09-05 09:02:02 -07:00
06fd945f85 Set node config correctly 2018-09-05 09:02:02 -07:00
6ad4a81123 s/_/-/g in filenames 2018-09-05 09:02:02 -07:00
bcaa0fdcb1 net/ can now deploy Snaps 2018-09-05 09:02:02 -07:00
2cb1375217 Run gcloud_PrepInstancesForSsh in parallel 2018-09-05 09:02:02 -07:00
9365a47d42 Employ a startup script 2018-09-05 09:02:02 -07:00
6ffe205447 Add -g option 2018-09-05 09:02:02 -07:00
ec3e62dd58 Add net/ sanity 2018-09-05 09:02:02 -07:00
fa07c49cc9 net/ can now deploy Snaps 2018-09-05 09:02:02 -07:00
449d7042f0 Configure metrics correctly 2018-09-05 09:02:02 -07:00
7e2b65374d gce instance types are now configurable 2018-09-05 09:02:02 -07:00
8e39465700 Drop .sh extension to hide from shellcheck 2018-09-05 09:02:02 -07:00
43b4207101 Run oom-monitor in net/ testnets 2018-09-05 09:02:02 -07:00
ff991b87da Add support for deploying from non-Linux machines 2018-09-05 09:02:02 -07:00
c81c19234f Improve incremental speed of docker cargo builds outside of CI 2018-09-05 09:02:02 -07:00
399caf343c Morph gce_multinode-based scripts into net/ 2018-09-05 09:02:02 -07:00
ffb72136c8 Remove account from balances table after error seen (#1120)
If balance goes to 0, then bank removes the account
from it's account table and returns no account error. Thin client
should also update the account to this state or it will
still have the cached balance from the last successful get_balance().
2018-09-04 21:33:19 -07:00
1a615bde2b Update README.md (#1117)
* Update README.md

* Fix spelling

* Improved punctuation
2018-09-04 20:41:11 -07:00
cf2626a1c5 Update instructions to upgrade nightly docker image 2018-09-04 20:56:40 -06:00
68c72d6f34 Fix nightly build 2018-09-04 20:56:40 -06:00
65f78905cd Install cargo-cov on latest nightly 2018-09-04 20:56:40 -06:00
70a8ae4612 Fixed private IP variable in gcloud script (#1119) 2018-09-04 16:24:19 -07:00
d82ec2634c Fix is_leader boolean (#1115)
A node is the leader if the address is none
2018-09-04 13:38:24 -07:00
b4a7a18334 Update README.md 2018-09-04 13:29:00 -07:00
c44c5f0b09 take into account size of an Entry (#1116) 2018-09-05 05:07:58 +09:00
226d3b9471 Trace recycle() calls (#968)
* trace recycle() calls fixes #810
2018-09-05 05:07:02 +09:00
2752bde683 Print to indicate what drone is doing while waiting for gossip 2018-09-04 13:45:08 -06:00
b8816d722c Fix Block::to_blobs() benchmark
16% speedup, w00t!

name                                control  ns/iter  variable  ns/iter  diff ns/iter   diff %  speedup
bench_block_to_blobs_to_block       29,897            25,807                   -4,090  -13.68%   x 1.16
2018-09-04 07:50:23 -10:00
2aa72cc72e Return a Vec from to_blobs() instead of using a mut parameter 2018-09-04 07:50:23 -10:00
8cc030ef84 Use Vec instead of VecDeque for SharedBlobs 2018-09-04 07:50:23 -10:00
9a9f89293a Better error handling messages for airdrops 2018-09-04 06:46:43 -10:00
501deeef56 accounts should never be negative (#1083) 2018-09-04 06:43:18 -10:00
05f921d544 Don't call println in the test suite 2018-09-04 06:01:32 -10:00
ab7a2960b1 Don't use product name in solana library 2018-09-04 06:01:32 -10:00
4e2deaa33b Less mut 2018-09-04 06:01:32 -10:00
d5ef18337c Remove redundant return value
And don't log the same error twice.
2018-09-04 06:01:32 -10:00
d18ea501b7 Minimize unsafe code 2018-09-04 06:01:32 -10:00
c9a1ac9b8c Don't propogate errors we'll never handle 2018-09-04 06:01:32 -10:00
c2a4cb544e Borrow, don't clone entries 2018-09-04 06:01:32 -10:00
3ab12076e8 Convert voting functions to methods
More idiomatic Rust.
2018-09-04 05:53:58 -10:00
6a383c45fc Update sendTransaction example to reflect new array size 2018-09-04 05:44:10 -10:00
7cc27e7bd1 Doc requestAirdrop rpc method 2018-09-04 05:44:10 -10:00
0464087327 Add api definitions 2018-09-04 05:44:10 -10:00
c193c7de12 Add JSON-RPC API Documentation 2018-09-04 05:44:10 -10:00
61abee204f don't check for snap mode in common.sh, is only relevant to snap daemons (#1113)
snap mode is for daemons, remove it from client (i.e. common.sh)

supply leader info to client via snap
2018-09-04 14:31:54 +09:00
a99dbb2a0c set -x in client.sh 2018-09-04 11:55:04 +09:00
e834c76b40 --count => --num-nodes 2018-09-04 07:07:25 +09:00
7b3c7f148b supply leader and leader_address 2018-09-02 02:27:05 +09:00
fb4b33b81b make the repair_backoff test more robust (#1095)
* more the repair_backoff test more robust

* fix names and magic numbers
2018-08-31 12:40:56 -10:00
25d7dc7b96 fixups 2018-09-01 04:38:18 +09:00
d1f1cbe88f leader-address=>leader-ip 2018-09-01 04:38:18 +09:00
a4e7b6e90c more fixups for client.sh changes 2018-09-01 03:33:21 +09:00
fbc7c9c431 fix client_start to deal with new client.sh 2018-09-01 03:23:05 +09:00
8b248dcf09 specify port 2018-09-01 02:56:24 +09:00
4938aad939 fixups 2018-09-01 02:21:46 +09:00
7e882dfe62 inform all snaps where the network is 2018-09-01 02:21:46 +09:00
5c8cb96f88 rebase fixup 2018-08-31 23:21:07 +09:00
9d1eb4f9ea remove 'localhost' leader (redundant, un-dig-friendly) 2018-08-31 23:21:07 +09:00
210a4d0640 fixup 2018-08-31 23:21:07 +09:00
176e806d94 rework of netwrk rendezvous
* rename NodeInfo field of Node from "data" to "info"
      (touches a lot of files)

  * update client to use gossip to find leader, a la drone

  * rework multinode scripts
      * move more stuff into rust
      * added usage to all
      * no more rsync unless you're a validator (TODO: whack that, too)
  * fullnode doesn't bail if drone isn't up yet, just keeps trying
  * drone doesn't bail if network isn't up yet, just keeps trying
2018-08-31 23:21:07 +09:00
eb4e5a7bd0 fixups 2018-08-31 23:21:07 +09:00
ba27596076 fixups 2018-08-31 23:21:07 +09:00
63e44dcc35 continue rendezvous refactor for gossip and repair
* remove trailing whitespace in ci/audit.sh

  * code review fixups
     * rename GOSSIP_PORT_RANGE => SOLANA_PORT_RANGE
     * remove out-of-date TODO in localnet-sanity.sh

  * remove features=test and code that was using it (localhost prohibitions in
      crdt) added TODO in crdt.rs, maybe we should boot localhost in production
      networks?

  * boot tvu_window from NodeInfo: instead, send repair requests from the repair
      socket (to gossip on peer) and answer repair requests via the sockaddr
      from the repair request

  * remove various unused pub functions

  * banish SocketAddr parse().unwrap() to a macro that can also accept simpler stuff
2018-08-31 23:21:07 +09:00
c0ba676658 fixup 2018-08-31 23:21:07 +09:00
1af4cee63b fix #1079
* move gossip/NCP off assuming anything about its address
  * use a single socket to send and receive gossip
  * remove --addr/-a from CLIs
  * rearrange networking utility code
  * use Arc<UdpSocket> to share the Sync-safe UdpSocket among threads
  * rename TestNode to Node

TODO:

  * re-enable 127.0.0.1 as a valid address in crdt
  * change repair request/response to a similar, single socket
  * pick cloned sockets or Arc<UdpSocket> for all these (rpu uses tryclone())
  * update contact_info with network truthiness instead of what the node
      says?
2018-08-31 23:21:07 +09:00
cb52a335bd re-enable localnet-sanity 2018-08-31 23:21:07 +09:00
e308a4279e Update RPC requestAirdrop endpoint to return airdrop tx signature 2018-08-28 18:27:41 -06:00
513a934ff6 Update request_airdrop utility function to pass along airdrop tx signature 2018-08-28 18:27:41 -06:00
77d820c842 Update drone module to return airdrop tx signature 2018-08-28 18:27:41 -06:00
30cbe7c6a9 Update jsonrpc crate version 2018-08-28 18:27:24 -06:00
18ef643dc7 Keep locals local 2018-08-28 08:11:44 -07:00
73a0bf8d30 Avoid unbounded /var/tmp growth 2018-08-28 08:11:44 -07:00
9d53208d68 Use gcloud_DeleteInstances 2018-08-28 08:11:44 -07:00
d26f135159 Find metrics-write-datapoint.sh again 2018-08-27 22:41:58 -07:00
c8e3ce26a9 Start of scripts/gcloud.sh 2018-08-27 22:35:14 -07:00
f88970a964 source oom-score-adj.sh from validator.sh 2018-08-28 10:01:41 +09:00
51d911e3f4 Update testnet-sanity.sh 2018-08-27 15:44:10 -07:00
bd5c6158ae Move some common scripts from multinode-demo/ to scripts/ 2018-08-27 13:52:38 -07:00
cd0db7842c Remove unused _config.yml 2018-08-27 13:52:38 -07:00
31d1087103 Documentation 2018-08-27 13:52:38 -07:00
0efd64df6f no need for sudo, move ledger copy out of SNAP_DATA 2018-08-28 05:42:05 +09:00
28bdf346f6 clean up after ledger sanity 2018-08-28 05:42:05 +09:00
48762834d9 Randomize repair requests (#1059)
* randomize packet repair requests

* exponential random repair requests

* use gen_range to get a uniform distribution
2018-08-27 07:05:48 -07:00
8d0d429acd update 2018-08-26 23:34:25 -07:00
e5408368f7 fmt 2018-08-26 22:35:26 -07:00
61492fd27e exit if no leader 2018-08-26 22:35:26 -07:00
bbce08a67b bench needs to discover leader as well 2018-08-26 22:35:26 -07:00
a002148098 retry transfer and poll 2018-08-26 16:10:46 -07:00
90ae662e4d Fix packet header offset
And update transaction offsets to use the same approach as packet.rs.
Maybe this should be serialized_size(), but thanks to this
GenericArray update, those values are the same.
2018-08-26 14:27:19 -06:00
60d8f5489f Update transaction layout offsets
24 less bytes in minimal transactions. 10% TPS boost?
2018-08-26 14:27:19 -06:00
59dd8b650d Update generic-array requirement from 0.11.1 to 0.12.0
Updates the requirements on [generic-array](https://github.com/fizyk20/generic-array) to permit the latest version.
- [Release notes](https://github.com/fizyk20/generic-array/releases)
- [Changelog](https://github.com/fizyk20/generic-array/blob/master/CHANGELOG.md)
- [Commits](https://github.com/fizyk20/generic-array/commits)

Signed-off-by: dependabot[bot] <support@dependabot.com>
2018-08-26 14:27:19 -06:00
738247ad44 advertise valid gossip address in drone and wallet (#1066)
* advertize valid gossip address in drone and wallet

get rid of asserts

check for valid ip address

check for valid address

ip address

* tests

* cleanup

* cleanup

* print error

* bump

* disable tests

* disable nightly
2018-08-26 11:36:27 -07:00
5b0bb7e607 Skip invalid nodes for finality (#1068)
* skip invalid nodes for finality

* check valid last_ids only

* fixup!

* fixup!
2018-08-25 23:12:41 -07:00
f7c0d30167 Disallow localhost in deployment (#1064)
* disallow localhost in deployment

* tests

* fmt

* integration tests do not have a flag to check

* fmt
2018-08-25 21:09:18 -07:00
8e98c7c9d6 fix purge test 2018-08-25 19:56:09 -07:00
50661e7b8d Added poll_balance_with_timeout method (#1062)
* Added poll_balance_with_timeout method

- updated bench-tps, fullnode and wallet to use this method instead
  of repeatedly calling poll_get_balance()

* Address review comments

- Revert some changes to use wrapper poll_get_balance()

* Reverting bench-tps to use poll_get_balance

- The original code is checking if the balance has been updated,
  instead of just retrieving the balance. The logic is different
  than poll_balance_with_timeout()

* Reverting wallet to use poll_get_balance

- The break condition in the loop is different than poll_balance_with_timeout().
  It's checking if the balance has been updated.
2018-08-25 18:24:25 -07:00
ad159e0906 Fix crash in fullnode when poll_get_balance() returns error (#1058) 2018-08-25 15:25:13 -07:00
d3fac8a06f Dynamically bind to available UDP ports in Fullnode (#920)
* Dynamically bind to available UDP ports in Fullnode

* Added tests for dynamic port binding

- Also removed hard coding of port range from CRDT
2018-08-25 10:24:16 -07:00
c641ba1006 Up network buffers to 64MB max (#1057)
500ms of data at 1Gbps = 125GB/2 = 64MB
Seems to help tx rate in GCP network tests.
2018-08-24 18:17:48 -07:00
de379ed915 Fix sig verify counters to be unique and tweak perf counters (#1056)
print events and add current events to old value to report
2018-08-24 16:05:32 -07:00
d4554c6b78 RFC Branches, Channels, and Tags 2018-08-23 21:28:05 -07:00
6fc21a4223 Don't hang in transaction_count (#1052)
Situation is there can be that there can be bad entries in
the bench-tps CRDT table until they get purged later. Threads however
are created for those bad entries and then will hang on trying
to get the transaction_count from those bad addresses and never end.
2018-08-23 20:57:13 -07:00
71319978df Up drone request amount (#1051)
Multiple clients will request 500k each so up this to support them.
2018-08-23 15:30:35 -07:00
6147e54686 Cap repair requests timeout (#958) 2018-08-23 15:30:21 -07:00
0c8eec2563 Cleanup Fullnode construction
leader_id was already set by Fullnode constructor. And cleanup the
rest of that code while in the neighborhood.

Thanks @CriesofCarrots!
2018-08-23 13:42:54 -07:00
4ab58f069a Add back JsonRpcService changes 2018-08-23 13:42:54 -07:00
85f96d926a Pacify clippy 2018-08-23 13:42:54 -07:00
816de4f8ec Hoist shared code between leaders and validators 2018-08-23 13:42:54 -07:00
42229a1105 Hoist thread_hdls 2018-08-23 13:42:54 -07:00
d8820053af Inline create_leader_threads and create_validator_threads 2018-08-23 13:42:54 -07:00
731f8512c6 Hoist Arc<Bank> 2018-08-23 13:42:54 -07:00
a133784706 Rename mode-specific constructors and return only thread handles 2018-08-23 13:42:54 -07:00
be58fdf1bb Less constructors 2018-08-23 13:42:54 -07:00
57daeb35d2 Drop all references to new_leader and new_validator 2018-08-23 13:42:54 -07:00
9c5e69bf3d Don't offer two ways to specify a leader 2018-08-23 13:42:54 -07:00
cfac127e4c Extract lower-level constructor
Passing in the bank is useful for unit-tests since Fullnode doesn't
store it in a member variable.
2018-08-23 13:42:54 -07:00
fda4523cbf Fix broken doc 2018-08-23 13:42:54 -07:00
cabe80b129 Increment counter by number of packets received (#1049)
So that we can see the total packets/s
2018-08-23 12:32:50 -07:00
d4c41219f9 Improve gossip use for drone and wallet
- Add utility function
  - Add thread sleep
  - Enable configurable timeout for gossip poll
2018-08-23 13:08:59 -06:00
4fdd9fbfca Wallet: use gossip to identify leader's port config 2018-08-23 13:08:59 -06:00
bdf5ac9c1a Drone: use gossip to identify leader's port config 2018-08-23 13:08:59 -06:00
f1785c76a4 Rework counter increment outside apply_debits loop (#1046)
Reduces prints/atomics work inside the process_transactions loop
2018-08-23 09:42:59 -07:00
2de8fe9c5f Pass bank to rpc as reference 2018-08-23 09:06:17 -06:00
d910ed68a3 Use balance to verify requestAirdrop success 2018-08-23 09:06:17 -06:00
f7f7ecd4c6 Add json-rpc requestAirdrop endpoint 2018-08-23 09:06:17 -06:00
a9c3a28a3b Add json-rpc sendTransaction endpoint 2018-08-23 09:06:17 -06:00
96787ff4ac Use builtin sum 2018-08-22 16:24:19 -06:00
c3ed4d28de Change average TPS to max average tps seen for any node and...
add script to collect perf stats
2018-08-22 14:55:04 -07:00
f1e35c3bc6 GCE script change to use GCE private network for multinode tests (#1042)
- Also the user can specify the zone where the nodes should be created
2018-08-22 13:21:33 -07:00
db3fb3a27c Boot criterion (#1032)
* Revert benchmarks back to libtest

Criterion has too many dependencies, it's execution as slower, and
we didn't see the kind of precision we had hoped for to use it to
block CI builds.

* Ignore benchmarks that take more than a few milliseconds per iteration

* Revert "Ignore benchmarks that take more than a few milliseconds per iteration"

This reverts commit b87cdf6ef4.

* Don't run benchmarks in CI

They are already built in the nightly build. Executing them in CI
doesn't add much value until the results are precise enough to act
on.
2018-08-22 08:57:07 -06:00
8282442956 fixes #927 2018-08-22 17:47:59 +09:00
a355d9f46c Add error catch for rpc server builder 2018-08-21 14:04:52 -06:00
be4824c955 Add custom panic hook for RPC port bind 2018-08-21 14:04:52 -06:00
86c1d97c13 Fix validator rpc addr to match leader 2018-08-20 22:35:06 -07:00
0b48aea937 echo commands, use PID (good form) 2018-08-21 11:41:00 +09:00
cdec0cead2 files have to appear in the snap 2018-08-21 11:41:00 +09:00
831709ce7e fixups 2018-08-21 10:36:03 +09:00
b7b8a31532 make a copy of the ledger for sanity check
we can't verify a live ledger, unfortunately, fixes #985
2018-08-21 10:36:03 +09:00
15406545d8 Document how to adjust the number of clients or validators on the testnet 2018-08-20 18:35:01 -07:00
5aced8224f Revert "make a copy of the ledger for sanity check"
This reverts commit af20a43b77.
2018-08-21 10:34:52 +09:00
af20a43b77 make a copy of the ledger for sanity check
we can't verify a live ledger, unfortunately, fixes #985
2018-08-21 09:45:52 +09:00
39c3280860 Don't block on large network test 2018-08-20 16:48:37 -06:00
2d35345c50 Boot unused creates 2018-08-20 16:48:37 -06:00
a02910be32 Remove pubkey from getBalance response 2018-08-20 15:02:48 -07:00
b9ec97a30b Add counter for bank transaction errors (#1015) 2018-08-20 14:56:01 -07:00
2e89999d88 # This is a combination of 4 commits.
# This is the 1st commit message:

Fix tesetment readme

# This is the commit message #2:

updte

# This is the commit message #3:

typo

# This is the commit message #4:

cleanup
2018-08-20 13:49:56 -07:00
24b0031925 Reduce number of nodes in multinode test (#1003) 2018-08-20 13:40:42 -07:00
9eeaf2d502 Bind RPC port on all interfaces 2018-08-20 12:45:50 -07:00
c9e6fb36c3 Avoid unncessary cargo rebuilds in non-perf configuration 2018-08-20 12:03:44 -07:00
8de317113c clippy: remove identity conversion 2018-08-20 10:55:55 -07:00
a1ec549630 Pin nightly rust for more controlled updating 2018-08-20 10:55:55 -07:00
ecddff98f5 Add --nopull argument 2018-08-20 10:55:55 -07:00
10066d67bf Add llvm deb repository 2018-08-19 09:01:36 -07:00
a07f7435c6 \ 2018-08-19 08:49:29 -07:00
d3523ebbe5 Nightly image now derives from stable image 2018-08-19 08:47:59 -07:00
133ddb11ff typo in README 2018-08-18 18:24:42 -07:00
1bf15ae907 Temporarily disable cargo audit CI failure 2018-08-18 12:29:49 -06:00
f73f3941cd Revert ill-advised jsonrpc marker, and handle jsonrpc server close 2018-08-18 12:29:49 -06:00
d69d79612b Simplify Rpc request processing 2018-08-18 12:29:49 -06:00
64ea5126e0 Fix early return for invalid parameter 2018-08-18 12:29:49 -06:00
9df3aa50d5 Remove unnecessary solana_ prefixes 2018-08-18 12:29:49 -06:00
cab75b7829 Handle potential panics 2018-08-18 12:29:49 -06:00
d9fac86015 Use jsonrpc git repo, allowing removal of Default bound for Metadata 2018-08-18 12:29:49 -06:00
1eb8724a89 Disable Rpc module for other tests to prevent port conflicts 2018-08-18 12:29:49 -06:00
c6662a4512 Implement Rpc in Fullnode 2018-08-18 12:29:49 -06:00
d3c09b4e96 Update jsonrpc dependency syntax 2018-08-18 12:29:49 -06:00
124f6e83d2 Rpc get last id endpoint 2018-08-18 12:29:49 -06:00
569ff73b39 Rpc tests 2018-08-18 12:29:49 -06:00
fc1dbddd93 Implement json-rpc functionality 2018-08-18 12:29:49 -06:00
3ae867bdd6 fixups 2018-08-18 02:22:52 -07:00
bc5f29150b fix erasure, remove Entry "pad"
* fixes #997
 * Entry pad is no longer required since erasure coding aligns data length
2018-08-18 02:22:52 -07:00
46016b8c7e crashes generate_coding() 2018-08-18 02:22:52 -07:00
5dbecd6b6b add logging, more conservative reset 2018-08-18 02:22:52 -07:00
877920e61b Compute snap channel using ci/channel-info.sh 2018-08-17 23:15:48 -07:00
3d1e908dad Add script to fetch latest channel info 2018-08-17 23:15:48 -07:00
6880c2bef0 Exclude ci/semver_bash/; don't want to diverge from upstream 2018-08-17 23:15:48 -07:00
78872ffb4b Vendor https://github.com/cloudflare/semver_bash/tree/c1133faf0e 2018-08-17 23:15:48 -07:00
229d825fe0 Fix master-perf basename 2018-08-17 21:59:36 -07:00
edc5fc098e Make SNAP_CHANNEL more visible in build log 2018-08-17 21:39:54 -07:00
bbe815468d Add instructions on how to run the demo against testnet.solana.com and watch it on the dashboard 2018-08-17 21:26:06 -07:00
82e7725a42 Invert logic 2018-08-17 21:16:35 -07:00
dc61cf1c8d Keep v0.7 snap off the edge channel 2018-08-17 21:12:10 -07:00
aba63e2c6c Log expansion directive must be on its own line 2018-08-17 20:58:14 -07:00
c2ddd056e2 Add option to skip ledger verification 2018-08-17 20:41:30 -07:00
c9508e84f2 0.8.0 2018-08-17 17:56:35 -07:00
f6f0900506 Large network test to not poll validator for sigs (#998)
- The finality is already reached. The test will check the signature
  in validators once, instead of polling. This will help speed up the test.
2018-08-17 14:38:19 -07:00
7aeef27b99 not quite banishing build.rs, but better 2018-08-16 22:33:31 -07:00
98d0ef6df5 Add some wget retries 2018-08-16 20:22:49 -07:00
208a7f16cb Fix bench-tps nokey error 2018-08-16 19:38:26 -06:00
16cf31c3a3 fix #990 2018-08-16 15:52:30 -07:00
2b48daaeba accept multiple expected outputs 2018-08-16 14:44:51 -07:00
79d24ee227 fixed test according to @rob-solana 2018-08-16 14:44:51 -07:00
a284030ecc Account type with state
comments

fixups!

fixups!

fixups for a real Result<> from get_balance()

on 2nd thought, be more rigorous

Merge branch 'rob-solana-accounts_with_state' into accounts_with_state

update

review comments

comments

get rid of option
2018-08-16 14:44:51 -07:00
fc0d7f5982 updated nightly versions 2018-08-16 13:17:29 -07:00
f697632edb update clippy install instructions, from here:
https://github.com/rust-lang-nursery/rust-clippy

fixes #947 ?
2018-08-16 13:17:29 -07:00
73797c789b back to 4 TX threads 2018-08-16 12:02:11 -07:00
036fcced31 test -t nproc 2018-08-16 12:02:11 -07:00
1d3157fb80 fixups 2018-08-16 12:02:11 -07:00
0b11c2e119 restart testnet clients in case airdrop fails 2018-08-16 12:02:11 -07:00
96af892d95 Add docs about the testnet 2018-08-16 07:39:17 -07:00
c2983f824e Refactored large network test to use finality to assert success (#978) 2018-08-15 20:05:43 -07:00
88d6fea999 Revert "Accounts with state (#954)"
This reverts commit c23fa289c3.
2018-08-15 19:44:39 -07:00
c23fa289c3 Accounts with state (#954)
* Account type with state

* fixed test according to @rob-solana
2018-08-15 14:32:11 -07:00
db35f220f7 Run multinode test for enough iterations for a small node count test (#971) 2018-08-15 10:44:14 -07:00
982afa87a6 Retransmit blobs from leader from window (#975)
- Some nodes don't have leader information while leader is broadcasting
  blobs to those nodes. Such blobs are not retransmitted. This change
  rertansmits the blobs once the leader's identity is know.
2018-08-14 21:51:37 -07:00
dccae18b53 cfg=erasure fixes, use return value of align!() 2018-08-14 12:14:59 -07:00
53e86f2fa2 use align! 2018-08-14 12:14:59 -07:00
757dfd36a3 Report errors better in build log 2018-08-14 11:44:26 -07:00
708add0e64 fixups 2018-08-14 10:16:34 -07:00
d8991ae2ca fix UPnP backout, fixes #969 2018-08-14 10:16:34 -07:00
5f6cbe0cf8 fixups 2018-08-13 21:07:26 -07:00
f167b0c2c5 fixups 2018-08-13 21:07:26 -07:00
f784500fbb fixups
fixes #907
2018-08-13 21:07:26 -07:00
83df47323a initialize recycled data 2018-08-13 21:07:26 -07:00
c75d4abb0b Tuck away PoH duration 2018-08-13 20:17:16 -06:00
5216a723b1 Pacify clippy 2018-08-13 20:17:16 -06:00
b801ca477d Declare fullnode a word 2018-08-13 20:17:16 -06:00
c830c604f4 Make BroadcastStage an actual stage
TODO: Why isn't BroadcastStage/RetransmitStage managed by the NCP?
2018-08-13 20:17:16 -06:00
0e66606c7f Rename broadcaster to broadcast_stage
And move retransmitter code into retransmit_stage.

TODO: Add a BroadcastStage service
2018-08-13 20:17:16 -06:00
8707abe091 Fix erasure build 2018-08-13 20:17:16 -06:00
dc2a840985 Move FullNode::new_window into window module 2018-08-13 20:17:16 -06:00
2727067b94 Move winow into its own module 2018-08-13 20:17:16 -06:00
6a8a494f5d Rename WindowStage to RetransmitStage
The window is used for both broadcasting from leader to validator
and retransmitting between validators.
2018-08-13 20:17:16 -06:00
a09d2e252a Move window dependencies out of streamer
No tests!?
2018-08-13 20:17:16 -06:00
3e9c463ff1 Offer only 1 way to create a fullnode with an empty window 2018-08-13 20:17:16 -06:00
46d50f5bde Remove p2p crate (and uPnP support) 2018-08-13 18:22:58 -07:00
e8da903c6c move tmp_ledger back to target dir 2018-08-13 16:52:53 -07:00
ab10b7676a use stable cache 2018-08-13 16:23:30 -07:00
fa44a71d3e move bench to a seprate, parallel step 2018-08-13 16:23:30 -07:00
c86e9e8568 pad max_data_size to jerasure's alignment requirements 2018-08-13 16:10:51 -07:00
9e22e23ce6 increase stable timeout until tomorrow 2018-08-13 15:45:50 -07:00
835f29a178 off by 2 2018-08-13 15:12:12 -07:00
9688f8fb64 Update IP address 2018-08-13 12:32:09 -07:00
df5cde74b0 Back out pre-0.7.1 workaround 2018-08-13 12:13:00 -07:00
231d5e5968 0.7.1 2018-08-13 12:12:27 -07:00
c2ba72fe1f Fix up validator sanity 2018-08-13 10:23:35 -07:00
d93786c86a Revert "turn off validator sanity while I work on it"
This reverts commit d4304eea28.
2018-08-13 10:23:35 -07:00
bf15cad36b Add get_finality request and use it from multinode test (#941) 2018-08-13 08:55:13 -07:00
288ed7a8ea Vote should be valid (#945)
* test that fails

* fix for test

* rename
2018-08-12 18:19:54 -07:00
f07c038266 Fix bank coalescing (#949)
* fix bank coalescing

* comments

* fix bench

* fix bench

* backout banking stage coalescing

* 120 nodes

* 100
2018-08-12 10:04:21 -07:00
8eed120c38 add missing backslash 2018-08-10 23:45:08 -07:00
5dbcb43abd more enhancements 2018-08-10 19:53:58 -07:00
dd1eefaf62 change verify-internal to precheck
update to new ledger API
2018-08-10 19:53:58 -07:00
35de159d00 better error messages 2018-08-10 19:53:58 -07:00
546a1e90d5 clippy fixups 2018-08-10 19:53:58 -07:00
b033e1d904 enhance ledger-tool
* add json, which does the thing with json, move print to Rust's {:?}
  * add --head NUM, to limit how much work gets done for print, json, verify
  * add verify-internal, which very carefully checks ledger format without
      trying first to "recover" it
  * exit with errors on mis-usage
2018-08-10 19:53:58 -07:00
96d6985895 rework read_ledger, LedgerWriter, and LedgerWindow for recover()
fixes #910
2018-08-10 18:07:23 -07:00
58f220a3b7 Add tic-tac-toe program flow concept 2018-08-10 17:39:54 -07:00
a206f2570d Add hostname to metrics on panic 2018-08-10 17:08:40 -07:00
2318ffc704 Use a different counter for validator account not found errors. (#931)
* Use a different counter for validator account not found errors.  This is a usefull signal of something going wrong with the ledger
2018-08-10 15:18:44 -07:00
d4304eea28 turn off validator sanity while I work on it 2018-08-10 14:56:46 -07:00
06af9de753 fixups 2018-08-10 11:41:31 -07:00
7f71e1e09f fixups 2018-08-10 11:41:31 -07:00
bb7eccd542 check validator startup in testnet-sanity.sh 2018-08-10 11:41:31 -07:00
b04c71acd9 check issue 910 in testnet-sanity 2018-08-10 11:41:31 -07:00
bbf9ea89c5 add some flushing to ledger 2018-08-10 11:41:31 -07:00
846ad61941 use ~/.solana instead of PWD to keep cargo happy, don't rsync --append 2018-08-10 11:41:31 -07:00
8b41c415b7 add equal sign 2018-08-10 08:05:48 -07:00
197ba8b395 Fixed punctuation 2018-08-09 16:39:04 -07:00
8d2a61a0c9 Alphabetize bins 2018-08-09 16:23:05 -06:00
7512317243 Alphabetize dependencies 2018-08-09 16:23:05 -06:00
bca2294655 cargo fmt 2018-08-09 13:41:37 -06:00
abd55e4159 Add coding guidelines document 2018-08-09 13:41:37 -06:00
4a980568ac Rename sig variables to signature
We'll avoid introducing three-letter terms to free up the namespace
for three-letter acronyms.

But recognize the term "sigverify", a verb, to verify a digital
signature.
2018-08-09 13:41:37 -06:00
9d436fc5f8 Rename pk variables to pubkey 2018-08-09 13:41:37 -06:00
ad331e6d56 Rename PublicKey type to Pubkey
Recognize pubkey as a noun meaning the public key of a keypair.
2018-08-09 13:41:37 -06:00
d7e4e57548 Rename public_key variables to pubkey 2018-08-09 13:41:37 -06:00
b2067d2721 Rename kp variables to keypair 2018-08-09 13:41:37 -06:00
c2bbe4344e Rename KeyPair to Keypair 2018-08-09 13:41:37 -06:00
8567253833 Ignore flaky test 2018-08-09 10:15:10 -06:00
ca7d4c42dd Rename cur_hashes to num_hashes 2018-08-09 10:15:10 -06:00
8ca514a5ca Remove unnecessary : 2018-08-08 22:45:39 -07:00
b605552079 Record network version in testnet-deploy start datapoint 2018-08-08 22:41:02 -07:00
74f5538bd3 Verify the ledger as a part of sanity 2018-08-08 16:10:54 -07:00
ff57c7b7df Include ledger tool 2018-08-08 15:12:40 -07:00
ce8a4fa831 allow received to outpace window, we're already constraining repair
correctly identify sender in ledger_window repair responses, enabling re-transmission
2018-08-08 15:10:44 -07:00
8331aab26a Enable Crdt debug messages to debug validators 2018-08-08 14:22:20 -07:00
a6857dbaaa Updated node count to 230. Increased wmem on CI large 2018-08-08 13:13:18 -07:00
054298d957 Retry snap install 3 times, sometimes the snap server 503s 2018-08-08 08:56:05 -07:00
cca240c279 Add SOLANA_NET_NAME, rename SOLANA_NET_URL to SOLANA_NET_ENTRYPOINT 2018-08-08 08:49:30 -07:00
89f17ceecf Route setup-args 2018-08-08 08:32:23 -07:00
fe97857c62 Add 'setup-args' snap configuration parameter, to override -p 2018-08-08 08:10:56 -07:00
75854cc234 Update dynamic network test with more nodes (#904)
- Check for correct OS params in test-large-network.sh
2018-08-08 06:52:57 -07:00
9783d47fd1 write a "unit" test for WindowLedger (it was working ;)
clear flags on fresh blobs, lest they sometimes impersonate coding blobs...

fix bug: advance *received whether the blob_index is in the window or not,
  failure to do so results in a stalled repair request pipeline
2018-08-08 04:28:09 -07:00
38be61bd22 Check for log level before doing perf counter work
Perf counters, especially when running the dynamic test can cause
functions like crdt::apply_updates to be really slow (>500ms).
2018-08-08 00:16:53 -07:00
c64e2acf8b set destination address when for ledger window repair responses 2018-08-07 23:31:01 -07:00
a200cedb4b Lower UDP data size to 64k - 128 bytes
rust API gives errors for packets larger than ~65500 and
wikipedia says 65507 is the max size, lowering this avoids the errors.
2018-08-07 18:39:36 -07:00
5fec0ac82f Validators now rsync the ledger smarter
- Don't re-rsync parts of the ledger that are already present
- Disable compression
2018-08-07 17:38:26 -07:00
999534248b fixups 2018-08-07 17:27:53 -07:00
fbc754ea25 plug in LedgerWindow
fixes #872
2018-08-07 17:27:53 -07:00
ecea41a0ab Install EarlyOOM on testnet nodes 2018-08-07 16:58:46 -07:00
1b6d472cb2 Fixed counters for coalescing and broadcast index (#900) 2018-08-07 16:46:48 -07:00
f0446c7e88 Package curl in Snap for metrics_write_datapoint.sh 2018-08-07 22:41:26 +00:00
2a0025bb57 get buffered IO back for ledger 2018-08-07 15:34:15 -07:00
64d6d3015a Counters for broadcasted blob idx and coalesced packets (#897) 2018-08-07 14:54:26 -07:00
90550c5b58 Switch to slice arguments and remove clippy exceptions 2018-08-07 14:43:44 -07:00
53cd2cdd9f Only monitor for OOM kills when a leader, validator or drone is enabled 2018-08-07 14:20:52 -07:00
1ac5d300a4 Rearrange start hash for process_ledger and add a unit test 2018-08-07 14:10:36 -07:00
642c25bd3b Update path 2018-08-07 13:40:49 -07:00
df808dedd1 Add simple OOM Killer monitor 2018-08-07 13:35:01 -07:00
02f9cb415b Ignore failure to write oom_score_adj 2018-08-07 13:35:01 -07:00
e3cf1e6598 Bundle metrics_write_datapoint.sh in Snap 2018-08-07 13:35:01 -07:00
7681211c02 Pacify shellcheck 2018-08-07 13:35:01 -07:00
0ee935dd72 Adjust fullnode/drone oom_score_adj to goad the kernel into killing it first 2018-08-07 10:42:53 -07:00
16772d3d51 Coalesce multiple blobs if received close together (#869)
- This helps reduce unnecessary growth of window if small blobs
  are received in small space of time
2018-08-07 10:29:57 -07:00
1c38e40dee Validate ledger once all the tests complete 2018-08-07 10:00:52 -07:00
ceb5a76609 Refactor validator windowing
- a unit test for windowing functions
- issue #857
2018-08-07 08:17:32 -07:00
db2392a691 Use last_id from the entries stream instead of last_id from bank
bank will only register ids when has_more is not set because those are
the only ids it has advertised, so it will not register all ids,
however the entry stream will contain unbroken last_id chain, so we
need to track that to get the correct start hash.
2018-08-07 08:14:06 -07:00
9c1b6288a4 Use ? instead of unwrap()
This change addresses #833, while there are still some unwrap() though.
2018-08-07 08:10:22 -07:00
575179be8e y 2018-08-06 23:55:00 -07:00
5b6ffaecc0 s/r$/f/ 2018-08-06 23:36:09 -07:00
efc72b9572 Support -V/--version on all CLI apps
All CLI apps that use clap (in other words, except for bench-streamer)
can use crate_version! to take the version from Cargo.toml.

This change addresses #700.
2018-08-06 22:03:58 -07:00
5dc7177540 Remove manually created help text, use clap's text instead. 2018-08-06 19:15:52 -07:00
78a4b1287d Initialize logger 2018-08-06 19:04:04 -07:00
c5001869f1 Add verify subcommand 2018-08-06 19:04:04 -07:00
7c31f217d5 Add voting metric even when there are not enough validators 2018-08-06 15:58:10 -07:00
1152457691 avoid normal validator port a little better for sanity 2018-08-06 15:06:16 -07:00
3beb38ac8a /tmp/farf no good on multi-user machine 2018-08-06 14:53:40 -07:00
8cbaa19d2e Report the address that failed to bind 2018-08-06 11:21:25 -07:00
63d2b2eb42 adjust bank notion of entry_count to aid debugging 2018-08-06 11:20:52 -07:00
e02da9a15a Clean up tx_count usage 2018-08-06 11:00:25 -07:00
ae111a131c Condense stdout 2018-08-06 11:00:25 -07:00
4402e1128f Cleanup 2018-08-06 11:00:25 -07:00
f55bb6d95c Send/confirm a loopback payment after each batch of transactions 2018-08-06 11:00:25 -07:00
91741e20fa Add rustc/cargo version check 2018-08-06 09:32:08 -07:00
0514f5e573 sync() apparently imposes a serious performance penalty 2018-08-06 08:51:41 -07:00
637d403415 move bank.process_entries() to firsties 2018-08-06 08:51:41 -07:00
9fabd34156 remove trace! calls, re-arrange replicate_requests to have the bank earlier 2018-08-06 08:51:41 -07:00
039ed01abf on 2nd thought: do not copy_ledger() for this test 2018-08-06 08:51:41 -07:00
ead0eb2754 move copy_ledger() back into ledger.rs
Don't recover() for copy(), as copy() is already tolerant of things
    recover() guards against.  Note: recover() is problematic if the ledger is
    "live", i.e. is currently being written to.
2018-08-06 08:51:41 -07:00
c3db2df7eb tweak random access ledger
* add recover_ledger() to deal with expected common ledger corruptions
  * add verify_ledger() for future use cases (ledger-tool)
  * increase ledger testing
  * allow replicate stage to run without a ledger
  * ledger-tool to output valid json
2018-08-06 08:51:41 -07:00
ee6c15d2db start on ledger recovery with a description of what that might mean 2018-08-06 08:51:41 -07:00
715a3d50fe Revert "Revert "clippy fixup""
This reverts commit d173e6ef87.
2018-08-06 08:51:41 -07:00
692b125391 Revert "Revert "fixups""
This reverts commit e2c68d8775.
2018-08-06 08:51:41 -07:00
5193819d8e Revert "Revert "plug in new ledger""
This reverts commit 57e928d1d0.
2018-08-06 08:51:41 -07:00
210b9d346f Add voting metrics and -h/--help to get usage for client.sh script 2018-08-05 14:21:49 -07:00
4c4b0f551e clippy fixups 2018-08-05 13:30:45 -07:00
6800ff1882 solana-ledger-tool initial commit
does nothing but convert from random-access ledger to json
2018-08-05 13:30:45 -07:00
399a3852b1 Add sigverify_stage-total_verify_time datapoint 2018-08-04 21:45:58 -07:00
e7d3069f58 macOS: Adjust maxdgram to allow for large UDP packets 2018-08-04 21:42:59 -07:00
40ea3e3e61 tweak multinode-demo to work better in snap, validator-x be more stand-alone 2018-08-04 01:04:06 -07:00
dc9a11bae0 remove rsync size limit for validator's ledger 2018-08-03 23:31:25 -07:00
906d18a709 move VOTE to trace, info too verbose 2018-08-03 23:04:54 -07:00
a13058b6c4 Look for 3 nodes (1 leader, 2 validators) 2018-08-03 20:30:29 -07:00
98ee4b4672 fix up some nits in multinode-demo 2018-08-03 20:19:41 -07:00
7fd7310b96 Prevent a node from overrunning it's receive window (#846)
- The node drops blobs that will cause it to overrun window
- The node does not ask to repair a blob that overruns the window
2018-08-03 20:15:14 -07:00
28fa43d2a9 Use env_logger@0.5.12 2018-08-03 20:08:30 -07:00
1a9e6ffdd7 Try multiple times to confirm a non-zero balance 2018-08-03 19:57:38 -07:00
c998199954 fixups, add validator-x to sanity 2018-08-03 15:34:11 -07:00
19792192a7 support any number of self-setup validators on a single host 2018-08-03 15:34:11 -07:00
4aab413154 recycle the skipped, outside-window blob, fixes #843 2018-08-03 15:02:55 -07:00
15a6179b97 Stop installing rustfmt-preview, it's already present 2018-08-03 14:27:11 -07:00
83b308983f Include rustfmt-preview 2018-08-03 14:11:42 -07:00
f2b1a04bca cargo fmt fixups 2018-08-03 11:59:25 -07:00
3e36e6dcf8 Upgrade to rust 1.28 2018-08-03 11:30:40 -07:00
6feb6a27be Run localnet-sanity in test-stable-perf 2018-08-03 10:46:48 -07:00
c5ceb15e02 Skip network tuning on CI machines 2018-08-03 10:46:48 -07:00
57e928d1d0 Revert "plug in new ledger"
This reverts commit 46d9ba5ca0.
2018-08-03 10:24:51 -07:00
e2c68d8775 Revert "fixups"
This reverts commit b72e91f681.
2018-08-03 10:24:51 -07:00
d173e6ef87 Revert "clippy fixup"
This reverts commit 384b486b29.
2018-08-03 10:24:51 -07:00
c230360f4c Wait until recycled machines are reachable before provisioning them 2018-08-02 22:13:17 -07:00
384b486b29 clippy fixup 2018-08-02 21:50:47 -07:00
b72e91f681 fixups 2018-08-02 21:50:47 -07:00
46d9ba5ca0 plug in new ledger 2018-08-02 21:50:47 -07:00
a9240a42bf Delete unreachable validators to cause a fresh one to be spawned 2018-08-02 20:45:29 -07:00
a7204d5353 Use a local user to avoid GCP login quota limits 2018-08-02 19:43:35 -07:00
f570ef1c66 Defer repair request for blobs that may still be in avalanche transit (#814) 2018-08-02 19:12:57 -07:00
ee0195d588 Try to measure finality from time seen to when 2/3 of validator..
..set has voted. Add a timestamp to last_ids and use that to
see how long from when 2/3s validator set has voted on them.
2018-08-02 13:21:29 -07:00
448b8b1c17 Add Hash wrapper and supporting traits 2018-08-01 17:00:51 -07:00
4d77fa900b Add Signature wrapper and supporting traits 2018-08-01 17:00:51 -07:00
7ccd771ccc Only send sigverify to GPU if batch size is >64
Seems to be a decent crossover point for Xeon E5-2620 v4 8c,16t vs. nvidia 1080ti
2018-08-01 16:38:15 -07:00
e9f8b5b9db Fix bench 2018-08-01 16:24:47 -07:00
2366c1ebaf Enable cargo audit in CI
Fixes #772
2018-08-01 16:24:47 -07:00
c5de237276 Upgrade ring and untrusted 2018-08-01 16:24:47 -07:00
aa9bc57b4d Implement GenKeys without SecureRandom 2018-08-01 16:24:47 -07:00
11df477b20 Make GenKey functions mut
We hide the mutability to implement SecureRandom, but that's going
away.
2018-08-01 16:24:47 -07:00
7141750668 new_key -> gen_keypair 2018-08-01 16:24:47 -07:00
68675bd1ab Less pub 2018-08-01 16:24:47 -07:00
19b3cacd60 Generate a fixed-size array instead of a vector 2018-08-01 16:24:47 -07:00
bcfaf5d994 Rebase ledger change 2018-08-01 16:15:14 -07:00
e9499ac5b8 Update PublicKey AsRef to slice 2018-08-01 16:15:14 -07:00
7ff721e563 Replace pub field with AsRef impl 2018-08-01 16:15:14 -07:00
fda3b9bbd4 Use new PublicKey format instead of hex 2018-08-01 16:15:14 -07:00
cf70e5ff2f Handle wrapped PublicKey struct 2018-08-01 16:15:14 -07:00
a86618faf3 Add PublicKey wrapper
Add custom formatting for PublicKey display and debug
2018-08-01 16:15:14 -07:00
6693386bc5 Lower errors to warnings so they don't print during tests
Negative tests should trigger the warnings, but errors look like
something is wrong.
2018-08-01 16:56:12 -06:00
4a8a0d03a3 Correct localhost address 2018-08-01 15:49:48 -07:00
2c9d288ca9 Add a CI metric data point upload timeout to prevent CI build stalls
5 seconds is somewhat arbitrary, seems like enough
2018-08-01 15:49:48 -07:00
bb0aabae75 Add cmake, which is needed to build cargo-audit 2018-08-01 16:43:49 -06:00
5cda0ed964 Airdrop from the leader 2018-08-01 15:21:20 -07:00
0aba74935b fixups 2018-08-01 14:42:58 -07:00
4eb666d4f9 provide ledger::copy() 2018-08-01 14:42:58 -07:00
d5e0cf81ff fixups 2018-08-01 14:42:58 -07:00
3ea784aff7 clippy fixups 2018-08-01 14:42:58 -07:00
fef93958c8 fixups, tests 2018-08-01 14:42:58 -07:00
cae88c90b1 add a persistent ledger of index and data files 2018-08-01 14:42:58 -07:00
1a8da769b6 ... 2018-08-01 14:42:58 -07:00
2b259aeb41 testnet now deploys successfully on days of the month < 10 2018-08-01 14:10:52 -07:00
de7e9b4b4c Remove retry
This was introduced to mask the occasional failure of racy tests.  But this is misguided as it helps hid the true problem, the racy test, and it causes tries builds that fail deterministically to retry only to fail once again.
2018-08-01 12:02:39 -07:00
0f95031b99 CI builds no longer turn red if a metrics write fails 2018-08-01 11:35:19 -07:00
d622742b84 Mark test-multinode-basic as ignore 2018-08-01 10:13:05 -06:00
ff254fbe5f re-instate traces 2018-08-01 09:08:38 -07:00
05153e4884 de-trace this function, new blob is not a dup 2018-08-01 09:08:38 -07:00
2ece27ee3a fix leak 2018-08-01 09:08:38 -07:00
a58df52205 Fix build
Last two PRs crossed in flight. A keypair is now required for all
types of FullNode, not just validators.
2018-08-01 08:53:21 -07:00
2ea6f86199 Submit leader's vote after observing 2/3 validator votes (#780)
* fixup!

* fixups!

* send the vote and count it

* actually vote

* test

* Spelling fixes

* Process the voting transaction in the leader's bank

* Send tokens to the leader

* Give leader tokens in more cases

* Test for write_stage::leader_vote

* Request airdrop inside fullnode and not the script

* Change readme to indicate that drone should be up before leader

And start drone before leader in snap scripts

* Rename _kp => _keypair for keypairs and other review fixups

* Remove empty else
* tweak test_leader_vote numbers to be closer to testing 2/3 boundary
* combine creating blob and transaction for leader/validator
2018-07-31 22:07:53 -07:00
7c5172a65e Converted sigverify disable flag to runtime check instead of "cfg" (#799) 2018-07-31 16:54:24 -07:00
821e3bc3ca Avoid race between test_lograte and test_lograte_env 2018-07-31 16:08:01 -07:00
5dd2f737a3 clear out old blobs in find_next_missing 2018-07-31 15:54:32 -07:00
c9bb5c1f5b Update snap log file documentation 2018-07-31 13:13:27 -07:00
5d936e5c8a Trap SIGINT for clean ^C shutdown 2018-07-30 17:15:50 -07:00
e985c2e7d5 .gitignore more generated files 2018-07-30 17:15:50 -07:00
308b6c3371 Follow Shared prefix convention for Window alias (#798)
Follow Shared prefix convention for Window alias.
2018-07-30 16:56:01 -07:00
ea7fa11b3e use size_of() instead of serialized_size() and magic number 8 2018-07-30 16:48:58 -07:00
5a40ea3fd7 Only map HOME when in CI 2018-07-30 16:36:26 -07:00
102510ac0e Clear apt cache to reduce image size 2018-07-30 16:36:26 -07:00
2158329058 Switch to docker-rust image 2018-07-30 16:36:26 -07:00
bc484ffe5f Add docker-rust image 2018-07-30 16:36:26 -07:00
6fcf4584d5 Propagate more BUILDKITE environment variables into containers 2018-07-30 16:36:26 -07:00
1adc83d148 Add localnet-sanity.sh 2018-07-30 16:36:26 -07:00
647053e973 Terminate child process when main script is interrupted 2018-07-30 16:36:26 -07:00
95b98b3845 Fix --addr option 2018-07-30 16:36:26 -07:00
f27613754a Report number of nodes found on failure too 2018-07-30 16:36:26 -07:00
3e351b0b13 Drop -t 2018-07-30 16:13:51 -07:00
79ece53e3c Don't panic the tokio worker thread when deserialize() fails 2018-07-30 14:56:53 -07:00
f341b2ec10 fixups 2018-07-30 14:26:44 -07:00
167b079e29 fixups 2018-07-30 14:26:44 -07:00
7ded5a70be fixups 2018-07-30 14:26:44 -07:00
fc476ff979 implement iterator for parsing length + data ledger 2018-07-30 14:26:44 -07:00
c3279c8a00 chugga 2018-07-30 14:26:44 -07:00
e471ea41da fixups 2018-07-30 14:26:44 -07:00
552d4adff5 use a binary ledger: newline-separated, newline-escaped entries instead of json 2018-07-30 14:26:44 -07:00
0c33c9e0d7 Dynamic network test changes (#795)
- No sigverify if feature sigverify_cpu_disable is used
- Purge validators in the test if lag count increases beyond
  SOLANA_DYNAMIC_NODES_PURGE_LAG environment variable
- Other useful log messages in the test
2018-07-30 13:57:10 -07:00
fae9fff24c Unify logging initialization 2018-07-29 19:08:27 -07:00
79924e407c Include nanoseconds in log timestamp 2018-07-29 19:08:27 -07:00
18d4da0076 Fetch env_logger from github until 0.5.12 is available 2018-07-29 19:08:27 -07:00
416c141775 export SKIP_INSTALL=1 to reset the network without reinstalling the snap 2018-07-28 18:04:13 -07:00
af1a2e83bc Don't panic again when waiting for a panicked validator thread 2018-07-28 16:35:35 -07:00
4cdb9a73f8 Skip testnet-sanity on manual deploy 2018-07-28 12:37:29 -07:00
4433730610 Add support for deploying a locally built snap 2018-07-28 12:37:29 -07:00
71eb5bdecc Factor out vm_foreach 2018-07-28 12:37:29 -07:00
029e2db2cf Improve assert message 2018-07-28 10:40:50 -07:00
81db333490 Guard against rsyncing TBs of ledger 2018-07-27 23:53:20 -07:00
c68ee0040d No need to support migrating from the old ledger format anymore 2018-07-27 23:53:20 -07:00
d96e267624 Keep around 3GB of logs, 160MB is just not enough 2018-07-27 22:40:21 -07:00
0b47404ba6 Check for default leader and use cmp::max for a bit nicer code (#779) 2018-07-27 15:53:31 -07:00
7f4844f426 More stats in dynamic multinode test 2018-07-27 11:55:09 -07:00
50e1e0ae47 use rust's rotate (in place, yay!) 2018-07-27 11:44:02 -07:00
538c3b63e1 Log the last_id being voted on 2018-07-27 11:27:51 -07:00
678b2870ff i 2018-07-27 11:11:37 -07:00
308d8c254d poll_get_balance no longer fails intermittently for zero balance accounts
While polling for a non-zero balance, it's not uncommon for one of the
get_balance requests to fail with EWOULDBLOCK.  Previously when a get_balance
request failure occurred on the last iteration of the polling loop,
poll_get_balance returned an error even though the N-1 iterations may have
successfully retrieved a balance of 0.
2018-07-26 21:41:07 -07:00
f11aa4a57b Ensure non-zero exit code if 'balance' command fails 2018-07-26 21:41:07 -07:00
c52d4eca0b Stop validator first to stop voting before the leader stops 2018-07-26 17:29:32 -07:00
7672506b45 Validators now vote once a second regardless 2018-07-26 17:07:42 -07:00
80a02359f7 Add script to audit for security vulnerabilities 2018-07-26 13:42:12 -07:00
ab3968e3bf Dedup 2018-07-26 11:45:58 -07:00
42ebf9502a Agent cleaning is now performed in a separate pipeline 2018-07-26 11:37:36 -07:00
bd4fcf4ac6 Clean out stale buildkite agent build directories 2018-07-26 11:37:36 -07:00
4dceb73909 Reinstall client nodes in the background to speed up deploys 2018-07-26 09:49:00 -07:00
dd819cec3d fix off by one in packet.rs 2018-07-26 09:24:44 -07:00
5115cd7798 large network back to erasure 2018-07-25 20:45:16 -07:00
cbb8dee360 rework broadcast to understand a separate transmit index for coding blobs 2018-07-25 20:45:16 -07:00
e0cdcb0973 employ the simple choice for broadcast table of coding blobs: round-robin 2018-07-25 20:45:16 -07:00
a6a2a745ae fix broadcast of erasure coding blobs
erasure coding blobs were being counted as window slots, skewing transmit_index

erasure coding blobs were being skipped over for broadcast, because they're
  only generated when the last data blob in an erasure block is added to the
  window.... rewind the index to pick up and broadcast those coding blobs
2018-07-25 20:45:16 -07:00
297896bc49 honor environment variable SOLANA_DYNAMIC_NODES, obviating need to edit+compile for re-test 2018-07-25 20:45:16 -07:00
f372840354 Collect some datapoints while bench-tps is running 2018-07-25 20:15:43 -07:00
4c4659be13 Add more stdout 2018-07-25 16:38:21 -07:00
1b79fe73a1 Emit a metrics datapoint if bench-tps terminates 2018-07-25 15:55:02 -07:00
5fa072cf16 Avoid quotes around net name 2018-07-25 15:55:02 -07:00
212874e155 Use BlobError for get_size return 2018-07-25 15:54:04 -07:00
75212f40e7 fix off by one for send_to() of blob 2018-07-25 15:16:56 -07:00
6fde65577e fixes #756 2018-07-25 11:07:03 -07:00
80ecef2832 Add --sustained to ci testnet deploy script 2018-07-25 10:16:46 -07:00
edf2ffaf4e Reduce complexity of main for clippy
...and readability
2018-07-25 10:16:46 -07:00
6c275ea5ef More knobs. Arg for tx count per batch and also sustained mode
sustained mode overlaps tx generation with transfer. This mode seems
to have lower peak performance but higher average performance
2018-07-25 10:16:46 -07:00
23ed65b339 Transfer and sign at the same time in bench-tps 2018-07-25 10:16:46 -07:00
9c7913ac9e trying to raise an error 2018-07-25 08:12:20 -07:00
8b01e6ac0b implement Blob::get_size(), the counterpart of Blob::set_size() 2018-07-25 08:12:20 -07:00
ff5854396a deserialize using get_data_size(), which refers to blob.data()'s length,
instead of using msg.meta.size, which refers to the entire blob's length

fixes #752
2018-07-25 08:12:20 -07:00
f0725b4900 Avoid panicking if poll_get_balance() fails while in the transaction loop 2018-07-24 23:31:28 -07:00
327ba5301d Log token balance throughout the transfer loop 2018-07-24 22:40:12 -07:00
dcce475f0b Progagate logging configuration to client nodes 2018-07-24 21:40:02 -07:00
aa2104a21b Reclaim tokens before exiting to avoid leaking tokens 2018-07-24 21:40:02 -07:00
0206020104 Make airdrops more robust 2018-07-24 21:40:02 -07:00
33bd1229d9 make next_entries() smarter about fitting Transactions into a Blob 2018-07-24 21:38:06 -07:00
195098ca2b Failure test case 2018-07-24 21:38:06 -07:00
9daa7bdbe2 Replace rayon with threads for dynamic network test (#745) 2018-07-24 17:54:29 -07:00
6bd18e18ea Add error messages to ledger verify 2018-07-24 17:35:41 -07:00
8f046cb1f8 disable erasure for large network testing 2018-07-24 16:54:52 -07:00
735a0ee16d Switch back to running bench-tps in 10 minute iterations 2018-07-24 15:43:25 -07:00
537be6a29d export SOLANA_DEFAULT_METRICS_RATE 2018-07-24 15:43:25 -07:00
2b528e2225 fixups 2018-07-24 13:04:34 -07:00
75505bbd72 fixups 2018-07-24 13:04:34 -07:00
e1fc7444f9 fixups 2018-07-24 13:04:34 -07:00
940caf7876 test large network with erasure 2018-07-24 13:04:34 -07:00
fcdb0403ba eliminate unused parameter received, this branch fixes #636 2018-07-24 13:04:34 -07:00
caeb55d066 placate clippy and reduce replicode 2018-07-24 13:04:34 -07:00
f11e60b801 fix major bug: re-used blobs need to have their flags cleared
plus: lots of additional debug-ability
2018-07-24 13:04:34 -07:00
54f2146429 fixups 2018-07-24 13:04:34 -07:00
f60ee87a52 zero the tails of data blobs during generate() and recover() to enable blob reuse 2018-07-24 13:04:34 -07:00
9c06fe25df enhance unit test to fail when erasure encodes stray bytes of data blobs 2018-07-24 13:04:34 -07:00
1eec8bf57f fixups 2018-07-24 13:04:34 -07:00
ddb24ebb61 fixups 2018-07-24 13:04:34 -07:00
a58c83d999 prevent infinite loop on window wraparound 2018-07-24 13:04:34 -07:00
6656ec816c protect generate and recover from u64->usize casting issues 2018-07-24 13:04:34 -07:00
8d2bd43100 fixups 2018-07-24 13:04:34 -07:00
429ea98ace mutable-coding-blocks 2018-07-24 13:04:34 -07:00
3d80926508 fixups 2018-07-24 13:04:34 -07:00
d713e3c2cf send coding in broadcast(), fixups 2018-07-24 13:04:34 -07:00
5d20d1ddbf get test_window_recover_basic() passing 2018-07-24 13:04:34 -07:00
257acdcda1 building now 2018-07-24 13:04:34 -07:00
dab98dcd81 coded => coding 2018-07-24 13:04:34 -07:00
99653a4d04 rework erasure to have data and coding blobs side-by-side in window 2018-07-24 13:04:34 -07:00
dda563a169 document process_blob() 2018-07-24 13:04:34 -07:00
782aa7b23b Cap at 4 threads 2018-07-24 11:35:03 -07:00
813e438d18 Improve panic message 2018-07-24 11:20:13 -07:00
7a71adaa8c Adjust threads by the number of cpus 2018-07-23 21:17:36 -07:00
ce8796bc2e Correctly calculate the expected number of full nodes 2018-07-23 19:55:09 -07:00
c7e1409f7b Not so much |set -x| 2018-07-23 19:55:09 -07:00
9de9379925 Add support more more than 1 client node 2018-07-23 19:34:34 -07:00
7d68b6edc8 Fixup arg processing 2018-07-23 16:51:39 -07:00
48b5344586 Check for 0 TPS explicitly 2018-07-23 16:51:39 -07:00
686b7d3737 Report panics if metrics are setup 2018-07-23 16:51:39 -07:00
7c65e2fbfc Rename variable to improve readability 2018-07-23 16:51:39 -07:00
96a6e09050 Enable metrics in the TPS client 2018-07-23 16:51:39 -07:00
b3f823d544 Alternate between token reclaim and distribution 2018-07-23 13:17:52 -07:00
ea21c7a43e Limit bench-tps last_id poll to prevent infinite loop 2018-07-23 13:17:52 -07:00
437fb1a8d7 Add -a argument to client in case you want to override the address
advertised by the client
2018-07-23 13:17:52 -07:00
166099b9d9 Start validators in parallel in multinode test (#727) 2018-07-23 09:27:06 -07:00
c707b3d2e7 Display the total number of transactions for each node once complete 2018-07-22 23:19:33 -07:00
f7d294de90 Don't rsync leader.json on every iteration 2018-07-22 17:25:00 -07:00
4ecd0a0e45 Improve bench-tps logging 2018-07-22 16:26:49 -07:00
7ebbaaeb2d Use bench-tps default duration 2018-07-22 16:26:49 -07:00
cdcf59ede0 Display a list of all discovered nodes 2018-07-22 11:32:44 -07:00
5d065133ef Add data point for testnet startup and shutdown 2018-07-21 23:27:24 -07:00
d403808564 Restart solana.bench-tps every 10 minutes to work around memory leak
cc: #728
2018-07-21 19:48:13 -07:00
3ffdca193d Rename client-demo to bench-tps catchup 2018-07-21 15:46:03 -07:00
69688a18c7 Fix clippy warnings
Seems clippy is not linting any of the benches.
2018-07-21 11:36:20 -04:00
7193bf28b6 Move streamer bench into standalone executable
It doesn't make use of criterion (or libtest)
2018-07-21 11:36:20 -04:00
637f890b91 Rename client-demo to bench-tps 2018-07-21 11:36:20 -04:00
009d5adcba Tell the client to transact for 1 hour blocks 2018-07-20 17:52:19 -07:00
52c55a0335 Log to /tmp/solana.log for easy runtime inspection of client activity 2018-07-20 17:45:13 -07:00
23428b0381 Migrate drone to poll_for_signature 2018-07-20 20:33:55 -04:00
0e305bd7dd Add poll_for_signature 2018-07-20 20:33:55 -04:00
c068ca4cb7 Return Signature from transfer_signed and send_airdrop 2018-07-20 20:33:55 -04:00
6a8379109d Sleep between retries
Don't congest a congested network.
2018-07-20 20:33:55 -04:00
120add0e82 Add support for a client node running continuous transactions on the net 2018-07-20 17:07:36 -07:00
b92ee51c2d Add --loop flag to easily send transactions continuously 2018-07-20 17:07:36 -07:00
cba3b35ac9 Change not_enough_peers to the default log rate 2018-07-20 11:37:12 -07:00
313fed375c Add counter for tx count and limit error messages 2018-07-20 11:37:12 -07:00
1e63702c36 cargo fmt 2018-07-20 13:09:01 -04:00
478ee9a1c4 move tests for 'is_valid_address()' into its own test 2018-07-20 13:09:01 -04:00
eb1e5dcce4 add test for 'is_valid_address()' 2018-07-20 13:09:01 -04:00
84225beeef replace 'daddr' checks with 'is_valid_address()' 2018-07-20 13:09:01 -04:00
9cf0bd9b88 Adjust variable name 2018-07-20 09:50:24 -07:00
9d25d7611a Protect against unsupported configurations to prevent non-obvious errors later 2018-07-20 09:47:01 -07:00
1abefb2c7a Pass expected node count to testnet-sanity 2018-07-20 09:47:01 -07:00
bcc247f25f Clarify code comment 2018-07-20 12:31:23 -04:00
68ca9b2cb8 Generalize testnet deployment scripts 2018-07-20 09:10:28 -07:00
686e61d50c Display max TPS from all nodes at end of client demo (#716)
- Also lists node with 0 TPS and overall average TPS
2018-07-19 20:09:57 -07:00
17d927ac74 Count testnet nodes as a part of sanity 2018-07-19 12:05:21 -07:00
966c55f58e Trim CUDA runtime 2018-07-19 11:47:31 -07:00
d76d3162e5 Slow down deployment more 2018-07-19 10:11:04 -07:00
d0a2d46923 Don't shellcheck in target/ 2018-07-19 09:41:09 -07:00
a67f58e9a5 Add -c option to easily interrogate the number of nodes 2018-07-19 09:41:09 -07:00
fece91c4d1 Test multi node dynamic network ci (#696)
Buildkite automation for multinode test.  This test is ignored by default because it requires a large cpu machine to run.
2018-07-19 07:50:44 -07:00
149 changed files with 13218 additions and 5683 deletions

View File

@ -40,11 +40,6 @@ else
point_fields="${point_fields// /\\ }" # Escape spaces point_fields="${point_fields// /\\ }" # Escape spaces
point="job_stats,$point_tags $point_fields" point="job_stats,$point_tags $point_fields"
echo "Influx data point: $point"
if [[ -n $INFLUX_USERNAME && -n $INFLUX_PASSWORD ]]; then scripts/metrics-write-datapoint.sh "$point" || true
echo "https://metrics.solana.com:8086/write?db=ci&u=${INFLUX_USERNAME}&p=${INFLUX_PASSWORD}" \
| xargs curl -XPOST --data-binary "$point"
else
echo Influx user credentials not found
fi
fi fi

View File

@ -1,13 +1,27 @@
#!/bin/bash -e #!/bin/bash -e
[[ -n "$CARGO_TARGET_CACHE_NAME" ]] || exit 0 # Ensure the pattern "+++ ..." never occurs when |set -x| is set, as buildkite
# interprets this as the start of a log group.
# Ref: https://buildkite.com/docs/pipelines/managing-log-output
export PS4="++"
# #
# Restore target/ from the previous CI build on this machine # Restore target/ from the previous CI build on this machine
# #
( [[ -n "$CARGO_TARGET_CACHE_NAME" ]] || (
d=$HOME/cargo-target-cache/"$CARGO_TARGET_CACHE_NAME" d=$HOME/cargo-target-cache/"$CARGO_TARGET_CACHE_NAME"
if [[ -d $d ]]; then
du -hs "$d"
read -r cacheSizeInGB _ < <(du -s --block-size=1000000000 "$d")
if [[ $cacheSizeInGB -gt 5 ]]; then
echo "$d has gotten too large, removing it"
rm -rf "$d"
fi
fi
mkdir -p "$d"/target mkdir -p "$d"/target
set -x set -x
rsync -a --delete --link-dest="$d" "$d"/target . rsync -a --delete --link-dest="$d" "$d"/target .
) )

5
.gitignore vendored
View File

@ -1,5 +1,6 @@
Cargo.lock Cargo.lock
/target/ /target/
**/*.rs.bk **/*.rs.bk
.cargo .cargo
@ -9,3 +10,7 @@ Cargo.lock
/config-drone/ /config-drone/
/config-validator/ /config-validator/
/config-client/ /config-client/
/multinode-demo/test/config-client/
# test temp files, ledgers, etc.
/farf/

53
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,53 @@
Solana Coding Guidelines
===
The goal of these guidelines is to improve developer productivity by allowing developers to
jump any file in the codebase and not need to adapt to inconsistencies in how the code is
written. The codebase should appear as if it had been authored by a single developer. If you
don't agree with a convention, submit a PR patching this document and let's discuss! Once
the PR is accepted, *all* code should be updated as soon as possible to reflect the new
conventions.
Rust coding conventions
---
* All Rust code is formatted using the latest version of `rustfmt`. Once installed, it will be
updated automatically when you update the compiler with `rustup`.
* All Rust code is linted with Clippy. If you'd prefer to ignore its advice, do so explicitly:
```rust
#[cfg_attr(feature = "cargo-clippy", allow(too_many_arguments))]
```
Note: Clippy defaults can be overridden in the top-level file `.clippy.toml`.
* For variable names, when in doubt, spell it out. The mapping from type names to variable names
is to lowercase the type name, putting an underscore before each capital letter. Variable names
should *not* be abbreviated unless being used as closure arguments and the brevity improves
readability. When a function has multiple instances of the same type, qualify each with a
prefix and underscore (i.e. alice_keypair) or a numeric suffix (i.e. tx0).
* For function and method names, use `<verb>_<subject>`. For unit tests, that verb should
always be `test` and for benchmarks the verb should always be `bench`. Avoid namespacing
function names with some arbitrary word. Avoid abreviating words in function names.
* As they say, "When in Rome, do as the Romans do." A good patch should acknowledge the coding
conventions of the code that surrounds it, even in the case where that code has not yet been
updated to meet the conventions described here.
Terminology
---
Inventing new terms is allowed, but should only be done when the term is widely used and
understood. Avoid introducing new 3-letter terms, which can be confused with 3-letter acronyms.
Some terms we currently use regularly in the codebase:
* fullnode: n. A fully participating network node.
* hash: n. A SHA-256 Hash.
* keypair: n. A Ed25519 key-pair, containing a public and private key.
* pubkey: n. The public key of a Ed25519 key-pair.
* sigverify: v. To verify a Ed25519 digital signature.

View File

@ -1,7 +1,7 @@
[package] [package]
name = "solana" name = "solana"
description = "Blockchain, Rebuilt for Scale" description = "Blockchain, Rebuilt for Scale"
version = "0.7.0" version = "0.8.0"
documentation = "https://docs.rs/solana" documentation = "https://docs.rs/solana"
homepage = "http://solana.com/" homepage = "http://solana.com/"
readme = "README.md" readme = "README.md"
@ -18,21 +18,21 @@ authors = [
license = "Apache-2.0" license = "Apache-2.0"
[[bin]] [[bin]]
name = "solana-client-demo" name = "solana-bench-tps"
path = "src/bin/client-demo.rs" path = "src/bin/bench-tps.rs"
[[bin]] [[bin]]
name = "solana-wallet" name = "solana-bench-streamer"
path = "src/bin/wallet.rs" path = "src/bin/bench-streamer.rs"
[[bin]]
name = "solana-drone"
path = "src/bin/drone.rs"
[[bin]] [[bin]]
name = "solana-fullnode" name = "solana-fullnode"
path = "src/bin/fullnode.rs" path = "src/bin/fullnode.rs"
[[bin]]
name = "solana-keygen"
path = "src/bin/keygen.rs"
[[bin]] [[bin]]
name = "solana-fullnode-config" name = "solana-fullnode-config"
path = "src/bin/fullnode-config.rs" path = "src/bin/fullnode-config.rs"
@ -42,8 +42,16 @@ name = "solana-genesis"
path = "src/bin/genesis.rs" path = "src/bin/genesis.rs"
[[bin]] [[bin]]
name = "solana-drone" name = "solana-ledger-tool"
path = "src/bin/drone.rs" path = "src/bin/ledger-tool.rs"
[[bin]]
name = "solana-keygen"
path = "src/bin/keygen.rs"
[[bin]]
name = "solana-wallet"
path = "src/bin/wallet.rs"
[badges] [badges]
codecov = { repository = "solana-labs/solana", branch = "master", service = "github" } codecov = { repository = "solana-labs/solana", branch = "master", service = "github" }
@ -53,59 +61,54 @@ unstable = []
ipv6 = [] ipv6 = []
cuda = [] cuda = []
erasure = [] erasure = []
test = []
[dependencies] [dependencies]
atty = "0.2"
bincode = "1.0.0"
bs58 = "0.2.0"
byteorder = "1.2.1"
bytes = "0.4"
chrono = { version = "0.4.0", features = ["serde"] }
clap = "2.31"
dirs = "1.0.2"
env_logger = "0.5.12"
generic-array = { version = "0.12.0", default-features = false, features = ["serde"] }
getopts = "0.2"
influx_db_client = "0.3.4"
jsonrpc-core = { git = "https://github.com/paritytech/jsonrpc", rev = "4b6060b" }
jsonrpc-http-server = { git = "https://github.com/paritytech/jsonrpc", rev = "4b6060b" }
jsonrpc-macros = { git = "https://github.com/paritytech/jsonrpc", rev = "4b6060b" }
itertools = "0.7.8"
log = "0.4.2"
matches = "0.1.6"
nix = "0.11.0"
pnet_datalink = "0.21.0"
rand = "0.5.1"
rayon = "1.0.0" rayon = "1.0.0"
reqwest = "0.8.6"
ring = "0.13.2"
sha2 = "0.7.0" sha2 = "0.7.0"
generic-array = { version = "0.11.1", default-features = false, features = ["serde"] }
serde = "1.0.27" serde = "1.0.27"
serde_derive = "1.0.27" serde_derive = "1.0.27"
serde_json = "1.0.10" serde_json = "1.0.10"
ring = "0.12.1" socket2 = "0.3.8"
untrusted = "0.5.1" sys-info = "0.5.6"
bincode = "1.0.0"
chrono = { version = "0.4.0", features = ["serde"] }
log = "0.4.2"
env_logger = "0.5.10"
matches = "0.1.6"
byteorder = "1.2.1"
libc = "0.2.1"
getopts = "0.2"
atty = "0.2"
rand = "0.5.1"
pnet_datalink = "0.21.0"
tokio = "0.1" tokio = "0.1"
tokio-codec = "0.1" tokio-codec = "0.1"
tokio-core = "0.1.17" untrusted = "0.6.2"
tokio-io = "0.1"
itertools = "0.7.8"
bs58 = "0.2.0"
p2p = "0.5.2"
futures = "0.1.21"
clap = "2.31"
reqwest = "0.8.6"
influx_db_client = "0.3.4"
dirs = "1.0.2"
[dev-dependencies]
criterion = "0.2"
[[bench]] [[bench]]
name = "bank" name = "bank"
harness = false
[[bench]] [[bench]]
name = "banking_stage" name = "banking_stage"
harness = false
[[bench]] [[bench]]
name = "ledger" name = "ledger"
harness = false
[[bench]] [[bench]]
name = "signature" name = "signature"
harness = false
[[bench]] [[bench]]
name = "streamer" name = "sigverify"
harness = false

123
README.md
View File

@ -17,7 +17,11 @@ All claims, content, designs, algorithms, estimates, roadmaps, specifications, a
Introduction Introduction
=== ===
It's possible for a centralized database to process 710,000 transactions per second on a standard gigabit network if the transactions are, on average, no more than 176 bytes. A centralized database can also replicate itself and maintain high availability without significantly compromising that transaction rate using the distributed system technique known as Optimistic Concurrency Control [H.T.Kung, J.T.Robinson (1981)]. At Solana, we're demonstrating that these same theoretical limits apply just as well to blockchain on an adversarial network. The key ingredient? Finding a way to share time when nodes can't trust one-another. Once nodes can trust time, suddenly ~40 years of distributed systems research becomes applicable to blockchain! Furthermore, and much to our surprise, it can implemented using a mechanism that has existed in Bitcoin since day one. The Bitcoin feature is called nLocktime and it can be used to postdate transactions using block height instead of a timestamp. As a Bitcoin client, you'd use block height instead of a timestamp if you don't trust the network. Block height turns out to be an instance of what's being called a Verifiable Delay Function in cryptography circles. It's a cryptographically secure way to say time has passed. In Solana, we use a far more granular verifiable delay function, a SHA 256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we implement Optimistic Concurrency Control and are now well in route towards that theoretical limit of 710,000 transactions per second. It's possible for a centralized database to process 710,000 transactions per second on a standard gigabit network if the transactions are, on average, no more than 176 bytes. A centralized database can also replicate itself and maintain high availability without significantly compromising that transaction rate using the distributed system technique known as Optimistic Concurrency Control [\[H.T.Kung, J.T.Robinson (1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At Solana, we're demonstrating that these same theoretical limits apply just as well to blockchain on an adversarial network. The key ingredient? Finding a way to share time when nodes can't trust one-another. Once nodes can trust time, suddenly ~40 years of distributed systems research becomes applicable to blockchain!
> Perhaps the most striking difference between algorithms obtained by our method and ones based upon timeout is that using timeout produces a traditional distributed algorithm in which the processes operate asynchronously, while our method produces a globally synchronous one in which every process does the same thing at (approximately) the same time. Our method seems to contradict the whole purpose of distributed processing, which is to permit different processes to operate independently and perform different functions. However, if a distributed system is really a single system, then the processes must be synchronized in some way. Conceptually, the easiest way to synchronize processes is to get them all to do the same thing at the same time. Therefore, our method is used to implement a kernel that performs the necessary synchronization--for example, making sure that two different processes do not try to modify a file at the same time. Processes might spend only a small fraction of their time executing the synchronizing kernel; the rest of the time, they can operate independently--e.g., accessing different files. This is an approach we have advocated even when fault-tolerance is not required. The method's basic simplicity makes it easier to understand the precise properties of a system, which is crucial if one is to know just how fault-tolerant the system is. [\[L.Lamport (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
Furthermore, and much to our surprise, it can be implemented using a mechanism that has existed in Bitcoin since day one. The Bitcoin feature is called nLocktime and it can be used to postdate transactions using block height instead of a timestamp. As a Bitcoin client, you'd use block height instead of a timestamp if you don't trust the network. Block height turns out to be an instance of what's being called a Verifiable Delay Function in cryptography circles. It's a cryptographically secure way to say time has passed. In Solana, we use a far more granular verifiable delay function, a SHA 256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we implement Optimistic Concurrency Control and are now well in route towards that theoretical limit of 710,000 transactions per second.
Testnet Demos Testnet Demos
@ -58,7 +62,7 @@ your odds of success if you check out the
before proceeding: before proceeding:
```bash ```bash
$ git checkout v0.7.0-beta $ git checkout v0.8.0
``` ```
Configuration Setup Configuration Setup
@ -71,26 +75,10 @@ These files can be generated by running the following script.
$ ./multinode-demo/setup.sh $ ./multinode-demo/setup.sh
``` ```
Singlenode Testnet
---
Before you start a fullnode, make sure you know the IP address of the machine you
want to be the leader for the demo, and make sure that udp ports 8000-10000 are
open on all the machines you want to test with.
Now start the server:
```bash
$ ./multinode-demo/leader.sh
```
Wait a few seconds for the server to initialize. It will print "Ready." when it's ready to
receive transactions.
Drone Drone
--- ---
In order for the below test client and validators to work, we'll also need to In order for the leader, client and validators to work, we'll need to
spin up a drone to give out some test tokens. The drone delivers Milton spin up a drone to give out some test tokens. The drone delivers Milton
Friedman-style "air drops" (free tokens to requesting clients) to be used in Friedman-style "air drops" (free tokens to requesting clients) to be used in
test transactions. test transactions.
@ -101,36 +89,54 @@ Start the drone on the leader node with:
$ ./multinode-demo/drone.sh $ ./multinode-demo/drone.sh
``` ```
Singlenode Testnet
---
Before you start a fullnode, make sure you know the IP address of the machine you
want to be the leader for the demo, and make sure that udp ports 8000-10000 are
open on all the machines you want to test with.
Now start the server in a separate shell:
```bash
$ ./multinode-demo/leader.sh
```
Wait a few seconds for the server to initialize. It will print "leader ready..." when it's ready to
receive transactions. The leader will request some tokens from the drone if it doesn't have any.
The drone does not need to be running for subsequent leader starts.
Multinode Testnet Multinode Testnet
--- ---
To run a multinode testnet, after starting a leader node, spin up some validator nodes: To run a multinode testnet, after starting a leader node, spin up some validator nodes in
separate shells:
```bash ```bash
$ ./multinode-demo/validator.sh ubuntu@10.0.1.51:~/solana 10.0.1.51 $ ./multinode-demo/validator.sh
``` ```
To run a performance-enhanced leader or validator (on Linux), To run a performance-enhanced leader or validator (on Linux),
[CUDA 9.2](https://developer.nvidia.com/cuda-downloads) must be installed on [CUDA 9.2](https://developer.nvidia.com/cuda-downloads) must be installed on
your system: your system:
```bash ```bash
$ ./fetch-perf-libs.sh $ ./fetch-perf-libs.sh
$ SOLANA_CUDA=1 ./multinode-demo/leader.sh $ SOLANA_CUDA=1 ./multinode-demo/leader.sh
$ SOLANA_CUDA=1 ./multinode-demo/validator.sh ubuntu@10.0.1.51:~/solana 10.0.1.51 $ SOLANA_CUDA=1 ./multinode-demo/validator.sh
``` ```
Testnet Client Demo Testnet Client Demo
--- ---
Now that your singlenode or multinode testnet is up and running, in a separate shell, let's send it some transactions! Note we pass in Now that your singlenode or multinode testnet is up and running let's send it
the JSON configuration file here, not the genesis ledger. some transactions!
In a separate shell start the client:
```bash ```bash
$ ./multinode-demo/client.sh ubuntu@10.0.1.51:~/solana 2 #The leader machine and the total number of nodes in the network $ ./multinode-demo/client.sh # runs against localhost by default
``` ```
What just happened? The client demo spins up several threads to send 500,000 transactions What just happened? The client demo spins up several threads to send 500,000 transactions
@ -142,21 +148,35 @@ demo completes after it has convinced itself the testnet won't process any addit
transactions. You should see several TPS measurements printed to the screen. In the transactions. You should see several TPS measurements printed to the screen. In the
multinode variation, you'll see TPS measurements for each validator node as well. multinode variation, you'll see TPS measurements for each validator node as well.
Public Testnet
--------------
In this example the client connects to our public testnet. To run validators on the testnet you would need to open udp ports `8000-10000`.
```bash
$ ./multinode-demo/client.sh --network $(dig +short testnet.solana.com):8001 --identity config-private/client-id.json --duration 60
```
You can observe the effects of your client's transactions on our [dashboard](https://metrics.solana.com:3000/d/testnet/testnet-hud?orgId=2&from=now-30m&to=now&refresh=5s&var-testnet=testnet)
Linux Snap Linux Snap
--- ---
A Linux [Snap](https://snapcraft.io/) is available, which can be used to A Linux [Snap](https://snapcraft.io/) is available, which can be used to
easily get Solana running on supported Linux systems without building anything easily get Solana running on supported Linux systems without building anything
from source. The `edge` Snap channel is updated daily with the latest from source. The `edge` Snap channel is updated daily with the latest
development from the `master` branch. To install: development from the `master` branch. To install:
```bash ```bash
$ sudo snap install solana --edge --devmode $ sudo snap install solana --edge --devmode
``` ```
(`--devmode` flag is required only for `solana.fullnode-cuda`) (`--devmode` flag is required only for `solana.fullnode-cuda`)
Once installed the usual Solana programs will be available as `solona.*` instead Once installed the usual Solana programs will be available as `solona.*` instead
of `solana-*`. For example, `solana.fullnode` instead of `solana-fullnode`. of `solana-*`. For example, `solana.fullnode` instead of `solana-fullnode`.
Update to the latest version at any time with Update to the latest version at any time with:
```bash ```bash
$ snap info solana $ snap info solana
$ sudo snap refresh solana --devmode $ sudo snap refresh solana --devmode
@ -166,10 +186,17 @@ $ sudo snap refresh solana --devmode
The snap supports running a leader, validator or leader+drone node as a system The snap supports running a leader, validator or leader+drone node as a system
daemon. daemon.
Run `sudo snap get solana` to view the current daemon configuration, and Run `sudo snap get solana` to view the current daemon configuration. To view
`sudo snap logs -f solana` to view the daemon logs. daemon logs:
1. Run `sudo snap logs -n=all solana` to view the daemon initialization log
2. Runtime logging can be found under `/var/snap/solana/current/leader/`,
`/var/snap/solana/current/validator/`, or `/var/snap/solana/current/drone/` depending
on which `mode=` was selected. Within each log directory the file `current`
contains the latest log, and the files `*.s` (if present) contain older rotated
logs.
Disable the daemon at any time by running: Disable the daemon at any time by running:
```bash ```bash
$ sudo snap set solana mode= $ sudo snap set solana mode=
``` ```
@ -178,11 +205,13 @@ Runtime configuration files for the daemon can be found in
`/var/snap/solana/current/config`. `/var/snap/solana/current/config`.
#### Leader daemon #### Leader daemon
```bash ```bash
$ sudo snap set solana mode=leader $ sudo snap set solana mode=leader
``` ```
If CUDA is available: If CUDA is available:
```bash ```bash
$ sudo snap set solana mode=leader enable-cuda=1 $ sudo snap set solana mode=leader enable-cuda=1
``` ```
@ -205,26 +234,31 @@ to port tcp:873, tcp:9900 and the port range udp:8000-udp:10000**
To run both the Leader and Drone: To run both the Leader and Drone:
```bash ```bash
$ sudo snap set solana mode=leader+drone $ sudo snap set solana mode=leader+drone
``` ```
#### Validator daemon #### Validator daemon
```bash ```bash
$ sudo snap set solana mode=validator $ sudo snap set solana mode=validator
``` ```
If CUDA is available: If CUDA is available:
```bash ```bash
$ sudo snap set solana mode=validator enable-cuda=1 $ sudo snap set solana mode=validator enable-cuda=1
``` ```
By default the validator will connect to **testnet.solana.com**, override By default the validator will connect to **testnet.solana.com**, override
the leader IP address by running: the leader IP address by running:
```bash ```bash
$ sudo snap set solana mode=validator leader-address=127.0.0.1 #<-- change IP address $ sudo snap set solana mode=validator leader-address=127.0.0.1 #<-- change IP address
``` ```
It's assumed that the leader will be running `rsync` configured as described in It's assumed that the leader will be running `rsync` configured as described in
the previous **Leader daemon** section. the previous **Leader daemon** section.
@ -248,9 +282,10 @@ If your rustc version is lower than 1.26.1, please update it:
$ rustup update $ rustup update
``` ```
On Linux systems you may need to install libssl-dev and pkg-config. On Ubuntu: On Linux systems you may need to install libssl-dev, pkg-config, zlib1g-dev, etc. On Ubuntu:
```bash ```bash
$ sudo apt-get install libssl-dev pkg-config $ sudo apt-get install libssl-dev pkg-config zlib1g-dev
``` ```
Download the source code: Download the source code:
@ -270,6 +305,7 @@ $ cargo test
``` ```
To emulate all the tests that will run on a Pull Request, run: To emulate all the tests that will run on a Pull Request, run:
```bash ```bash
$ ./ci/run-local.sh $ ./ci/run-local.sh
``` ```
@ -278,17 +314,21 @@ Debugging
--- ---
There are some useful debug messages in the code, you can enable them on a per-module and per-level There are some useful debug messages in the code, you can enable them on a per-module and per-level
basis with the normal RUST\_LOG environment variable. Run the fullnode with this syntax: basis. Before running a leader or validator set the normal RUST\_LOG environment variable.
For example, to enable info everywhere and debug only in the solana::banking_stage module:
```bash ```bash
$ RUST_LOG=solana::streamer=debug,solana::server=info cat genesis.log | ./target/release/solana-fullnode > transactions0.log $ export RUST_LOG=info,solana::banking_stage=debug
``` ```
to see the debug and info sections for streamer and server respectively. Generally
we are using debug for infrequent debug messages, trace for potentially frequent messages and
info for performance-related logging.
Attaching to a running process with gdb Generally we are using debug for infrequent debug messages, trace for potentially frequent
messages and info for performance-related logging.
``` You can also attach to a running process with GDB. The leader's process is named
_solana-fullnode_:
```bash
$ sudo gdb $ sudo gdb
attach <PID> attach <PID>
set logging on set logging on
@ -312,6 +352,11 @@ Run the benchmarks:
$ cargo +nightly bench --features="unstable" $ cargo +nightly bench --features="unstable"
``` ```
Release Process
---
The release process for this project is described [here](rfcs/rfc-005-branches-tags-and-channels.md).
Code coverage Code coverage
--- ---

View File

@ -1 +0,0 @@
theme: jekyll-theme-slate

View File

@ -1,18 +1,19 @@
#[macro_use] #![feature(test)]
extern crate criterion;
extern crate bincode; extern crate bincode;
extern crate rayon; extern crate rayon;
extern crate solana; extern crate solana;
extern crate test;
use bincode::serialize; use bincode::serialize;
use criterion::{Bencher, Criterion};
use rayon::prelude::*; use rayon::prelude::*;
use solana::bank::*; use solana::bank::*;
use solana::hash::hash; use solana::hash::hash;
use solana::mint::Mint; use solana::mint::Mint;
use solana::signature::{KeyPair, KeyPairUtil}; use solana::signature::{Keypair, KeypairUtil};
use solana::transaction::Transaction; use solana::transaction::Transaction;
use test::Bencher;
#[bench]
fn bench_process_transaction(bencher: &mut Bencher) { fn bench_process_transaction(bencher: &mut Bencher) {
let mint = Mint::new(100_000_000); let mint = Mint::new(100_000_000);
let bank = Bank::new(&mint); let bank = Bank::new(&mint);
@ -22,7 +23,7 @@ fn bench_process_transaction(bencher: &mut Bencher) {
.into_par_iter() .into_par_iter()
.map(|i| { .map(|i| {
// Seed the 'from' account. // Seed the 'from' account.
let rando0 = KeyPair::new(); let rando0 = Keypair::new();
let tx = Transaction::new(&mint.keypair(), rando0.pubkey(), 10_000, mint.last_id()); let tx = Transaction::new(&mint.keypair(), rando0.pubkey(), 10_000, mint.last_id());
assert!(bank.process_transaction(&tx).is_ok()); assert!(bank.process_transaction(&tx).is_ok());
@ -30,37 +31,18 @@ fn bench_process_transaction(bencher: &mut Bencher) {
let last_id = hash(&serialize(&i).unwrap()); // Unique hash let last_id = hash(&serialize(&i).unwrap()); // Unique hash
bank.register_entry_id(&last_id); bank.register_entry_id(&last_id);
let rando1 = KeyPair::new(); let rando1 = Keypair::new();
let tx = Transaction::new(&rando0, rando1.pubkey(), 1, last_id); let tx = Transaction::new(&rando0, rando1.pubkey(), 1, last_id);
assert!(bank.process_transaction(&tx).is_ok()); assert!(bank.process_transaction(&tx).is_ok());
// Finally, return the transaction to the benchmark. // Finally, return the transaction to the benchmark.
tx tx
}) }).collect();
.collect();
bencher.iter_with_setup( bencher.iter(|| {
|| { // Since benchmarker runs this multiple times, we need to clear the signatures.
// Since benchmarker runs this multiple times, we need to clear the signatures. bank.clear_signatures();
bank.clear_signatures(); let results = bank.process_transactions(transactions.clone());
transactions.clone() assert!(results.iter().all(Result::is_ok));
}, })
|transactions| {
let results = bank.process_transactions(transactions);
assert!(results.iter().all(Result::is_ok));
},
)
} }
fn bench(criterion: &mut Criterion) {
criterion.bench_function("bench_process_transaction", |bencher| {
bench_process_transaction(bencher);
});
}
criterion_group!(
name = benches;
config = Criterion::default().sample_size(2);
targets = bench
);
criterion_main!(benches);

View File

@ -1,21 +1,21 @@
#![feature(test)]
extern crate bincode; extern crate bincode;
#[macro_use]
extern crate criterion;
extern crate rayon; extern crate rayon;
extern crate solana; extern crate solana;
extern crate test;
use criterion::{Bencher, Criterion};
use rayon::prelude::*; use rayon::prelude::*;
use solana::bank::Bank; use solana::bank::Bank;
use solana::banking_stage::BankingStage; use solana::banking_stage::BankingStage;
use solana::mint::Mint; use solana::mint::Mint;
use solana::packet::{to_packets_chunked, PacketRecycler}; use solana::packet::{to_packets_chunked, PacketRecycler};
use solana::record_stage::Signal; use solana::record_stage::Signal;
use solana::signature::{KeyPair, KeyPairUtil}; use solana::signature::{Keypair, KeypairUtil};
use solana::transaction::Transaction; use solana::transaction::Transaction;
use std::iter; use std::iter;
use std::sync::mpsc::{channel, Receiver}; use std::sync::mpsc::{channel, Receiver};
use std::sync::Arc; use std::sync::Arc;
use test::Bencher;
// use self::test::Bencher; // use self::test::Bencher;
// use bank::{Bank, MAX_ENTRY_IDS}; // use bank::{Bank, MAX_ENTRY_IDS};
@ -23,7 +23,7 @@ use std::sync::Arc;
// use hash::hash; // use hash::hash;
// use mint::Mint; // use mint::Mint;
// use rayon::prelude::*; // use rayon::prelude::*;
// use signature::{KeyPair, KeyPairUtil}; // use signature::{Keypair, KeypairUtil};
// use std::collections::HashSet; // use std::collections::HashSet;
// use std::time::Instant; // use std::time::Instant;
// use transaction::Transaction; // use transaction::Transaction;
@ -49,11 +49,11 @@ use std::sync::Arc;
// } // }
// //
// // Seed the 'from' account. // // Seed the 'from' account.
// let rando0 = KeyPair::new(); // let rando0 = Keypair::new();
// let tx = Transaction::new(&mint.keypair(), rando0.pubkey(), 1_000, last_id); // let tx = Transaction::new(&mint.keypair(), rando0.pubkey(), 1_000, last_id);
// bank.process_transaction(&tx).unwrap(); // bank.process_transaction(&tx).unwrap();
// //
// let rando1 = KeyPair::new(); // let rando1 = Keypair::new();
// let tx = Transaction::new(&rando0, rando1.pubkey(), 2, last_id); // let tx = Transaction::new(&rando0, rando1.pubkey(), 2, last_id);
// bank.process_transaction(&tx).unwrap(); // bank.process_transaction(&tx).unwrap();
// //
@ -79,12 +79,15 @@ use std::sync::Arc;
// println!("{} tps", tps); // println!("{} tps", tps);
// } // }
fn check_txs(batches: usize, receiver: &Receiver<Signal>, ref_tx_count: usize) { fn check_txs(receiver: &Receiver<Signal>, ref_tx_count: usize) {
let mut total = 0; let mut total = 0;
for _ in 0..batches { loop {
let signal = receiver.recv().unwrap(); let signal = receiver.recv().unwrap();
if let Signal::Transactions(transactions) = signal { if let Signal::Transactions(transactions) = signal {
total += transactions.len(); total += transactions.len();
if total >= ref_tx_count {
break;
}
} else { } else {
assert!(false); assert!(false);
} }
@ -92,6 +95,7 @@ fn check_txs(batches: usize, receiver: &Receiver<Signal>, ref_tx_count: usize) {
assert_eq!(total, ref_tx_count); assert_eq!(total, ref_tx_count);
} }
#[bench]
fn bench_banking_stage_multi_accounts(bencher: &mut Bencher) { fn bench_banking_stage_multi_accounts(bencher: &mut Bencher) {
let tx = 10_000_usize; let tx = 10_000_usize;
let mint_total = 1_000_000_000_000; let mint_total = 1_000_000_000_000;
@ -99,9 +103,9 @@ fn bench_banking_stage_multi_accounts(bencher: &mut Bencher) {
let num_dst_accounts = 8 * 1024; let num_dst_accounts = 8 * 1024;
let num_src_accounts = 8 * 1024; let num_src_accounts = 8 * 1024;
let srckeys: Vec<_> = (0..num_src_accounts).map(|_| KeyPair::new()).collect(); let srckeys: Vec<_> = (0..num_src_accounts).map(|_| Keypair::new()).collect();
let dstkeys: Vec<_> = (0..num_dst_accounts) let dstkeys: Vec<_> = (0..num_dst_accounts)
.map(|_| KeyPair::new().pubkey()) .map(|_| Keypair::new().pubkey())
.collect(); .collect();
let transactions: Vec<_> = (0..tx) let transactions: Vec<_> = (0..tx)
@ -112,8 +116,7 @@ fn bench_banking_stage_multi_accounts(bencher: &mut Bencher) {
i as i64, i as i64,
mint.last_id(), mint.last_id(),
) )
}) }).collect();
.collect();
let (verified_sender, verified_receiver) = channel(); let (verified_sender, verified_receiver) = channel();
let (signal_sender, signal_receiver) = channel(); let (signal_sender, signal_receiver) = channel();
@ -127,8 +130,7 @@ fn bench_banking_stage_multi_accounts(bencher: &mut Bencher) {
mint_total / num_src_accounts as i64, mint_total / num_src_accounts as i64,
mint.last_id(), mint.last_id(),
) )
}) }).collect();
.collect();
bencher.iter(move || { bencher.iter(move || {
let bank = Arc::new(Bank::new(&mint)); let bank = Arc::new(Bank::new(&mint));
@ -139,40 +141,37 @@ fn bench_banking_stage_multi_accounts(bencher: &mut Bencher) {
.map(|x| { .map(|x| {
let len = (*x).read().unwrap().packets.len(); let len = (*x).read().unwrap().packets.len();
(x, iter::repeat(1).take(len).collect()) (x, iter::repeat(1).take(len).collect())
}) }).collect();
.collect();
let verified_setup_len = verified_setup.len();
verified_sender.send(verified_setup).unwrap(); verified_sender.send(verified_setup).unwrap();
BankingStage::process_packets(&bank, &verified_receiver, &signal_sender, &packet_recycler) BankingStage::process_packets(&bank, &verified_receiver, &signal_sender, &packet_recycler)
.unwrap(); .unwrap();
check_txs(verified_setup_len, &signal_receiver, num_src_accounts); check_txs(&signal_receiver, num_src_accounts);
let verified: Vec<_> = to_packets_chunked(&packet_recycler, &transactions.clone(), 192) let verified: Vec<_> = to_packets_chunked(&packet_recycler, &transactions.clone(), 192)
.into_iter() .into_iter()
.map(|x| { .map(|x| {
let len = (*x).read().unwrap().packets.len(); let len = (*x).read().unwrap().packets.len();
(x, iter::repeat(1).take(len).collect()) (x, iter::repeat(1).take(len).collect())
}) }).collect();
.collect();
let verified_len = verified.len();
verified_sender.send(verified).unwrap(); verified_sender.send(verified).unwrap();
BankingStage::process_packets(&bank, &verified_receiver, &signal_sender, &packet_recycler) BankingStage::process_packets(&bank, &verified_receiver, &signal_sender, &packet_recycler)
.unwrap(); .unwrap();
check_txs(verified_len, &signal_receiver, tx); check_txs(&signal_receiver, tx);
}); });
} }
#[bench]
fn bench_banking_stage_single_from(bencher: &mut Bencher) { fn bench_banking_stage_single_from(bencher: &mut Bencher) {
let tx = 10_000_usize; let tx = 10_000_usize;
let mint = Mint::new(1_000_000_000_000); let mint = Mint::new(1_000_000_000_000);
let mut pubkeys = Vec::new(); let mut pubkeys = Vec::new();
let num_keys = 8; let num_keys = 8;
for _ in 0..num_keys { for _ in 0..num_keys {
pubkeys.push(KeyPair::new().pubkey()); pubkeys.push(Keypair::new().pubkey());
} }
let transactions: Vec<_> = (0..tx) let transactions: Vec<_> = (0..tx)
@ -184,8 +183,7 @@ fn bench_banking_stage_single_from(bencher: &mut Bencher) {
i as i64, i as i64,
mint.last_id(), mint.last_id(),
) )
}) }).collect();
.collect();
let (verified_sender, verified_receiver) = channel(); let (verified_sender, verified_receiver) = channel();
let (signal_sender, signal_receiver) = channel(); let (signal_sender, signal_receiver) = channel();
@ -198,29 +196,11 @@ fn bench_banking_stage_single_from(bencher: &mut Bencher) {
.map(|x| { .map(|x| {
let len = (*x).read().unwrap().packets.len(); let len = (*x).read().unwrap().packets.len();
(x, iter::repeat(1).take(len).collect()) (x, iter::repeat(1).take(len).collect())
}) }).collect();
.collect();
let verified_len = verified.len();
verified_sender.send(verified).unwrap(); verified_sender.send(verified).unwrap();
BankingStage::process_packets(&bank, &verified_receiver, &signal_sender, &packet_recycler) BankingStage::process_packets(&bank, &verified_receiver, &signal_sender, &packet_recycler)
.unwrap(); .unwrap();
check_txs(verified_len, &signal_receiver, tx); check_txs(&signal_receiver, tx);
}); });
} }
fn bench(criterion: &mut Criterion) {
criterion.bench_function("bench_banking_stage_multi_accounts", |bencher| {
bench_banking_stage_multi_accounts(bencher);
});
criterion.bench_function("bench_process_stage_single_from", |bencher| {
bench_banking_stage_single_from(bencher);
});
}
criterion_group!(
name = benches;
config = Criterion::default().sample_size(2);
targets = bench
);
criterion_main!(benches);

View File

@ -1,40 +1,26 @@
#[macro_use] #![feature(test)]
extern crate criterion;
extern crate solana; extern crate solana;
extern crate test;
use criterion::{Bencher, Criterion};
use solana::hash::{hash, Hash}; use solana::hash::{hash, Hash};
use solana::ledger::{next_entries, reconstruct_entries_from_blobs, Block}; use solana::ledger::{next_entries, reconstruct_entries_from_blobs, Block};
use solana::packet::BlobRecycler; use solana::packet::BlobRecycler;
use solana::signature::{KeyPair, KeyPairUtil}; use solana::signature::{Keypair, KeypairUtil};
use solana::transaction::Transaction; use solana::transaction::Transaction;
use std::collections::VecDeque; use test::Bencher;
#[bench]
fn bench_block_to_blobs_to_block(bencher: &mut Bencher) { fn bench_block_to_blobs_to_block(bencher: &mut Bencher) {
let zero = Hash::default(); let zero = Hash::default();
let one = hash(&zero); let one = hash(&zero.as_ref());
let keypair = KeyPair::new(); let keypair = Keypair::new();
let tx0 = Transaction::new(&keypair, keypair.pubkey(), 1, one); let tx0 = Transaction::new(&keypair, keypair.pubkey(), 1, one);
let transactions = vec![tx0; 10]; let transactions = vec![tx0; 10];
let entries = next_entries(&zero, 1, transactions); let entries = next_entries(&zero, 1, transactions);
let blob_recycler = BlobRecycler::default(); let blob_recycler = BlobRecycler::default();
bencher.iter(|| { bencher.iter(|| {
let mut blob_q = VecDeque::new(); let blobs = entries.to_blobs(&blob_recycler);
entries.to_blobs(&blob_recycler, &mut blob_q); assert_eq!(reconstruct_entries_from_blobs(blobs).unwrap(), entries);
assert_eq!(reconstruct_entries_from_blobs(blob_q).unwrap(), entries);
}); });
} }
fn bench(criterion: &mut Criterion) {
criterion.bench_function("bench_block_to_blobs_to_block", |bencher| {
bench_block_to_blobs_to_block(bencher);
});
}
criterion_group!(
name = benches;
config = Criterion::default().sample_size(2);
targets = bench
);
criterion_main!(benches);

View File

@ -1,24 +1,12 @@
#[macro_use] #![feature(test)]
extern crate criterion;
extern crate solana; extern crate solana;
extern crate test;
use criterion::{Bencher, Criterion};
use solana::signature::GenKeys; use solana::signature::GenKeys;
use test::Bencher;
#[bench]
fn bench_gen_keys(b: &mut Bencher) { fn bench_gen_keys(b: &mut Bencher) {
let rnd = GenKeys::new([0u8; 32]); let mut rnd = GenKeys::new([0u8; 32]);
b.iter(|| rnd.gen_n_keypairs(1000)); b.iter(|| rnd.gen_n_keypairs(1000));
} }
fn bench(criterion: &mut Criterion) {
criterion.bench_function("bench_gen_keys", |bencher| {
bench_gen_keys(bencher);
});
}
criterion_group!(
name = benches;
config = Criterion::default().sample_size(2);
targets = bench
);
criterion_main!(benches);

24
benches/sigverify.rs Normal file
View File

@ -0,0 +1,24 @@
#![feature(test)]
extern crate bincode;
extern crate rayon;
extern crate solana;
extern crate test;
use solana::packet::{to_packets, PacketRecycler};
use solana::sigverify;
use solana::transaction::test_tx;
use test::Bencher;
#[bench]
fn bench_sigverify(bencher: &mut Bencher) {
let tx = test_tx();
// generate packet vector
let packet_recycler = PacketRecycler::default();
let batches = to_packets(&packet_recycler, &vec![tx; 128]);
// verify packets
bencher.iter(|| {
let _ans = sigverify::ed25519_verify(&batches);
})
}

View File

@ -1,117 +0,0 @@
#[macro_use]
extern crate log;
extern crate solana;
#[macro_use]
extern crate criterion;
use criterion::{Bencher, Criterion};
use solana::packet::{Packet, PacketRecycler, BLOB_SIZE, PACKET_DATA_SIZE};
use solana::result::Result;
use solana::streamer::{receiver, PacketReceiver};
use std::net::{SocketAddr, UdpSocket};
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::mpsc::channel;
use std::sync::{Arc, Mutex};
use std::thread::sleep;
use std::thread::{spawn, JoinHandle};
use std::time::Duration;
use std::time::SystemTime;
fn producer(addr: &SocketAddr, recycler: PacketRecycler, exit: Arc<AtomicBool>) -> JoinHandle<()> {
let send = UdpSocket::bind("0.0.0.0:0").unwrap();
let msgs = recycler.allocate();
let msgs_ = msgs.clone();
msgs.write().unwrap().packets.resize(10, Packet::default());
for w in msgs.write().unwrap().packets.iter_mut() {
w.meta.size = PACKET_DATA_SIZE;
w.meta.set_addr(&addr);
}
spawn(move || loop {
if exit.load(Ordering::Relaxed) {
return;
}
let mut num = 0;
for p in msgs_.read().unwrap().packets.iter() {
let a = p.meta.addr();
assert!(p.meta.size < BLOB_SIZE);
send.send_to(&p.data[..p.meta.size], &a).unwrap();
num += 1;
}
assert_eq!(num, 10);
})
}
fn sink(
recycler: PacketRecycler,
exit: Arc<AtomicBool>,
rvs: Arc<Mutex<usize>>,
r: PacketReceiver,
) -> JoinHandle<()> {
spawn(move || loop {
if exit.load(Ordering::Relaxed) {
return;
}
let timer = Duration::new(1, 0);
match r.recv_timeout(timer) {
Ok(msgs) => {
*rvs.lock().unwrap() += msgs.read().unwrap().packets.len();
recycler.recycle(msgs);
}
_ => (),
}
})
}
fn bench_streamer_with_result() -> Result<()> {
let read = UdpSocket::bind("127.0.0.1:0")?;
read.set_read_timeout(Some(Duration::new(1, 0)))?;
let addr = read.local_addr()?;
let exit = Arc::new(AtomicBool::new(false));
let pack_recycler = PacketRecycler::default();
let (s_reader, r_reader) = channel();
let t_reader = receiver(read, exit.clone(), pack_recycler.clone(), s_reader);
let t_producer1 = producer(&addr, pack_recycler.clone(), exit.clone());
let t_producer2 = producer(&addr, pack_recycler.clone(), exit.clone());
let t_producer3 = producer(&addr, pack_recycler.clone(), exit.clone());
let rvs = Arc::new(Mutex::new(0));
let t_sink = sink(pack_recycler.clone(), exit.clone(), rvs.clone(), r_reader);
let start = SystemTime::now();
let start_val = *rvs.lock().unwrap();
sleep(Duration::new(5, 0));
let elapsed = start.elapsed().unwrap();
let end_val = *rvs.lock().unwrap();
let time = elapsed.as_secs() * 10000000000 + elapsed.subsec_nanos() as u64;
let ftime = (time as f64) / 10000000000f64;
let fcount = (end_val - start_val) as f64;
trace!("performance: {:?}", fcount / ftime);
exit.store(true, Ordering::Relaxed);
t_reader.join()?;
t_producer1.join()?;
t_producer2.join()?;
t_producer3.join()?;
t_sink.join()?;
Ok(())
}
fn bench_streamer(bencher: &mut Bencher) {
bencher.iter(|| {
bench_streamer_with_result().unwrap();
});
}
fn bench(criterion: &mut Criterion) {
criterion.bench_function("bench_streamer", |bencher| {
bench_streamer(bencher);
});
}
criterion_group!(
name = benches;
config = Criterion::default().sample_size(2);
targets = bench
);
criterion_main!(benches);

View File

@ -1,15 +1,33 @@
use std::env; use std::env;
use std::fs;
fn main() { fn main() {
println!("cargo:rustc-link-search=native=."); println!("cargo:rerun-if-changed=target/perf-libs");
if !env::var("CARGO_FEATURE_CUDA").is_err() { println!("cargo:rerun-if-changed=build.rs");
// Ensure target/perf-libs/ exists. It's been observed that
// a cargo:rerun-if-changed= directive with a non-existent
// directory triggers a rebuild on every |cargo build| invocation
fs::create_dir("target/perf-libs").unwrap_or_else(|err| {
if err.kind() != std::io::ErrorKind::AlreadyExists {
panic!("Unable to create target/perf-libs: {:?}", err);
}
});
let cuda = !env::var("CARGO_FEATURE_CUDA").is_err();
let erasure = !env::var("CARGO_FEATURE_ERASURE").is_err();
if cuda || erasure {
println!("cargo:rustc-link-search=native=target/perf-libs");
}
if cuda {
println!("cargo:rustc-link-lib=static=cuda_verify_ed25519"); println!("cargo:rustc-link-lib=static=cuda_verify_ed25519");
println!("cargo:rustc-link-search=native=/usr/local/cuda/lib64"); println!("cargo:rustc-link-search=native=/usr/local/cuda/lib64");
println!("cargo:rustc-link-lib=dylib=cudart"); println!("cargo:rustc-link-lib=dylib=cudart");
println!("cargo:rustc-link-lib=dylib=cuda"); println!("cargo:rustc-link-lib=dylib=cuda");
println!("cargo:rustc-link-lib=dylib=cudadevrt"); println!("cargo:rustc-link-lib=dylib=cudadevrt");
} }
if !env::var("CARGO_FEATURE_ERASURE").is_err() { if erasure {
println!("cargo:rustc-link-lib=dylib=Jerasure"); println!("cargo:rustc-link-lib=dylib=Jerasure");
println!("cargo:rustc-link-lib=dylib=gf_complete"); println!("cargo:rustc-link-lib=dylib=gf_complete");
} }

32
ci/audit.sh Executable file
View File

@ -0,0 +1,32 @@
#!/bin/bash -e
#
# Audits project dependencies for security vulnerabilities
#
cd "$(dirname "$0")/.."
export RUST_BACKTRACE=1
rustc --version
cargo --version
_() {
echo "--- $*"
"$@"
}
maybe_cargo_install() {
for cmd in "$@"; do
set +e
cargo "$cmd" --help > /dev/null 2>&1
declare exitcode=$?
set -e
if [[ $exitcode -eq 101 ]]; then
_ cargo install cargo-"$cmd"
fi
done
}
maybe_cargo_install audit tree
_ cargo tree
_ cargo audit || true

View File

@ -1,13 +1,18 @@
steps: steps:
- command: "ci/docker-run.sh rust ci/test-stable.sh" - command: "ci/docker-run.sh solanalabs/rust:1.29.0 ci/test-stable.sh"
name: "stable [public]" name: "stable [public]"
env: env:
CARGO_TARGET_CACHE_NAME: "stable" CARGO_TARGET_CACHE_NAME: "stable"
timeout_in_minutes: 30 timeout_in_minutes: 30
# - command: "ci/docker-run.sh solanalabs/rust-nightly ci/test-bench.sh"
# name: "bench [public]"
# env:
# CARGO_TARGET_CACHE_NAME: "nightly"
# timeout_in_minutes: 30
- command: "ci/shellcheck.sh" - command: "ci/shellcheck.sh"
name: "shellcheck [public]" name: "shellcheck [public]"
timeout_in_minutes: 20 timeout_in_minutes: 20
- command: "ci/docker-run.sh solanalabs/rust-nightly ci/test-nightly.sh" - command: "ci/docker-run.sh solanalabs/rust-nightly:2018-09-03 ci/test-nightly.sh || true"
name: "nightly [public]" name: "nightly [public]"
env: env:
CARGO_TARGET_CACHE_NAME: "nightly" CARGO_TARGET_CACHE_NAME: "nightly"
@ -17,10 +22,6 @@ steps:
env: env:
CARGO_TARGET_CACHE_NAME: "stable-perf" CARGO_TARGET_CACHE_NAME: "stable-perf"
timeout_in_minutes: 20 timeout_in_minutes: 20
retry:
automatic:
- exit_status: "*"
limit: 2
agents: agents:
- "queue=cuda" - "queue=cuda"
- command: "ci/pr-snap.sh" - command: "ci/pr-snap.sh"
@ -30,9 +31,6 @@ steps:
- command: "ci/publish-crate.sh" - command: "ci/publish-crate.sh"
timeout_in_minutes: 20 timeout_in_minutes: 20
name: "publish crate [public]" name: "publish crate [public]"
- command: "ci/hoover.sh"
timeout_in_minutes: 20
name: "clean agent [public]"
- trigger: "solana-snap" - trigger: "solana-snap"
branches: "!pull/*" branches: "!pull/*"
async: true async: true

91
ci/channel-info.sh Executable file
View File

@ -0,0 +1,91 @@
#!/bin/bash
#
# Computes the current branch names of the edge, beta and stable
# channels, as well as the latest tagged release for beta and stable.
#
# stdout of this script may be eval-ed
#
here="$(dirname "$0")"
# shellcheck source=ci/semver_bash/semver.sh
source "$here"/semver_bash/semver.sh
remote=https://github.com/solana-labs/solana.git
# Fetch all vX.Y.Z tags
#
# NOTE: pre-release tags are explicitly ignored
#
# shellcheck disable=SC2207
tags=( \
$(git ls-remote --tags $remote \
| cut -c52- \
| grep '^v[[:digit:]][[:digit:]]*\.[[:digit:]][[:digit:]]*.[[:digit:]][[:digit:]]*$' \
| cut -c2- \
) \
)
# Fetch all the vX.Y branches
#
# shellcheck disable=SC2207
heads=( \
$(git ls-remote --heads $remote \
| cut -c53- \
| grep '^v[[:digit:]][[:digit:]]*\.[[:digit:]][[:digit:]]*$' \
| cut -c2- \
) \
)
# Figure the beta channel by looking for the largest vX.Y branch
beta=
for head in "${heads[@]}"; do
if [[ -n $beta ]]; then
if semverLT "$head.0" "$beta.0"; then
continue
fi
fi
beta=$head
done
# Figure the stable channel by looking for the second largest vX.Y branch
stable=
for head in "${heads[@]}"; do
if [[ $head = "$beta" ]]; then
continue
fi
if [[ -n $stable ]]; then
if semverLT "$head.0" "$stable.0"; then
continue
fi
fi
stable=$head
done
for tag in "${tags[@]}"; do
if [[ -n $beta && $tag = $beta* ]]; then
if [[ -n $beta_tag ]]; then
if semverLT "$tag" "$beta_tag"; then
continue
fi
fi
beta_tag=$tag
fi
if [[ -n $stable && $tag = $stable* ]]; then
if [[ -n $stable_tag ]]; then
if semverLT "$tag" "$stable_tag"; then
continue
fi
fi
stable_tag=$tag
fi
done
echo EDGE_CHANNEL=master
echo BETA_CHANNEL="${beta:+v$beta}"
echo STABLE_CHANNEL="${stable:+v$stable}"
echo BETA_CHANNEL_LATEST_TAG="${beta_tag:+v$beta_tag}"
echo STABLE_CHANNEL_LATEST_TAG="${stable_tag:+v$stable_tag}"
exit 0

View File

@ -1,32 +1,48 @@
#!/bin/bash -e #!/bin/bash -e
usage() { usage() {
echo "Usage: $0 [docker image name] [command]" echo "Usage: $0 [--nopull] [docker image name] [command]"
echo echo
echo Runs command in the specified docker image with echo Runs command in the specified docker image with
echo a CI-appropriate environment echo a CI-appropriate environment.
echo
echo "--nopull Skip the dockerhub image update"
echo echo
} }
cd "$(dirname "$0")/.." cd "$(dirname "$0")/.."
NOPULL=false
if [[ $1 = --nopull ]]; then
NOPULL=true
shift
fi
IMAGE="$1" IMAGE="$1"
if [[ -z "$IMAGE" ]]; then if [[ -z "$IMAGE" ]]; then
echo Error: image not defined echo Error: image not defined
exit 1 exit 1
fi fi
docker pull "$IMAGE" $NOPULL || docker pull "$IMAGE"
shift shift
ARGS=( ARGS=(
--workdir /solana --workdir /solana
--volume "$PWD:/solana" --volume "$PWD:/solana"
--volume "$HOME:/home"
--env "CARGO_HOME=/home/.cargo"
--rm --rm
) )
if [[ -n $CI ]]; then
# Share the real ~/.cargo between docker containers in CI for speed
ARGS+=(--volume "$HOME:/home")
else
# Avoid sharing ~/.cargo when building locally to avoid a mixed macOS/Linux
# ~/.cargo
ARGS+=(--volume "$PWD:/home")
fi
ARGS+=(--env "CARGO_HOME=/home/.cargo")
# kcov tries to set the personality of the binary which docker # kcov tries to set the personality of the binary which docker
# doesn't allow by default. # doesn't allow by default.
ARGS+=(--security-opt "seccomp=unconfined") ARGS+=(--security-opt "seccomp=unconfined")
@ -38,7 +54,10 @@ fi
# Environment variables to propagate into the container # Environment variables to propagate into the container
ARGS+=( ARGS+=(
--env BUILDKITE
--env BUILDKITE_AGENT_ACCESS_TOKEN
--env BUILDKITE_BRANCH --env BUILDKITE_BRANCH
--env BUILDKITE_JOB_ID
--env BUILDKITE_TAG --env BUILDKITE_TAG
--env CODECOV_TOKEN --env CODECOV_TOKEN
--env CRATES_IO_TOKEN --env CRATES_IO_TOKEN

View File

@ -1,6 +1,10 @@
FROM rustlang/rust:nightly FROM solanalabs/rust
ARG date
RUN cargo install --force clippy cargo-cov && \ RUN set -x && \
echo deb http://ftp.debian.org/debian stretch-backports main >> /etc/apt/sources.list && \ rustup install nightly-$date && \
apt update && \ rustup default nightly-$date && \
apt install -y llvm-6.0 rustup component add clippy-preview --toolchain=nightly-$date && \
rustc --version && \
cargo --version && \
cargo +nightly-$date install cargo-cov

View File

@ -1,6 +1,36 @@
Docker image containing rust nightly and some preinstalled crates used in CI. Docker image containing rust nightly and some preinstalled crates used in CI.
This image may be manually updated by running `./build.sh` if you are a member This image may be manually updated by running `CI=true ./build.sh` if you are a member
of the [Solana Labs](https://hub.docker.com/u/solanalabs/) Docker Hub of the [Solana Labs](https://hub.docker.com/u/solanalabs/) Docker Hub
organization, but it is also automatically updated periodically by organization, but it is also automatically updated periodically by
[this automation](https://buildkite.com/solana-labs/solana-ci-docker-rust-nightly). [this automation](https://buildkite.com/solana-labs/solana-ci-docker-rust-nightly).
## Moving to a newer nightly
We pin the version of nightly (see the `ARG nightly=xyz` line in `Dockerfile`)
to avoid the build breaking at unexpected times, as occasionally nightly will
introduce breaking changes.
To update the pinned version:
1. Run `ci/docker-rust-nightly/build.sh` to rebuild the nightly image locally,
or potentially `ci/docker-rust-nightly/build.sh YYYY-MM-DD` if there's a
specific YYYY-MM-DD that is desired (default is today's build).
1. Run `SOLANA_DOCKER_RUN_NOSETUID=1 ci/docker-run.sh --nopull solanalabs/rust-nightly:YYYY-MM-DD ci/test-nightly.sh`
to confirm the new nightly image builds. Fix any issues as needed
1. Run `docker login` to enable pushing images to Docker Hub, if you're authorized.
1. Run `CI=true ci/docker-rust-nightly/build.sh YYYY-MM-DD` to push the new nightly image to dockerhub.com.
1. Modify the `solanalabs/rust-nightly:YYYY-MM-DD` reference in `ci/buildkite.yml` from the previous to
new *YYYY-MM-DD* value, send a PR with this change and any codebase adjustments needed.
## Troubleshooting
### Resource is denied
When running `CI=true ci/docker-rust-nightly/build.sh`, you see:
```
denied: requested access to the resource is denied
```
Run `docker login` to enable pushing images to Docker Hub. Contact @mvines or @garious
to get write access.

View File

@ -2,5 +2,12 @@
cd "$(dirname "$0")" cd "$(dirname "$0")"
docker build -t solanalabs/rust-nightly . nightlyDate=${1:-$(date +%Y-%m-%d)}
docker push solanalabs/rust-nightly docker build -t solanalabs/rust-nightly:"$nightlyDate" --build-arg date="$nightlyDate" .
maybeEcho=
if [[ -z $CI ]]; then
echo "Not CI, skipping |docker push|"
maybeEcho="echo"
fi
$maybeEcho docker push solanalabs/rust-nightly:"$nightlyDate"

23
ci/docker-rust/Dockerfile Normal file
View File

@ -0,0 +1,23 @@
# Note: when the rust version (1.28) is changed also modify
# ci/buildkite.yml to pick up the new image tag
FROM rust:1.29.0
RUN set -x && \
apt update && \
apt-get install apt-transport-https && \
echo deb https://apt.buildkite.com/buildkite-agent stable main > /etc/apt/sources.list.d/buildkite-agent.list && \
echo deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main > /etc/apt/sources.list.d/llvm.list && \
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 32A37959C2FA5C3C99EFBC32A79206696452D198 && \
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - && \
apt update && \
apt install -y \
buildkite-agent \
cmake \
llvm-6.0 \
rsync \
sudo \
&& \
rustup component add rustfmt-preview && \
rm -rf /var/lib/apt/lists/* && \
rustc --version && \
cargo --version

6
ci/docker-rust/README.md Normal file
View File

@ -0,0 +1,6 @@
Docker image containing rust and some preinstalled packages used in CI.
This image may be manually updated by running `./build.sh` if you are a member
of the [Solana Labs](https://hub.docker.com/u/solanalabs/) Docker Hub
organization, but it is also automatically updated periodically by
[this automation](https://buildkite.com/solana-labs/solana-ci-docker-rust).

11
ci/docker-rust/build.sh Executable file
View File

@ -0,0 +1,11 @@
#!/bin/bash -ex
cd "$(dirname "$0")"
docker build -t solanalabs/rust .
read -r rustc version _ < <(docker run solanalabs/rust rustc --version)
[[ $rustc = rustc ]]
docker tag solanalabs/rust:latest solanalabs/rust:"$version"
docker push solanalabs/rust

View File

@ -3,6 +3,7 @@
# Regular maintenance performed on a buildkite agent to control disk usage # Regular maintenance performed on a buildkite agent to control disk usage
# #
echo --- Delete all exited containers first echo --- Delete all exited containers first
( (
set -x set -x
@ -39,12 +40,35 @@ echo --- Remove unused docker networks
docker network prune -f docker network prune -f
) )
echo "--- Delete /tmp files older than 1 day owned by $(whoami)" echo "--- Delete /tmp files older than 1 day owned by $(id -un)"
( (
set -x set -x
find /tmp -maxdepth 1 -user "$(whoami)" -mtime +1 -print0 | xargs -0 rm -rf find /tmp -maxdepth 1 -user "$(id -un)" -mtime +1 -print0 | xargs -0 rm -rf
) )
echo --- Deleting stale buildkite agent build directories
if [[ ! -d ../../../../builds/$BUILDKITE_AGENT_NAME ]]; then
# We might not be where we think we are, do nothing
echo Warning: Skipping flush of stale agent build directories
echo " PWD=$PWD"
else
# NOTE: this will be horribly broken if we ever decide to run multiple
# agents on the same machine.
(
for keepDir in "$BUILDKITE_PIPELINE_SLUG" \
"$BUILDKITE_ORGANIZATION_SLUG" \
"$BUILDKITE_AGENT_NAME"; do
cd .. || exit 1
for dir in *; do
if [[ -d $dir && $dir != "$keepDir" ]]; then
echo "Removing $dir"
rm -rf "${dir:?}"/
fi
done
done
)
fi
echo --- System Status echo --- System Status
( (
set -x set -x

93
ci/localnet-sanity.sh Executable file
View File

@ -0,0 +1,93 @@
#!/bin/bash -e
#
# Perform a quick sanity test on a leader, drone, validator and client running
# locally on the same machine
#
cd "$(dirname "$0")"/..
source ci/upload_ci_artifact.sh
source scripts/configure-metrics.sh
multinode-demo/setup.sh
backgroundCommands="drone leader validator validator-x"
pids=()
for cmd in $backgroundCommands; do
echo "--- Start $cmd"
rm -f log-"$cmd".txt
multinode-demo/"$cmd".sh > log-"$cmd".txt 2>&1 &
declare pid=$!
pids+=("$pid")
echo "pid: $pid"
done
killBackgroundCommands() {
set +e
for pid in "${pids[@]}"; do
if kill "$pid"; then
wait "$pid"
else
echo -e "^^^ +++\\nWarning: unable to kill $pid"
fi
done
set -e
pids=()
}
shutdown() {
exitcode=$?
killBackgroundCommands
set +e
echo "--- Upload artifacts"
for cmd in $backgroundCommands; do
declare logfile=log-$cmd.txt
upload_ci_artifact "$logfile"
tail "$logfile"
done
exit $exitcode
}
trap shutdown EXIT INT
set -e
flag_error() {
echo Failed
echo "^^^ +++"
exit 1
}
echo "--- Wallet sanity"
(
set -x
scripts/wallet-sanity.sh
) || flag_error
echo "--- Node count"
(
source multinode-demo/common.sh
set -x
client_id=/tmp/client-id.json-$$
$solana_keygen -o $client_id
$solana_bench_tps --identity $client_id --num-nodes 3 --converge-only
rm -rf $client_id
) || flag_error
killBackgroundCommands
echo "--- Ledger verification"
(
source multinode-demo/common.sh
set -x
cp -R "$SOLANA_CONFIG_DIR"/ledger /tmp/ledger-$$
$solana_ledger_tool --ledger /tmp/ledger-$$ verify
rm -rf /tmp/ledger-$$
) || flag_error
echo +++
echo Ok
exit 0

26
ci/semver_bash/LICENSE Normal file
View File

@ -0,0 +1,26 @@
Copyright (c) 2013, Ray Bejjani
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The views and conclusions contained in the software and documentation are those
of the authors and should not be interpreted as representing official policies,
either expressed or implied, of the FreeBSD Project.

31
ci/semver_bash/README.md Normal file
View File

@ -0,0 +1,31 @@
semver_bash is a bash parser for semantic versioning
====================================================
[Semantic Versioning](http://semver.org/) is a set of guidelines that help keep
version and version management sane. This is a bash based parser to help manage
a project's versions. Use it from a Makefile or any scripts you use in your
project.
Usage
-----
semver_bash can be used from the command line as:
$ ./semver.sh "3.2.1" "3.2.1-alpha"
3.2.1 -> M: 3 m:2 p:1 s:
3.2.1-alpha -> M: 3 m:2 p:1 s:-alpha
3.2.1 == 3.2.1-alpha -> 1.
3.2.1 < 3.2.1-alpha -> 1.
3.2.1 > 3.2.1-alpha -> 0.
Alternatively, you can source it from within a script:
. ./semver.sh
local MAJOR=0
local MINOR=0
local PATCH=0
local SPECIAL=""
semverParseInto "1.2.3" MAJOR MINOR PATCH SPECIAL
semverParseInto "3.2.1" MAJOR MINOR PATCH SPECIAL

130
ci/semver_bash/semver.sh Executable file
View File

@ -0,0 +1,130 @@
#!/usr/bin/env sh
function semverParseInto() {
local RE='[^0-9]*\([0-9]*\)[.]\([0-9]*\)[.]\([0-9]*\)\([0-9A-Za-z-]*\)'
#MAJOR
eval $2=`echo $1 | sed -e "s#$RE#\1#"`
#MINOR
eval $3=`echo $1 | sed -e "s#$RE#\2#"`
#MINOR
eval $4=`echo $1 | sed -e "s#$RE#\3#"`
#SPECIAL
eval $5=`echo $1 | sed -e "s#$RE#\4#"`
}
function semverEQ() {
local MAJOR_A=0
local MINOR_A=0
local PATCH_A=0
local SPECIAL_A=0
local MAJOR_B=0
local MINOR_B=0
local PATCH_B=0
local SPECIAL_B=0
semverParseInto $1 MAJOR_A MINOR_A PATCH_A SPECIAL_A
semverParseInto $2 MAJOR_B MINOR_B PATCH_B SPECIAL_B
if [ $MAJOR_A -ne $MAJOR_B ]; then
return 1
fi
if [ $MINOR_A -ne $MINOR_B ]; then
return 1
fi
if [ $PATCH_A -ne $PATCH_B ]; then
return 1
fi
if [[ "_$SPECIAL_A" != "_$SPECIAL_B" ]]; then
return 1
fi
return 0
}
function semverLT() {
local MAJOR_A=0
local MINOR_A=0
local PATCH_A=0
local SPECIAL_A=0
local MAJOR_B=0
local MINOR_B=0
local PATCH_B=0
local SPECIAL_B=0
semverParseInto $1 MAJOR_A MINOR_A PATCH_A SPECIAL_A
semverParseInto $2 MAJOR_B MINOR_B PATCH_B SPECIAL_B
if [ $MAJOR_A -lt $MAJOR_B ]; then
return 0
fi
if [[ $MAJOR_A -le $MAJOR_B && $MINOR_A -lt $MINOR_B ]]; then
return 0
fi
if [[ $MAJOR_A -le $MAJOR_B && $MINOR_A -le $MINOR_B && $PATCH_A -lt $PATCH_B ]]; then
return 0
fi
if [[ "_$SPECIAL_A" == "_" ]] && [[ "_$SPECIAL_B" == "_" ]] ; then
return 1
fi
if [[ "_$SPECIAL_A" == "_" ]] && [[ "_$SPECIAL_B" != "_" ]] ; then
return 1
fi
if [[ "_$SPECIAL_A" != "_" ]] && [[ "_$SPECIAL_B" == "_" ]] ; then
return 0
fi
if [[ "_$SPECIAL_A" < "_$SPECIAL_B" ]]; then
return 0
fi
return 1
}
function semverGT() {
semverEQ $1 $2
local EQ=$?
semverLT $1 $2
local LT=$?
if [ $EQ -ne 0 ] && [ $LT -ne 0 ]; then
return 0
else
return 1
fi
}
if [ "___semver.sh" == "___`basename $0`" ]; then
MAJOR=0
MINOR=0
PATCH=0
SPECIAL=""
semverParseInto $1 MAJOR MINOR PATCH SPECIAL
echo "$1 -> M: $MAJOR m:$MINOR p:$PATCH s:$SPECIAL"
semverParseInto $2 MAJOR MINOR PATCH SPECIAL
echo "$2 -> M: $MAJOR m:$MINOR p:$PATCH s:$SPECIAL"
semverEQ $1 $2
echo "$1 == $2 -> $?."
semverLT $1 $2
echo "$1 < $2 -> $?."
semverGT $1 $2
echo "$1 > $2 -> $?."
fi

151
ci/semver_bash/semver_test.sh Executable file
View File

@ -0,0 +1,151 @@
#!/usr/bin/env bash
. ./semver.sh
semverTest() {
local A=R1.3.2
local B=R2.3.2
local C=R1.4.2
local D=R1.3.3
local E=R1.3.2a
local F=R1.3.2b
local G=R1.2.3
local MAJOR=0
local MINOR=0
local PATCH=0
local SPECIAL=""
semverParseInto $A MAJOR MINOR PATCH SPECIAL
echo "$A -> M:$MAJOR m:$MINOR p:$PATCH s:$SPECIAL. Expect M:1 m:3 p:2 s:"
semverParseInto $E MAJOR MINOR PATCH SPECIAL
echo "$E -> M:$MAJOR m:$MINOR p:$PATCH s:$SPECIAL. Expect M:1 m:3 p:2 s:a"
echo "Equality comparisions"
semverEQ $A $A
echo "$A == $A -> $?. Expect 0."
semverLT $A $A
echo "$A < $A -> $?. Expect 1."
semverGT $A $A
echo "$A > $A -> $?. Expect 1."
echo "Major number comparisions"
semverEQ $A $B
echo "$A == $B -> $?. Expect 1."
semverLT $A $B
echo "$A < $B -> $?. Expect 0."
semverGT $A $B
echo "$A > $B -> $?. Expect 1."
semverEQ $B $A
echo "$B == $A -> $?. Expect 1."
semverLT $B $A
echo "$B < $A -> $?. Expect 1."
semverGT $B $A
echo "$B > $A -> $?. Expect 0."
echo "Minor number comparisions"
semverEQ $A $C
echo "$A == $C -> $?. Expect 1."
semverLT $A $C
echo "$A < $C -> $?. Expect 0."
semverGT $A $C
echo "$A > $C -> $?. Expect 1."
semverEQ $C $A
echo "$C == $A -> $?. Expect 1."
semverLT $C $A
echo "$C < $A -> $?. Expect 1."
semverGT $C $A
echo "$C > $A -> $?. Expect 0."
echo "patch number comparisions"
semverEQ $A $D
echo "$A == $D -> $?. Expect 1."
semverLT $A $D
echo "$A < $D -> $?. Expect 0."
semverGT $A $D
echo "$A > $D -> $?. Expect 1."
semverEQ $D $A
echo "$D == $A -> $?. Expect 1."
semverLT $D $A
echo "$D < $A -> $?. Expect 1."
semverGT $D $A
echo "$D > $A -> $?. Expect 0."
echo "special section vs no special comparisions"
semverEQ $A $E
echo "$A == $E -> $?. Expect 1."
semverLT $A $E
echo "$A < $E -> $?. Expect 1."
semverGT $A $E
echo "$A > $E -> $?. Expect 0."
semverEQ $E $A
echo "$E == $A -> $?. Expect 1."
semverLT $E $A
echo "$E < $A -> $?. Expect 0."
semverGT $E $A
echo "$E > $A -> $?. Expect 1."
echo "special section vs special comparisions"
semverEQ $E $F
echo "$E == $F -> $?. Expect 1."
semverLT $E $F
echo "$E < $F -> $?. Expect 0."
semverGT $E $F
echo "$E > $F -> $?. Expect 1."
semverEQ $F $E
echo "$F == $E -> $?. Expect 1."
semverLT $F $E
echo "$F < $E -> $?. Expect 1."
semverGT $F $E
echo "$F > $E -> $?. Expect 0."
echo "Minor and patch number comparisons"
semverEQ $A $G
echo "$A == $G -> $?. Expect 1."
semverLT $A $G
echo "$A < $G -> $?. Expect 1."
semverGT $A $G
echo "$A > $G -> $?. Expect 0."
semverEQ $G $A
echo "$G == $A -> $?. Expect 1."
semverLT $G $A
echo "$G < $A -> $?. Expect 0."
semverGT $G $A
echo "$G > $A -> $?. Expect 1."
}
semverTest

View File

@ -5,7 +5,13 @@
cd "$(dirname "$0")/.." cd "$(dirname "$0")/.."
set -x set -x
find . -name "*.sh" -not -regex ".*/.cargo/.*" -not -regex ".*/node_modules/.*" -print0 \ find . -name "*.sh" \
-not -regex ".*/ci/semver_bash/.*" \
-not -regex ".*/.cargo/.*" \
-not -regex ".*/node_modules/.*" \
-not -regex ".*/target/.*" \
-print0 \
| xargs -0 \ | xargs -0 \
ci/docker-run.sh koalaman/shellcheck --color=always --external-sources --shell=bash ci/docker-run.sh koalaman/shellcheck --color=always --external-sources --shell=bash
exit 0 exit 0

View File

@ -7,16 +7,21 @@ if [[ -z $BUILDKITE_BRANCH ]] || ./ci/is-pr.sh; then
DRYRUN="echo" DRYRUN="echo"
fi fi
# BUILDKITE_TAG is the normal environment variable set by Buildkite. However eval "$(ci/channel-info.sh)"
# when this script is run from a triggered pipeline, TRIGGERED_BUILDKITE_TAG is
# used instead of BUILDKITE_TAG (due to Buildkite limitations that prevents if [[ $BUILDKITE_BRANCH = "$STABLE_CHANNEL" ]]; then
# BUILDKITE_TAG from propagating through to triggered pipelines) SNAP_CHANNEL=stable
if [[ -z "$BUILDKITE_TAG" && -z "$TRIGGERED_BUILDKITE_TAG" ]]; then elif [[ $BUILDKITE_BRANCH = "$EDGE_CHANNEL" ]]; then
SNAP_CHANNEL=edge SNAP_CHANNEL=edge
else elif [[ $BUILDKITE_BRANCH = "$BETA_CHANNEL" ]]; then
SNAP_CHANNEL=beta SNAP_CHANNEL=beta
fi fi
if [[ -z $SNAP_CHANNEL ]]; then
echo Unable to determine channel to publish into, exiting.
exit 0
fi
if [[ -z $DRYRUN ]]; then if [[ -z $DRYRUN ]]; then
[[ -n $SNAPCRAFT_CREDENTIALS_KEY ]] || { [[ -n $SNAPCRAFT_CREDENTIALS_KEY ]] || {
echo SNAPCRAFT_CREDENTIALS_KEY not defined echo SNAPCRAFT_CREDENTIALS_KEY not defined
@ -39,15 +44,18 @@ set -x
echo --- checking for multilog echo --- checking for multilog
if [[ ! -x /usr/bin/multilog ]]; then if [[ ! -x /usr/bin/multilog ]]; then
echo "multilog not found, install with: sudo apt-get install -y daemontools" if [[ -z $CI ]]; then
exit 1 echo "multilog not found, install with: sudo apt-get install -y daemontools"
exit 1
fi
sudo apt-get install -y daemontools
fi fi
echo --- build echo --- build: $SNAP_CHANNEL channel
snapcraft snapcraft
source ci/upload_ci_artifact.sh source ci/upload_ci_artifact.sh
upload_ci_artifact solana_*.snap upload_ci_artifact solana_*.snap
echo --- publish echo --- publish: $SNAP_CHANNEL channel
$DRYRUN snapcraft push solana_*.snap --release $SNAP_CHANNEL $DRYRUN snapcraft push solana_*.snap --release $SNAP_CHANNEL

13
ci/test-bench.sh Executable file
View File

@ -0,0 +1,13 @@
#!/bin/bash -e
cd "$(dirname "$0")/.."
ci/version-check.sh nightly
export RUST_BACKTRACE=1
_() {
echo "--- $*"
"$@"
}
_ cargo bench --features=unstable --verbose

45
ci/test-large-network.sh Executable file
View File

@ -0,0 +1,45 @@
#!/bin/bash -e
here=$(dirname "$0")
cd "$here"/..
if ! ci/version-check.sh stable; then
# This job doesn't run within a container, try once to upgrade tooling on a
# version check failure
rustup install stable
ci/version-check.sh stable
fi
export RUST_BACKTRACE=1
./fetch-perf-libs.sh
export LD_LIBRARY_PATH=$PWD/target/perf-libs:$LD_LIBRARY_PATH
export RUST_LOG=multinode=info
if [[ $(ulimit -n) -lt 65000 ]]; then
echo 'Error: nofiles too small, run "ulimit -n 65000" to continue'
exit 1
fi
if [[ $(sysctl -n net.core.rmem_default) -lt 1610612736 ]]; then
echo 'Error: rmem_default too small, run "sudo sysctl -w net.core.rmem_default=1610612736" to continue'
exit 1
fi
if [[ $(sysctl -n net.core.rmem_max) -lt 1610612736 ]]; then
echo 'Error: rmem_max too small, run "sudo sysctl -w net.core.rmem_max=1610612736" to continue'
exit 1
fi
if [[ $(sysctl -n net.core.wmem_default) -lt 1610612736 ]]; then
echo 'Error: rmem_default too small, run "sudo sysctl -w net.core.wmem_default=1610612736" to continue'
exit 1
fi
if [[ $(sysctl -n net.core.wmem_max) -lt 1610612736 ]]; then
echo 'Error: rmem_max too small, run "sudo sysctl -w net.core.wmem_max=1610612736" to continue'
exit 1
fi
set -x
exec cargo test --release --features=erasure test_multi_node_dynamic_network -- --ignored

View File

@ -2,9 +2,8 @@
cd "$(dirname "$0")/.." cd "$(dirname "$0")/.."
ci/version-check.sh nightly
export RUST_BACKTRACE=1 export RUST_BACKTRACE=1
rustc --version
cargo --version
_() { _() {
echo "--- $*" echo "--- $*"
@ -12,8 +11,11 @@ _() {
} }
_ cargo build --verbose --features unstable _ cargo build --verbose --features unstable
_ cargo test --verbose --features unstable _ cargo test --verbose --features=unstable
_ cargo clippy -- --deny=warnings
# TODO: Re-enable warnings-as-errors after clippy offers a way to not warn on unscoped lint names.
#_ cargo clippy -- --deny=warnings
_ cargo clippy
exit 0 exit 0
@ -29,4 +31,3 @@ if [[ -z "$CODECOV_TOKEN" ]]; then
else else
bash <(curl -s https://codecov.io/bash) -x 'llvm-cov-6.0 gcov' bash <(curl -s https://codecov.io/bash) -x 'llvm-cov-6.0 gcov'
fi fi

View File

@ -2,11 +2,29 @@
cd "$(dirname "$0")/.." cd "$(dirname "$0")/.."
./fetch-perf-libs.sh if ! ci/version-check.sh stable; then
# This job doesn't run within a container, try once to upgrade tooling on a
export LD_LIBRARY_PATH=$PWD:/usr/local/cuda/lib64 # version check failure
export PATH=$PATH:/usr/local/cuda/bin rustup install stable
ci/version-check.sh stable
fi
export RUST_BACKTRACE=1 export RUST_BACKTRACE=1
set -x ./fetch-perf-libs.sh
exec cargo test --features=cuda,erasure export LD_LIBRARY_PATH=$PWD/target/perf-libs:/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:/usr/local/cuda/bin
_() {
echo "--- $*"
"$@"
}
_ cargo test --features=cuda,erasure
echo --- ci/localnet-sanity.sh
(
set -x
# Assume |cargo build| has populated target/debug/ successfully.
export PATH=$PWD/target/debug:$PATH
USE_INSTALL=1 ci/localnet-sanity.sh
)

View File

@ -2,17 +2,24 @@
cd "$(dirname "$0")/.." cd "$(dirname "$0")/.."
ci/version-check.sh stable
export RUST_BACKTRACE=1 export RUST_BACKTRACE=1
rustc --version
cargo --version
_() { _() {
echo "--- $*" echo "--- $*"
"$@" "$@"
} }
_ rustup component add rustfmt-preview _ cargo fmt -- --check
_ cargo fmt -- --write-mode=check
_ cargo build --verbose _ cargo build --verbose
_ cargo test --verbose _ cargo test --verbose
_ cargo bench --verbose
echo --- ci/localnet-sanity.sh
(
set -x
# Assume |cargo build| has populated target/debug/ successfully.
export PATH=$PWD/target/debug:$PATH
USE_INSTALL=1 ci/localnet-sanity.sh
)
_ ci/audit.sh || true

View File

@ -1,165 +1,121 @@
#!/bin/bash -e #!/bin/bash -e
#
# Deploys the Solana software running on the testnet full nodes
#
# This script must be run by a user/machine that has successfully authenticated
# with GCP and has sufficient permission.
#
cd "$(dirname "$0")/.."
# TODO: Switch over to rolling updates cd "$(dirname "$0")"/..
ROLLING_UPDATE=false
#ROLLING_UPDATE=true
if [[ -z $SOLANA_METRICS_CONFIG ]]; then zone=
echo Error: SOLANA_METRICS_CONFIG environment variable is unset leaderAddress=
exit 1 clientNodeCount=0
fi validatorNodeCount=10
publicNetwork=false
snapChannel=edge
delete=false
enableGpu=false
# Default to edge channel. To select the beta channel: usage() {
# export SOLANA_SNAP_CHANNEL=beta exitcode=0
if [[ -z $SOLANA_SNAP_CHANNEL ]]; then if [[ -n "$1" ]]; then
SOLANA_SNAP_CHANNEL=edge exitcode=1
fi echo "Error: $*"
case $SOLANA_SNAP_CHANNEL in
edge)
publicUrl=master.testnet.solana.com
publicIp=$(dig +short $publicUrl | head -n1)
;;
beta)
publicUrl=testnet.solana.com
publicIp="" # Use default value
;;
*)
echo Error: Unknown SOLANA_SNAP_CHANNEL=$SOLANA_SNAP_CHANNEL
exit 1
;;
esac
resourcePrefix=${publicUrl//./-}
vmlist=("$resourcePrefix":us-west1-b) # Leader is hard coded as the first entry
validatorNamePrefix=$resourcePrefix-validator-
echo "--- Available validators for $publicUrl"
filter="name~^$validatorNamePrefix"
gcloud compute instances list --filter="$filter"
while read -r vmName vmZone status; do
if [[ $status != RUNNING ]]; then
echo "Warning: $vmName is not RUNNING, ignoring it."
continue
fi fi
vmlist+=("$vmName:$vmZone") cat <<EOF
done < <(gcloud compute instances list --filter="$filter" --format 'value(name,zone,status)') usage: $0 [name] [zone] [options...]
wait_for_node() { Deploys a CD testnet
declare pid=$1
declare ok=true name - name of the network
wait "$pid" || ok=false zone - GCE to deploy the network into
cat "log-$pid.txt"
if ! $ok; then options:
echo ^^^ +++ -s edge|beta|stable - Deploy the specified Snap release channel
exit 1 (default: $snapChannel)
fi -n [number] - Number of validator nodes (default: $validatorNodeCount)
-c [number] - Number of client nodes (default: $clientNodeCount)
-P - Use public network IP addresses (default: $publicNetwork)
-g - Enable GPU (default: $enableGpu)
-a [address] - Set the leader node's external IP address to this GCE address
-d - Delete the network
Note: the SOLANA_METRICS_CONFIG environment variable is used to configure
metrics
EOF
exit $exitcode
} }
if ! $ROLLING_UPDATE; then netName=$1
count=1 zone=$2
for info in "${vmlist[@]}"; do [[ -n $netName ]] || usage
nodePosition="($count/${#vmlist[*]})" [[ -n $zone ]] || usage "Zone not specified"
vmName=${info%:*} shift 2
vmZone=${info#*:}
echo "--- Shutting down $vmName in zone $vmZone $nodePosition"
gcloud compute ssh "$vmName" --zone "$vmZone" \
--ssh-flag="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" \
--command="echo sudo snap remove solana" &
if [[ $((count % 10)) = 0 ]]; then while getopts "h?p:Pn:c:s:ga:d" opt; do
# Slow down deployment to avoid triggering GCP login case $opt in
# quota limits (each |ssh| counts as a login) h | \?)
sleep 3 usage
fi ;;
P)
publicNetwork=true
;;
n)
validatorNodeCount=$OPTARG
;;
c)
clientNodeCount=$OPTARG
;;
s)
case $OPTARG in
edge|beta|stable)
snapChannel=$OPTARG
;;
*)
usage "Invalid snap channel: $OPTARG"
;;
esac
;;
g)
enableGpu=true
;;
a)
leaderAddress=$OPTARG
;;
d)
delete=true
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
count=$((count + 1))
done
wait gce_create_args=(
-a "$leaderAddress"
-c "$clientNodeCount"
-n "$validatorNodeCount"
-p "$netName"
-z "$zone"
)
if $enableGpu; then
gce_create_args+=(-g)
fi fi
echo "--- Refreshing leader for $publicUrl" if $publicNetwork; then
leader=true gce_create_args+=(-P)
pids=() fi
count=1
for info in "${vmlist[@]}"; do
nodePosition="($count/${#vmlist[*]})"
vmName=${info%:*} set -x
vmZone=${info#*:}
echo "Starting refresh for $vmName $nodePosition"
( echo --- gce.sh delete
SECONDS=0 time net/gce.sh delete -p "$netName"
echo "--- $vmName in zone $vmZone $nodePosition" if $delete; then
commonNodeConfig="\ exit 0
rust-log=$RUST_LOG \ fi
default-metrics-rate=$SOLANA_DEFAULT_METRICS_RATE \
metrics-config=$SOLANA_METRICS_CONFIG \
"
if $leader; then
nodeConfig="mode=leader+drone $commonNodeConfig"
if [[ -n $SOLANA_CUDA ]]; then
nodeConfig="$nodeConfig enable-cuda=1"
fi
else
nodeConfig="mode=validator leader-address=$publicIp $commonNodeConfig"
fi
set -x echo --- gce.sh create
gcloud compute ssh "$vmName" --zone "$vmZone" \ time net/gce.sh create "${gce_create_args[@]}"
--ssh-flag="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -t" \ net/init-metrics.sh -e
--command="\
set -ex; \
logmarker='solana deploy $(date)/$RANDOM'; \
sudo snap remove solana; \
logger \$logmarker; \
sudo snap install solana --$SOLANA_SNAP_CHANNEL --devmode; \
sudo snap set solana $nodeConfig; \
snap info solana; \
echo Slight delay to get more syslog output; \
sleep 2; \
sudo grep -Pzo \"\$logmarker(.|\\n)*\" /var/log/syslog \
"
echo "Succeeded in ${SECONDS} seconds"
) > "log-$vmName.txt" 2>&1 &
pid=$!
# Rename log file so it can be discovered later by $pid
mv "log-$vmName.txt" "log-$pid.txt"
if $leader; then echo --- net.sh start
echo Waiting for leader... time net/net.sh start -s "$snapChannel"
# Wait for the leader to initialize before starting the validators
# TODO: Remove this limitation eventually.
wait_for_node "$pid"
echo "--- Refreshing validators"
else
# Slow down deployment to ~20 machines a minute to avoid triggering GCP login
# quota limits (each |ssh| counts as a login)
sleep 3
pids+=("$pid")
fi
leader=false
count=$((count + 1))
done
echo --- Waiting for validators
for pid in "${pids[@]}"; do
wait_for_node "$pid"
done
echo "--- $publicUrl sanity test"
USE_SNAP=1 ci/testnet-sanity.sh $publicUrl
exit 0 exit 0

View File

@ -1,17 +1,36 @@
#!/bin/bash -e #!/bin/bash -e
#
# Perform a quick sanity test on the specific testnet
#
cd "$(dirname "$0")/.." cd "$(dirname "$0")/.."
TESTNET=$1 usage() {
if [[ -z $TESTNET ]]; then exitcode=0
TESTNET=testnet.solana.com if [[ -n "$1" ]]; then
fi exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 [name]
echo "--- $TESTNET: wallet sanity" Sanity check a CD testnet
multinode-demo/test/wallet-sanity.sh $TESTNET
name - name of the network
Note: the SOLANA_METRICS_CONFIG environment variable is used to configure
metrics
EOF
exit $exitcode
}
netName=$1
[[ -n $netName ]] || usage ""
set -x
echo --- gce.sh config
net/gce.sh config -p "$netName"
net/init-metrics.sh -e
echo --- net.sh sanity
net/net.sh sanity \
${NO_LEDGER_VERIFY:+-o noLedgerVerify} \
${NO_VALIDATOR_SANITY:+-o noValidatorSanity} \
echo --- fin
exit 0 exit 0

35
ci/version-check.sh Executable file
View File

@ -0,0 +1,35 @@
#!/bin/bash -e
require() {
declare expectedProgram="$1"
declare expectedVersion="$2"
read -r program version _ < <($expectedProgram -V)
declare ok=true
[[ $program = "$expectedProgram" ]] || ok=false
[[ $version =~ $expectedVersion ]] || ok=false
echo "Found $program $version"
if ! $ok; then
echo Error: expected "$expectedProgram $expectedVersion"
exit 1
fi
}
case ${1:-stable} in
nightly)
require rustc 1.30.[0-9]+-nightly
require cargo 1.29.[0-9]+-nightly
;;
stable)
require rustc 1.29.[0-9]+
require cargo 1.29.[0-9]+
;;
*)
echo Error: unknown argument: "$1"
exit 1
;;
esac
exit 0

178
doc/json-rpc.md Normal file
View File

@ -0,0 +1,178 @@
Solana JSON RPC API
===
Solana nodes accept HTTP requests using the [JSON-RPC 2.0](https://www.jsonrpc.org/specification) specification.
To interact with a Solana node inside a JavaScript application, use the [solana-web3.js](https://github.com/solana-labs/solana-web3.js) library, which gives a convenient interface for the RPC methods.
RPC Endpoint
---
**Default port:** 8899
eg. http://localhost:8899, http://192.168.1.88:8899
Methods
---
* [confirmTransaction](#confirmtransaction)
* [getAddress](#getaddress)
* [getBalance](#getbalance)
* [getLastId](#getlastid)
* [getTransactionCount](#gettransactioncount)
* [requestAirdrop](#requestairdrop)
* [sendTransaction](#sendtransaction)
Request Formatting
---
To make a JSON-RPC request, send an HTTP POST request with a `Content-Type: application/json` header. The JSON request data should contain 4 fields:
* `jsonrpc`, set to `"2.0"`
* `id`, a unique client-generated identifying integer
* `method`, a string containing the method to be invoked
* `params`, a JSON array of ordered parameter values
Example using curl:
```bash
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "id":1, "method":"getBalance", "params":["83astBRguLMdt2h5U1Tpdq5tjFoJ6noeGwaY3mDLVcri"]}' 192.168.1.88:8899
```
The response output will be a JSON object with the following fields:
* `jsonrpc`, matching the request specification
* `id`, matching the request identifier
* `result`, requested data or success confirmation
Requests can be sent in batches by sending an array of JSON-RPC request objects as the data for a single POST.
Definitions
---
* Hash: A SHA-256 hash of a chunk of data.
* Pubkey: The public key of a Ed25519 key-pair.
* Signature: An Ed25519 signature of a chunk of data.
* Transaction: A Solana instruction signed by a client key-pair.
JSON RPC API Reference
---
### confirmTransaction
Returns a transaction receipt
##### Parameters:
* `string` - Signature of Transaction to confirm, as base-58 encoded string
##### Results:
* `boolean` - Transaction status, true if Transaction is confirmed
##### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "id":1, "method":"confirmTransaction", "params":["5VERv8NMvzbJMEkV8xnrLkEaWRtSz9CosKDYjCJjBRnbJLgp8uirBgmQpjKhoR4tjF3ZpRzrFmBV6UjKdiSZkQUW"]}' http://localhost:8899
// Result
{"jsonrpc":"2.0","result":true,"id":1}
```
---
### getBalance
Returns the balance of the account of provided Pubkey
##### Parameters:
* `string` - Pubkey of account to query, as base-58 encoded string
##### Results:
* `integer` - quantity, as a signed 64-bit integer
##### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "id":1, "method":"getBalance", "params":["83astBRguLMdt2h5U1Tpdq5tjFoJ6noeGwaY3mDLVcri"]}' http://localhost:8899
// Result
{"jsonrpc":"2.0","result":0,"id":1}
```
---
### getLastId
Returns the last entry ID from the ledger
##### Parameters:
None
##### Results:
* `string` - the ID of last entry, a Hash as base-58 encoded string
##### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","id":1, "method":"getLastId"}' http://localhost:8899
// Result
{"jsonrpc":"2.0","result":"GH7ome3EiwEr7tu9JuTh2dpYWBJK3z69Xm1ZE3MEE6JC","id":1}
```
---
### getTransactionCount
Returns the current Transaction count from the ledger
##### Parameters:
None
##### Results:
* `integer` - count, as unsigned 64-bit integer
##### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","id":1, "method":"getTransactionCount"}' http://localhost:8899
// Result
{"jsonrpc":"2.0","result":268,"id":1}
```
---
### requestAirdrop
Requests an airdrop of tokens to a Pubkey
##### Parameters:
* `string` - Pubkey of account to receive tokens, as base-58 encoded string
* `integer` - token quantity, as a signed 64-bit integer
##### Results:
* `string` - Transaction Signature of airdrop, as base-58 encoded string
##### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","id":1, "method":"requestAirdrop", "params":["83astBRguLMdt2h5U1Tpdq5tjFoJ6noeGwaY3mDLVcri", 50]}' http://localhost:8899
// Result
{"jsonrpc":"2.0","result":"5VERv8NMvzbJMEkV8xnrLkEaWRtSz9CosKDYjCJjBRnbJLgp8uirBgmQpjKhoR4tjF3ZpRzrFmBV6UjKdiSZkQUW","id":1}
```
---
### sendTransaction
Creates new transaction
##### Parameters:
* `array` - array of octets containing a fully-signed Transaction
##### Results:
* `string` - Transaction Signature, as base-58 encoded string
##### Example:
```bash
// Request
curl -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","id":1, "method":"sendTransaction", "params":[[61, 98, 55, 49, 15, 187, 41, 215, 176, 49, 234, 229, 228, 77, 129, 221, 239, 88, 145, 227, 81, 158, 223, 123, 14, 229, 235, 247, 191, 115, 199, 71, 121, 17, 32, 67, 63, 209, 239, 160, 161, 2, 94, 105, 48, 159, 235, 235, 93, 98, 172, 97, 63, 197, 160, 164, 192, 20, 92, 111, 57, 145, 251, 6, 40, 240, 124, 194, 149, 155, 16, 138, 31, 113, 119, 101, 212, 128, 103, 78, 191, 80, 182, 234, 216, 21, 121, 243, 35, 100, 122, 68, 47, 57, 13, 39, 0, 0, 0, 0, 50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 50, 0, 0, 0, 0, 0, 0, 0, 40, 240, 124, 194, 149, 155, 16, 138, 31, 113, 119, 101, 212, 128, 103, 78, 191, 80, 182, 234, 216, 21, 121, 243, 35, 100, 122, 68, 47, 57, 11, 12, 106, 49, 74, 226, 201, 16, 161, 192, 28, 84, 124, 97, 190, 201, 171, 186, 6, 18, 70, 142, 89, 185, 176, 154, 115, 61, 26, 163, 77, 1, 88, 98, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]}' http://localhost:8899
// Result
{"jsonrpc":"2.0","result":"2EBVM6cB8vAAD93Ktr6Vd8p67XPbQzCJX47MpReuiCXJAtcjaxpvWpcg9Ege1Nr5Tk3a2GFrByT7WPBjdsTycY9b","id":1}
```
---

44
doc/testnet.md Normal file
View File

@ -0,0 +1,44 @@
# TestNet debugging info
Currently we have two testnets, 'perf' and 'master', both on the master branch of the solana repo. Deploys happen
at the top of every hour with the latest code. 'perf' has more cores for the client machine to flood the network
with transactions until failure.
## Deploy process
They are deployed with the `ci/testnet-deploy.sh` script. There is a scheduled buildkite job which runs to do the deploy,
look at `testnet-deploy` to see the agent which ran it and the logs. There is also a manual job to do the deploy manually..
Validators are selected based on their machine name and everyone gets the binaries installed from snap.
## Where are the testnet logs?
For the client they are put in `/tmp/solana`; for validators and leaders they are in `/var/snap/solana/current/`.
You can also see the backtrace of the client by ssh'ing into the client node and doing:
```bash
$ sudo -u testnet-deploy
$ tmux attach -t solana
```
## How do I reset the testnet?
Through buildkite.
## How can I scale the tx generation rate?
Increase the TX rate by increasing the number of cores on the client machine which is running
`bench-tps` or run multiple clients. Decrease by lowering cores or using the rayon env
variable `RAYON_NUM_THREADS=<xx>`
## How can I test a change on the testnet?
Currently, a merged PR is the only way to test a change on the testnet.
## Adjusting the number of clients or validators on the testnet
1. Go to the [GCP Instance Group](https://console.cloud.google.com/compute/instanceGroups/list?project=principal-lane-200702) tab
2. Find the client or validator instance group you'd like to adjust
3. Edit it (pencil icon), change the "Number of instances", then click "Save" button
4. Refresh until the change to number of instances has been executed
5. Click the "New Build" button on the [testnet-deploy](https://buildkite.com/solana-labs/testnet-deploy/)
buildkite job to initiate a redeploy of the network with the updated instance count.

View File

@ -10,28 +10,30 @@ if [[ $(uname -m) != x86_64 ]]; then
exit 1 exit 1
fi fi
mkdir -p target/perf-libs
( (
set -x cd target/perf-libs
curl -o solana-perf.tgz \ (
https://solana-perf.s3.amazonaws.com/master/x86_64-unknown-linux-gnu/solana-perf.tgz set -x
tar zxvf solana-perf.tgz curl https://solana-perf.s3.amazonaws.com/v0.8.0/x86_64-unknown-linux-gnu/solana-perf.tgz | tar zxvf -
) )
if [[ -r /usr/local/cuda/version.txt && -r cuda-version.txt ]]; then if [[ -r /usr/local/cuda/version.txt && -r cuda-version.txt ]]; then
if ! diff /usr/local/cuda/version.txt cuda-version.txt > /dev/null; then if ! diff /usr/local/cuda/version.txt cuda-version.txt > /dev/null; then
echo ==============================================
echo Warning: possible CUDA version mismatch
echo
echo "Expected version: $(cat cuda-version.txt)"
echo "Detected version: $(cat /usr/local/cuda/version.txt)"
echo ==============================================
fi
else
echo ============================================== echo ==============================================
echo Warning: possible CUDA version mismatch echo Warning: unable to validate CUDA version
echo
echo "Expected version: $(cat cuda-version.txt)"
echo "Detected version: $(cat /usr/local/cuda/version.txt)"
echo ============================================== echo ==============================================
fi fi
else
echo ==============================================
echo Warning: unable to validate CUDA version
echo ==============================================
fi
echo "Downloaded solana-perf version: $(cat solana-perf-HEAD.txt)" echo "Downloaded solana-perf version: $(cat solana-perf-HEAD.txt)"
)
exit 0 exit 0

View File

@ -1,33 +1,25 @@
#!/bin/bash #!/bin/bash -e
#
# usage: $0 <rsync network path to solana repo on leader machine> <number of nodes in the network>"
#
here=$(dirname "$0") here=$(dirname "$0")
# shellcheck source=multinode-demo/common.sh # shellcheck source=multinode-demo/common.sh
source "$here"/common.sh source "$here"/common.sh
leader=$1 usage() {
if [[ -z $leader ]]; then if [[ -n $1 ]]; then
if [[ -d "$SNAP" ]]; then echo "$*"
leader=testnet.solana.com # Default to testnet when running as a Snap echo
else
leader=$here/.. # Default to local solana repo
fi fi
echo "usage: $0 [extra args]"
echo
echo " Run bench-tps "
echo
echo " extra args: additional arguments are pass along to solana-bench-tps"
echo
exit 1
}
if [[ -z $1 ]]; then # default behavior
$solana_bench_tps --identity config-private/client-id.json --network 127.0.0.1:8001 --duration 90
else
$solana_bench_tps "$@"
fi fi
count=${2:-1}
rsync_leader_url=$(rsync_url "$leader")
set -ex
mkdir -p "$SOLANA_CONFIG_CLIENT_DIR"
$rsync -vPz "$rsync_leader_url"/config/leader.json "$SOLANA_CONFIG_CLIENT_DIR"/
client_json="$SOLANA_CONFIG_CLIENT_DIR"/client.json
[[ -r $client_json ]] || $solana_keygen -o "$client_json"
$solana_client_demo \
-n "$count" \
-l "$SOLANA_CONFIG_CLIENT_DIR"/leader.json \
-k "$SOLANA_CONFIG_CLIENT_DIR"/client.json \

View File

@ -1,23 +1,34 @@
# |source| this file # |source| this file
# #
# Disable complaints about unused variables in this file: # Common utilities shared by other scripts in this directory
#
# The following directive disable complaints about unused variables in this
# file:
# shellcheck disable=2034 # shellcheck disable=2034
#
# shellcheck disable=2154 # 'here' is referenced but not assigned
if [[ -z $here ]]; then
echo "|here| is not defined"
exit 1
fi
rsync=rsync rsync=rsync
leader_logger="cat" leader_logger="cat"
validator_logger="cat" validator_logger="cat"
drone_logger="cat" drone_logger="cat"
if [[ -d "$SNAP" ]]; then # Running inside a Linux Snap? if [[ $(uname) != Linux ]]; then
# Protect against unsupported configurations to prevent non-obvious errors
# later. Arguably these should be fatal errors but for now prefer tolerance.
if [[ -n $USE_SNAP ]]; then
echo "Warning: Snap is not supported on $(uname)"
USE_SNAP=
fi
if [[ -n $SOLANA_CUDA ]]; then
echo "Warning: CUDA is not supported on $(uname)"
SOLANA_CUDA=
fi
fi
if [[ -d $SNAP ]]; then # Running inside a Linux Snap?
solana_program() { solana_program() {
declare program="$1" declare program="$1"
if [[ "$program" = wallet || "$program" = client-demo ]]; then if [[ "$program" = wallet || "$program" = bench-tps ]]; then
# TODO: Merge wallet.sh/client.sh functionality into # TODO: Merge wallet.sh/client.sh functionality into
# solana-wallet/solana-demo-client proper and remove this special case # solana-wallet/solana-demo-client proper and remove this special case
printf "%s/bin/solana-%s" "$SNAP" "$program" printf "%s/bin/solana-%s" "$SNAP" "$program"
@ -26,7 +37,7 @@ if [[ -d "$SNAP" ]]; then # Running inside a Linux Snap?
fi fi
} }
rsync="$SNAP"/bin/rsync rsync="$SNAP"/bin/rsync
multilog="$SNAP/bin/multilog t s16777215" multilog="$SNAP/bin/multilog t s16777215 n200"
leader_logger="$multilog $SNAP_DATA/leader" leader_logger="$multilog $SNAP_DATA/leader"
validator_logger="$multilog t $SNAP_DATA/validator" validator_logger="$multilog t $SNAP_DATA/validator"
drone_logger="$multilog $SNAP_DATA/drone" drone_logger="$multilog $SNAP_DATA/drone"
@ -34,17 +45,12 @@ if [[ -d "$SNAP" ]]; then # Running inside a Linux Snap?
# 0700 # 0700
mkdir -p "$SNAP_DATA"/{drone,leader,validator} mkdir -p "$SNAP_DATA"/{drone,leader,validator}
SOLANA_METRICS_CONFIG="$(snapctl get metrics-config)" elif [[ -n $USE_SNAP ]]; then # Use the Linux Snap binaries
SOLANA_DEFAULT_METRICS_RATE="$(snapctl get default-metrics-rate)"
SOLANA_CUDA="$(snapctl get enable-cuda)"
RUST_LOG="$(snapctl get rust-log)"
elif [[ -n "$USE_SNAP" ]]; then # Use the Linux Snap binaries
solana_program() { solana_program() {
declare program="$1" declare program="$1"
printf "solana.%s" "$program" printf "solana.%s" "$program"
} }
elif [[ -n "$USE_INSTALL" ]]; then # Assume |cargo install| was run elif [[ -n $USE_INSTALL ]]; then # Assume |cargo install| was run
solana_program() { solana_program() {
declare program="$1" declare program="$1"
printf "solana-%s" "$program" printf "solana-%s" "$program"
@ -59,19 +65,25 @@ else
program=${BASH_REMATCH[1]} program=${BASH_REMATCH[1]}
features="--features=cuda" features="--features=cuda"
fi fi
if [[ -z "$DEBUG" ]]; then if [[ -z $DEBUG ]]; then
maybe_release=--release maybe_release=--release
fi fi
printf "cargo run $maybe_release --bin solana-%s %s -- " "$program" "$features" printf "cargo run $maybe_release --bin solana-%s %s -- " "$program" "$features"
} }
if [[ -n $SOLANA_CUDA ]]; then if [[ -n $SOLANA_CUDA ]]; then
# shellcheck disable=2154 # 'here' is referenced but not assigned
if [[ -z $here ]]; then
echo "|here| is not defined"
exit 1
fi
# Locate perf libs downloaded by |./fetch-perf-libs.sh| # Locate perf libs downloaded by |./fetch-perf-libs.sh|
LD_LIBRARY_PATH=$(cd "$here" && dirname "$PWD"):$LD_LIBRARY_PATH LD_LIBRARY_PATH=$(cd "$here" && dirname "$PWD"/target/perf-libs):$LD_LIBRARY_PATH
export LD_LIBRARY_PATH export LD_LIBRARY_PATH
fi fi
fi fi
solana_client_demo=$(solana_program client-demo) solana_bench_tps=$(solana_program bench-tps)
solana_wallet=$(solana_program wallet) solana_wallet=$(solana_program wallet)
solana_drone=$(solana_program drone) solana_drone=$(solana_program drone)
solana_fullnode=$(solana_program fullnode) solana_fullnode=$(solana_program fullnode)
@ -79,56 +91,18 @@ solana_fullnode_config=$(solana_program fullnode-config)
solana_fullnode_cuda=$(solana_program fullnode-cuda) solana_fullnode_cuda=$(solana_program fullnode-cuda)
solana_genesis=$(solana_program genesis) solana_genesis=$(solana_program genesis)
solana_keygen=$(solana_program keygen) solana_keygen=$(solana_program keygen)
solana_ledger_tool=$(solana_program ledger-tool)
export RUST_LOG=${RUST_LOG:-solana=info} # if RUST_LOG is unset, default to info export RUST_LOG=${RUST_LOG:-solana=info} # if RUST_LOG is unset, default to info
export RUST_BACKTRACE=1 export RUST_BACKTRACE=1
# shellcheck source=scripts/configure-metrics.sh
# The SOLANA_METRICS_CONFIG environment variable is formatted as a source "$(dirname "${BASH_SOURCE[0]}")"/../scripts/configure-metrics.sh
# comma-delimited list of parameters. All parameters are optional.
#
# Example:
# export SOLANA_METRICS_CONFIG="host=<metrics host>,db=<database name>,u=<username>,p=<password>"
#
configure_metrics() {
[[ -n $SOLANA_METRICS_CONFIG ]] || return
declare metrics_params
IFS=',' read -r -a metrics_params <<< "$SOLANA_METRICS_CONFIG"
for param in "${metrics_params[@]}"; do
IFS='=' read -r -a pair <<< "$param"
if [[ "${#pair[@]}" != 2 ]]; then
echo Error: invalid metrics parameter: "$param" >&2
else
declare name="${pair[0]}"
declare value="${pair[1]}"
case "$name" in
host)
export INFLUX_HOST="$value"
echo INFLUX_HOST="$INFLUX_HOST" >&2
;;
db)
export INFLUX_DATABASE="$value"
echo INFLUX_DATABASE="$INFLUX_DATABASE" >&2
;;
u)
export INFLUX_USERNAME="$value"
echo INFLUX_USERNAME="$INFLUX_USERNAME" >&2
;;
p)
export INFLUX_PASSWORD="$value"
echo INFLUX_PASSWORD="********" >&2
;;
*)
echo Error: Unknown metrics parameter name: "$name" >&2
;;
esac
fi
done
}
configure_metrics
tune_networking() { tune_networking() {
# Skip in CI
[[ -z $CI ]] || return 0
# Reference: https://medium.com/@CameronSparr/increase-os-udp-buffers-to-improve-performance-51d167bb1360 # Reference: https://medium.com/@CameronSparr/increase-os-udp-buffers-to-improve-performance-51d167bb1360
if [[ $(uname) = Linux ]]; then if [[ $(uname) = Linux ]]; then
( (
@ -136,33 +110,93 @@ tune_networking() {
# test the existence of the sysctls before trying to set them # test the existence of the sysctls before trying to set them
# go ahead and return true and don't exit if these calls fail # go ahead and return true and don't exit if these calls fail
sysctl net.core.rmem_max 2>/dev/null 1>/dev/null && sysctl net.core.rmem_max 2>/dev/null 1>/dev/null &&
sudo sysctl -w net.core.rmem_max=26214400 1>/dev/null 2>/dev/null sudo sysctl -w net.core.rmem_max=67108864 1>/dev/null 2>/dev/null
sysctl net.core.rmem_default 2>/dev/null 1>/dev/null && sysctl net.core.rmem_default 2>/dev/null 1>/dev/null &&
sudo sysctl -w net.core.rmem_default=26214400 1>/dev/null 2>/dev/null sudo sysctl -w net.core.rmem_default=26214400 1>/dev/null 2>/dev/null
) || true ) || true
fi fi
if [[ $(uname) = Darwin ]]; then
(
if [[ $(sysctl net.inet.udp.maxdgram | cut -d\ -f2) != 65535 ]]; then
echo "Adjusting maxdgram to allow for large UDP packets, see BLOB_SIZE in src/packet.rs:"
set -x
sudo sysctl net.inet.udp.maxdgram=65535
fi
)
fi
} }
SOLANA_CONFIG_DIR=${SNAP_DATA:-$PWD}/config SOLANA_CONFIG_DIR=${SNAP_DATA:-$PWD}/config
SOLANA_CONFIG_PRIVATE_DIR=${SNAP_DATA:-$PWD}/config-private SOLANA_CONFIG_PRIVATE_DIR=${SNAP_DATA:-$PWD}/config-private
SOLANA_CONFIG_VALIDATOR_DIR=${SNAP_DATA:-$PWD}/config-validator
SOLANA_CONFIG_CLIENT_DIR=${SNAP_USER_DATA:-$PWD}/config-client SOLANA_CONFIG_CLIENT_DIR=${SNAP_USER_DATA:-$PWD}/config-client
rsync_url() { # adds the 'rsync://` prefix to URLs that need it rsync_url() { # adds the 'rsync://` prefix to URLs that need it
declare url="$1" declare url="$1"
if [[ "$url" =~ ^.*:.*$ ]]; then if [[ $url =~ ^.*:.*$ ]]; then
# assume remote-shell transport when colon is present, use $url unmodified # assume remote-shell transport when colon is present, use $url unmodified
echo "$url" echo "$url"
return return 0
fi fi
if [[ -d "$url" ]]; then if [[ -d $url ]]; then
# assume local directory if $url is a valid directory, use $url unmodified # assume local directory if $url is a valid directory, use $url unmodified
echo "$url" echo "$url"
return return 0
fi fi
# Default to rsync:// URL # Default to rsync:// URL
echo "rsync://$url" echo "rsync://$url"
} }
# called from drone, validator, client
find_leader() {
declare leader leader_address
declare shift=0
if [[ -d $SNAP ]]; then
if [[ -n $1 ]]; then
usage "Error: unexpected parameter: $1"
fi
# Select leader from the Snap configuration
leader_ip=$(snapctl get leader-ip)
if [[ -z $leader_ip ]]; then
leader=testnet.solana.com
leader_ip=$(dig +short "${leader%:*}" | head -n1)
if [[ -z $leader_ip ]]; then
usage "Error: unable to resolve IP address for $leader"
fi
fi
leader=$leader_ip
leader_address=$leader_ip:8001
else
if [[ -z $1 ]]; then
leader=${here}/.. # Default to local tree for rsync
leader_address=127.0.0.1:8001 # Default to local leader
elif [[ -z $2 ]]; then
leader=$1
declare leader_ip
leader_ip=$(dig +short "${leader%:*}" | head -n1)
if [[ -z $leader_ip ]]; then
usage "Error: unable to resolve IP address for $leader"
fi
leader_address=$leader_ip:8001
shift=1
else
leader=$1
leader_address=$2
shift=2
fi
fi
echo "$leader" "$leader_address" "$shift"
}

View File

@ -1,28 +1,26 @@
#!/bin/bash #!/bin/bash
# #
# usage: $0 <rsync network path to solana repo on leader machine> # Starts an instance of solana-drone
# #
here=$(dirname "$0") here=$(dirname "$0")
# shellcheck source=multinode-demo/common.sh # shellcheck source=multinode-demo/common.sh
source "$here"/common.sh source "$here"/common.sh
SOLANA_CONFIG_DIR="$SOLANA_CONFIG_DIR"-drone
if [[ -d "$SNAP" ]]; then usage() {
# Exit if mode is not yet configured if [[ -n $1 ]]; then
# (typically the case after the Snap is first installed) echo "$*"
[[ -n "$(snapctl get mode)" ]] || exit 0 echo
# Select leader from the Snap configuration
leader_address="$(snapctl get leader-address)"
if [[ -z "$leader_address" ]]; then
# Assume drone is running on the same node as the leader by default
leader_address="localhost"
fi fi
leader="$leader_address" echo "usage: $0 [network entry point]"
else echo
leader=${1:-${here}/..} # Default to local solana repo echo " Run an airdrop drone for the specified network"
fi echo
exit 1
}
read -r _ leader_address shift < <(find_leader "${@:1:1}")
shift "$shift"
[[ -f "$SOLANA_CONFIG_PRIVATE_DIR"/mint.json ]] || { [[ -f "$SOLANA_CONFIG_PRIVATE_DIR"/mint.json ]] || {
echo "$SOLANA_CONFIG_PRIVATE_DIR/mint.json not found, create it by running:" echo "$SOLANA_CONFIG_PRIVATE_DIR/mint.json not found, create it by running:"
@ -31,12 +29,12 @@ fi
exit 1 exit 1
} }
rsync_leader_url=$(rsync_url "$leader")
set -ex set -ex
mkdir -p "$SOLANA_CONFIG_DIR"
$rsync -vPz "$rsync_leader_url"/config/leader.json "$SOLANA_CONFIG_DIR"/
set -o pipefail trap 'kill "$pid" && wait "$pid"' INT TERM
$solana_drone \ $solana_drone \
-l "$SOLANA_CONFIG_DIR"/leader.json -k "$SOLANA_CONFIG_PRIVATE_DIR"/mint.json \ --keypair "$SOLANA_CONFIG_PRIVATE_DIR"/mint.json \
2>&1 | $drone_logger --network "$leader_address" \
> >($drone_logger) 2>&1 &
pid=$!
wait "$pid"

View File

@ -1,80 +0,0 @@
#!/bin/bash
command=$1
prefix=
num_nodes=
out_file=
image_name="ubuntu-16-04-cuda-9-2-new"
shift
usage() {
exitcode=0
if [[ -n "$1" ]]; then
exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 <create|delete> <-p prefix> <-n num_nodes> <-o file> [-i image-name]
Manage a GCE multinode network
create|delete - Create or delete the network
-p prefix - A common prefix for node names, to avoid collision
-n num_nodes - Number of nodes
-o out_file - Used for create option. Outputs an array of IP addresses
of new nodes to the file
-i image_name - Existing image on GCE (default $image_name)
EOF
exit $exitcode
}
while getopts "h?p:i:n:o:" opt; do
case $opt in
h | \?)
usage
;;
p)
prefix=$OPTARG
;;
i)
image_name=$OPTARG
;;
o)
out_file=$OPTARG
;;
n)
num_nodes=$OPTARG
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
set -e
[[ -n $command ]] || usage "Need a command (create|delete)"
[[ -n $prefix ]] || usage "Need a prefix for GCE instance names"
[[ -n $num_nodes ]] || usage "Need number of nodes"
nodes=()
for i in $(seq 1 "$num_nodes"); do
nodes+=("$prefix$i")
done
if [[ $command == "create" ]]; then
[[ -n $out_file ]] || usage "Need an outfile to store IP Addresses"
ip_addr_list=$(gcloud beta compute instances create "${nodes[@]}" --zone=us-west1-b --tags=testnet \
--image="$image_name" | awk '/RUNNING/ {print $5}')
echo "ip_addr_array=($ip_addr_list)" >"$out_file"
elif [[ $command == "delete" ]]; then
gcloud beta compute instances delete "${nodes[@]}"
else
usage "Unknown command: $command"
fi

View File

@ -1,9 +1,15 @@
#!/bin/bash #!/bin/bash
#
# Starts a leader node
#
here=$(dirname "$0") here=$(dirname "$0")
# shellcheck source=multinode-demo/common.sh # shellcheck source=multinode-demo/common.sh
source "$here"/common.sh source "$here"/common.sh
# shellcheck source=scripts/oom-score-adj.sh
source "$here"/../scripts/oom-score-adj.sh
if [[ -d "$SNAP" ]]; then if [[ -d "$SNAP" ]]; then
# Exit if mode is not yet configured # Exit if mode is not yet configured
# (typically the case after the Snap is first installed) # (typically the case after the Snap is first installed)
@ -13,7 +19,7 @@ fi
[[ -f "$SOLANA_CONFIG_DIR"/leader.json ]] || { [[ -f "$SOLANA_CONFIG_DIR"/leader.json ]] || {
echo "$SOLANA_CONFIG_DIR/leader.json not found, create it by running:" echo "$SOLANA_CONFIG_DIR/leader.json not found, create it by running:"
echo echo
echo " ${here}/setup.sh -t leader" echo " ${here}/setup.sh"
exit 1 exit 1
} }
@ -25,8 +31,11 @@ fi
tune_networking tune_networking
set -xo pipefail trap 'kill "$pid" && wait "$pid"' INT TERM
$program \ $program \
--identity "$SOLANA_CONFIG_DIR"/leader.json \ --identity "$SOLANA_CONFIG_DIR"/leader.json \
--ledger "$SOLANA_CONFIG_DIR"/ledger.log \ --ledger "$SOLANA_CONFIG_DIR"/ledger \
2>&1 | $leader_logger > >($leader_logger) 2>&1 &
pid=$!
oom_score_adj "$pid" 1000
wait "$pid"

View File

@ -1,14 +0,0 @@
#!/bin/bash -e
[[ -n $FORCE ]] || exit
chmod 600 ~/.ssh/authorized_keys ~/.ssh/id_rsa
PATH="$HOME"/.cargo/bin:"$PATH"
./fetch-perf-libs.sh
# Run setup
USE_INSTALL=1 ./multinode-demo/setup.sh -p
USE_INSTALL=1 SOLANA_CUDA=1 ./multinode-demo/leader.sh >leader.log 2>&1 &
USE_INSTALL=1 ./multinode-demo/drone.sh >drone.log 2>&1 &

View File

@ -1,185 +0,0 @@
#!/bin/bash
command=$1
ip_addr_file=
remote_user=
ssh_keys=
shift
usage() {
exitcode=0
if [[ -n "$1" ]]; then
exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 <start|stop> <-f IP Addr Array file> <-u username> [-k ssh-keys]
Manage a GCE multinode network
start|stop - Create or delete the network
-f file - A bash script that exports an array of IP addresses, ip_addr_array.
Elements of the array are public IP address of remote nodes.
-u username - The username for logging into remote nodes.
-k ssh-keys - Path to public/private key pair that remote nodes can use to perform
rsync and ssh among themselves. Must contain pub, and priv keys.
EOF
exit $exitcode
}
while getopts "h?f:u:k:" opt; do
case $opt in
h | \?)
usage
;;
f)
ip_addr_file=$OPTARG
;;
u)
remote_user=$OPTARG
;;
k)
ssh_keys=$OPTARG
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
set -e
# Sample IP Address array file contents
# ip_addr_array=(192.168.1.1 192.168.1.5 192.168.2.2)
[[ -n $command ]] || usage "Need a command (start|stop)"
[[ -n $ip_addr_file ]] || usage "Need a file with IP address array"
[[ -n $remote_user ]] || usage "Need the username for remote nodes"
ip_addr_array=()
# Get IP address array
# shellcheck source=/dev/null
source "$ip_addr_file"
build_project() {
echo "Build started at $(date)"
SECONDS=0
# Build and install locally
PATH="$HOME"/.cargo/bin:"$PATH"
cargo install --force
echo "Build took $SECONDS seconds"
}
common_start_setup() {
ip_addr=$1
# Killing sshguard for now. TODO: Find a better solution
# sshguard is blacklisting IP address after ssh-keyscan and ssh login attempts
ssh "$remote_user@$ip_addr" " \
set -ex; \
sudo service sshguard stop; \
sudo apt-get --assume-yes install rsync libssl-dev; \
mkdir -p ~/.ssh ~/solana ~/.cargo/bin; \
" >log/"$ip_addr".log
# If provided, deploy SSH keys
if [[ -n $ssh_keys ]]; then
{
rsync -vPrz "$ssh_keys"/id_rsa "$remote_user@$ip_addr":~/.ssh/
rsync -vPrz "$ssh_keys"/id_rsa.pub "$remote_user@$ip_addr":~/.ssh/
rsync -vPrz "$ssh_keys"/id_rsa.pub "$remote_user@$ip_addr":~/.ssh/authorized_keys
rsync -vPrz ./multinode-demo "$remote_user@$ip_addr":~/solana/
} >>log/"$ip_addr".log
fi
}
start_leader() {
common_start_setup "$1"
{
rsync -vPrz ~/.cargo/bin/solana* "$remote_user@$ip_addr":~/.cargo/bin/
rsync -vPrz ./fetch-perf-libs.sh "$remote_user@$ip_addr":~/solana/
ssh -n -f "$remote_user@$ip_addr" 'cd solana; FORCE=1 ./multinode-demo/remote_leader.sh'
} >>log/"$1".log
leader_ip=$1
leader_time=$SECONDS
SECONDS=0
}
start_validator() {
common_start_setup "$1"
ssh -n -f "$remote_user@$ip_addr" "cd solana; FORCE=1 ./multinode-demo/remote_validator.sh $leader_ip" >>log/"$1".log
}
start_all_nodes() {
echo "Deployment started at $(date)"
SECONDS=0
count=0
leader_ip=
leader_time=
mkdir -p log
for ip_addr in "${ip_addr_array[@]}"; do
if ((!count)); then
# Start the leader on the first node
echo "Leader node $ip_addr, killing previous instance and restarting"
start_leader "$ip_addr"
else
# Start validator on all other nodes
echo "Validator[$count] node $ip_addr, killing previous instance and restarting"
start_validator "$ip_addr" &
# TBD: Remove the sleep or reduce time once GCP login quota is increased
sleep 2
fi
((count = count + 1))
done
wait
((validator_count = count - 1))
echo "Deployment finished at $(date)"
echo "Leader deployment too $leader_time seconds"
echo "$validator_count Validator deployment took $SECONDS seconds"
}
stop_all_nodes() {
SECONDS=0
local count=0
for ip_addr in "${ip_addr_array[@]}"; do
ssh-keygen -R "$ip_addr" >log/local.log
ssh-keyscan "$ip_addr" >>~/.ssh/known_hosts 2>/dev/null
echo "Stopping node[$count] $ip_addr. Remote user $remote_user"
ssh -n -f "$remote_user@$ip_addr" " \
set -ex; \
sudo service sshguard stop; \
pkill -9 solana-; \
pkill -9 validator; \
pkill -9 leader; \
"
sleep 2
((count = count + 1))
echo "Stopped node[$count] $ip_addr"
done
echo "Stopping $count nodes took $SECONDS seconds"
}
if [[ $command == "start" ]]; then
build_project
stop_all_nodes
start_all_nodes
elif [[ $command == "stop" ]]; then
stop_all_nodes
else
usage "Unknown command: $command"
fi

View File

@ -1,17 +0,0 @@
#!/bin/bash -e
[[ -n $FORCE ]] || exit
chmod 600 ~/.ssh/authorized_keys ~/.ssh/id_rsa
PATH="$HOME"/.cargo/bin:"$PATH"
touch ~/.ssh/known_hosts
ssh-keygen -R "$1" 2>/dev/null
ssh-keyscan "$1" >>~/.ssh/known_hosts 2>/dev/null
rsync -vPrz "$1":~/.cargo/bin/solana* ~/.cargo/bin/
# Run setup
USE_INSTALL=1 ./multinode-demo/setup.sh -p
USE_INSTALL=1 ./multinode-demo/validator.sh "$1":~/solana "$1" >validator.log 2>&1

View File

@ -1,4 +1,7 @@
#!/bin/bash #!/bin/bash
#
# Creates a fullnode configuration
#
here=$(dirname "$0") here=$(dirname "$0")
# shellcheck source=multinode-demo/common.sh # shellcheck source=multinode-demo/common.sh
@ -31,6 +34,7 @@ ip_address_arg=-l
num_tokens=1000000000 num_tokens=1000000000
node_type_leader=true node_type_leader=true
node_type_validator=true node_type_validator=true
node_type_client=true
while getopts "h?n:lpt:" opt; do while getopts "h?n:lpt:" opt; do
case $opt in case $opt in
h|\?) h|\?)
@ -52,10 +56,17 @@ while getopts "h?n:lpt:" opt; do
leader) leader)
node_type_leader=true node_type_leader=true
node_type_validator=false node_type_validator=false
node_type_client=false
;; ;;
validator) validator)
node_type_leader=false node_type_leader=false
node_type_validator=true node_type_validator=true
node_type_client=false
;;
client)
node_type_leader=false
node_type_validator=false
node_type_client=true
;; ;;
*) *)
usage "Error: unknown node type: $node_type" usage "Error: unknown node type: $node_type"
@ -69,42 +80,49 @@ while getopts "h?n:lpt:" opt; do
done done
leader_address_args=("$ip_address_arg")
validator_address_args=("$ip_address_arg" -b 9000)
leader_id_path="$SOLANA_CONFIG_PRIVATE_DIR"/leader-id.json
validator_id_path="$SOLANA_CONFIG_PRIVATE_DIR"/validator-id.json
mint_path="$SOLANA_CONFIG_PRIVATE_DIR"/mint.json
set -e set -e
echo "Cleaning $SOLANA_CONFIG_DIR" for i in "$SOLANA_CONFIG_DIR" "$SOLANA_CONFIG_VALIDATOR_DIR" "$SOLANA_CONFIG_PRIVATE_DIR"; do
rm -rvf "$SOLANA_CONFIG_DIR" echo "Cleaning $i"
mkdir -p "$SOLANA_CONFIG_DIR" rm -rvf "$i"
mkdir -p "$i"
done
rm -rvf "$SOLANA_CONFIG_PRIVATE_DIR" if $node_type_client; then
mkdir -p "$SOLANA_CONFIG_PRIVATE_DIR" client_id_path="$SOLANA_CONFIG_PRIVATE_DIR"/client-id.json
$solana_keygen -o "$client_id_path"
$solana_keygen -o "$leader_id_path" ls -lhR "$SOLANA_CONFIG_PRIVATE_DIR"/
$solana_keygen -o "$validator_id_path" fi
if $node_type_leader; then if $node_type_leader; then
leader_address_args=("$ip_address_arg")
leader_id_path="$SOLANA_CONFIG_PRIVATE_DIR"/leader-id.json
mint_path="$SOLANA_CONFIG_PRIVATE_DIR"/mint.json
$solana_keygen -o "$leader_id_path"
echo "Creating $mint_path with $num_tokens tokens" echo "Creating $mint_path with $num_tokens tokens"
$solana_keygen -o "$mint_path" $solana_keygen -o "$mint_path"
echo "Creating $SOLANA_CONFIG_DIR/ledger.log" echo "Creating $SOLANA_CONFIG_DIR/ledger"
$solana_genesis --tokens="$num_tokens" < "$mint_path" > "$SOLANA_CONFIG_DIR"/ledger.log $solana_genesis --tokens="$num_tokens" --ledger "$SOLANA_CONFIG_DIR"/ledger < "$mint_path"
echo "Creating $SOLANA_CONFIG_DIR/leader.json" echo "Creating $SOLANA_CONFIG_DIR/leader.json"
$solana_fullnode_config --keypair="$leader_id_path" "${leader_address_args[@]}" > "$SOLANA_CONFIG_DIR"/leader.json $solana_fullnode_config --keypair="$leader_id_path" "${leader_address_args[@]}" > "$SOLANA_CONFIG_DIR"/leader.json
ls -lhR "$SOLANA_CONFIG_DIR"/
ls -lhR "$SOLANA_CONFIG_PRIVATE_DIR"/
fi fi
if $node_type_validator; then if $node_type_validator; then
echo "Creating $SOLANA_CONFIG_DIR/validator.json" validator_address_args=("$ip_address_arg" -b 9000)
$solana_fullnode_config --keypair="$validator_id_path" "${validator_address_args[@]}" > "$SOLANA_CONFIG_DIR"/validator.json validator_id_path="$SOLANA_CONFIG_PRIVATE_DIR"/validator-id.json
fi
ls -lh "$SOLANA_CONFIG_DIR"/ $solana_keygen -o "$validator_id_path"
if $node_type_leader; then
ls -lh "$SOLANA_CONFIG_PRIVATE_DIR" echo "Creating $SOLANA_CONFIG_VALIDATOR_DIR/validator.json"
$solana_fullnode_config --keypair="$validator_id_path" "${validator_address_args[@]}" > "$SOLANA_CONFIG_VALIDATOR_DIR"/validator.json
ls -lhR "$SOLANA_CONFIG_VALIDATOR_DIR"/
fi fi

8
multinode-demo/validator-x.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
#
# Start a dynamically-configured validator node
#
here=$(dirname "$0")
exec "$here"/validator.sh -x "$@"

View File

@ -1,96 +1,100 @@
#!/bin/bash #!/bin/bash
#
# Start a validator node
#
here=$(dirname "$0") here=$(dirname "$0")
# shellcheck source=multinode-demo/common.sh # shellcheck source=multinode-demo/common.sh
source "$here"/common.sh source "$here"/common.sh
usage() { # shellcheck source=scripts/oom-score-adj.sh
if [[ -n "$1" ]]; then source "$here"/../scripts/oom-score-adj.sh
echo "$*"
echo
fi
echo "usage: $0 [rsync network path to solana repo on leader machine] [network ip address of leader]"
exit 1
}
if [[ "$1" = "-h" || -n "$3" ]]; then
usage
fi
if [[ -d "$SNAP" ]]; then if [[ -d "$SNAP" ]]; then
# Exit if mode is not yet configured # Exit if mode is not yet configured
# (typically the case after the Snap is first installed) # (typically the case after the Snap is first installed)
[[ -n "$(snapctl get mode)" ]] || exit 0 [[ -n "$(snapctl get mode)" ]] || exit 0
# Select leader from the Snap configuration
leader_address="$(snapctl get leader-address)"
if [[ -z "$leader_address" ]]; then
# Assume public testnet by default
leader_address=35.230.65.68 # testnet.solana.com
fi
leader="$leader_address"
else
if [[ -n "$3" ]]; then
usage
fi
if [[ -z "$1" ]]; then
leader=${1:-${here}/..} # Default to local solana repo
leader_address=${2:-127.0.0.1} # Default to local leader
elif [[ -z "$2" ]]; then
leader="$1"
leader_address=$(dig +short "$1" | head -n1)
if [[ -z "$leader_address" ]]; then
usage "Error: unable to resolve IP address for $leader"
fi
else
leader="$1"
leader_address="$2"
fi
fi
leader_port=8001
if [[ -n "$SOLANA_CUDA" ]]; then
program="$solana_fullnode_cuda"
else
program="$solana_fullnode"
fi fi
usage() {
[[ -f "$SOLANA_CONFIG_DIR"/validator.json ]] || { if [[ -n $1 ]]; then
echo "$SOLANA_CONFIG_DIR/validator.json not found, create it by running:" echo "$*"
echo
fi
echo "usage: $0 [-x] [rsync network path to leader] [network entry point]"
echo
echo " Start a validator on the specified network"
echo
echo " -x: runs a new, dynamically-configured validator"
echo echo
echo " ${here}/setup.sh -t validator"
exit 1 exit 1
} }
if [[ $1 = -h ]]; then
usage
fi
if [[ $1 == -x ]]; then
self_setup=1
shift
else
self_setup=0
fi
if [[ -n $3 ]]; then
usage
fi
read -r leader leader_address shift < <(find_leader "${@:1:2}")
shift "$shift"
if [[ -n $SOLANA_CUDA ]]; then
program=$solana_fullnode_cuda
else
program=$solana_fullnode
fi
if ((!self_setup)); then
[[ -f $SOLANA_CONFIG_VALIDATOR_DIR/validator.json ]] || {
echo "$SOLANA_CONFIG_VALIDATOR_DIR/validator.json not found, create it by running:"
echo
echo " ${here}/setup.sh"
exit 1
}
validator_json_path=$SOLANA_CONFIG_VALIDATOR_DIR/validator.json
SOLANA_LEADER_CONFIG_DIR=$SOLANA_CONFIG_VALIDATOR_DIR/leader-config
else
mkdir -p "$SOLANA_CONFIG_PRIVATE_DIR"
validator_id_path=$SOLANA_CONFIG_PRIVATE_DIR/validator-id-x$$.json
$solana_keygen -o "$validator_id_path"
mkdir -p "$SOLANA_CONFIG_VALIDATOR_DIR"
validator_json_path=$SOLANA_CONFIG_VALIDATOR_DIR/validator-x$$.json
port=9000
(((port += ($$ % 1000)) && (port == 9000) && port++))
$solana_fullnode_config --keypair="$validator_id_path" -l -b "$port" > "$validator_json_path"
SOLANA_LEADER_CONFIG_DIR=$SOLANA_CONFIG_VALIDATOR_DIR/leader-config-x$$
fi
rsync_leader_url=$(rsync_url "$leader") rsync_leader_url=$(rsync_url "$leader")
tune_networking tune_networking
SOLANA_LEADER_CONFIG_DIR="$SOLANA_CONFIG_DIR"/leader-config
rm -rf "$SOLANA_LEADER_CONFIG_DIR"
set -ex set -ex
$rsync -vPrz "$rsync_leader_url"/config/ "$SOLANA_LEADER_CONFIG_DIR" $rsync -vPr "$rsync_leader_url"/config/ "$SOLANA_LEADER_CONFIG_DIR"
[[ -d $SOLANA_LEADER_CONFIG_DIR/ledger ]] || {
echo "Unable to retrieve ledger from $rsync_leader_url"
exit 1
}
# migrate from old ledger format? why not... trap 'kill "$pid" && wait "$pid"' INT TERM
if [[ ! -f "$SOLANA_LEADER_CONFIG_DIR"/ledger.log &&
-f "$SOLANA_LEADER_CONFIG_DIR"/genesis.log ]]; then
(shopt -s nullglob &&
cat "$SOLANA_LEADER_CONFIG_DIR"/genesis.log \
"$SOLANA_LEADER_CONFIG_DIR"/tx-*.log) > "$SOLANA_LEADER_CONFIG_DIR"/ledger.log
fi
# Ensure the validator has at least 1 token before connecting to the network
# TODO: Remove this workaround
while ! $solana_wallet \
-l "$SOLANA_LEADER_CONFIG_DIR"/leader.json \
-k "$SOLANA_CONFIG_PRIVATE_DIR"/validator-id.json airdrop --tokens 1; do
sleep 1
done
set -o pipefail
$program \ $program \
--identity "$SOLANA_CONFIG_DIR"/validator.json \ --identity "$validator_json_path" \
--testnet "$leader_address:$leader_port" \ --network "$leader_address" \
--ledger "$SOLANA_LEADER_CONFIG_DIR"/ledger.log \ --ledger "$SOLANA_LEADER_CONFIG_DIR"/ledger \
2>&1 | $validator_logger > >($validator_logger) 2>&1 &
pid=$!
oom_score_adj "$pid" 1000
wait "$pid"

View File

@ -1,5 +1,7 @@
#!/bin/bash #!/bin/bash
# #
# Runs solana-wallet against the specified network
#
# usage: $0 <rsync network path to solana repo on leader machine>" # usage: $0 <rsync network path to solana repo on leader machine>"
# #
@ -7,6 +9,9 @@ here=$(dirname "$0")
# shellcheck source=multinode-demo/common.sh # shellcheck source=multinode-demo/common.sh
source "$here"/common.sh source "$here"/common.sh
# shellcheck source=scripts/oom-score-adj.sh
source "$here"/../scripts/oom-score-adj.sh
# if $1 isn't host:path, something.com, or a valid local path # if $1 isn't host:path, something.com, or a valid local path
if [[ ${1%:} != "$1" || "$1" =~ [^.]\.[^.] || -d $1 ]]; then if [[ ${1%:} != "$1" || "$1" =~ [^.]\.[^.] || -d $1 ]]; then
leader=$1 # interpret leader=$1 # interpret
@ -42,4 +47,4 @@ fi
# shellcheck disable=SC2086 # $solana_wallet should not be quoted # shellcheck disable=SC2086 # $solana_wallet should not be quoted
exec $solana_wallet \ exec $solana_wallet \
-l "$SOLANA_CONFIG_CLIENT_DIR"/leader.json -k "$client_id_path" "$@" -l "$SOLANA_CONFIG_CLIENT_DIR"/leader.json -k "$client_id_path" --timeout 10 "$@"

2
net/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
/config/
/log/

66
net/README.md Normal file
View File

@ -0,0 +1,66 @@
# Network Management
This directory contains scripts useful for working with a test network. It's
intended to be both dev and CD friendly.
### User Account Prerequisites
Log in to GCP with:
```bash
$ gcloud auth login
```
Also ensure that `$(whoami)` is the name of an InfluxDB user account with enough
access to create a new database.
## Quick Start
```bash
$ cd net/
$ ./gce.sh create -n 5 -c 1 #<-- Create a GCE testnet with 5 validators, 1 client (billing starts here)
$ ./init-metrics.sh $(whoami) #<-- Configure a metrics database for the testnet
$ ./net.sh start #<-- Deploy the network from the local workspace
$ ./ssh.sh #<-- Details on how to ssh into any testnet node
$ ./gce.sh delete #<-- Dispose of the network (billing stops here)
```
## Tips
### Running the network over public IP addresses
By default private IP addresses are used with all instances in the same
availability zone to avoid GCE network engress charges. However to run the
network over public IP addresses:
```bash
$ ./gce.sh create -P ...
```
### Deploying a Snap-based network
To deploy the latest pre-built `edge` channel Snap (ie, latest from the `master`
branch), once the testnet has been created run:
```bash
$ ./net.sh start -s edge
```
### Enabling CUDA
First ensure the network instances are created with GPU enabled:
```bash
$ ./gce.sh create -g ...
```
If deploying a Snap-based network nothing further is required, as GPU presence
is detected at runtime and the CUDA build is auto selected.
If deploying a locally-built network, first run `./fetch-perf-libs.sh` then
ensure the `cuda` feature is specified at network start:
```bash
$ ./net.sh start -f "cuda,erasure"
```
### How to interact with a CD testnet deployed by ci/testnet-deploy.sh
Taking **master-testnet-solana-com** as an example, configure your workspace for
the testnet using:
```
$ ./gce.sh config -p master-testnet-solana-com
$ ./ssh.sh # <-- Details on how to ssh into any testnet node
```

58
net/common.sh Normal file
View File

@ -0,0 +1,58 @@
# |source| this file
#
# Common utilities shared by other scripts in this directory
#
# The following directive disable complaints about unused variables in this
# file:
# shellcheck disable=2034
#
netDir=$(
cd "$(dirname "${BASH_SOURCE[0]}")" || exit
echo "$PWD"
)
netConfigDir="$netDir"/config
netLogDir="$netDir"/log
mkdir -p "$netConfigDir" "$netLogDir"
# shellcheck source=scripts/configure-metrics.sh
source "$(dirname "${BASH_SOURCE[0]}")"/../scripts/configure-metrics.sh
configFile="$netConfigDir/config"
entrypointIp=
publicNetwork=
leaderIp=
netBasename=
sshPrivateKey=
clientIpList=()
sshOptions=()
validatorIpList=()
buildSshOptions() {
sshOptions=(
-o "BatchMode=yes"
-o "StrictHostKeyChecking=no"
-o "UserKnownHostsFile=/dev/null"
-o "User=solana"
-o "IdentityFile=$sshPrivateKey"
-o "LogLevel=ERROR"
-F /dev/null
)
}
loadConfigFile() {
[[ -r $configFile ]] || usage "Config file unreadable: $configFile"
# shellcheck source=/dev/null
source "$configFile"
[[ -n "$entrypointIp" ]] || usage "Config file invalid, entrypointIp unspecified: $configFile"
[[ -n "$publicNetwork" ]] || usage "Config file invalid, publicNetwork unspecified: $configFile"
[[ -n "$leaderIp" ]] || usage "Config file invalid, leaderIp unspecified: $configFile"
[[ -n "$netBasename" ]] || usage "Config file invalid, netBasename unspecified: $configFile"
[[ -n $sshPrivateKey ]] || usage "Config file invalid, sshPrivateKey unspecified: $configFile"
[[ ${#validatorIpList[@]} -gt 0 ]] || usage "Config file invalid, validatorIpList unspecified: $configFile"
buildSshOptions
configureMetrics
}

336
net/gce.sh Executable file
View File

@ -0,0 +1,336 @@
#!/bin/bash -e
here=$(dirname "$0")
# shellcheck source=net/scripts/gcloud.sh
source "$here"/scripts/gcloud.sh
# shellcheck source=net/common.sh
source "$here"/common.sh
prefix=testnet-dev-${USER//[^A-Za-z0-9]/}
validatorNodeCount=5
clientNodeCount=1
leaderBootDiskSize=1TB
leaderMachineType=n1-standard-16
leaderAccelerator=
validatorMachineType=n1-standard-4
validatorBootDiskSize=$leaderBootDiskSize
validatorAccelerator=
clientMachineType=n1-standard-16
clientBootDiskSize=40GB
clientAccelerator=
imageName="ubuntu-16-04-cuda-9-2-new"
publicNetwork=false
zone="us-west1-b"
leaderAddress=
usage() {
exitcode=0
if [[ -n "$1" ]]; then
exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 [create|config|delete] [common options] [command-specific options]
Configure a GCE-based testnet
create - create a new testnet (implies 'config')
config - configure the testnet and write a config file describing it
delete - delete the testnet
common options:
-p [prefix] - Optional common prefix for instance names to avoid
collisions (default: $prefix)
create-specific options:
-n [number] - Number of validator nodes (default: $validatorNodeCount)
-c [number] - Number of client nodes (default: $clientNodeCount)
-P - Use public network IP addresses (default: $publicNetwork)
-z [zone] - GCP Zone for the nodes (default: $zone)
-i [imageName] - Existing image on GCE (default: $imageName)
-g - Enable GPU
-a [address] - Set the leader node's external IP address to this GCE address
config-specific options:
none
delete-specific options:
none
EOF
exit $exitcode
}
command=$1
[[ -n $command ]] || usage
shift
[[ $command = create || $command = config || $command = delete ]] || usage "Invalid command: $command"
while getopts "h?p:Pi:n:c:z:ga:" opt; do
case $opt in
h | \?)
usage
;;
p)
[[ ${OPTARG//[^A-Za-z0-9-]/} == "$OPTARG" ]] || usage "Invalid prefix: \"$OPTARG\", alphanumeric only"
prefix=$OPTARG
;;
P)
publicNetwork=true
;;
i)
imageName=$OPTARG
;;
n)
validatorNodeCount=$OPTARG
;;
c)
clientNodeCount=$OPTARG
;;
z)
zone=$OPTARG
;;
g)
leaderAccelerator="count=4,type=nvidia-tesla-k80"
;;
a)
leaderAddress=$OPTARG
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
shift $((OPTIND - 1))
[[ -z $1 ]] || usage "Unexpected argument: $1"
sshPrivateKey="$netConfigDir/id_$prefix"
prepareInstancesAndWriteConfigFile() {
$metricsWriteDatapoint "testnet-deploy net-config-begin=1"
cat >> "$configFile" <<EOF
# autogenerated at $(date)
netBasename=$prefix
publicNetwork=$publicNetwork
sshPrivateKey=$sshPrivateKey
EOF
buildSshOptions
recordInstanceIp() {
declare name="$1"
declare publicIp="$3"
declare privateIp="$4"
declare arrayName="$6"
echo "$arrayName+=($publicIp) # $name" >> "$configFile"
if [[ $arrayName = "leaderIp" ]]; then
if $publicNetwork; then
echo "entrypointIp=$publicIp" >> "$configFile"
else
echo "entrypointIp=$privateIp" >> "$configFile"
fi
fi
}
waitForStartupComplete() {
declare name="$1"
declare publicIp="$3"
echo "Waiting for $name to finish booting..."
(
for i in $(seq 1 30); do
if (set -x; ssh "${sshOptions[@]}" "$publicIp" "test -f /.gce-startup-complete"); then
break
fi
sleep 2
echo "Retry $i..."
done
)
}
echo "Looking for leader instance..."
gcloud_FindInstances "name=$prefix-leader" show
[[ ${#instances[@]} -eq 1 ]] || {
echo "Unable to find leader"
exit 1
}
echo "Fetching $sshPrivateKey from $leaderName"
(
rm -rf "$sshPrivateKey"{,pub}
declare leaderName
declare leaderZone
declare leaderIp
IFS=: read -r leaderName leaderZone leaderIp _ < <(echo "${instances[0]}")
set -x
# Try to ping the machine first. There can be a delay between when the
# instance is reported as RUNNING and when it's reachable over the network
timeout 30s bash -c "set -o pipefail; until ping -c 3 $leaderIp | tr - _; do echo .; done"
# Try to scp in a couple times, sshd may not yet be up even though the
# machine can be pinged...
set -o pipefail
for i in $(seq 1 10); do
if gcloud compute scp --zone "$leaderZone" \
"$leaderName:/solana-id_ecdsa" "$sshPrivateKey"; then
break
fi
sleep 1
echo "Retry $i..."
done
chmod 400 "$sshPrivateKey"
)
echo "leaderIp=()" >> "$configFile"
gcloud_ForEachInstance recordInstanceIp leaderIp
gcloud_ForEachInstance waitForStartupComplete
echo "Looking for validator instances..."
gcloud_FindInstances "name~^$prefix-validator" show
[[ ${#instances[@]} -gt 0 ]] || {
echo "Unable to find validators"
exit 1
}
echo "validatorIpList=()" >> "$configFile"
gcloud_ForEachInstance recordInstanceIp validatorIpList
gcloud_ForEachInstance waitForStartupComplete
echo "clientIpList=()" >> "$configFile"
echo "Looking for client instances..."
gcloud_FindInstances "name~^$prefix-client" show
[[ ${#instances[@]} -eq 0 ]] || {
gcloud_ForEachInstance recordInstanceIp clientIpList
gcloud_ForEachInstance waitForStartupComplete
}
echo "Wrote $configFile"
$metricsWriteDatapoint "testnet-deploy net-config-complete=1"
}
case $command in
delete)
$metricsWriteDatapoint "testnet-deploy net-delete-begin=1"
# Delete the leader node first to prevent unusual metrics on the dashboard
# during shutdown.
# TODO: It would be better to fully cut-off metrics reporting before any
# instances are deleted.
for filter in "^$prefix-leader" "^$prefix-"; do
gcloud_FindInstances "name~$filter"
if [[ ${#instances[@]} -eq 0 ]]; then
echo "No instances found matching '$filter'"
else
gcloud_DeleteInstances true
fi
done
rm -f "$configFile"
$metricsWriteDatapoint "testnet-deploy net-delete-complete=1"
;;
create)
[[ -n $validatorNodeCount ]] || usage "Need number of nodes"
$metricsWriteDatapoint "testnet-deploy net-create-begin=1"
rm -rf "$sshPrivateKey"{,.pub}
ssh-keygen -t ecdsa -N '' -f "$sshPrivateKey"
printNetworkInfo() {
cat <<EOF
========================================================================================
Network composition:
Leader = $leaderMachineType (GPU=${leaderAccelerator:-none})
Validators = $validatorNodeCount x $validatorMachineType (GPU=${validatorAccelerator:-none})
Client(s) = $clientNodeCount x $clientMachineType (GPU=${clientAccelerator:-none})
========================================================================================
EOF
}
printNetworkInfo
declare startupScript="$netConfigDir"/gce-startup-script.sh
cat > "$startupScript" <<EOF
#!/bin/bash -ex
# autogenerated at $(date)
cat > /etc/motd <<EOM
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
This instance has not been fully configured.
See "startup-script" log messages in /var/log/syslog for status:
$ sudo cat /var/log/syslog | grep startup-script
To block until setup is complete, run:
$ until [[ -f /.gce-startup-complete ]]; do sleep 1; done
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
EOM
# Place the generated private key at /solana-id_ecdsa so it's retrievable by anybody
# who is able to log into this machine
cat > /solana-id_ecdsa <<EOK
$(cat "$sshPrivateKey")
EOK
cat > /solana-id_ecdsa.pub <<EOK
$(cat "$sshPrivateKey.pub")
EOK
chmod 444 /solana-id_ecdsa
USER=\$(id -un)
$(
cd "$here"/scripts/
cat \
disable-background-upgrades.sh \
create-solana-user.sh \
install-earlyoom.sh \
install-rsync.sh \
install-libssl-compatability.sh \
)
cat > /etc/motd <<EOM
$(printNetworkInfo)
EOM
touch /.gce-startup-complete
EOF
gcloud_CreateInstances "$prefix-leader" 1 "$zone" \
"$imageName" "$leaderMachineType" "$leaderBootDiskSize" "$leaderAccelerator" \
"$startupScript" "$leaderAddress"
gcloud_CreateInstances "$prefix-validator" "$validatorNodeCount" "$zone" \
"$imageName" "$validatorMachineType" "$validatorBootDiskSize" "$validatorAccelerator" \
"$startupScript" ""
if [[ $clientNodeCount -gt 0 ]]; then
gcloud_CreateInstances "$prefix-client" "$clientNodeCount" "$zone" \
"$imageName" "$clientMachineType" "$clientBootDiskSize" "$clientAccelerator" \
"$startupScript" ""
fi
$metricsWriteDatapoint "testnet-deploy net-create-complete=1"
prepareInstancesAndWriteConfigFile
;;
config)
prepareInstancesAndWriteConfigFile
;;
*)
usage "Unknown command: $command"
esac

80
net/init-metrics.sh Executable file
View File

@ -0,0 +1,80 @@
#!/bin/bash -e
here=$(dirname "$0")
# shellcheck source=net/common.sh
source "$here"/common.sh
usage() {
exitcode=0
if [[ -n "$1" ]]; then
exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 [-e] [-d] [username]
Creates a testnet dev metrics database
username InfluxDB user with access to create a new database
-d Delete the database instead of creating it
-e Assume database already exists and SOLANA_METRICS_CONFIG is
defined in the environment already
EOF
exit $exitcode
}
loadConfigFile
useEnv=false
delete=false
while getopts "hde" opt; do
case $opt in
h|\?)
usage
exit 0
;;
d)
delete=true
;;
e)
useEnv=true
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
shift $((OPTIND - 1))
if $useEnv; then
[[ -n $SOLANA_METRICS_CONFIG ]] ||
usage "Error: SOLANA_METRICS_CONFIG is not defined in the environment"
else
username=$1
[[ -n "$username" ]] || usage "username not specified"
read -rs -p "InfluxDB password for $username: " password
[[ -n $password ]] || { echo "Password not specified"; exit 1; }
echo
query() {
echo "$*"
curl -XPOST \
"https://metrics.solana.com:8086/query?u=${username}&p=${password}" \
--data-urlencode "q=$*"
}
query "DROP DATABASE \"$netBasename\""
! $delete || exit 0
query "CREATE DATABASE \"$netBasename\""
query "ALTER RETENTION POLICY autogen ON \"$netBasename\" DURATION 7d"
query "GRANT READ ON \"$netBasename\" TO \"ro\""
query "GRANT WRITE ON \"$netBasename\" TO \"scratch_writer\""
SOLANA_METRICS_CONFIG="db=$netBasename,u=scratch_writer,p=topsecret"
fi
echo "export SOLANA_METRICS_CONFIG=\"$SOLANA_METRICS_CONFIG\"" >> "$configFile"
exit 0

352
net/net.sh Executable file
View File

@ -0,0 +1,352 @@
#!/bin/bash -e
here=$(dirname "$0")
SOLANA_ROOT="$(cd "$here"/..; pwd)"
# shellcheck source=net/common.sh
source "$here"/common.sh
usage() {
exitcode=0
if [[ -n "$1" ]]; then
exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 [start|stop|restart|sanity] [command-specific options]
Operate a configured testnet
start - Start the network
sanity - Sanity check the network
stop - Stop the network
restart - Shortcut for stop then start
start-specific options:
-S [snapFilename] - Deploy the specified Snap file
-s edge|beta|stable - Deploy the latest Snap on the specified Snap release channel
-f [cargoFeatures] - List of |cargo --feaures=| to activate
(ignored if -s or -S is specified)
Note: if RUST_LOG is set in the environment it will be propogated into the
network nodes.
sanity/start-specific options:
-o noLedgerVerify - Skip ledger verification
-o noValidatorSanity - Skip validator sanity
stop-specific options:
none
EOF
exit $exitcode
}
snapChannel=
snapFilename=
deployMethod=local
sanityExtraArgs=
cargoFeatures=
command=$1
[[ -n $command ]] || usage
shift
while getopts "h?S:s:o:f:" opt; do
case $opt in
h | \?)
usage
;;
S)
snapFilename=$OPTARG
[[ -f $snapFilename ]] || usage "Snap not readable: $snapFilename"
deployMethod=snap
;;
s)
case $OPTARG in
edge|beta|stable)
snapChannel=$OPTARG
deployMethod=snap
;;
*)
usage "Invalid snap channel: $OPTARG"
;;
esac
;;
f)
cargoFeatures=$OPTARG
;;
o)
case $OPTARG in
noLedgerVerify|noValidatorSanity)
sanityExtraArgs="$sanityExtraArgs -o $OPTARG"
;;
*)
echo "Error: unknown option: $OPTARG"
exit 1
;;
esac
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
loadConfigFile
expectedNodeCount=$((${#validatorIpList[@]} + 1))
build() {
declare MAYBE_DOCKER=
if [[ $(uname) != Linux ]]; then
MAYBE_DOCKER="ci/docker-run.sh solanalabs/rust"
fi
SECONDS=0
(
cd "$SOLANA_ROOT"
echo "--- Build started at $(date)"
set -x
rm -rf farf
$MAYBE_DOCKER cargo install --features="$cargoFeatures" --root farf
)
echo "Build took $SECONDS seconds"
}
startCommon() {
declare ipAddress=$1
test -d "$SOLANA_ROOT"
ssh "${sshOptions[@]}" "$ipAddress" "mkdir -p ~/solana ~/.cargo/bin"
rsync -vPrc -e "ssh ${sshOptions[*]}" \
"$SOLANA_ROOT"/{fetch-perf-libs.sh,scripts,net,multinode-demo} \
"$ipAddress":~/solana/
}
startLeader() {
declare ipAddress=$1
declare logFile="$2"
echo "--- Starting leader: $leaderIp"
echo "start log: $logFile"
# Deploy local binaries to leader. Validators and clients later fetch the
# binaries from the leader.
(
set -x
startCommon "$ipAddress" || exit 1
case $deployMethod in
snap)
rsync -vPrc -e "ssh ${sshOptions[*]}" "$snapFilename" "$ipAddress:~/solana/solana.snap"
;;
local)
rsync -vPrc -e "ssh ${sshOptions[*]}" "$SOLANA_ROOT"/farf/bin/* "$ipAddress:~/.cargo/bin/"
;;
*)
usage "Internal error: invalid deployMethod: $deployMethod"
;;
esac
ssh "${sshOptions[@]}" -n "$ipAddress" \
"./solana/net/remote/remote-node.sh $deployMethod leader $publicNetwork $entrypointIp $expectedNodeCount \"$RUST_LOG\""
) >> "$logFile" 2>&1 || {
cat "$logFile"
echo "^^^ +++"
exit 1
}
}
startValidator() {
declare ipAddress=$1
declare logFile="$netLogDir/validator-$ipAddress.log"
echo "--- Starting validator: $leaderIp"
echo "start log: $logFile"
(
set -x
startCommon "$ipAddress"
ssh "${sshOptions[@]}" -n "$ipAddress" \
"./solana/net/remote/remote-node.sh $deployMethod validator $publicNetwork $entrypointIp $expectedNodeCount \"$RUST_LOG\""
) >> "$logFile" 2>&1 &
declare pid=$!
ln -sfT "validator-$ipAddress.log" "$netLogDir/validator-$pid.log"
pids+=("$pid")
}
startClient() {
declare ipAddress=$1
declare logFile="$2"
echo "--- Starting client: $ipAddress"
echo "start log: $logFile"
(
set -x
startCommon "$ipAddress"
ssh "${sshOptions[@]}" -f "$ipAddress" \
"./solana/net/remote/remote-client.sh $deployMethod $entrypointIp $expectedNodeCount \"$RUST_LOG\""
) >> "$logFile" 2>&1 || {
cat "$logFile"
echo "^^^ +++"
exit 1
}
}
sanity() {
declare expectedNodeCount=$((${#validatorIpList[@]} + 1))
declare ok=true
echo "--- Sanity"
$metricsWriteDatapoint "testnet-deploy net-sanity-begin=1"
(
set -x
# shellcheck disable=SC2029 # remote-client.sh args are expanded on client side intentionally
ssh "${sshOptions[@]}" "$leaderIp" \
"./solana/net/remote/remote-sanity.sh $sanityExtraArgs"
) || ok=false
$metricsWriteDatapoint "testnet-deploy net-sanity-complete=1"
$ok || exit 1
}
start() {
case $deployMethod in
snap)
if [[ -n $snapChannel ]]; then
rm -f "$SOLANA_ROOT"/solana_*.snap
if [[ $(uname) != Linux ]]; then
(
set -x
SOLANA_DOCKER_RUN_NOSETUID=1 "$SOLANA_ROOT"/ci/docker-run.sh ubuntu:18.04 bash -c "
set -ex;
apt-get -qq update;
apt-get -qq -y install snapd;
snap download --channel=$snapChannel solana;
"
)
else
(
cd "$SOLANA_ROOT"
snap download --channel="$snapChannel" solana
)
fi
snapFilename="$(echo "$SOLANA_ROOT"/solana_*.snap)"
[[ -r $snapFilename ]] || {
echo "Error: Snap not readable: $snapFilename"
exit 1
}
fi
;;
local)
build
;;
*)
usage "Internal error: invalid deployMethod: $deployMethod"
;;
esac
echo "Deployment started at $(date)"
$metricsWriteDatapoint "testnet-deploy net-start-begin=1"
SECONDS=0
declare leaderDeployTime=
startLeader "$leaderIp" "$netLogDir/leader-$leaderIp.log"
leaderDeployTime=$SECONDS
$metricsWriteDatapoint "testnet-deploy net-leader-started=1"
SECONDS=0
pids=()
for ipAddress in "${validatorIpList[@]}"; do
startValidator "$ipAddress"
done
for pid in "${pids[@]}"; do
declare ok=true
wait "$pid" || ok=false
if ! $ok; then
cat "$netLogDir/validator-$pid.log"
echo ^^^ +++
exit 1
fi
done
$metricsWriteDatapoint "testnet-deploy net-validators-started=1"
validatorDeployTime=$SECONDS
sanity
SECONDS=0
for ipAddress in "${clientIpList[@]}"; do
startClient "$ipAddress" "$netLogDir/client-$ipAddress.log"
done
clientDeployTime=$SECONDS
$metricsWriteDatapoint "testnet-deploy net-start-complete=1"
if [[ $deployMethod = "snap" ]]; then
declare networkVersion=unknown
IFS=\ read -r _ networkVersion _ < <(
ssh "${sshOptions[@]}" "$leaderIp" \
"snap info solana | grep \"^installed:\""
)
networkVersion=${networkVersion/0+git./}
$metricsWriteDatapoint "testnet-deploy version=\"$networkVersion\""
fi
echo
echo "+++ Deployment Successful"
echo "Leader deployment took $leaderDeployTime seconds"
echo "Validator deployment (${#validatorIpList[@]} instances) took $validatorDeployTime seconds"
echo "Client deployment (${#clientIpList[@]} instances) took $clientDeployTime seconds"
echo "Network start logs in $netLogDir:"
ls -l "$netLogDir"
}
stopNode() {
local ipAddress=$1
echo "--- Stopping node: $ipAddress"
(
set -x
ssh "${sshOptions[@]}" "$ipAddress" "
set -x
if snap list solana; then
sudo snap set solana mode=
sudo snap remove solana
fi
! tmux list-sessions || tmux kill-session
for pattern in solana- remote- oom-monitor net-stats; do
pkill -9 \$pattern
done
"
) || true
}
stop() {
SECONDS=0
$metricsWriteDatapoint "testnet-deploy net-stop-begin=1"
stopNode "$leaderIp"
for ipAddress in "${validatorIpList[@]}" "${clientIpList[@]}"; do
stopNode "$ipAddress"
done
$metricsWriteDatapoint "testnet-deploy net-stop-complete=1"
echo "Stopping nodes took $SECONDS seconds"
}
case $command in
restart)
stop
start
;;
start)
start
;;
sanity)
sanity
;;
stop)
stop
;;
*)
echo "Internal error: Unknown command: $command"
exit 1
esac

1
net/remote/README.md Normal file
View File

@ -0,0 +1 @@
Scripts that run on the remote testnet nodes

83
net/remote/remote-client.sh Executable file
View File

@ -0,0 +1,83 @@
#!/bin/bash -e
cd "$(dirname "$0")"/../..
echo "$(date) | $0 $*" > client.log
deployMethod="$1"
entrypointIp="$2"
numNodes="$3"
RUST_LOG="$4"
export RUST_LOG=${RUST_LOG:-solana=info} # if RUST_LOG is unset, default to info
missing() {
echo "Error: $1 not specified"
exit 1
}
[[ -n $deployMethod ]] || missing deployMethod
[[ -n $entrypointIp ]] || missing entrypointIp
[[ -n $numNodes ]] || missing numNodes
source net/common.sh
loadConfigFile
threadCount=$(nproc)
if [[ $threadCount -gt 4 ]]; then
threadCount=4
fi
case $deployMethod in
snap)
net/scripts/rsync-retry.sh -vPrc "$entrypointIp:~/solana/solana.snap" .
sudo snap install solana.snap --devmode --dangerous
solana_bench_tps=/snap/bin/solana.bench-tps
solana_keygen=/snap/bin/solana.keygen
;;
local)
PATH="$HOME"/.cargo/bin:"$PATH"
export USE_INSTALL=1
export SOLANA_DEFAULT_METRICS_RATE=1
net/scripts/rsync-retry.sh -vPrc "$entrypointIp:~/.cargo/bin/solana*" ~/.cargo/bin/
solana_bench_tps=solana-bench-tps
solana_keygen=solana-keygen
;;
*)
echo "Unknown deployment method: $deployMethod"
exit 1
esac
scripts/oom-monitor.sh > oom-monitor.log 2>&1 &
scripts/net-stats.sh > net-stats.log 2>&1 &
! tmux list-sessions || tmux kill-session
clientCommand="\
$solana_bench_tps \
--network $entrypointIp:8001 \
--identity client.json \
--num-nodes $numNodes \
--duration 600 \
--sustained \
--threads $threadCount \
"
keygenCommand="$solana_keygen -o client.json"
tmux new -s solana-bench-tps -d "
[[ -r client.json ]] || {
echo '$ $keygenCommand' | tee -a client.log
$keygenCommand >> client.log 2>&1
}
while true; do
echo === Client start: \$(date) | tee -a client.log
$metricsWriteDatapoint 'testnet-deploy client-begin=1'
echo '$ $clientCommand' | tee -a client.log
$clientCommand >> client.log 2>&1
$metricsWriteDatapoint 'testnet-deploy client-complete=1'
done
"
sleep 1
tmux capture-pane -t solana-bench-tps -p -S -100

113
net/remote/remote-node.sh Executable file
View File

@ -0,0 +1,113 @@
#!/bin/bash -e
cd "$(dirname "$0")"/../..
deployMethod="$1"
nodeType="$2"
publicNetwork="$3"
entrypointIp="$4"
numNodes="$5"
RUST_LOG="$6"
missing() {
echo "Error: $1 not specified"
exit 1
}
[[ -n $deployMethod ]] || missing deployMethod
[[ -n $nodeType ]] || missing nodeType
[[ -n $publicNetwork ]] || missing publicNetwork
[[ -n $entrypointIp ]] || missing entrypointIp
[[ -n $numNodes ]] || missing numNodes
cat > deployConfig <<EOF
deployMethod="$deployMethod"
entrypointIp="$entrypointIp"
numNodes="$numNodes"
EOF
source net/common.sh
loadConfigFile
if [[ $publicNetwork = true ]]; then
setupArgs="-p"
else
setupArgs="-l"
fi
case $deployMethod in
snap)
SECONDS=0
[[ $nodeType = leader ]] ||
net/scripts/rsync-retry.sh -vPrc "$entrypointIp:~/solana/solana.snap" .
sudo snap install solana.snap --devmode --dangerous
commonNodeConfig="\
leader-ip=$entrypointIp \
default-metrics-rate=1 \
metrics-config=$SOLANA_METRICS_CONFIG \
rust-log=$RUST_LOG \
setup-args=$setupArgs \
"
if [[ -e /dev/nvidia0 ]]; then
commonNodeConfig="$commonNodeConfig enable-cuda=1"
fi
if [[ $nodeType = leader ]]; then
nodeConfig="mode=leader+drone $commonNodeConfig"
ln -sf -T /var/snap/solana/current/leader/current leader.log
ln -sf -T /var/snap/solana/current/drone/current drone.log
else
nodeConfig="mode=validator $commonNodeConfig"
ln -sf -T /var/snap/solana/current/validator/current validator.log
fi
logmarker="solana deploy $(date)/$RANDOM"
logger "$logmarker"
# shellcheck disable=SC2086 # Don't want to double quote "$nodeConfig"
sudo snap set solana $nodeConfig
snap info solana
sudo snap get solana
echo Slight delay to get more syslog output
sleep 2
sudo grep -Pzo "$logmarker(.|\\n)*" /var/log/syslog
echo "Succeeded in ${SECONDS} seconds"
;;
local)
PATH="$HOME"/.cargo/bin:"$PATH"
export USE_INSTALL=1
export RUST_LOG
export SOLANA_DEFAULT_METRICS_RATE=1
./fetch-perf-libs.sh
export LD_LIBRARY_PATH="$PWD/target/perf-libs:$LD_LIBRARY_PATH"
scripts/oom-monitor.sh > oom-monitor.log 2>&1 &
scripts/net-stats.sh > net-stats.log 2>&1 &
case $nodeType in
leader)
./multinode-demo/setup.sh -t leader $setupArgs
./multinode-demo/drone.sh > drone.log 2>&1 &
./multinode-demo/leader.sh > leader.log 2>&1 &
;;
validator)
net/scripts/rsync-retry.sh -vPrc "$entrypointIp:~/.cargo/bin/solana*" ~/.cargo/bin/
./multinode-demo/setup.sh -t validator $setupArgs
./multinode-demo/validator.sh "$entrypointIp":~/solana "$entrypointIp:8001" >validator.log 2>&1 &
;;
*)
echo "Error: unknown node type: $nodeType"
exit 1
;;
esac
;;
*)
echo "Unknown deployment method: $deployMethod"
exit 1
esac

138
net/remote/remote-sanity.sh Executable file
View File

@ -0,0 +1,138 @@
#!/bin/bash -e
#
# This script is to be run on the leader node
#
cd "$(dirname "$0")"/../..
deployMethod=
entrypointIp=
numNodes=
[[ -r deployConfig ]] || {
echo deployConfig missing
exit 1
}
# shellcheck source=/dev/null # deployConfig is written by remote-node.sh
source deployConfig
missing() {
echo "Error: $1 not specified"
exit 1
}
[[ -n $deployMethod ]] || missing deployMethod
[[ -n $entrypointIp ]] || missing entrypointIp
[[ -n $numNodes ]] || missing numNodes
ledgerVerify=true
validatorSanity=true
while [[ $1 = -o ]]; do
opt="$2"
shift 2
case $opt in
noLedgerVerify)
ledgerVerify=false
;;
noValidatorSanity)
validatorSanity=false
;;
*)
echo "Error: unknown option: $opt"
exit 1
;;
esac
done
source net/common.sh
loadConfigFile
case $deployMethod in
snap)
PATH="/snap/bin:$PATH"
export USE_SNAP=1
entrypointRsyncUrl="$entrypointIp"
solana_bench_tps=solana.bench-tps
solana_ledger_tool=solana.ledger-tool
solana_keygen=solana.keygen
ledger=/var/snap/solana/current/config/ledger
client_id=~/snap/solana/current/config/client-id.json
;;
local)
PATH="$HOME"/.cargo/bin:"$PATH"
export USE_INSTALL=1
entrypointRsyncUrl="$entrypointIp:~/solana"
solana_bench_tps=solana-bench-tps
solana_ledger_tool=solana-ledger-tool
solana_keygen=solana-keygen
ledger=config/ledger
client_id=config/client-id.json
;;
*)
echo "Unknown deployment method: $deployMethod"
exit 1
esac
echo "--- $entrypointIp: wallet sanity"
(
set -x
scripts/wallet-sanity.sh "$entrypointRsyncUrl"
)
echo "+++ $entrypointIp: node count ($numNodes expected)"
(
set -x
$solana_keygen -o "$client_id"
$solana_bench_tps --network "$entrypointIp:8001" --identity "$client_id" --num-nodes "$numNodes" --converge-only
)
echo "--- $entrypointIp: verify ledger"
if $ledgerVerify; then
if [[ -d $ledger ]]; then
(
set -x
rm -rf /var/tmp/ledger-verify
du -hs "$ledger"
time cp -r "$ledger" /var/tmp/ledger-verify
time $solana_ledger_tool --ledger /var/tmp/ledger-verify verify
)
else
echo "^^^ +++"
echo "Ledger verify skipped: directory does not exist: $ledger"
fi
else
echo "^^^ +++"
echo "Note: ledger verify disabled"
fi
echo "--- $entrypointIp: validator sanity"
if $validatorSanity; then
(
set -ex -o pipefail
./multinode-demo/setup.sh -t validator
timeout 10s ./multinode-demo/validator.sh "$entrypointRsyncUrl" "$entrypointIp:8001" 2>&1 | tee validator.log
) || {
exitcode=$?
[[ $exitcode -eq 124 ]] || exit $exitcode
}
wc -l validator.log
if grep -C100 panic validator.log; then
echo "^^^ +++"
echo "Panic observed"
exit 1
else
echo "Validator log looks ok"
fi
else
echo "^^^ +++"
echo "Note: validator sanity disabled"
fi
echo --- Pass

View File

@ -0,0 +1,27 @@
#!/bin/bash -ex
[[ $(uname) = Linux ]] || exit 1
[[ $USER = root ]] || exit 1
adduser solana --gecos "" --disabled-password --quiet
adduser solana sudo
echo "solana ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
id solana
[[ -r /solana-id_ecdsa ]] || exit 1
[[ -r /solana-id_ecdsa.pub ]] || exit 1
sudo -u solana bash -c "
mkdir -p /home/solana/.ssh/
cd /home/solana/.ssh/
cp /solana-id_ecdsa.pub authorized_keys
umask 377
cp /solana-id_ecdsa id_ecdsa
echo \"
Host *
BatchMode yes
IdentityFile ~/.ssh/id_ecdsa
StrictHostKeyChecking no
\" > config
"

View File

@ -0,0 +1,20 @@
#!/bin/bash -ex
#
# Prevent background upgrades that block |apt-get|
#
# TODO: This approach is pretty uncompromising. An alternative solution that
# doesn't involve deleting system files would be welcome.
[[ $(uname) = Linux ]] || exit 1
[[ $USER = root ]] || exit 1
rm -rf /usr/lib/apt/apt.systemd.daily
rm -rf /usr/bin/unattended-upgrade
killall apt.systemd.daily || true
killall unattended-upgrade || true
while fuser /var/lib/dpkg/lock; do
echo Waiting for lock release...
sleep 1
done

187
net/scripts/gcloud.sh Normal file
View File

@ -0,0 +1,187 @@
# |source| this file
#
# Utilities for working with gcloud
#
#
# gcloud_FindInstances [filter] [options]
#
# Find instances matching the specified pattern.
#
# For each matching instance, an entry in the `instances` array will be added with the
# following information about the instance:
# "name:zone:public IP:private IP"
#
# filter - The instances to filter on
# options - If set to the string "show", the list of instances will be echoed
# to stdout
#
# examples:
# $ gcloud_FindInstances "name=exact-machine-name"
# $ gcloud_FindInstances "name~^all-machines-with-a-common-machine-prefix"
#
gcloud_FindInstances() {
declare filter="$1"
declare options="$2"
instances=()
declare name zone publicIp privateIp status
while read -r name zone publicIp privateIp status; do
if [[ $status != RUNNING ]]; then
echo "Warning: $name is not RUNNING, ignoring it."
continue
fi
if [[ $options = show ]]; then
printf "%-30s | %-16s publicIp=%-16s privateIp=%s\n" "$name" "$zone" "$publicIp" "$privateIp"
fi
instances+=("$name:$zone:$publicIp:$privateIp")
done < <(gcloud compute instances list \
--filter="$filter" \
--format 'value(name,zone,networkInterfaces[0].accessConfigs[0].natIP,networkInterfaces[0].networkIP,status)')
}
#
# gcloud_ForEachInstance [cmd] [extra args to cmd]
#
# Execute a command for each element in the `instances` array
#
# cmd - The command to execute on each instance
# The command will receive arguments followed by any
# additionl arguments supplied to gcloud_ForEachInstance:
# name - name of the instance
# zone - zone the instance is located in
# publicIp - The public IP address of this instance
# privateIp - The priate IP address of this instance
# count - Monotonically increasing count for each
# invocation of cmd, starting at 1
# ... - Extra args to cmd..
#
#
gcloud_ForEachInstance() {
declare cmd="$1"
shift
[[ -n $cmd ]] || { echo gcloud_ForEachInstance: cmd not specified; exit 1; }
declare count=1
for info in "${instances[@]}"; do
declare name zone publicIp privateIp
IFS=: read -r name zone publicIp privateIp < <(echo "$info")
eval "$cmd" "$name" "$zone" "$publicIp" "$privateIp" "$count" "$@"
count=$((count + 1))
done
}
#
# gcloud_CreateInstances [namePrefix] [numNodes] [zone] [imageName]
# [machineType] [bootDiskSize] [accelerator]
# [startupScript] [address]
#
# Creates one more identical instances.
#
# namePrefix - unique string to prefix all the instance names with
# numNodes - number of instances to create
# zone - zone to create the instances in
# imageName - Disk image for the instances
# machineType - GCE machine type
# bootDiskSize - Optional disk of the boot disk
# accelerator - Optional accelerator to attach to the instance(s), see
# eg, request 4 K80 GPUs with "count=4,type=nvidia-tesla-k80"
# startupScript - Optional startup script to execute when the instance boots
# address - Optional name of the GCE static IP address to attach to the
# instance. Requires that |numNodes| = 1 and that addressName
# has been provisioned in the GCE region that is hosting |zone|
#
# Tip: use gcloud_FindInstances to locate the instances once this function
# returns
gcloud_CreateInstances() {
declare namePrefix="$1"
declare numNodes="$2"
declare zone="$3"
declare imageName="$4"
declare machineType="$5"
declare optionalBootDiskSize="$6"
declare optionalAccelerator="$7"
declare optionalStartupScript="$8"
declare optionalAddress="$9"
declare nodes
if [[ $numNodes = 1 ]]; then
nodes=("$namePrefix")
else
read -ra nodes <<<$(seq -f "${namePrefix}%0${#numNodes}g" 1 "$numNodes")
fi
declare -a args
args=(
"--zone=$zone"
"--tags=testnet"
"--image=$imageName"
"--machine-type=$machineType"
)
if [[ -n $optionalBootDiskSize ]]; then
args+=(
"--boot-disk-size=$optionalBootDiskSize"
)
fi
if [[ -n $optionalAccelerator ]]; then
args+=(
"--accelerator=$optionalAccelerator"
--maintenance-policy TERMINATE
--restart-on-failure
)
fi
if [[ -n $optionalStartupScript ]]; then
args+=(
--metadata-from-file "startup-script=$optionalStartupScript"
)
fi
if [[ -n $optionalAddress ]]; then
[[ $numNodes = 1 ]] || {
echo "Error: address may not be supplied when provisioning multiple nodes: $optionalAddress"
exit 1
}
args+=(
"--address=$optionalAddress"
)
fi
(
set -x
gcloud beta compute instances create "${nodes[@]}" "${args[@]}"
)
}
#
# gcloud_DeleteInstances [yes]
#
# Deletes all the instances listed in the `instances` array
#
# If yes = "true", skip the delete confirmation
#
gcloud_DeleteInstances() {
declare maybeQuiet=
if [[ $1 = true ]]; then
maybeQuiet=--quiet
fi
if [[ ${#instances[0]} -eq 0 ]]; then
echo No instances to delete
return
fi
declare names=("${instances[@]/:*/}")
# Assume all instances are in the same zone
# TODO: One day this assumption will be invalid
declare zone
IFS=: read -r _ zone _ < <(echo "${instances[0]}")
(
set -x
gcloud beta compute instances delete --zone "$zone" $maybeQuiet "${names[@]}"
)
}

30
net/scripts/install-earlyoom.sh Executable file
View File

@ -0,0 +1,30 @@
#!/bin/bash -ex
#
# Install EarlyOOM
#
[[ $(uname) = Linux ]] || exit 1
[[ $USER = root ]] || exit 1
# 64 - enable signalling of processes (term, kill, oom-kill)
# TODO: This setting will not persist across reboots
sysctl -w kernel.sysrq=$(( $(cat /proc/sys/kernel/sysrq) | 64 ))
if command -v earlyoom; then
systemctl status earlyoom
else
wget http://ftp.us.debian.org/debian/pool/main/e/earlyoom/earlyoom_1.1-2_amd64.deb
apt install --quiet --yes ./earlyoom_1.1-2_amd64.deb
cat > earlyoom <<OOM
# use the kernel OOM killer, trigger at 20% available RAM,
EARLYOOM_ARGS="-k -m 20"
OOM
cp earlyoom /etc/default/
rm earlyoom
systemctl stop earlyoom
systemctl enable earlyoom
systemctl start earlyoom
fi

View File

@ -0,0 +1,18 @@
#!/bin/bash -ex
[[ $(uname) = Linux ]] || exit 1
[[ $USER = root ]] || exit 1
# Install libssl-dev to be compatible with binaries built on an Ubuntu machine...
apt-get update
apt-get --assume-yes install libssl-dev
# Install libssl1.1 to be compatible with binaries built in the
# solanalabs/rust docker image
#
# cc: https://github.com/solana-labs/solana/issues/1090
# cc: https://packages.ubuntu.com/bionic/amd64/libssl1.1/download
wget http://security.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.0g-2ubuntu4.1_amd64.deb
dpkg -i libssl1.1_1.1.0g-2ubuntu4.1_amd64.deb
rm libssl1.1_1.1.0g-2ubuntu4.1_amd64.deb

19
net/scripts/install-rsync.sh Executable file
View File

@ -0,0 +1,19 @@
#!/bin/bash -ex
#
# Rsync setup for Snap builds
#
[[ $(uname) = Linux ]] || exit 1
[[ $USER = root ]] || exit 1
apt-get --assume-yes install rsync
cat > /etc/rsyncd.conf <<-EOF
[config]
path = /var/snap/solana/current/config
hosts allow = *
read only = true
EOF
systemctl enable rsync
systemctl start rsync

12
net/scripts/rsync-retry.sh Executable file
View File

@ -0,0 +1,12 @@
#!/bin/bash
#
# rsync wrapper that retries a few times on failure
#
for i in $(seq 1 5); do
(
set -x
rsync "$@"
) && exit 0
echo Retry "$i"...
done

69
net/ssh.sh Executable file
View File

@ -0,0 +1,69 @@
#!/bin/bash
here=$(dirname "$0")
# shellcheck source=net/common.sh
source "$here"/common.sh
usage() {
exitcode=0
if [[ -n "$1" ]]; then
exitcode=1
echo "Error: $*"
fi
cat <<EOF
usage: $0 [ipAddress] [extra ssh arguments]
ssh into a node
ipAddress - IP address of the desired node.
If ipAddress is unspecified, a list of available nodes will be displayed.
EOF
exit $exitcode
}
while getopts "h?" opt; do
case $opt in
h | \?)
usage
;;
*)
usage "Error: unhandled option: $opt"
;;
esac
done
loadConfigFile
ipAddress=$1
shift
if [[ -n "$ipAddress" ]]; then
set -x
exec ssh "${sshOptions[@]}" "$ipAddress" "$@"
fi
printNode() {
declare nodeType=$1
declare ip=$2
printf " %-25s | For logs run: $0 $ip tail -f solana/$nodeType.log\n" "$0 $ip"
}
echo Leader:
printNode leader "$leaderIp"
echo
echo Validators:
for ipAddress in "${validatorIpList[@]}"; do
printNode validator "$ipAddress"
done
echo
echo Clients:
if [[ ${#clientIpList[@]} -eq 0 ]]; then
echo " None"
else
for ipAddress in "${clientIpList[@]}"; do
printNode client "$ipAddress"
done
fi
exit 0

View File

@ -4,7 +4,7 @@ The goal of this RFC is to define a set of constraints for APIs and runtime such
## Version ## Version
version 0.1 version 0.2
## Toolchain Stack ## Toolchain Stack
@ -37,154 +37,175 @@ version 0.1
In Figure 1 an untrusted client, creates a program in the front-end language of her choice, (like C/C++/Rust/Lua), and compiles it with LLVM to a position independent shared object ELF, targeting BPF bytecode. Solana will safely load and execute the ELF. In Figure 1 an untrusted client, creates a program in the front-end language of her choice, (like C/C++/Rust/Lua), and compiles it with LLVM to a position independent shared object ELF, targeting BPF bytecode. Solana will safely load and execute the ELF.
## Bytecode
Our bytecode is based on Berkley Packet Filter. The requirements for BPF overlap almost exactly with the requirements we have:
1. Deterministic amount of time to execute the code
2. Bytecode that is portable between machine instruction sets
3. Verified memory accesses
4. Fast to load the object, verify the bytecode and JIT to local machine instruction set
For 1, that means that loops are unrolled, and for any jumps back we can guard them with a check against the number of instruction that have been executed at this point. If the limit is reached, the program yields its execution. This involves saving the stack and current instruction index.
For 2, the BPF bytecode already easily maps to x8664, arm64 and other instruction sets. 
For 3, every load and store that is relative can be checked to be within the expected memory that is passed into the ELF. Dynamic load and stores can do a runtime check against available memory, these will be slow and should be avoided.
For 4, Fully linked PIC ELF with just a single RX segment. Effectively we are linking a shared object with `-fpic -target bpf` and with a linker script to collect everything into a single RX segment. Writable globals are not supported.
### Address Checks
The interface to the module takes a `&mut Vec<Vec<u8>>` in rust, or a `int sz, void* data[sz], int szs[sz]` in `C`. Given the module's bytecode, for each method, we need to analyze the bounds on load and stores into each buffer the module uses. This check needs to be done `on chain`, and after those bounds are computed we can verify that the user supplied array of buffers will not cause a memory fault. For load and stores that we cannot analyze, we can replace with a `safe_load` and `safe_store` instruction that will check the table for access.
## Loader
The loader is our first smart contract. The job of this contract is to load the actual program with its own instance data. The loader will verify the bytecode and that the object implements the expected entry points.
Since there is only one RX segment, the context for the contract instance is passed into each entry point as well as the event data for that entry point.
A client will create a transaction to create a new loader instance:
`Solana_NewLoader(Loader Instance PubKey, proof of key ownership, space I need for my elf)`
A client will then do a bunch of transactions to load its elf into the loader instance they created:
`Loader_UploadElf(Loader Instance PubKey, proof of key ownership, pos start, pos end, data)`
At this point the client can create a new instance of the module with its own instance address:
`Loader_NewInstance(Loader Instance PubKey, proof of key ownership, Instance PubKey, proof of key ownership)`
Once the instance has been created, the client may need to upload more user data to solana to configure this instance:
`Instance_UploadModuleData(Instance PubKey, proof of key ownership, pos start, pos end, data)`
Now clients can `start` the instance:
`Instance_Start(Instance PubKey, proof of key ownership)`
## Runtime ## Runtime
Our goal with the runtime is to have a general purpose execution environment that is highly parallelizable and doesn't require dynamic resource management. We want to execute as many contracts as we can in parallel, and have them pass or fail without a destructive state change. The goal with the runtime is to have a general purpose execution environment that is highly parallelizeable and doesn't require dynamic resource management. The goal is to execute as many contracts as possible in parallel, and have them pass or fail without a destructive state change.
### State and Entry Point
State is addressed by an account which is at the moment simply the PubKey. Our goal is to eliminate dynamic memory allocation in the smart contract itself, so the contract is a function that takes a mapping of [(PubKey,State)] and returns [(PubKey, State')]. The output of keys is a subset of the input. Three basic kinds of state exist:
* Instance State
* Participant State
* Caller State
There isn't any difference in how each is implemented, but conceptually Participant State is memory that is allocated for each participant in the contract. Instance State is memory that is allocated for the contract itself, and Caller State is memory that the transactions caller has allocated.
### Call ### State
State is addressed by an account which is at the moment simply the Pubkey. Our goal is to eliminate memory allocation from within the smart contract itself. Thus the client of the contract provides all the state that is necessary for the contract to execute in the transaction itself. The runtime interacts with the contract through a state transition function, which takes a mapping of [(Pubkey,State)] and returns [(Pubkey, State')]. The State is an opeque type to the runtime, a `Vec<u8>`, the contents of which the contract has full control over.
### Call Structure
``` ```
void call( /// Call definition
const struct instance_data *data, /// Signed portion
const uint8_t kind[], //instance|participant|caller|read|write #[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Clone)]
const uint8_t *keys[], pub struct CallData {
uint8_t *data[], /// Each Pubkey in this vector is mapped to a corresponding `Page` that is loaded for contract execution
int num, /// In a simple pay transaction `key[0]` is the token owner's key and `key[1]` is the recipient's key.
uint8_t dirty[], //dirty memory bits pub keys: Vec<Pubkey>,
uint8_t *userdata, //current transaction data
); /// The Pubkeys that are required to have a proof. The proofs are a `Vec<Signature> which encoded along side this data structure
/// Each Signature signs the `required_proofs` vector as well as the `keys` vectors. The transaction is valid if and only if all
/// the required signatures are present and the public key vector is unchanged between signatures.
pub required_proofs: Vec<u8>,
/// PoH data
/// last PoH hash observed by the sender
pub last_id: Hash,
/// Program
/// The address of the program we want to call. ContractId is just a Pubkey that is the address of the loaded code that will execute this Call.
pub contract_id: ContractId,
/// OS scheduling fee
pub fee: i64,
/// struct version to prevent duplicate spends
/// Calls with a version <= Page.version are rejected
pub version: u64,
/// method to call in the contract
pub method: u8,
/// usedata in bytes
pub userdata: Vec<u8>,
}
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Clone)]
pub struct Call {
/// Signatures and Keys
/// (signature, key index)
/// This vector contains a tuple of signatures, and the key index the signature is for
/// proofs[0] is always key[0]
pub proofs: Vec<Signature>,
pub data: CallData,
}
``` ```
To call this operation, the transaction that is destined to the contract instance specifies what keyed state it should present to the `call` function. To allocate the state memory or a call context, the client has to first call a function on the contract with the designed address that will own the state. At it's core, this is just a set of Pubkeys and Signatures with a bit of metadata. The contract Pubkey routes this transaction into that contracts entry point. `version` is used for dropping retransmitted requests.
At its core, this is a system call that requires cryptographic proof of ownership of memory regions instead of an OS that checks page tables for access rights. Contracts should be able to read any state that is part of runtime, but only write to state that the contract allocated.
* `Instance_AllocateContext(Instance PubKey, My PubKey, Proof of key ownership)`
Any transaction can then call `call` on the contract with a set of keys. It's up to the contract itself to manage ownership:
* `Instance_Call(Instance PubKey, [Context PubKeys], proofs of ownership, userdata...)`
Contracts should be able to read any state that is part of solana, but only write to state that the contract allocated.
#### Caller State
Caller `state` is memory allocated for the `call` that belongs to the public key that is issuing the `call`. This is the caller's context.
#### Instance State
Instance `state` is memory that belongs to this contract instance. We may also need module-wide `state` as well.
#### Participant State
Participant `state` is any other memory. In some cases it may make sense to have these allocated as part of the call by the caller.
### Reduce
Some operations on the contract will require iteration over all the keys. To make this parallelizable the iteration is broken up into reduce calls which are combined.
```
void reduce_m(
const struct instance_data *data,
const uint8_t *keys[],
const uint8_t *data[],
int num,
uint8_t *reduce_data,
);
void reduce_r(
const struct instance_data *data,
const uint8_t *reduce_data[],
int num,
uint8_t *reduce_data,
);
```
### Execution ### Execution
Transactions are batched and processed in parallel at each stage. Calls batched and processed in a pipeline
```
+-----------+ +--------------+ +-----------+ +---------------+
| sigverify |-+->| debit commit |---+->| execution |-+->| memory commit |
+-----------+ | +--------------+ | +-----------+ | +---------------+
| | |
| +---------------+ | | +--------------+
|->| memory verify |->+ +->| debit undo |
+---------------+ | +--------------+
|
| +---------------+
+->| credit commit |
+---------------+
``` ```
The `debit verify` stage is very similar to `memory verify`. Proof of key ownership is used to check if the callers key has some state allocated with the contract, then the memory is loaded and executed. After execution stage, the dirty pages are written back by the contract. Because know all the memory accesses during execution, we can batch transactions that do not interfere with each other. We can also apply the `debit undo` and `credit commit` stages of the transaction. `debit undo` is run in case of an exception during contract execution, only transfers may be reversed, fees are commited to solana. +-----------+ +-------------+ +--------------+ +--------------------+
| sigverify |--->| lock memory |--->| validate fee |--->| allocate new pages |--->
+-----------+ +-------------+ +--------------+ +--------------------+
+------------+ +---------+ +--------------+ +-=------------+
--->| load pages |--->| execute |--->|unlock memory |--->| commit pages |
+------------+ +---------+ +--------------+ +--------------+
### GPU execution ```
A single contract can read and write to separate key pairs without interference. These separate calls to the same contract can execute on the same GPU thread over different memory using different SIMD lanes. At the `execute` stage, the loaded pages have no data dependencies, so all the contracts can be executed in parallel.
## Memory Management
```
pub struct Page {
/// key that indexes this page
/// prove ownership of this key to spend from this Page
owner: Pubkey,
/// contract that owns this page
/// contract can write to the data that is in `memory` vector
contract: Pubkey,
/// balance that belongs to owner
balance: u64,
/// version of the structure, public for testing
version: u64,
/// hash of the page data
memhash: Hash,
/// The following could be in a separate structure
memory: Vec<u8>,
}
```
The guarantee that runtime enforces:
1. The contract code is the only code that will modify the contents of `memory`
2. Total balances on all the pages is equal before and after exectuion of a call
3. Balances of each of the pages not owned by the contract must be equal to or greater after the call than before the call.
## Entry Point
Exectuion of the contract involves maping the contract's public key to an entry point which takes a pointer to the transaction, and an array of loaded pages.
```
// Find the method
match (tx.contract, tx.method) {
// system interface
// everyone has the same reallocate
(_, 0) => system_0_realloc(&tx, &mut call_pages),
(_, 1) => system_1_assign(&tx, &mut call_pages),
// contract methods
(DEFAULT_CONTRACT, 128) => default_contract_128_move_funds(&tx, &mut call_pages),
(contract, method) => //...
```
The first 127 methods are reserved for the system interface, which implements allocation and assignment of memory. The rest, including the contract for moving funds are implemented by the contract itself.
## System Interface
```
/// SYSTEM interface, same for very contract, methods 0 to 127
/// method 0
/// reallocate
/// spend the funds from the call to the first recipient's
pub fn system_0_realloc(call: &Call, pages: &mut Vec<Page>) {
if call.contract == DEFAULT_CONTRACT {
let size: u64 = deserialize(&call.userdata).unwrap();
pages[0].memory.resize(size as usize, 0u8);
}
}
/// method 1
/// assign
/// assign the page to a contract
pub fn system_1_assign(call: &Call, pages: &mut Vec<Page>) {
let contract = deserialize(&call.userdata).unwrap();
if call.contract == DEFAULT_CONTRACT {
pages[0].contract = contract;
//zero out the memory in pages[0].memory
//Contracts need to own the state of that data otherwise a use could fabricate the state and
//manipulate the contract
pages[0].memory.clear();
}
}
```
The first method resizes the memory that is assosciated with the callers page. The second system call assignes the page to the contract. Both methods check if the current contract is 0, otherwise the method does nothing and the caller spent their fees.
This ensures that when memory is assigned to the contract the initial state of all the bytes is 0, and the contract itself is the only thing that can modify that state.
## Simplest contract
```
/// DEFAULT_CONTRACT interface
/// All contracts start with 128
/// method 128
/// move_funds
/// spend the funds from the call to the first recipient's
pub fn default_contract_128_move_funds(call: &Call, pages: &mut Vec<Page>) {
let amount: u64 = deserialize(&call.userdata).unwrap();
if pages[0].balance >= amount {
pages[0].balance -= amount;
pages[1].balance += amount;
}
}
```
This simply moves the amount from page[0], which is the callers page, to page[1], which is the recipient's page.
## Notes ## Notes
1. There is no dynamic memory allocation. 1. There is no dynamic memory allocation.
2. Persistant Memory is allocated to a Key with ownership 2. Persistent Memory is allocated to a Key with ownership
3. Contracts can `call` to update key owned state 3. Contracts can `call` to update key owned state
4. Contracts can `reduce` over the memory to aggregate state 4. `call` is just a *syscall* that does a cryptographic check of memory ownership
5. `call` is just a *syscall* that does a cryptographic check of memory owndershp 5. Kernel guarantees that when memory is assigned to the contract its state is 0
6. Kernel guarantees that contract is the only thing that can modify memory that its assigned to
7. Kernel guarantees that the contract can only spend tokens that are in pages that are assigned to it
8. Kernel guarantees the balances belonging to pages are balanced before and after the call

View File

@ -0,0 +1,77 @@
Two players want to play tic-tac-toe with each other on Solana.
The tic-tac-toe program has already been provisioned on the network, and the
program author has advertised the following information to potential gamers:
* `tictactoe_publickey` - the program's public key
* `tictactoe_gamestate_size` - the number of bytes needed to maintain the game state
The game state is a well-documented data structure consisting of:
- Player 1's public key
- Player 2's public key
- Game status. An 8-bit value where:
* 0 = game uninitialized
* 1 = Player 1's turn
* 2 = Player 2's turn
* 3 = Player 1 won
* 4 = Player 2 won
- Current board configuration. A 3x3 character array containing the values '\0', 'X' or 'O'
### Game Setup
1. Two players want to start a game. Player 2 sends Player 1 their public key,
`player2_publickey` off-chain (IM, email, etc)
2. Player 1 creates a new keypair to represent the game state, `(gamestate_publickey,
gamestate_privatekey)`.
3. Player 1 issues an allocate_memory transaction, assigning that memory page to the
tic-tac-toe program. The `memory_fee` is used to *rent* the memory page for the
duration of the game and is subtracted from current account balance of Player
1:
```
allocate_memory(gamestate_publickey, tictactoe_publickey, tictactoe_gamestate_size, memory_fee)
```
4. Game state is then initialized by issuing a *new* call transaction to the
tic-tac-toe program. This transaction is signed by `gamestate_privatekey`, known only
to Player 1.
```
call(tictactoe_publickey, gamestate_publickey, 'new', player1_publickey, player2_publickey)
```
5. Once the game is initialized, Player 1 shares `gamestate_publickey` with
Player 2 off-chain (IM, email, etc)
Note that it's likely each player prefer to generate a game-specific keypair
rather than sharing their primary public key (`player1_publickey`,
`player2_publickey`) with each other and the tic-tac-toe program.
### Game Play
Both players poll the network, via a **TBD off-chain RPC API**, to read the
current game state from the `gamestate_publickey` memory page.
When the *Game status* field indicates it's their turn, the player issues a
*move* call transaction passing in the board position (1..9) that they want to
mark as X or O:
```
call(tictactoe_publickey, gamestate_publickey, 'move', position)
```
The program will reject the transaction if it was not signed by the player whose
turn it is.
The outcome of the *move* call is also observed by polling the current game state via
the **TBD off-chain RPC API**.
### Game Cancellation
At any time Player 1 may conclude the game by issuing:
```
call(tictactoe_publickey, gamestate_publickey, 'abort')
```
causing any remaining *rent* tokens assigned to the `gamestate_publickey` page
to be transferred back to Player 1 by the tic-tac-toe program. Lastly, the
network recognizes the empty account and frees the `gamestate_publickey` memory
page.

View File

@ -0,0 +1,59 @@
```
========================= master branch (edge channel) =======================>
\ \ \
\___v0.7.0 tag \ \
\ \ v0.9.0 tag__\
\ v0.8.0 tag__\ \
v0.7.1 tag__\ \ v0.9 branch (beta channel)
\___v0.7.2 tag \___v0.8.1 tag
\ \
\ \
v0.7 branch v0.8 branch (stable channel)
```
## Branches and Tags
### master branch
All new development occurs on the `master` branch.
Bug fixes that affect a `vX.Y` branch are first made on `master`. This is to
allow a fix some soak time on `master` before it is applied to one or more
stabilization branches.
Merging to `master` first also helps ensure that fixes applied to one release
are present for future releases. (Sometimes the joy of landing a critical
release blocker in a branch causes you to forget to propagate back to
`master`!)"
Once the bug fix lands on `master` it is cherry-picked into the `vX.Y` branch
and potentially the `vX.Y-1` branch. The exception to this rule is when a bug
fix for `vX.Y` doesn't apply to `master` or `vX.Y-1`.
Immediately after a new stabilization branch is forged, the `Cargo.toml` minor
version (*Y*) in the `master` branch is incremented by the release engineer.
Incrementing the major version of the `master` branch is outside the scope of
this document.
### v*X.Y* stabilization branches
These are stabilization branches for a given milestone. They are created off
the `master` branch as late as possible prior to the milestone release.
### v*X.Y.Z* release tag
The release tags are created as desired by the owner of the given stabilization
branch, and cause that *X.Y.Z* release to be shipped to https://crates.io,
https://snapcraft.io/, and elsewhere.
Immediately after a new v*X.Y.Z* branch tag has been created, the `Cargo.toml`
patch version number (*Z*) of the stabilization branch is incremented by the
release engineer.
## Channels
Channels are used by end-users (humans and bots) to consume the branches
described in the previous section, so they may automatically update to the most
recent version matching their desired stability.
There are three release channels that map to branches as follows:
* edge - tracks the `master` branch, least stable.
* beta - tracks the largest (and latest) `vX.Y` stabilization branch, more stable.
* stable - tracks the second largest `vX.Y` stabilization branch, most stable.

View File

@ -0,0 +1,51 @@
# |source| this file
#
# The SOLANA_METRICS_CONFIG environment variable is formatted as a
# comma-delimited list of parameters. All parameters are optional.
#
# Example:
# export SOLANA_METRICS_CONFIG="host=<metrics host>,db=<database name>,u=<username>,p=<password>"
#
# The following directive disable complaints about unused variables in this
# file:
# shellcheck disable=2034
#
metricsWriteDatapoint="$(dirname "${BASH_SOURCE[0]}")"/metrics-write-datapoint.sh
configureMetrics() {
[[ -n $SOLANA_METRICS_CONFIG ]] || return 0
declare metricsParams
IFS=',' read -r -a metricsParams <<< "$SOLANA_METRICS_CONFIG"
for param in "${metricsParams[@]}"; do
IFS='=' read -r -a pair <<< "$param"
if [[ ${#pair[@]} != 2 ]]; then
echo Error: invalid metrics parameter: "$param" >&2
else
declare name="${pair[0]}"
declare value="${pair[1]}"
case "$name" in
host)
export INFLUX_HOST="$value"
echo INFLUX_HOST="$INFLUX_HOST" >&2
;;
db)
export INFLUX_DATABASE="$value"
echo INFLUX_DATABASE="$INFLUX_DATABASE" >&2
;;
u)
export INFLUX_USERNAME="$value"
echo INFLUX_USERNAME="$INFLUX_USERNAME" >&2
;;
p)
export INFLUX_PASSWORD="$value"
echo INFLUX_PASSWORD="********" >&2
;;
*)
echo Error: Unknown metrics parameter name: "$name" >&2
;;
esac
fi
done
}
configureMetrics

View File

@ -0,0 +1,19 @@
#!/bin/bash -e
#
# Send a metrics datapoint
#
point=$1
if [[ -z $point ]]; then
echo "Data point not specified"
exit 1
fi
echo "Influx data point: $point"
if [[ -z $INFLUX_DATABASE || -z $INFLUX_USERNAME || -z $INFLUX_PASSWORD ]]; then
echo Influx user credentials not found
exit 0
fi
echo "https://metrics.solana.com:8086/write?db=${INFLUX_DATABASE}&u=${INFLUX_USERNAME}&p=${INFLUX_PASSWORD}" \
| xargs curl --max-time 5 -XPOST --data-binary "$point"

49
scripts/net-stats.sh Executable file
View File

@ -0,0 +1,49 @@
#!/bin/bash -e
#
# Reports network statistics
#
[[ $(uname) == Linux ]] || exit 0
cd "$(dirname "$0")"
# shellcheck source=scripts/configure-metrics.sh
source configure-metrics.sh
packets_received=0
packets_received_diff=0
receive_errors=0
receive_errors_diff=0
rcvbuf_errors=0
rcvbuf_errors_diff=0
update_netstat() {
declare net_stat
net_stat=$(netstat -suna)
declare stats
stats=$(echo "$net_stat" | awk 'BEGIN {tmp_var = 0} /packets received/ {tmp_var = $1} END { print tmp_var }')
packets_received_diff=$((stats - packets_received))
packets_received="$stats"
stats=$(echo "$net_stat" | awk 'BEGIN {tmp_var = 0} /packet receive errors/ {tmp_var = $1} END { print tmp_var }')
receive_errors_diff=$((stats - receive_errors))
receive_errors="$stats"
stats=$(echo "$net_stat" | awk 'BEGIN {tmp_var = 0} /RcvbufErrors/ {tmp_var = $2} END { print tmp_var }')
rcvbuf_errors_diff=$((stats - rcvbuf_errors))
rcvbuf_errors="$stats"
}
update_netstat
while true; do
update_netstat
report="packets_received=$packets_received_diff,receive_errors=$receive_errors_diff,rcvbuf_errors=$rcvbuf_errors_diff"
echo "$report"
./metrics-write-datapoint.sh "net-stats,hostname=$HOSTNAME $report"
sleep 1
done
exit 1

35
scripts/oom-monitor.sh Executable file
View File

@ -0,0 +1,35 @@
#!/bin/bash -e
#
# Reports Linux OOM Killer activity
#
cd "$(dirname "$0")"
# shellcheck source=scripts/oom-score-adj.sh
source oom-score-adj.sh
# shellcheck source=scripts/configure-metrics.sh
source configure-metrics.sh
[[ $(uname) = Linux ]] || exit 0
syslog=/var/log/syslog
[[ -r $syslog ]] || {
echo Unable to read $syslog
exit 1
}
# Adjust OOM score to reduce the chance that this script will be killed
# during an Out of Memory event since the purpose of this script is to
# report such events
oom_score_adj "self" -500
while read -r victim; do
echo "Out of memory event detected, $victim killed"
./metrics-write-datapoint.sh "oom-killer,victim=$victim,hostname=$HOSTNAME killed=1"
done < <( \
tail --follow=name --retry -n0 $syslog \
| sed --unbuffered -n 's/^.* Out of memory: Kill process [1-9][0-9]* (\([^)]*\)) .*/\1/p' \
)
exit 1

20
scripts/oom-score-adj.sh Normal file
View File

@ -0,0 +1,20 @@
# |source| this file
#
# Adjusts the OOM score for the specified process. Linux only
#
# usage: oom_score_adj [pid] [score]
#
oom_score_adj() {
declare pid=$1
declare score=$2
if [[ $(uname) != Linux ]]; then
return
fi
echo "$score" > "/proc/$pid/oom_score_adj" || true
declare currentScore
currentScore=$(cat "/proc/$pid/oom_score_adj" || true)
if [[ $score != "$currentScore" ]]; then
echo "Failed to set oom_score_adj to $score for pid $pid (current score: $currentScore)"
fi
}

70
scripts/perf-stats.py Executable file
View File

@ -0,0 +1,70 @@
#!/usr/bin/env python3
import json
import sys
stages_data = {}
if len(sys.argv) != 2:
print("USAGE: {} <input file>".format(sys.argv[0]))
sys.exit(1)
with open(sys.argv[1]) as fh:
for line in fh.readlines():
if "COUNTER" in line:
json_part = line[line.find("{"):]
x = json.loads(json_part)
counter = x['name']
if not (counter in stages_data):
stages_data[counter] = {'first_ts': x['now'], 'last_ts': x['now'], 'last_count': 0,
'data': [], 'max_speed': 0, 'min_speed': 9999999999.0,
'count': 0,
'max_speed_ts': 0, 'min_speed_ts': 0}
stages_data[counter]['count'] += 1
count_since_last = x['counts'] - stages_data[counter]['last_count']
time_since_last = float(x['now'] - stages_data[counter]['last_ts'])
if time_since_last > 1:
speed = 1000.0 * (count_since_last / time_since_last)
stages_data[counter]['data'].append(speed)
if speed > stages_data[counter]['max_speed']:
stages_data[counter]['max_speed'] = speed
stages_data[counter]['max_speed_ts'] = x['now']
if speed < stages_data[counter]['min_speed']:
stages_data[counter]['min_speed'] = speed
stages_data[counter]['min_speed_ts'] = x['now']
stages_data[counter]['last_ts'] = x['now']
stages_data[counter]['last_count'] = x['counts']
for stage in stages_data.keys():
stages_data[stage]['data'].sort()
#mean_index = stages_data[stage]['count'] / 2
mean = 0
average = 0
eightieth = 0
data_len = len(stages_data[stage]['data'])
mean_index = int(data_len / 2)
eightieth_index = int(data_len * 0.8)
#print("mean idx: {} data.len: {}".format(mean_index, data_len))
if data_len > 0:
mean = stages_data[stage]['data'][mean_index]
average = float(sum(stages_data[stage]['data'])) / data_len
eightieth = stages_data[stage]['data'][eightieth_index]
print("stage: {} max: {:,.2f} min: {:.2f} count: {} total: {} mean: {:,.2f} average: {:,.2f} 80%: {:,.2f}".format(stage,
stages_data[stage]['max_speed'],
stages_data[stage]['min_speed'],
stages_data[stage]['count'],
stages_data[stage]['last_count'],
mean, average, eightieth))
num = 5
idx = -1
if data_len >= num:
print(" top {}: ".format(num), end='')
for x in range(0, num):
print("{:,.2f} ".format(stages_data[stage]['data'][idx]), end='')
idx -= 1
if stages_data[stage]['data'][idx] < average:
break
print("")
print(" max_ts: {} min_ts: {}".format(stages_data[stage]['max_speed_ts'], stages_data[stage]['min_speed_ts']))
print("\n")

22
scripts/snap-config-to-env.sh Executable file
View File

@ -0,0 +1,22 @@
#!/bin/bash
#
# Snap daemons have no access to the environment so |snap set solana ...| is
# used to set runtime configuration.
#
# This script exports the snap runtime configuration options back as
# environment variables before invoking the specified program
#
if [[ -d $SNAP ]]; then # Running inside a Linux Snap?
RUST_LOG="$(snapctl get rust-log)"
SOLANA_CUDA="$(snapctl get enable-cuda)"
SOLANA_DEFAULT_METRICS_RATE="$(snapctl get default-metrics-rate)"
SOLANA_METRICS_CONFIG="$(snapctl get metrics-config)"
export RUST_LOG
export SOLANA_CUDA
export SOLANA_DEFAULT_METRICS_RATE
export SOLANA_METRICS_CONFIG
fi
exec "$@"

View File

@ -3,15 +3,14 @@
# Wallet sanity test # Wallet sanity test
# #
here=$(dirname "$0") cd "$(dirname "$0")"/..
cd "$here"
if [[ -n "$USE_SNAP" ]]; then if [[ -n "$USE_SNAP" ]]; then
# TODO: Merge wallet.sh functionality into solana-wallet proper and # TODO: Merge wallet.sh functionality into solana-wallet proper and
# remove this USE_SNAP case # remove this USE_SNAP case
wallet="solana.wallet $1" wallet="solana.wallet $1"
else else
wallet="../wallet.sh $1" wallet="multinode-demo/wallet.sh $1"
fi fi
# Tokens transferred to this address are lost forever... # Tokens transferred to this address are lost forever...
@ -35,7 +34,7 @@ pay_and_confirm() {
$wallet reset $wallet reset
$wallet address $wallet address
check_balance_output "Your balance is: 0" check_balance_output "No account found" "Your balance is: 0"
$wallet airdrop --tokens 60 $wallet airdrop --tokens 60
check_balance_output "Your balance is: 60" check_balance_output "Your balance is: 60"
$wallet airdrop --tokens 40 $wallet airdrop --tokens 40

17
snap/hooks/configure vendored
View File

@ -4,27 +4,31 @@ echo Stopping daemons
snapctl stop --disable solana.daemon-drone snapctl stop --disable solana.daemon-drone
snapctl stop --disable solana.daemon-leader snapctl stop --disable solana.daemon-leader
snapctl stop --disable solana.daemon-validator snapctl stop --disable solana.daemon-validator
snapctl stop --disable solana.daemon-oom-monitor
snapctl stop --disable solana.daemon-net-stats
mode="$(snapctl get mode)" mode="$(snapctl get mode)"
if [[ -z "$mode" ]]; then if [[ -z "$mode" ]]; then
exit 0 exit 0
fi fi
ip_address_arg=-p # Use public IP address (TODO: make this configurable?)
num_tokens="$(snapctl get num-tokens)" num_tokens="$(snapctl get num-tokens)"
num_tokens="${num_tokens:+-n $num_tokens}"
setup_args="$(snapctl get setup-args)"
case $mode in case $mode in
leader+drone) leader+drone)
$SNAP/bin/setup.sh ${num_tokens:+-n $num_tokens} ${ip_address_arg} -t leader "$SNAP"/multinode-demo/setup.sh -t leader $num_tokens -p $setup_args
snapctl start --enable solana.daemon-leader
snapctl start --enable solana.daemon-drone snapctl start --enable solana.daemon-drone
snapctl start --enable solana.daemon-leader
;; ;;
leader) leader)
$SNAP/bin/setup.sh ${num_tokens:+-n $num_tokens} ${ip_address_arg} -t leader "$SNAP"/multinode-demo/setup.sh -t leader $num_tokens -p $setup_args
snapctl start --enable solana.daemon-leader snapctl start --enable solana.daemon-leader
;; ;;
validator) validator)
$SNAP/bin/setup.sh ${ip_address_arg} -t validator "$SNAP"/multinode-demo/setup.sh -t validator -p $setup_args
snapctl start --enable solana.daemon-validator snapctl start --enable solana.daemon-validator
;; ;;
*) *)
@ -32,3 +36,6 @@ validator)
exit 1 exit 1
;; ;;
esac esac
snapctl start --enable solana.daemon-oom-monitor
snapctl start --enable solana.daemon-net-stats

View File

@ -44,53 +44,66 @@ apps:
command: solana-keygen command: solana-keygen
plugs: plugs:
- home - home
client-demo: ledger-tool:
# TODO: Merge client.sh functionality into solana-client-demo proper command: solana-ledger-tool
command: client.sh plugs:
#command: solana-client-demo - home
bench-tps:
command: solana-bench-tps
plugs: plugs:
- network - network
- network-bind - network-bind
- home - home
wallet: wallet:
# TODO: Merge wallet.sh functionality into solana-wallet proper # TODO: Merge wallet.sh functionality into solana-wallet proper
command: wallet.sh command: multinode-demo/wallet.sh
#command: solana-wallet #command: solana-wallet
plugs: plugs:
- network - network
- home - home
daemon-validator: daemon-validator:
daemon: simple daemon: simple
command: validator.sh command: scripts/snap-config-to-env.sh $SNAP/multinode-demo/validator.sh
plugs: plugs:
- network - network
- network-bind - network-bind
daemon-leader: daemon-leader:
daemon: simple daemon: simple
command: leader.sh command: scripts/snap-config-to-env.sh $SNAP/multinode-demo/leader.sh
plugs: plugs:
- network - network
- network-bind - network-bind
daemon-drone: daemon-drone:
daemon: simple daemon: simple
command: drone.sh command: scripts/snap-config-to-env.sh $SNAP/multinode-demo/drone.sh
plugs: plugs:
- network - network
- network-bind - network-bind
daemon-oom-monitor:
daemon: simple
command: scripts/snap-config-to-env.sh $SNAP/scripts/oom-monitor.sh
plugs:
- network
daemon-net-stats:
daemon: simple
command: scripts/snap-config-to-env.sh $SNAP/scripts/net-stats.sh
plugs:
- network
parts: parts:
solana: solana:
plugin: nil plugin: nil
prime: prime:
- bin - bin
- multinode-demo
- scripts
- usr/lib - usr/lib
override-build: | override-build: |
# Install CUDA 9.2 runtime # Install CUDA 9.2 runtime
mkdir -p $SNAPCRAFT_PART_INSTALL/usr/
cp -rav /usr/local/cuda-9.2/targets/x86_64-linux/lib/ $SNAPCRAFT_PART_INSTALL/usr/lib
mkdir -p $SNAPCRAFT_PART_INSTALL/usr/lib/x86_64-linux-gnu/
cp -rav /usr/lib/x86_64-linux-gnu/libcuda.* $SNAPCRAFT_PART_INSTALL/usr/lib/x86_64-linux-gnu/
mkdir -p $SNAPCRAFT_PART_INSTALL/usr/lib/nvidia-396/ mkdir -p $SNAPCRAFT_PART_INSTALL/usr/lib/nvidia-396/
mkdir -p $SNAPCRAFT_PART_INSTALL/usr/lib/x86_64-linux-gnu/
cp -rav /usr/local/cuda-9.2/targets/x86_64-linux/lib/libcudart.so* $SNAPCRAFT_PART_INSTALL/usr/lib
cp -rav /usr/lib/x86_64-linux-gnu/libcuda.so* $SNAPCRAFT_PART_INSTALL/usr/lib/x86_64-linux-gnu/
cp -v /usr/lib/nvidia-396/libnvidia-fatbinaryloader.so* $SNAPCRAFT_PART_INSTALL/usr/lib/nvidia-396/ cp -v /usr/lib/nvidia-396/libnvidia-fatbinaryloader.so* $SNAPCRAFT_PART_INSTALL/usr/lib/nvidia-396/
# Build/install solana-fullnode-cuda # Build/install solana-fullnode-cuda
@ -100,19 +113,25 @@ parts:
rm -rf $SNAPCRAFT_PART_INSTALL/bin/* rm -rf $SNAPCRAFT_PART_INSTALL/bin/*
mv $SNAPCRAFT_PART_INSTALL/solana-fullnode $SNAPCRAFT_PART_INSTALL/bin/solana-fullnode-cuda mv $SNAPCRAFT_PART_INSTALL/solana-fullnode $SNAPCRAFT_PART_INSTALL/bin/solana-fullnode-cuda
mkdir -p $SNAPCRAFT_PART_INSTALL/usr/lib/ mkdir -p $SNAPCRAFT_PART_INSTALL/usr/lib/
cp -f libJerasure.so $SNAPCRAFT_PART_INSTALL/usr/lib/libJerasure.so.2 cp -f target/perf-libs/libJerasure.so $SNAPCRAFT_PART_INSTALL/usr/lib/libJerasure.so.2
cp -f libgf_complete.so $SNAPCRAFT_PART_INSTALL/usr/lib/libgf_complete.so.1 cp -f target/perf-libs/libgf_complete.so $SNAPCRAFT_PART_INSTALL/usr/lib/libgf_complete.so.1
# Build/install all other programs # Build/install all other programs
cargo install --root $SNAPCRAFT_PART_INSTALL --bins cargo install --root $SNAPCRAFT_PART_INSTALL --bins
# Install multinode scripts # Install multinode-demo/
mkdir -p $SNAPCRAFT_PART_INSTALL/bin mkdir -p $SNAPCRAFT_PART_INSTALL/multinode-demo/
cp -av multinode-demo/* $SNAPCRAFT_PART_INSTALL/bin/ cp -av multinode-demo/* $SNAPCRAFT_PART_INSTALL/multinode-demo/
# TODO: build rsync/multilog from source instead of sneaking it in from the host # Install scripts/
# system... mkdir -p $SNAPCRAFT_PART_INSTALL/scripts/
cp -av scripts/* $SNAPCRAFT_PART_INSTALL/scripts/
# TODO: build curl,dig,rsync/multilog from source instead of sneaking it
# in from the host system...
set -x set -x
mkdir -p $SNAPCRAFT_PART_INSTALL/bin mkdir -p $SNAPCRAFT_PART_INSTALL/bin
cp -av /usr/bin/rsync $SNAPCRAFT_PART_INSTALL/bin/ cp -av /usr/bin/curl $SNAPCRAFT_PART_INSTALL/bin/
cp -av /usr/bin/dig $SNAPCRAFT_PART_INSTALL/bin/
cp -av /usr/bin/multilog $SNAPCRAFT_PART_INSTALL/bin/ cp -av /usr/bin/multilog $SNAPCRAFT_PART_INSTALL/bin/
cp -av /usr/bin/rsync $SNAPCRAFT_PART_INSTALL/bin/

File diff suppressed because it is too large Load Diff

View File

@ -5,6 +5,7 @@
use bank::Bank; use bank::Bank;
use bincode::deserialize; use bincode::deserialize;
use counter::Counter; use counter::Counter;
use log::Level;
use packet::{PacketRecycler, Packets, SharedPackets}; use packet::{PacketRecycler, Packets, SharedPackets};
use rayon::prelude::*; use rayon::prelude::*;
use record_stage::Signal; use record_stage::Signal;
@ -51,8 +52,7 @@ impl BankingStage {
_ => error!("{:?}", e), _ => error!("{:?}", e),
} }
} }
}) }).unwrap();
.unwrap();
(BankingStage { thread_hdl }, signal_receiver) (BankingStage { thread_hdl }, signal_receiver)
} }
@ -65,8 +65,7 @@ impl BankingStage {
deserialize(&x.data[0..x.meta.size]) deserialize(&x.data[0..x.meta.size])
.map(|req| (req, x.meta.addr())) .map(|req| (req, x.meta.addr()))
.ok() .ok()
}) }).collect()
.collect()
} }
/// Process the incoming packets and send output `Signal` messages to `signal_sender`. /// Process the incoming packets and send output `Signal` messages to `signal_sender`.
@ -88,6 +87,7 @@ impl BankingStage {
timing::duration_as_ms(&recv_start.elapsed()), timing::duration_as_ms(&recv_start.elapsed()),
mms.len(), mms.len(),
); );
let bank_starting_tx_count = bank.transaction_count();
let count = mms.iter().map(|x| x.1.len()).sum(); let count = mms.iter().map(|x| x.1.len()).sum();
let proc_start = Instant::now(); let proc_start = Instant::now();
for (msgs, vers) in mms { for (msgs, vers) in mms {
@ -103,8 +103,7 @@ impl BankingStage {
} else { } else {
None None
}, },
}) }).collect();
.collect();
debug!("process_transactions"); debug!("process_transactions");
let results = bank.process_transactions(transactions); let results = bank.process_transactions(transactions);
@ -112,7 +111,7 @@ impl BankingStage {
signal_sender.send(Signal::Transactions(transactions))?; signal_sender.send(Signal::Transactions(transactions))?;
debug!("done process_transactions"); debug!("done process_transactions");
packet_recycler.recycle(msgs); packet_recycler.recycle(msgs, "process_transactions");
} }
let total_time_s = timing::duration_as_s(&proc_start.elapsed()); let total_time_s = timing::duration_as_s(&proc_start.elapsed());
let total_time_ms = timing::duration_as_ms(&proc_start.elapsed()); let total_time_ms = timing::duration_as_ms(&proc_start.elapsed());
@ -124,7 +123,11 @@ impl BankingStage {
reqs_len, reqs_len,
(reqs_len as f32) / (total_time_s) (reqs_len as f32) / (total_time_s)
); );
inc_new_counter!("banking_stage-process_packets", count); inc_new_counter_info!("banking_stage-process_packets", count);
inc_new_counter_info!(
"banking_stage-process_transactions",
bank.transaction_count() - bank_starting_tx_count
);
Ok(()) Ok(())
} }
} }

131
src/bin/bench-streamer.rs Normal file
View File

@ -0,0 +1,131 @@
extern crate clap;
extern crate solana;
use clap::{App, Arg};
use solana::netutil::bind_to;
use solana::packet::{Packet, PacketRecycler, BLOB_SIZE, PACKET_DATA_SIZE};
use solana::result::Result;
use solana::streamer::{receiver, PacketReceiver};
use std::cmp::max;
use std::net::{IpAddr, Ipv4Addr, SocketAddr, UdpSocket};
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::mpsc::channel;
use std::sync::Arc;
use std::thread::sleep;
use std::thread::{spawn, JoinHandle};
use std::time::Duration;
use std::time::SystemTime;
fn producer(addr: &SocketAddr, recycler: &PacketRecycler, exit: Arc<AtomicBool>) -> JoinHandle<()> {
let send = UdpSocket::bind("0.0.0.0:0").unwrap();
let msgs = recycler.allocate();
let msgs_ = msgs.clone();
msgs.write().unwrap().packets.resize(10, Packet::default());
for w in &mut msgs.write().unwrap().packets {
w.meta.size = PACKET_DATA_SIZE;
w.meta.set_addr(&addr);
}
spawn(move || loop {
if exit.load(Ordering::Relaxed) {
return;
}
let mut num = 0;
for p in &msgs_.read().unwrap().packets {
let a = p.meta.addr();
assert!(p.meta.size < BLOB_SIZE);
send.send_to(&p.data[..p.meta.size], &a).unwrap();
num += 1;
}
assert_eq!(num, 10);
})
}
fn sink(
recycler: PacketRecycler,
exit: Arc<AtomicBool>,
rvs: Arc<AtomicUsize>,
r: PacketReceiver,
) -> JoinHandle<()> {
spawn(move || loop {
if exit.load(Ordering::Relaxed) {
return;
}
let timer = Duration::new(1, 0);
if let Ok(msgs) = r.recv_timeout(timer) {
rvs.fetch_add(msgs.read().unwrap().packets.len(), Ordering::Relaxed);
recycler.recycle(msgs, "sink");
}
})
}
fn main() -> Result<()> {
let mut num_sockets = 1usize;
let matches = App::new("solana-bench-streamer")
.arg(
Arg::with_name("num-recv-sockets")
.long("num-recv-sockets")
.value_name("NUM")
.takes_value(true)
.help("Use NUM receive sockets"),
).get_matches();
if let Some(n) = matches.value_of("num-recv-sockets") {
num_sockets = max(num_sockets, n.to_string().parse().expect("integer"));
}
let mut port = 0;
let mut addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 0);
let exit = Arc::new(AtomicBool::new(false));
let pack_recycler = PacketRecycler::default();
let mut read_channels = Vec::new();
let mut read_threads = Vec::new();
for _ in 0..num_sockets {
let read = bind_to(port, false).unwrap();
read.set_read_timeout(Some(Duration::new(1, 0))).unwrap();
addr = read.local_addr().unwrap();
port = addr.port();
let (s_reader, r_reader) = channel();
read_channels.push(r_reader);
read_threads.push(receiver(
Arc::new(read),
exit.clone(),
pack_recycler.clone(),
s_reader,
));
}
let t_producer1 = producer(&addr, &pack_recycler, exit.clone());
let t_producer2 = producer(&addr, &pack_recycler, exit.clone());
let t_producer3 = producer(&addr, &pack_recycler, exit.clone());
let rvs = Arc::new(AtomicUsize::new(0));
let sink_threads: Vec<_> = read_channels
.into_iter()
.map(|r_reader| sink(pack_recycler.clone(), exit.clone(), rvs.clone(), r_reader))
.collect();
let start = SystemTime::now();
let start_val = rvs.load(Ordering::Relaxed);
sleep(Duration::new(5, 0));
let elapsed = start.elapsed().unwrap();
let end_val = rvs.load(Ordering::Relaxed);
let time = elapsed.as_secs() * 10_000_000_000 + u64::from(elapsed.subsec_nanos());
let ftime = (time as f64) / 10_000_000_000_f64;
let fcount = (end_val - start_val) as f64;
println!("performance: {:?}", fcount / ftime);
exit.store(true, Ordering::Relaxed);
for t_reader in read_threads {
t_reader.join()?;
}
t_producer1.join()?;
t_producer2.join()?;
t_producer3.join()?;
for t_sink in sink_threads {
t_sink.join()?;
}
Ok(())
}

751
src/bin/bench-tps.rs Normal file
View File

@ -0,0 +1,751 @@
extern crate bincode;
#[macro_use]
extern crate clap;
extern crate influx_db_client;
extern crate rayon;
extern crate serde_json;
#[macro_use]
extern crate solana;
use clap::{App, Arg};
use influx_db_client as influxdb;
use rayon::prelude::*;
use solana::client::mk_client;
use solana::crdt::{Crdt, NodeInfo};
use solana::drone::DRONE_PORT;
use solana::hash::Hash;
use solana::logger;
use solana::metrics;
use solana::ncp::Ncp;
use solana::packet::BlobRecycler;
use solana::service::Service;
use solana::signature::{read_keypair, GenKeys, Keypair, KeypairUtil};
use solana::thin_client::{poll_gossip_for_leader, ThinClient};
use solana::timing::{duration_as_ms, duration_as_s};
use solana::transaction::Transaction;
use solana::wallet::request_airdrop;
use solana::window::default_window;
use std::collections::VecDeque;
use std::net::SocketAddr;
use std::process::exit;
use std::sync::atomic::{AtomicBool, AtomicIsize, AtomicUsize, Ordering};
use std::sync::{Arc, RwLock};
use std::thread::sleep;
use std::thread::Builder;
use std::thread::JoinHandle;
use std::time::Duration;
use std::time::Instant;
pub struct NodeStats {
pub tps: f64, // Maximum TPS reported by this node
pub tx: u64, // Total transactions reported by this node
}
fn metrics_submit_token_balance(token_balance: i64) {
println!("Token balance: {}", token_balance);
metrics::submit(
influxdb::Point::new("bench-tps")
.add_tag("op", influxdb::Value::String("token_balance".to_string()))
.add_field("balance", influxdb::Value::Integer(token_balance as i64))
.to_owned(),
);
}
fn sample_tx_count(
exit_signal: &Arc<AtomicBool>,
maxes: &Arc<RwLock<Vec<(SocketAddr, NodeStats)>>>,
first_tx_count: u64,
v: &NodeInfo,
sample_period: u64,
) {
let mut client = mk_client(&v);
let mut now = Instant::now();
let mut initial_tx_count = client.transaction_count();
let mut max_tps = 0.0;
let mut total;
let log_prefix = format!("{:21}:", v.contact_info.tpu.to_string());
loop {
let tx_count = client.transaction_count();
assert!(
tx_count >= initial_tx_count,
"expected tx_count({}) >= initial_tx_count({})",
tx_count,
initial_tx_count
);
let duration = now.elapsed();
now = Instant::now();
let sample = tx_count - initial_tx_count;
initial_tx_count = tx_count;
let ns = duration.as_secs() * 1_000_000_000 + u64::from(duration.subsec_nanos());
let tps = (sample * 1_000_000_000) as f64 / ns as f64;
if tps > max_tps {
max_tps = tps;
}
if tx_count > first_tx_count {
total = tx_count - first_tx_count;
} else {
total = 0;
}
println!(
"{} {:9.2} TPS, Transactions: {:6}, Total transactions: {}",
log_prefix, tps, sample, total
);
sleep(Duration::new(sample_period, 0));
if exit_signal.load(Ordering::Relaxed) {
println!("{} Exiting validator thread", log_prefix);
let stats = NodeStats {
tps: max_tps,
tx: total,
};
maxes.write().unwrap().push((v.contact_info.tpu, stats));
break;
}
}
}
/// Send loopback payment of 0 tokens and confirm the network processed it
fn send_barrier_transaction(barrier_client: &mut ThinClient, last_id: &mut Hash, id: &Keypair) {
let transfer_start = Instant::now();
let mut poll_count = 0;
loop {
if poll_count > 0 && poll_count % 8 == 0 {
println!(
"polling for barrier transaction confirmation, attempt {}",
poll_count
);
}
*last_id = barrier_client.get_last_id();
let signature = barrier_client
.transfer(0, &id, id.pubkey(), last_id)
.expect("Unable to send barrier transaction");
let confirmatiom = barrier_client.poll_for_signature(&signature);
let duration_ms = duration_as_ms(&transfer_start.elapsed());
if confirmatiom.is_ok() {
println!("barrier transaction confirmed in {}ms", duration_ms);
metrics::submit(
influxdb::Point::new("bench-tps")
.add_tag(
"op",
influxdb::Value::String("send_barrier_transaction".to_string()),
).add_field("poll_count", influxdb::Value::Integer(poll_count))
.add_field("duration", influxdb::Value::Integer(duration_ms as i64))
.to_owned(),
);
// Sanity check that the client balance is still 1
let balance = barrier_client
.poll_balance_with_timeout(
&id.pubkey(),
&Duration::from_millis(100),
&Duration::from_secs(10),
).expect("Failed to get balance");
if balance != 1 {
panic!("Expected an account balance of 1 (balance: {}", balance);
}
break;
}
// Timeout after 3 minutes. When running a CPU-only leader+validator+drone+bench-tps on a dev
// machine, some batches of transactions can take upwards of 1 minute...
if duration_ms > 1000 * 60 * 3 {
println!("Error: Couldn't confirm barrier transaction!");
exit(1);
}
let new_last_id = barrier_client.get_last_id();
if new_last_id == *last_id {
if poll_count > 0 && poll_count % 8 == 0 {
println!("last_id is not advancing, still at {:?}", *last_id);
}
} else {
*last_id = new_last_id;
}
poll_count += 1;
}
}
fn generate_txs(
shared_txs: &Arc<RwLock<VecDeque<Vec<Transaction>>>>,
id: &Keypair,
keypairs: &[Keypair],
last_id: &Hash,
threads: usize,
reclaim: bool,
) {
let tx_count = keypairs.len();
println!("Signing transactions... {} (reclaim={})", tx_count, reclaim);
let signing_start = Instant::now();
let transactions: Vec<_> = keypairs
.par_iter()
.map(|keypair| {
if !reclaim {
Transaction::new(&id, keypair.pubkey(), 1, *last_id)
} else {
Transaction::new(keypair, id.pubkey(), 1, *last_id)
}
}).collect();
let duration = signing_start.elapsed();
let ns = duration.as_secs() * 1_000_000_000 + u64::from(duration.subsec_nanos());
let bsps = (tx_count) as f64 / ns as f64;
let nsps = ns as f64 / (tx_count) as f64;
println!(
"Done. {:.2} thousand signatures per second, {:.2} us per signature, {} ms total time",
bsps * 1_000_000_f64,
nsps / 1_000_f64,
duration_as_ms(&duration),
);
metrics::submit(
influxdb::Point::new("bench-tps")
.add_tag("op", influxdb::Value::String("generate_txs".to_string()))
.add_field(
"duration",
influxdb::Value::Integer(duration_as_ms(&duration) as i64),
).to_owned(),
);
let sz = transactions.len() / threads;
let chunks: Vec<_> = transactions.chunks(sz).collect();
{
let mut shared_txs_wl = shared_txs.write().unwrap();
for chunk in chunks {
shared_txs_wl.push_back(chunk.to_vec());
}
}
}
fn do_tx_transfers(
exit_signal: &Arc<AtomicBool>,
shared_txs: &Arc<RwLock<VecDeque<Vec<Transaction>>>>,
leader: &NodeInfo,
shared_tx_thread_count: &Arc<AtomicIsize>,
total_tx_sent_count: &Arc<AtomicUsize>,
) {
let client = mk_client(&leader);
loop {
let txs;
{
let mut shared_txs_wl = shared_txs.write().unwrap();
txs = shared_txs_wl.pop_front();
}
if let Some(txs0) = txs {
shared_tx_thread_count.fetch_add(1, Ordering::Relaxed);
println!(
"Transferring 1 unit {} times... to {}",
txs0.len(),
leader.contact_info.tpu
);
let tx_len = txs0.len();
let transfer_start = Instant::now();
for tx in txs0 {
client.transfer_signed(&tx).unwrap();
}
shared_tx_thread_count.fetch_add(-1, Ordering::Relaxed);
total_tx_sent_count.fetch_add(tx_len, Ordering::Relaxed);
println!(
"Tx send done. {} ms {} tps",
duration_as_ms(&transfer_start.elapsed()),
tx_len as f32 / duration_as_s(&transfer_start.elapsed()),
);
metrics::submit(
influxdb::Point::new("bench-tps")
.add_tag("op", influxdb::Value::String("do_tx_transfers".to_string()))
.add_field(
"duration",
influxdb::Value::Integer(duration_as_ms(&transfer_start.elapsed()) as i64),
).add_field("count", influxdb::Value::Integer(tx_len as i64))
.to_owned(),
);
}
if exit_signal.load(Ordering::Relaxed) {
break;
}
}
}
fn airdrop_tokens(client: &mut ThinClient, leader: &NodeInfo, id: &Keypair, tx_count: i64) {
let mut drone_addr = leader.contact_info.tpu;
drone_addr.set_port(DRONE_PORT);
let starting_balance = client.poll_get_balance(&id.pubkey()).unwrap_or(0);
metrics_submit_token_balance(starting_balance);
println!("starting balance {}", starting_balance);
if starting_balance < tx_count {
let airdrop_amount = tx_count - starting_balance;
println!(
"Airdropping {:?} tokens from {} for {}",
airdrop_amount,
drone_addr,
id.pubkey(),
);
if let Err(e) = request_airdrop(&drone_addr, &id.pubkey(), airdrop_amount as u64) {
panic!(
"Error requesting airdrop: {:?} to addr: {:?} amount: {}",
e, drone_addr, airdrop_amount
);
}
// TODO: return airdrop Result from Drone instead of polling the
// network
let mut current_balance = starting_balance;
for _ in 0..20 {
sleep(Duration::from_millis(500));
current_balance = client.poll_get_balance(&id.pubkey()).unwrap_or_else(|e| {
println!("airdrop error {}", e);
starting_balance
});
if starting_balance != current_balance {
break;
}
println!("current balance {}...", current_balance);
}
metrics_submit_token_balance(current_balance);
if current_balance - starting_balance != airdrop_amount {
println!(
"Airdrop failed! {} {} {}",
id.pubkey(),
current_balance,
starting_balance
);
exit(1);
}
}
}
fn compute_and_report_stats(
maxes: &Arc<RwLock<Vec<(SocketAddr, NodeStats)>>>,
sample_period: u64,
tx_send_elapsed: &Duration,
total_tx_send_count: usize,
) {
// Compute/report stats
let mut max_of_maxes = 0.0;
let mut max_tx_count = 0;
let mut nodes_with_zero_tps = 0;
let mut total_maxes = 0.0;
println!(" Node address | Max TPS | Total Transactions");
println!("---------------------+---------------+--------------------");
for (sock, stats) in maxes.read().unwrap().iter() {
let maybe_flag = match stats.tx {
0 => "!!!!!",
_ => "",
};
println!(
"{:20} | {:13.2} | {} {}",
(*sock).to_string(),
stats.tps,
stats.tx,
maybe_flag
);
if stats.tps == 0.0 {
nodes_with_zero_tps += 1;
}
total_maxes += stats.tps;
if stats.tps > max_of_maxes {
max_of_maxes = stats.tps;
}
if stats.tx > max_tx_count {
max_tx_count = stats.tx;
}
}
if total_maxes > 0.0 {
let num_nodes_with_tps = maxes.read().unwrap().len() - nodes_with_zero_tps;
let average_max = total_maxes / num_nodes_with_tps as f64;
println!(
"\nAverage max TPS: {:.2}, {} nodes had 0 TPS",
average_max, nodes_with_zero_tps
);
}
println!(
"\nHighest TPS: {:.2} sampling period {}s max transactions: {} clients: {} drop rate: {:.2}",
max_of_maxes,
sample_period,
max_tx_count,
maxes.read().unwrap().len(),
(total_tx_send_count as u64 - max_tx_count) as f64 / total_tx_send_count as f64,
);
println!(
"\tAverage TPS: {}",
max_tx_count as f32 / duration_as_s(tx_send_elapsed)
);
}
// First transfer 3/4 of the tokens to the dest accounts
// then ping-pong 1/4 of the tokens back to the other account
// this leaves 1/4 token buffer in each account
fn should_switch_directions(num_tokens_per_account: i64, i: i64) -> bool {
i % (num_tokens_per_account / 4) == 0 && (i >= (3 * num_tokens_per_account) / 4)
}
fn main() {
logger::setup();
metrics::set_panic_hook("bench-tps");
let matches = App::new("solana-bench-tps")
.version(crate_version!())
.arg(
Arg::with_name("network")
.short("n")
.long("network")
.value_name("HOST:PORT")
.takes_value(true)
.help("rendezvous with the network at this gossip entry point, defaults to 127.0.0.1:8001"),
)
.arg(
Arg::with_name("identity")
.short("i")
.long("identity")
.value_name("PATH")
.takes_value(true)
.required(true)
.help("file containing a client identity (keypair)"),
)
.arg(
Arg::with_name("num-nodes")
.short("N")
.long("num-nodes")
.value_name("NUM")
.takes_value(true)
.help("wait for NUM nodes to converge"),
)
.arg(
Arg::with_name("threads")
.short("t")
.long("threads")
.value_name("NUM")
.takes_value(true)
.help("number of threads"),
)
.arg(
Arg::with_name("duration")
.long("duration")
.value_name("SECS")
.takes_value(true)
.help("run benchmark for SECS seconds then exit, default is forever"),
)
.arg(
Arg::with_name("converge-only")
.long("converge-only")
.help("exit immediately after converging"),
)
.arg(
Arg::with_name("sustained")
.long("sustained")
.help("use sustained performance mode vs. peak mode. This overlaps the tx generation with transfers."),
)
.arg(
Arg::with_name("tx_count")
.long("tx_count")
.value_name("NUM")
.takes_value(true)
.help("number of transactions to send per batch")
)
.get_matches();
let network = if let Some(addr) = matches.value_of("network") {
addr.parse().unwrap_or_else(|e| {
eprintln!("failed to parse network: {}", e);
exit(1)
})
} else {
socketaddr!("127.0.0.1:8001")
};
let id =
read_keypair(matches.value_of("identity").unwrap()).expect("can't read client identity");
let threads = if let Some(t) = matches.value_of("threads") {
t.to_string().parse().expect("can't parse threads")
} else {
4usize
};
let num_nodes = if let Some(n) = matches.value_of("num-nodes") {
n.to_string().parse().expect("can't parse num-nodes")
} else {
1usize
};
let duration = if let Some(s) = matches.value_of("duration") {
Duration::new(s.to_string().parse().expect("can't parse duration"), 0)
} else {
Duration::new(std::u64::MAX, 0)
};
let tx_count = if let Some(s) = matches.value_of("tx_count") {
s.to_string().parse().expect("can't parse tx_count")
} else {
500_000
};
let sustained = matches.is_present("sustained");
println!("Looking for leader at {:?}", network);
let leader = poll_gossip_for_leader(network, None).expect("unable to find leader on network");
let exit_signal = Arc::new(AtomicBool::new(false));
let mut c_threads = vec![];
let (nodes, leader) = converge(&leader, &exit_signal, num_nodes, &mut c_threads);
if nodes.len() < num_nodes {
println!(
"Error: Insufficient nodes discovered. Expecting {} or more",
num_nodes
);
exit(1);
}
if leader.is_none() {
println!("no leader");
exit(1);
}
if matches.is_present("converge-only") {
return;
}
let leader = leader.unwrap();
println!("leader is at {} {}", leader.contact_info.rpu, leader.id);
let mut client = mk_client(&leader);
let mut barrier_client = mk_client(&leader);
let mut seed = [0u8; 32];
seed.copy_from_slice(&id.public_key_bytes()[..32]);
let mut rnd = GenKeys::new(seed);
println!("Creating {} keypairs...", tx_count / 2);
let keypairs = rnd.gen_n_keypairs(tx_count / 2);
let barrier_id = rnd.gen_n_keypairs(1).pop().unwrap();
println!("Get tokens...");
let num_tokens_per_account = 20;
// Sample the first keypair, see if it has tokens, if so then resume
// to avoid token loss
let keypair0_balance = client.poll_get_balance(&keypairs[0].pubkey()).unwrap_or(0);
if num_tokens_per_account > keypair0_balance {
airdrop_tokens(
&mut client,
&leader,
&id,
(num_tokens_per_account - keypair0_balance) * tx_count,
);
}
airdrop_tokens(&mut barrier_client, &leader, &barrier_id, 1);
println!("Get last ID...");
let mut last_id = client.get_last_id();
println!("Got last ID {:?}", last_id);
let first_tx_count = client.transaction_count();
println!("Initial transaction count {}", first_tx_count);
// Setup a thread per validator to sample every period
// collect the max transaction rate and total tx count seen
let maxes = Arc::new(RwLock::new(Vec::new()));
let sample_period = 1; // in seconds
println!("Sampling TPS every {} second...", sample_period);
let v_threads: Vec<_> = nodes
.into_iter()
.map(|v| {
let exit_signal = exit_signal.clone();
let maxes = maxes.clone();
Builder::new()
.name("solana-client-sample".to_string())
.spawn(move || {
sample_tx_count(&exit_signal, &maxes, first_tx_count, &v, sample_period);
}).unwrap()
}).collect();
let shared_txs: Arc<RwLock<VecDeque<Vec<Transaction>>>> =
Arc::new(RwLock::new(VecDeque::new()));
let shared_tx_active_thread_count = Arc::new(AtomicIsize::new(0));
let total_tx_sent_count = Arc::new(AtomicUsize::new(0));
let s_threads: Vec<_> = (0..threads)
.map(|_| {
let exit_signal = exit_signal.clone();
let shared_txs = shared_txs.clone();
let leader = leader.clone();
let shared_tx_active_thread_count = shared_tx_active_thread_count.clone();
let total_tx_sent_count = total_tx_sent_count.clone();
Builder::new()
.name("solana-client-sender".to_string())
.spawn(move || {
do_tx_transfers(
&exit_signal,
&shared_txs,
&leader,
&shared_tx_active_thread_count,
&total_tx_sent_count,
);
}).unwrap()
}).collect();
// generate and send transactions for the specified duration
let start = Instant::now();
let mut reclaim_tokens_back_to_source_account = false;
let mut i = keypair0_balance;
while start.elapsed() < duration {
let balance = client.poll_get_balance(&id.pubkey()).unwrap_or(-1);
metrics_submit_token_balance(balance);
// ping-pong between source and destination accounts for each loop iteration
// this seems to be faster than trying to determine the balance of individual
// accounts
generate_txs(
&shared_txs,
&id,
&keypairs,
&last_id,
threads,
reclaim_tokens_back_to_source_account,
);
// In sustained mode overlap the transfers with generation
// this has higher average performance but lower peak performance
// in tested environments.
if !sustained {
while shared_tx_active_thread_count.load(Ordering::Relaxed) > 0 {
sleep(Duration::from_millis(100));
}
}
// It's not feasible (would take too much time) to confirm each of the `tx_count / 2`
// transactions sent by `generate_txs()` so instead send and confirm a single transaction
// to validate the network is still functional.
send_barrier_transaction(&mut barrier_client, &mut last_id, &barrier_id);
i += 1;
if should_switch_directions(num_tokens_per_account, i) {
reclaim_tokens_back_to_source_account = !reclaim_tokens_back_to_source_account;
}
}
// Stop the sampling threads so it will collect the stats
exit_signal.store(true, Ordering::Relaxed);
println!("Waiting for validator threads...");
for t in v_threads {
if let Err(err) = t.join() {
println!(" join() failed with: {:?}", err);
}
}
// join the tx send threads
println!("Waiting for transmit threads...");
for t in s_threads {
if let Err(err) = t.join() {
println!(" join() failed with: {:?}", err);
}
}
let balance = client.poll_get_balance(&id.pubkey()).unwrap_or(-1);
metrics_submit_token_balance(balance);
compute_and_report_stats(
&maxes,
sample_period,
&start.elapsed(),
total_tx_sent_count.load(Ordering::Relaxed),
);
// join the crdt client threads
for t in c_threads {
t.join().unwrap();
}
}
fn converge(
leader: &NodeInfo,
exit_signal: &Arc<AtomicBool>,
num_nodes: usize,
threads: &mut Vec<JoinHandle<()>>,
) -> (Vec<NodeInfo>, Option<NodeInfo>) {
//lets spy on the network
let (node, gossip_socket) = Crdt::spy_node();
let mut spy_crdt = Crdt::new(node).expect("Crdt::new");
spy_crdt.insert(&leader);
spy_crdt.set_leader(leader.id);
let spy_ref = Arc::new(RwLock::new(spy_crdt));
let window = Arc::new(RwLock::new(default_window()));
let ncp = Ncp::new(
&spy_ref,
window,
BlobRecycler::default(),
None,
gossip_socket,
exit_signal.clone(),
);
let mut v: Vec<NodeInfo> = vec![];
// wait for the network to converge, 30 seconds should be plenty
for _ in 0..30 {
{
let spy_ref = spy_ref.read().unwrap();
println!("{}", spy_ref.node_info_trace());
if spy_ref.leader_data().is_some() {
v = spy_ref
.table
.values()
.filter(|x| Crdt::is_valid_address(&x.contact_info.rpu))
.cloned()
.collect();
if v.len() >= num_nodes {
println!("CONVERGED!");
break;
} else {
println!(
"{} node(s) discovered (looking for {} or more)",
v.len(),
num_nodes
);
}
}
}
sleep(Duration::new(1, 0));
}
threads.extend(ncp.thread_hdls().into_iter());
let leader = spy_ref.read().unwrap().leader_data().cloned();
(v, leader)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_switch_directions() {
assert_eq!(should_switch_directions(20, 0), false);
assert_eq!(should_switch_directions(20, 1), false);
assert_eq!(should_switch_directions(20, 14), false);
assert_eq!(should_switch_directions(20, 15), true);
assert_eq!(should_switch_directions(20, 16), false);
assert_eq!(should_switch_directions(20, 19), false);
assert_eq!(should_switch_directions(20, 20), true);
assert_eq!(should_switch_directions(20, 21), false);
assert_eq!(should_switch_directions(20, 99), false);
assert_eq!(should_switch_directions(20, 100), true);
assert_eq!(should_switch_directions(20, 101), false);
}
}

View File

@ -1,453 +0,0 @@
extern crate bincode;
extern crate clap;
extern crate env_logger;
extern crate rayon;
extern crate serde_json;
extern crate solana;
use bincode::serialize;
use clap::{App, Arg};
use rayon::prelude::*;
use solana::crdt::{Crdt, NodeInfo};
use solana::drone::{DroneRequest, DRONE_PORT};
use solana::fullnode::Config;
use solana::hash::Hash;
use solana::nat::{udp_public_bind, udp_random_bind};
use solana::ncp::Ncp;
use solana::service::Service;
use solana::signature::{read_keypair, GenKeys, KeyPair, KeyPairUtil};
use solana::streamer::default_window;
use solana::thin_client::ThinClient;
use solana::timing::{duration_as_ms, duration_as_s};
use solana::transaction::Transaction;
use std::error;
use std::fs::File;
use std::io::Write;
use std::net::{IpAddr, Ipv4Addr, SocketAddr, TcpStream, UdpSocket};
use std::process::exit;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::{Arc, RwLock};
use std::thread::sleep;
use std::thread::Builder;
use std::thread::JoinHandle;
use std::time::Duration;
use std::time::Instant;
fn sample_tx_count(
exit: &Arc<AtomicBool>,
maxes: &Arc<RwLock<Vec<(f64, u64)>>>,
first_count: u64,
v: &NodeInfo,
sample_period: u64,
) {
let mut client = mk_client(&v);
let mut now = Instant::now();
let mut initial_tx_count = client.transaction_count();
let mut max_tps = 0.0;
let mut total;
loop {
let tx_count = client.transaction_count();
let duration = now.elapsed();
now = Instant::now();
let sample = tx_count - initial_tx_count;
initial_tx_count = tx_count;
println!("{}: Transactions processed {}", v.contact_info.tpu, sample);
let ns = duration.as_secs() * 1_000_000_000 + u64::from(duration.subsec_nanos());
let tps = (sample * 1_000_000_000) as f64 / ns as f64;
if tps > max_tps {
max_tps = tps;
}
println!("{}: {:.2} tps", v.contact_info.tpu, tps);
total = tx_count - first_count;
println!(
"{}: Total Transactions processed {}",
v.contact_info.tpu, total
);
sleep(Duration::new(sample_period, 0));
if exit.load(Ordering::Relaxed) {
println!("exiting validator thread");
maxes.write().unwrap().push((max_tps, total));
break;
}
}
}
fn generate_and_send_txs(
client: &mut ThinClient,
tx_clients: &[ThinClient],
id: &KeyPair,
keypairs: &[KeyPair],
leader: &NodeInfo,
txs: i64,
last_id: &mut Hash,
threads: usize,
reclaim: bool,
) {
println!("Signing transactions... {}", txs / 2,);
let signing_start = Instant::now();
let transactions: Vec<_> = if !reclaim {
keypairs
.par_iter()
.map(|keypair| Transaction::new(&id, keypair.pubkey(), 1, *last_id))
.collect()
} else {
keypairs
.par_iter()
.map(|keypair| Transaction::new(keypair, id.pubkey(), 1, *last_id))
.collect()
};
let duration = signing_start.elapsed();
let ns = duration.as_secs() * 1_000_000_000 + u64::from(duration.subsec_nanos());
let bsps = txs as f64 / ns as f64;
let nsps = ns as f64 / txs as f64;
println!(
"Done. {:.2} thousand signatures per second, {:.2} us per signature, {} ms total time",
bsps * 1_000_000_f64,
nsps / 1_000_f64,
duration_as_ms(&duration),
);
println!(
"Transfering {} transactions in {} batches",
txs / 2,
threads
);
let transfer_start = Instant::now();
let sz = transactions.len() / threads;
let chunks: Vec<_> = transactions.chunks(sz).collect();
chunks
.into_par_iter()
.zip(tx_clients)
.for_each(|(txs, client)| {
println!(
"Transferring 1 unit {} times... to {:?}",
txs.len(),
leader.contact_info.tpu
);
for tx in txs {
client.transfer_signed(tx).unwrap();
}
});
println!(
"Transfer done. {:?} ms {} tps",
duration_as_ms(&transfer_start.elapsed()),
txs as f32 / (duration_as_s(&transfer_start.elapsed()))
);
loop {
let new_id = client.get_last_id();
if *last_id != new_id {
*last_id = new_id;
break;
}
sleep(Duration::from_millis(100));
}
}
fn main() {
env_logger::init();
let mut threads = 4usize;
let mut num_nodes = 1usize;
let mut time_sec = 90;
let matches = App::new("solana-client-demo")
.arg(
Arg::with_name("leader")
.short("l")
.long("leader")
.value_name("PATH")
.takes_value(true)
.help("/path/to/leader.json"),
)
.arg(
Arg::with_name("keypair")
.short("k")
.long("keypair")
.value_name("PATH")
.takes_value(true)
.default_value("~/.config/solana/id.json")
.help("/path/to/id.json"),
)
.arg(
Arg::with_name("num_nodes")
.short("n")
.long("nodes")
.value_name("NUMBER")
.takes_value(true)
.help("number of nodes to converge to"),
)
.arg(
Arg::with_name("threads")
.short("t")
.long("threads")
.value_name("NUMBER")
.takes_value(true)
.help("number of threads"),
)
.arg(
Arg::with_name("seconds")
.short("s")
.long("sec")
.value_name("NUMBER")
.takes_value(true)
.help("send transactions for this many seconds"),
)
.get_matches();
let leader: NodeInfo;
if let Some(l) = matches.value_of("leader") {
leader = read_leader(l).node_info;
} else {
let server_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 8000);
leader = NodeInfo::new_leader(&server_addr);
};
let id = read_keypair(matches.value_of("keypair").unwrap()).expect("client keypair");
if let Some(t) = matches.value_of("threads") {
threads = t.to_string().parse().expect("integer");
}
if let Some(n) = matches.value_of("num_nodes") {
num_nodes = n.to_string().parse().expect("integer");
}
if let Some(s) = matches.value_of("seconds") {
time_sec = s.to_string().parse().expect("integer");
}
let mut drone_addr = leader.contact_info.tpu;
drone_addr.set_port(DRONE_PORT);
let signal = Arc::new(AtomicBool::new(false));
let mut c_threads = vec![];
let validators = converge(&leader, &signal, num_nodes, &mut c_threads);
println!("Network has {} node(s)", validators.len());
assert!(validators.len() >= num_nodes);
let mut client = mk_client(&leader);
let starting_balance = client.poll_get_balance(&id.pubkey()).unwrap();
let txs: i64 = 500_000;
if starting_balance < txs {
let airdrop_amount = txs - starting_balance;
println!("Airdropping {:?} tokens", airdrop_amount);
request_airdrop(&drone_addr, &id, airdrop_amount as u64).unwrap();
// TODO: return airdrop Result from Drone
sleep(Duration::from_millis(100));
let balance = client.poll_get_balance(&id.pubkey()).unwrap();
println!("Your balance is: {:?}", balance);
if balance < txs || (starting_balance == balance) {
println!("TPS airdrop limit reached; wait 60sec to retry");
exit(1);
}
}
println!("Get last ID...");
let mut last_id = client.get_last_id();
println!("Got last ID {:?}", last_id);
let mut seed = [0u8; 32];
seed.copy_from_slice(&id.public_key_bytes()[..32]);
let rnd = GenKeys::new(seed);
println!("Creating keypairs...");
let keypairs = rnd.gen_n_keypairs(txs / 2);
let first_count = client.transaction_count();
println!("initial count {}", first_count);
println!("Sampling tps every second...",);
// Setup a thread per validator to sample every period
// collect the max transaction rate and total tx count seen
let maxes = Arc::new(RwLock::new(Vec::new()));
let sample_period = 1; // in seconds
let v_threads: Vec<_> = validators
.into_iter()
.map(|v| {
let exit = signal.clone();
let maxes = maxes.clone();
Builder::new()
.name("solana-client-sample".to_string())
.spawn(move || {
sample_tx_count(&exit, &maxes, first_count, &v, sample_period);
})
.unwrap()
})
.collect();
let clients: Vec<_> = (0..threads).map(|_| mk_client(&leader)).collect();
// generate and send transactions for the specified duration
let time = Duration::new(time_sec / 2, 0);
let mut now = Instant::now();
while now.elapsed() < time {
generate_and_send_txs(
&mut client,
&clients,
&id,
&keypairs,
&leader,
txs,
&mut last_id,
threads,
false,
);
}
last_id = client.get_last_id();
now = Instant::now();
while now.elapsed() < time {
generate_and_send_txs(
&mut client,
&clients,
&id,
&keypairs,
&leader,
txs,
&mut last_id,
threads,
true,
);
}
// Stop the sampling threads so it will collect the stats
signal.store(true, Ordering::Relaxed);
for t in v_threads {
t.join().unwrap();
}
// Compute/report stats
let mut max_of_maxes = 0.0;
let mut total_txs = 0;
for (max, txs) in maxes.read().unwrap().iter() {
if *max > max_of_maxes {
max_of_maxes = *max;
}
total_txs += *txs;
}
println!(
"\nHighest TPS: {:.2} sampling period {}s total transactions: {} clients: {}",
max_of_maxes,
sample_period,
total_txs,
maxes.read().unwrap().len()
);
// join the crdt client threads
for t in c_threads {
t.join().unwrap();
}
}
fn mk_client(r: &NodeInfo) -> ThinClient {
let requests_socket = udp_random_bind(8000, 10000, 5).unwrap();
let transactions_socket = udp_random_bind(8000, 10000, 5).unwrap();
requests_socket
.set_read_timeout(Some(Duration::new(1, 0)))
.unwrap();
ThinClient::new(
r.contact_info.rpu,
requests_socket,
r.contact_info.tpu,
transactions_socket,
)
}
fn spy_node() -> (NodeInfo, UdpSocket) {
let gossip_socket_pair = udp_public_bind("gossip", 8000, 10000);
let pubkey = KeyPair::new().pubkey();
let daddr = "0.0.0.0:0".parse().unwrap();
assert!(!gossip_socket_pair.addr.ip().is_unspecified());
assert!(!gossip_socket_pair.addr.ip().is_multicast());
let node = NodeInfo::new(
pubkey,
//gossip.local_addr().unwrap(),
gossip_socket_pair.addr,
daddr,
daddr,
daddr,
daddr,
);
(node, gossip_socket_pair.receiver)
}
fn converge(
leader: &NodeInfo,
exit: &Arc<AtomicBool>,
num_nodes: usize,
threads: &mut Vec<JoinHandle<()>>,
) -> Vec<NodeInfo> {
//lets spy on the network
let daddr = "0.0.0.0:0".parse().unwrap();
let (spy, spy_gossip) = spy_node();
let mut spy_crdt = Crdt::new(spy).expect("Crdt::new");
spy_crdt.insert(&leader);
spy_crdt.set_leader(leader.id);
let spy_ref = Arc::new(RwLock::new(spy_crdt));
let window = default_window();
let gossip_send_socket = udp_random_bind(8000, 10000, 5).unwrap();
let ncp = Ncp::new(
&spy_ref,
window.clone(),
spy_gossip,
gossip_send_socket,
exit.clone(),
).expect("DataReplicator::new");
let mut rv = vec![];
//wait for the network to converge, 30 seconds should be plenty
for _ in 0..30 {
let v: Vec<NodeInfo> = spy_ref
.read()
.unwrap()
.table
.values()
.into_iter()
.filter(|x| x.contact_info.rpu != daddr)
.cloned()
.collect();
if v.len() >= num_nodes {
println!("CONVERGED!");
rv.extend(v.into_iter());
break;
} else {
println!(
"{} node(s) discovered (looking for {} or more)",
v.len(),
num_nodes
);
}
sleep(Duration::new(1, 0));
}
threads.extend(ncp.thread_hdls().into_iter());
rv
}
fn read_leader(path: &str) -> Config {
let file = File::open(path).unwrap_or_else(|_| panic!("file not found: {}", path));
serde_json::from_reader(file).unwrap_or_else(|_| panic!("failed to parse {}", path))
}
fn request_airdrop(
drone_addr: &SocketAddr,
id: &KeyPair,
tokens: u64,
) -> Result<(), Box<error::Error>> {
let mut stream = TcpStream::connect(drone_addr)?;
let req = DroneRequest::GetAirdrop {
airdrop_request_amount: tokens,
client_public_key: id.pubkey(),
};
let tx = serialize(&req).expect("serialize drone request");
stream.write_all(&tx).unwrap();
// TODO: add timeout to this function, in case of unresponsive drone
Ok(())
}

View File

@ -1,97 +1,106 @@
extern crate bincode; extern crate bincode;
extern crate bytes;
#[macro_use]
extern crate clap; extern crate clap;
extern crate env_logger; extern crate log;
extern crate serde_json; extern crate serde_json;
extern crate solana; extern crate solana;
extern crate tokio; extern crate tokio;
extern crate tokio_codec; extern crate tokio_codec;
extern crate tokio_io;
use bincode::deserialize; use bincode::{deserialize, serialize};
use bytes::Bytes;
use clap::{App, Arg}; use clap::{App, Arg};
use solana::crdt::NodeInfo;
use solana::drone::{Drone, DroneRequest, DRONE_PORT}; use solana::drone::{Drone, DroneRequest, DRONE_PORT};
use solana::fullnode::Config; use solana::logger;
use solana::metrics::set_panic_hook; use solana::metrics::set_panic_hook;
use solana::signature::read_keypair; use solana::signature::read_keypair;
use std::fs::File; use std::error;
use std::net::{IpAddr, Ipv4Addr, SocketAddr}; use std::io;
use std::net::{Ipv4Addr, SocketAddr};
use std::process::exit;
use std::sync::{Arc, Mutex}; use std::sync::{Arc, Mutex};
use std::thread; use std::thread;
use tokio::net::TcpListener; use tokio::net::TcpListener;
use tokio::prelude::*; use tokio::prelude::*;
use tokio_codec::{BytesCodec, Decoder}; use tokio_codec::{BytesCodec, Decoder};
fn main() { macro_rules! socketaddr {
env_logger::init(); ($ip:expr, $port:expr) => {
SocketAddr::from((Ipv4Addr::from($ip), $port))
};
($str:expr) => {{
let a: SocketAddr = $str.parse().unwrap();
a
}};
}
fn main() -> Result<(), Box<error::Error>> {
logger::setup();
set_panic_hook("drone"); set_panic_hook("drone");
let matches = App::new("drone") let matches = App::new("drone")
.version(crate_version!())
.arg( .arg(
Arg::with_name("leader") Arg::with_name("network")
.short("l") .short("n")
.long("leader") .long("network")
.value_name("PATH") .value_name("HOST:PORT")
.takes_value(true) .takes_value(true)
.help("/path/to/leader.json"), .required(true)
) .help("rendezvous with the network at this gossip entry point"),
.arg( ).arg(
Arg::with_name("keypair") Arg::with_name("keypair")
.short("k") .short("k")
.long("keypair") .long("keypair")
.value_name("PATH") .value_name("PATH")
.takes_value(true) .takes_value(true)
.required(true) .required(true)
.help("/path/to/mint.json"), .help("File to read the client's keypair from"),
) ).arg(
.arg( Arg::with_name("slice")
Arg::with_name("time") .long("slice")
.short("t")
.long("time")
.value_name("SECONDS") .value_name("SECONDS")
.takes_value(true) .takes_value(true)
.help("time slice over which to limit requests to drone"), .help("Time slice over which to limit requests to drone"),
) ).arg(
.arg(
Arg::with_name("cap") Arg::with_name("cap")
.short("c")
.long("cap") .long("cap")
.value_name("NUMBER") .value_name("NUMBER")
.takes_value(true) .takes_value(true)
.help("request limit for time slice"), .help("Request limit for time slice"),
) ).get_matches();
.get_matches();
let leader: NodeInfo; let network = matches
if let Some(l) = matches.value_of("leader") { .value_of("network")
leader = read_leader(l).node_info; .unwrap()
} else { .parse()
let server_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 8000); .unwrap_or_else(|e| {
leader = NodeInfo::new_leader(&server_addr); eprintln!("failed to parse network: {}", e);
}; exit(1)
});
let mint_keypair = let mint_keypair =
read_keypair(matches.value_of("keypair").expect("keypair")).expect("client keypair"); read_keypair(matches.value_of("keypair").unwrap()).expect("failed to read client keypair");
let time_slice: Option<u64>; let time_slice: Option<u64>;
if let Some(t) = matches.value_of("time") { if let Some(secs) = matches.value_of("slice") {
time_slice = Some(t.to_string().parse().expect("integer")); time_slice = Some(secs.to_string().parse().expect("failed to parse slice"));
} else { } else {
time_slice = None; time_slice = None;
} }
let request_cap: Option<u64>; let request_cap: Option<u64>;
if let Some(c) = matches.value_of("cap") { if let Some(c) = matches.value_of("cap") {
request_cap = Some(c.to_string().parse().expect("integer")); request_cap = Some(c.to_string().parse().expect("failed to parse cap"));
} else { } else {
request_cap = None; request_cap = None;
} }
let drone_addr: SocketAddr = format!("0.0.0.0:{}", DRONE_PORT).parse().unwrap(); let drone_addr = socketaddr!(0, DRONE_PORT);
let drone = Arc::new(Mutex::new(Drone::new( let drone = Arc::new(Mutex::new(Drone::new(
mint_keypair, mint_keypair,
drone_addr, drone_addr,
leader.contact_info.tpu, network,
leader.contact_info.rpu,
time_slice, time_slice,
request_cap, request_cap,
))); )));
@ -112,38 +121,43 @@ fn main() {
let drone2 = drone.clone(); let drone2 = drone.clone();
// let client_ip = socket.peer_addr().expect("drone peer_addr").ip(); // let client_ip = socket.peer_addr().expect("drone peer_addr").ip();
let framed = BytesCodec::new().framed(socket); let framed = BytesCodec::new().framed(socket);
let (_writer, reader) = framed.split(); let (writer, reader) = framed.split();
let processor = reader let processor = reader.and_then(move |bytes| {
.for_each(move |bytes| { let req: DroneRequest = deserialize(&bytes).or_else(|err| {
let req: DroneRequest = Err(io::Error::new(
deserialize(&bytes).expect("deserialize packet in drone"); io::ErrorKind::Other,
println!("Airdrop requested..."); format!("deserialize packet in drone: {:?}", err),
// let res = drone2.lock().unwrap().check_rate_limit(client_ip); ))
let res1 = drone2.lock().unwrap().send_airdrop(req); })?;
match res1 {
Ok(_) => println!("Airdrop sent!"), println!("Airdrop requested...");
Err(_) => println!("Request limit reached for this time slice"), // let res = drone2.lock().unwrap().check_rate_limit(client_ip);
} let res1 = drone2.lock().unwrap().send_airdrop(req);
Ok(()) match res1 {
}) Ok(_) => println!("Airdrop sent!"),
.and_then(|()| { Err(_) => println!("Request limit reached for this time slice"),
println!("Socket received FIN packet and closed connection"); }
Ok(()) let response = res1?;
}) println!("Airdrop tx signature: {:?}", response);
.or_else(|err| { let response_vec = serialize(&response).or_else(|err| {
println!("Socket closed with error: {:?}", err); Err(io::Error::new(
Err(err) io::ErrorKind::Other,
}) format!("serialize signature in drone: {:?}", err),
.then(|result| { ))
println!("Socket closed with result: {:?}", result); })?;
Ok(()) let response_bytes = Bytes::from(response_vec.clone());
}); Ok(response_bytes)
tokio::spawn(processor) });
let server = writer
.send_all(processor.or_else(|err| {
Err(io::Error::new(
io::ErrorKind::Other,
format!("Drone response: {:?}", err),
))
})).then(|_| Ok(()));
tokio::spawn(server)
}); });
tokio::run(done); tokio::run(done);
} Ok(())
fn read_leader(path: &str) -> Config {
let file = File::open(path).unwrap_or_else(|_| panic!("file not found: {}", path));
serde_json::from_reader(file).unwrap_or_else(|_| panic!("failed to parse {}", path))
} }

View File

@ -1,58 +1,50 @@
#[macro_use]
extern crate clap; extern crate clap;
extern crate dirs; extern crate dirs;
extern crate serde_json; extern crate serde_json;
extern crate solana; extern crate solana;
use clap::{App, Arg}; use clap::{App, Arg};
use solana::crdt::{get_ip_addr, parse_port_or_addr}; use solana::crdt::FULLNODE_PORT_RANGE;
use solana::fullnode::Config; use solana::fullnode::Config;
use solana::nat::get_public_ip_addr; use solana::netutil::{get_ip_addr, get_public_ip_addr, parse_port_or_addr};
use solana::signature::read_pkcs8; use solana::signature::read_pkcs8;
use std::io; use std::io;
use std::net::SocketAddr; use std::net::SocketAddr;
fn main() { fn main() {
let matches = App::new("fullnode-config") let matches = App::new("fullnode-config")
.version(crate_version!())
.arg( .arg(
Arg::with_name("local") Arg::with_name("local")
.short("l") .short("l")
.long("local") .long("local")
.takes_value(false) .takes_value(false)
.help("detect network address from local machine configuration"), .help("detect network address from local machine configuration"),
) ).arg(
.arg(
Arg::with_name("keypair") Arg::with_name("keypair")
.short("k") .short("k")
.long("keypair") .long("keypair")
.value_name("PATH") .value_name("PATH")
.takes_value(true) .takes_value(true)
.help("/path/to/id.json"), .help("/path/to/id.json"),
) ).arg(
.arg(
Arg::with_name("public") Arg::with_name("public")
.short("p") .short("p")
.long("public") .long("public")
.takes_value(false) .takes_value(false)
.help("detect public network address using public servers"), .help("detect public network address using public servers"),
) ).arg(
.arg(
Arg::with_name("bind") Arg::with_name("bind")
.short("b") .short("b")
.long("bind") .long("bind")
.value_name("PORT") .value_name("PORT")
.takes_value(true) .takes_value(true)
.help("bind to port or address"), .help("bind to port or address"),
) ).get_matches();
.get_matches();
let bind_addr: SocketAddr = { let bind_addr: SocketAddr = {
let mut bind_addr = parse_port_or_addr({ let mut bind_addr = parse_port_or_addr(matches.value_of("bind"), FULLNODE_PORT_RANGE.0);
if let Some(b) = matches.value_of("bind") {
Some(b.to_string())
} else {
None
}
});
if matches.is_present("local") { if matches.is_present("local") {
let ip = get_ip_addr().unwrap(); let ip = get_ip_addr().unwrap();
bind_addr.set_ip(ip); bind_addr.set_ip(ip);

View File

@ -1,25 +1,34 @@
#[macro_use]
extern crate clap; extern crate clap;
extern crate env_logger;
extern crate getopts; extern crate getopts;
#[macro_use]
extern crate log; extern crate log;
extern crate serde_json; extern crate serde_json;
#[macro_use]
extern crate solana; extern crate solana;
use clap::{App, Arg}; use clap::{App, Arg};
use solana::crdt::{NodeInfo, TestNode}; use solana::client::mk_client;
use solana::fullnode::{Config, FullNode, LedgerFile}; use solana::crdt::Node;
use solana::drone::DRONE_PORT;
use solana::fullnode::{Config, Fullnode};
use solana::logger;
use solana::metrics::set_panic_hook; use solana::metrics::set_panic_hook;
use solana::service::Service; use solana::service::Service;
use solana::signature::{KeyPair, KeyPairUtil}; use solana::signature::{Keypair, KeypairUtil};
use solana::thin_client::poll_gossip_for_leader;
use solana::wallet::request_airdrop;
use std::fs::File; use std::fs::File;
use std::net::{IpAddr, Ipv4Addr, SocketAddr}; use std::net::{Ipv4Addr, SocketAddr};
use std::process::exit; use std::process::exit;
//use std::time::Duration; use std::thread::sleep;
use std::time::Duration;
fn main() -> () { fn main() -> () {
env_logger::init(); logger::setup();
set_panic_hook("fullnode"); set_panic_hook("fullnode");
let matches = App::new("fullnode") let matches = App::new("fullnode")
.version(crate_version!())
.arg( .arg(
Arg::with_name("identity") Arg::with_name("identity")
.short("i") .short("i")
@ -27,35 +36,29 @@ fn main() -> () {
.value_name("FILE") .value_name("FILE")
.takes_value(true) .takes_value(true)
.help("run with the identity found in FILE"), .help("run with the identity found in FILE"),
) ).arg(
.arg( Arg::with_name("network")
Arg::with_name("testnet") .short("n")
.short("t") .long("network")
.long("testnet")
.value_name("HOST:PORT") .value_name("HOST:PORT")
.takes_value(true) .takes_value(true)
.help("connect to the network at this gossip entry point"), .help("connect/rendezvous with the network at this gossip entry point"),
) ).arg(
.arg(
Arg::with_name("ledger") Arg::with_name("ledger")
.short("L") .short("l")
.long("ledger") .long("ledger")
.value_name("FILE") .value_name("DIR")
.takes_value(true) .takes_value(true)
.help("use FILE as persistent ledger (defaults to stdin/stdout)"), .required(true)
) .help("use DIR as persistent ledger location"),
.get_matches(); ).get_matches();
let bind_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 8000); let (keypair, ncp) = if let Some(i) = matches.value_of("identity") {
let mut keypair = KeyPair::new();
let mut repl_data = NodeInfo::new_leader_with_pubkey(keypair.pubkey(), &bind_addr);
if let Some(i) = matches.value_of("identity") {
let path = i.to_string(); let path = i.to_string();
if let Ok(file) = File::open(path.clone()) { if let Ok(file) = File::open(path.clone()) {
let parse: serde_json::Result<Config> = serde_json::from_reader(file); let parse: serde_json::Result<Config> = serde_json::from_reader(file);
if let Ok(data) = parse { if let Ok(data) = parse {
keypair = data.keypair(); (data.keypair(), data.node_info.contact_info.ncp)
repl_data = data.node_info;
} else { } else {
eprintln!("failed to parse {}", path); eprintln!("failed to parse {}", path);
exit(1); exit(1);
@ -64,23 +67,62 @@ fn main() -> () {
eprintln!("failed to read {}", path); eprintln!("failed to read {}", path);
exit(1); exit(1);
} }
} else {
(Keypair::new(), socketaddr!(0, 8000))
};
let ledger_path = matches.value_of("ledger").unwrap();
// socketaddr that is initial pointer into the network's gossip (ncp)
let network = matches
.value_of("network")
.map(|network| network.parse().expect("failed to parse network address"));
let node = Node::new_with_external_ip(keypair.pubkey(), &ncp);
// save off some stuff for airdrop
let node_info = node.info.clone();
let pubkey = keypair.pubkey();
let fullnode = Fullnode::new(node, ledger_path, keypair, network, false);
// airdrop stuff, probably goes away at some point
let leader = match network {
Some(network) => {
poll_gossip_for_leader(network, None).expect("can't find leader on network")
}
None => node_info,
};
let mut client = mk_client(&leader);
// TODO: maybe have the drone put itself in gossip somewhere instead of hardcoding?
let drone_addr = match network {
Some(network) => SocketAddr::new(network.ip(), DRONE_PORT),
None => SocketAddr::new(ncp.ip(), DRONE_PORT),
};
loop {
let balance = client.poll_get_balance(&pubkey).unwrap_or(0);
info!("balance is {}", balance);
if balance >= 50 {
info!("good to go!");
break;
}
info!("requesting airdrop from {}", drone_addr);
loop {
if request_airdrop(&drone_addr, &pubkey, 50).is_ok() {
break;
}
info!(
"airdrop request, is the drone address correct {:?}, drone running?",
drone_addr
);
sleep(Duration::from_secs(2));
}
} }
let ledger = if let Some(l) = matches.value_of("ledger") {
LedgerFile::Path(l.to_string())
} else {
LedgerFile::StdInOut
};
let mut node = TestNode::new_with_bind_addr(repl_data, bind_addr); fullnode.join().expect("to never happen");
let fullnode = if let Some(t) = matches.value_of("testnet") {
let testnet_address_string = t.to_string();
let testnet_addr = testnet_address_string.parse().unwrap();
FullNode::new(node, false, ledger, Some(keypair), Some(testnet_addr))
} else {
node.data.leader_id = node.data.id;
FullNode::new(node, true, ledger, None, None)
};
fullnode.join().expect("join");
} }

View File

@ -8,14 +8,15 @@ extern crate solana;
use atty::{is, Stream}; use atty::{is, Stream};
use clap::{App, Arg}; use clap::{App, Arg};
use solana::entry_writer::EntryWriter; use solana::ledger::LedgerWriter;
use solana::mint::Mint; use solana::mint::Mint;
use std::error; use std::error;
use std::io::{stdin, stdout, Read}; use std::io::{stdin, Read};
use std::process::exit; use std::process::exit;
fn main() -> Result<(), Box<error::Error>> { fn main() -> Result<(), Box<error::Error>> {
let matches = App::new("solana-genesis") let matches = App::new("solana-genesis")
.version(crate_version!())
.arg( .arg(
Arg::with_name("tokens") Arg::with_name("tokens")
.short("t") .short("t")
@ -24,10 +25,18 @@ fn main() -> Result<(), Box<error::Error>> {
.takes_value(true) .takes_value(true)
.required(true) .required(true)
.help("Number of tokens with which to initialize mint"), .help("Number of tokens with which to initialize mint"),
) ).arg(
.get_matches(); Arg::with_name("ledger")
.short("l")
.long("ledger")
.value_name("DIR")
.takes_value(true)
.required(true)
.help("use DIR as persistent ledger location"),
).get_matches();
let tokens = value_t_or_exit!(matches, "tokens", i64); let tokens = value_t_or_exit!(matches, "tokens", i64);
let ledger_path = matches.value_of("ledger").unwrap();
if is(Stream::Stdin) { if is(Stream::Stdin) {
eprintln!("nothing found on stdin, expected a json file"); eprintln!("nothing found on stdin, expected a json file");
@ -44,7 +53,8 @@ fn main() -> Result<(), Box<error::Error>> {
let pkcs8: Vec<u8> = serde_json::from_str(&buffer)?; let pkcs8: Vec<u8> = serde_json::from_str(&buffer)?;
let mint = Mint::new_with_pkcs8(tokens, pkcs8); let mint = Mint::new_with_pkcs8(tokens, pkcs8);
let mut writer = stdout(); let mut ledger_writer = LedgerWriter::open(&ledger_path, true)?;
EntryWriter::write_entries(&mut writer, mint.create_entries())?; ledger_writer.write_entries(mint.create_entries())?;
Ok(()) Ok(())
} }

View File

@ -1,3 +1,4 @@
#[macro_use]
extern crate clap; extern crate clap;
extern crate dirs; extern crate dirs;
extern crate ring; extern crate ring;
@ -13,15 +14,16 @@ use std::path::Path;
fn main() -> Result<(), Box<error::Error>> { fn main() -> Result<(), Box<error::Error>> {
let matches = App::new("solana-keygen") let matches = App::new("solana-keygen")
.version(crate_version!())
.arg( .arg(
Arg::with_name("outfile") Arg::with_name("outfile")
.short("o") .short("o")
.long("outfile") .long("outfile")
.value_name("PATH") .value_name("PATH")
.takes_value(true) .takes_value(true)
.required(true)
.help("path to generated file"), .help("path to generated file"),
) ).get_matches();
.get_matches();
let rnd = SystemRandom::new(); let rnd = SystemRandom::new();
let pkcs8_bytes = Ed25519KeyPair::generate_pkcs8(&rnd)?; let pkcs8_bytes = Ed25519KeyPair::generate_pkcs8(&rnd)?;

137
src/bin/ledger-tool.rs Normal file
View File

@ -0,0 +1,137 @@
#[macro_use]
extern crate clap;
extern crate serde_json;
extern crate solana;
use clap::{App, Arg, SubCommand};
use solana::bank::Bank;
use solana::ledger::{read_ledger, verify_ledger};
use solana::logger;
use std::io::{stdout, Write};
use std::process::exit;
fn main() {
logger::setup();
let matches = App::new("ledger-tool")
.version(crate_version!())
.arg(
Arg::with_name("ledger")
.short("l")
.long("ledger")
.value_name("DIR")
.takes_value(true)
.required(true)
.help("use DIR for ledger location"),
)
.arg(
Arg::with_name("head")
.short("n")
.long("head")
.value_name("NUM")
.takes_value(true)
.help("at most the first NUM entries in ledger\n (only applies to verify, print, json commands)"),
)
.arg(
Arg::with_name("precheck")
.short("p")
.long("precheck")
.help("use ledger_verify() to check internal ledger consistency before proceeding"),
)
.subcommand(SubCommand::with_name("print").about("Print the ledger"))
.subcommand(SubCommand::with_name("json").about("Print the ledger in JSON format"))
.subcommand(SubCommand::with_name("verify").about("Verify the ledger's PoH"))
.get_matches();
let ledger_path = matches.value_of("ledger").unwrap();
if matches.is_present("precheck") {
if let Err(e) = verify_ledger(&ledger_path) {
eprintln!("ledger precheck failed, error: {:?} ", e);
exit(1);
}
}
let entries = match read_ledger(ledger_path, true) {
Ok(entries) => entries,
Err(err) => {
eprintln!("Failed to open ledger at {}: {}", ledger_path, err);
exit(1);
}
};
let head = match matches.value_of("head") {
Some(head) => head.parse().expect("please pass a number for --head"),
None => <usize>::max_value(),
};
match matches.subcommand() {
("print", _) => {
let entries = match read_ledger(ledger_path, true) {
Ok(entries) => entries,
Err(err) => {
eprintln!("Failed to open ledger at {}: {}", ledger_path, err);
exit(1);
}
};
for (i, entry) in entries.enumerate() {
if i >= head {
break;
}
let entry = entry.unwrap();
println!("{:?}", entry);
}
}
("json", _) => {
stdout().write_all(b"{\"ledger\":[\n").expect("open array");
for (i, entry) in entries.enumerate() {
if i >= head {
break;
}
let entry = entry.unwrap();
serde_json::to_writer(stdout(), &entry).expect("serialize");
stdout().write_all(b",\n").expect("newline");
}
stdout().write_all(b"\n]}\n").expect("close array");
}
("verify", _) => {
if head < 2 {
eprintln!("verify requires at least 2 entries to run");
exit(1);
}
let bank = Bank::default();
{
let genesis = match read_ledger(ledger_path, true) {
Ok(entries) => entries,
Err(err) => {
eprintln!("Failed to open ledger at {}: {}", ledger_path, err);
exit(1);
}
};
let genesis = genesis.take(2).map(|e| e.unwrap());
if let Err(e) = bank.process_ledger(genesis) {
eprintln!("verify failed at genesis err: {:?}", e);
exit(1);
}
}
let entries = entries.map(|e| e.unwrap());
let head = head - 2;
for (i, entry) in entries.skip(2).enumerate() {
if i >= head {
break;
}
if let Err(e) = bank.process_entry(entry) {
eprintln!("verify failed at entry[{}], err: {:?}", i + 2, e);
exit(1);
}
}
}
("", _) => {
eprintln!("{}", matches.usage());
exit(1);
}
_ => unreachable!(),
};
}

View File

@ -1,25 +1,26 @@
extern crate atty; extern crate atty;
extern crate bincode; extern crate bincode;
extern crate bs58; extern crate bs58;
#[macro_use]
extern crate clap; extern crate clap;
extern crate dirs; extern crate dirs;
extern crate env_logger;
extern crate serde_json; extern crate serde_json;
#[macro_use]
extern crate solana; extern crate solana;
use bincode::serialize;
use clap::{App, Arg, SubCommand}; use clap::{App, Arg, SubCommand};
use solana::client::mk_client;
use solana::crdt::NodeInfo; use solana::crdt::NodeInfo;
use solana::drone::{DroneRequest, DRONE_PORT}; use solana::drone::DRONE_PORT;
use solana::fullnode::Config; use solana::fullnode::Config;
use solana::signature::{read_keypair, KeyPair, KeyPairUtil, PublicKey, Signature}; use solana::logger;
use solana::thin_client::ThinClient; use solana::signature::{read_keypair, Keypair, KeypairUtil, Pubkey, Signature};
use solana::thin_client::{poll_gossip_for_leader, ThinClient};
use solana::wallet::request_airdrop;
use std::error; use std::error;
use std::fmt; use std::fmt;
use std::fs::File; use std::fs::File;
use std::io; use std::net::{IpAddr, Ipv4Addr, SocketAddr};
use std::io::prelude::*;
use std::net::{IpAddr, Ipv4Addr, SocketAddr, TcpStream, UdpSocket};
use std::thread::sleep; use std::thread::sleep;
use std::time::Duration; use std::time::Duration;
@ -27,7 +28,7 @@ enum WalletCommand {
Address, Address,
Balance, Balance,
AirDrop(i64), AirDrop(i64),
Pay(i64, PublicKey), Pay(i64, Pubkey),
Confirm(Signature), Confirm(Signature),
} }
@ -56,17 +57,17 @@ impl error::Error for WalletError {
struct WalletConfig { struct WalletConfig {
leader: NodeInfo, leader: NodeInfo,
id: KeyPair, id: Keypair,
drone_addr: SocketAddr, drone_addr: SocketAddr,
command: WalletCommand, command: WalletCommand,
} }
impl Default for WalletConfig { impl Default for WalletConfig {
fn default() -> WalletConfig { fn default() -> WalletConfig {
let default_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 8000); let default_addr = socketaddr!(0, 8000);
WalletConfig { WalletConfig {
leader: NodeInfo::new_leader(&default_addr), leader: NodeInfo::new_with_socketaddr(&default_addr),
id: KeyPair::new(), id: Keypair::new(),
drone_addr: default_addr, drone_addr: default_addr,
command: WalletCommand::Balance, command: WalletCommand::Balance,
} }
@ -75,6 +76,7 @@ impl Default for WalletConfig {
fn parse_args() -> Result<WalletConfig, Box<error::Error>> { fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
let matches = App::new("solana-wallet") let matches = App::new("solana-wallet")
.version(crate_version!())
.arg( .arg(
Arg::with_name("leader") Arg::with_name("leader")
.short("l") .short("l")
@ -82,50 +84,48 @@ fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
.value_name("PATH") .value_name("PATH")
.takes_value(true) .takes_value(true)
.help("/path/to/leader.json"), .help("/path/to/leader.json"),
) ).arg(
.arg(
Arg::with_name("keypair") Arg::with_name("keypair")
.short("k") .short("k")
.long("keypair") .long("keypair")
.value_name("PATH") .value_name("PATH")
.takes_value(true) .takes_value(true)
.help("/path/to/id.json"), .help("/path/to/id.json"),
) ).arg(
.subcommand( Arg::with_name("timeout")
.long("timeout")
.value_name("SECONDS")
.takes_value(true)
.help("Max SECONDS to wait to get necessary gossip from the network"),
).subcommand(
SubCommand::with_name("airdrop") SubCommand::with_name("airdrop")
.about("Request a batch of tokens") .about("Request a batch of tokens")
.arg( .arg(
Arg::with_name("tokens") Arg::with_name("tokens")
// .index(1)
.long("tokens") .long("tokens")
.value_name("NUMBER") .value_name("NUMBER")
.takes_value(true) .takes_value(true)
.required(true) .required(true)
.help("The number of tokens to request"), .help("The number of tokens to request"),
), ),
) ).subcommand(
.subcommand(
SubCommand::with_name("pay") SubCommand::with_name("pay")
.about("Send a payment") .about("Send a payment")
.arg( .arg(
Arg::with_name("tokens") Arg::with_name("tokens")
// .index(2)
.long("tokens") .long("tokens")
.value_name("NUMBER") .value_name("NUMBER")
.takes_value(true) .takes_value(true)
.required(true) .required(true)
.help("the number of tokens to send"), .help("The number of tokens to send"),
) ).arg(
.arg(
Arg::with_name("to") Arg::with_name("to")
// .index(1)
.long("to") .long("to")
.value_name("PUBKEY") .value_name("PUBKEY")
.takes_value(true) .takes_value(true)
.help("The pubkey of recipient"), .help("The pubkey of recipient"),
), ),
) ).subcommand(
.subcommand(
SubCommand::with_name("confirm") SubCommand::with_name("confirm")
.about("Confirm your payment by signature") .about("Confirm your payment by signature")
.arg( .arg(
@ -135,8 +135,7 @@ fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
.required(true) .required(true)
.help("The transaction signature to confirm"), .help("The transaction signature to confirm"),
), ),
) ).subcommand(SubCommand::with_name("balance").about("Get your balance"))
.subcommand(SubCommand::with_name("balance").about("Get your balance"))
.subcommand(SubCommand::with_name("address").about("Get your public key")) .subcommand(SubCommand::with_name("address").about("Get your public key"))
.get_matches(); .get_matches();
@ -145,8 +144,14 @@ fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
leader = read_leader(l)?.node_info; leader = read_leader(l)?.node_info;
} else { } else {
let server_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 8000); let server_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 8000);
leader = NodeInfo::new_leader(&server_addr); leader = NodeInfo::new_with_socketaddr(&server_addr);
}; };
let timeout: Option<u64>;
if let Some(secs) = matches.value_of("timeout") {
timeout = Some(secs.to_string().parse().expect("integer"));
} else {
timeout = None;
}
let mut path = dirs::home_dir().expect("home directory"); let mut path = dirs::home_dir().expect("home directory");
let id_path = if matches.is_present("keypair") { let id_path = if matches.is_present("keypair") {
@ -156,13 +161,14 @@ fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
path.to_str().unwrap() path.to_str().unwrap()
}; };
let id = read_keypair(id_path).or_else(|err| { let id = read_keypair(id_path).or_else(|err| {
display_actions();
Err(WalletError::BadParameter(format!( Err(WalletError::BadParameter(format!(
"{}: Unable to open keypair file: {}", "{}: Unable to open keypair file: {}",
err, id_path err, id_path
))) )))
})?; })?;
let leader = poll_gossip_for_leader(leader.contact_info.ncp, timeout)?;
let mut drone_addr = leader.contact_info.tpu; let mut drone_addr = leader.contact_info.tpu;
drone_addr.set_port(DRONE_PORT); drone_addr.set_port(DRONE_PORT);
@ -177,11 +183,11 @@ fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
.into_vec() .into_vec()
.expect("base58-encoded public key"); .expect("base58-encoded public key");
if pubkey_vec.len() != std::mem::size_of::<PublicKey>() { if pubkey_vec.len() != std::mem::size_of::<Pubkey>() {
display_actions(); eprintln!("{}", pay_matches.usage());
Err(WalletError::BadParameter("Invalid public key".to_string()))?; Err(WalletError::BadParameter("Invalid public key".to_string()))?;
} }
PublicKey::clone_from_slice(&pubkey_vec) Pubkey::new(&pubkey_vec)
} else { } else {
id.pubkey() id.pubkey()
}; };
@ -191,22 +197,22 @@ fn parse_args() -> Result<WalletConfig, Box<error::Error>> {
Ok(WalletCommand::Pay(tokens, to)) Ok(WalletCommand::Pay(tokens, to))
} }
("confirm", Some(confirm_matches)) => { ("confirm", Some(confirm_matches)) => {
let sig_vec = bs58::decode(confirm_matches.value_of("signature").unwrap()) let signatures = bs58::decode(confirm_matches.value_of("signature").unwrap())
.into_vec() .into_vec()
.expect("base58-encoded signature"); .expect("base58-encoded signature");
if sig_vec.len() == std::mem::size_of::<Signature>() { if signatures.len() == std::mem::size_of::<Signature>() {
let sig = Signature::clone_from_slice(&sig_vec); let signature = Signature::new(&signatures);
Ok(WalletCommand::Confirm(sig)) Ok(WalletCommand::Confirm(signature))
} else { } else {
display_actions(); eprintln!("{}", confirm_matches.usage());
Err(WalletError::BadParameter("Invalid signature".to_string())) Err(WalletError::BadParameter("Invalid signature".to_string()))
} }
} }
("balance", Some(_balance_matches)) => Ok(WalletCommand::Balance), ("balance", Some(_balance_matches)) => Ok(WalletCommand::Balance),
("address", Some(_address_matches)) => Ok(WalletCommand::Address), ("address", Some(_address_matches)) => Ok(WalletCommand::Address),
("", None) => { ("", None) => {
display_actions(); println!("{}", matches.usage());
Err(WalletError::CommandNotRecognized( Err(WalletError::CommandNotRecognized(
"no subcommand given".to_string(), "no subcommand given".to_string(),
)) ))
@ -229,7 +235,7 @@ fn process_command(
match config.command { match config.command {
// Check client balance // Check client balance
WalletCommand::Address => { WalletCommand::Address => {
println!("{}", bs58::encode(config.id.pubkey()).into_string()); println!("{}", config.id.pubkey());
} }
WalletCommand::Balance => { WalletCommand::Balance => {
println!("Balance requested..."); println!("Balance requested...");
@ -253,15 +259,18 @@ fn process_command(
"Requesting airdrop of {:?} tokens from {}", "Requesting airdrop of {:?} tokens from {}",
tokens, config.drone_addr tokens, config.drone_addr
); );
let previous_balance = client.poll_get_balance(&config.id.pubkey())?; let previous_balance = client.poll_get_balance(&config.id.pubkey()).unwrap_or(0);
request_airdrop(&config.drone_addr, &config.id, tokens as u64)?; request_airdrop(&config.drone_addr, &config.id.pubkey(), tokens as u64)?;
// TODO: return airdrop Result from Drone instead of polling the // TODO: return airdrop Result from Drone instead of polling the
// network // network
let mut current_balance = previous_balance; let mut current_balance = previous_balance;
for _ in 0..20 { for _ in 0..20 {
sleep(Duration::from_millis(500)); sleep(Duration::from_millis(500));
current_balance = client.poll_get_balance(&config.id.pubkey())?; current_balance = client
.poll_get_balance(&config.id.pubkey())
.unwrap_or(previous_balance);
if previous_balance != current_balance { if previous_balance != current_balance {
break; break;
} }
@ -275,12 +284,12 @@ fn process_command(
// If client has positive balance, spend tokens in {balance} number of transactions // If client has positive balance, spend tokens in {balance} number of transactions
WalletCommand::Pay(tokens, to) => { WalletCommand::Pay(tokens, to) => {
let last_id = client.get_last_id(); let last_id = client.get_last_id();
let sig = client.transfer(tokens, &config.id, to, &last_id)?; let signature = client.transfer(tokens, &config.id, to, &last_id)?;
println!("{}", bs58::encode(sig).into_string()); println!("{}", signature);
} }
// Confirm the last client transaction by signature // Confirm the last client transaction by signature
WalletCommand::Confirm(sig) => { WalletCommand::Confirm(signature) => {
if client.check_signature(&sig) { if client.check_signature(&signature) {
println!("Confirmed"); println!("Confirmed");
} else { } else {
println!("Not found"); println!("Not found");
@ -290,17 +299,6 @@ fn process_command(
Ok(()) Ok(())
} }
fn display_actions() {
println!();
println!("Commands:");
println!(" address Get your public key");
println!(" balance Get your account balance");
println!(" airdrop Request a batch of tokens");
println!(" pay Send tokens to a public key");
println!(" confirm Confirm your last payment by signature");
println!();
}
fn read_leader(path: &str) -> Result<Config, WalletError> { fn read_leader(path: &str) -> Result<Config, WalletError> {
let file = File::open(path.to_string()).or_else(|err| { let file = File::open(path.to_string()).or_else(|err| {
Err(WalletError::BadParameter(format!( Err(WalletError::BadParameter(format!(
@ -317,40 +315,9 @@ fn read_leader(path: &str) -> Result<Config, WalletError> {
}) })
} }
fn mk_client(r: &NodeInfo) -> io::Result<ThinClient> {
let requests_socket = UdpSocket::bind("0.0.0.0:0").unwrap();
let transactions_socket = UdpSocket::bind("0.0.0.0:0").unwrap();
requests_socket
.set_read_timeout(Some(Duration::new(1, 0)))
.unwrap();
Ok(ThinClient::new(
r.contact_info.rpu,
requests_socket,
r.contact_info.tpu,
transactions_socket,
))
}
fn request_airdrop(
drone_addr: &SocketAddr,
id: &KeyPair,
tokens: u64,
) -> Result<(), Box<error::Error>> {
let mut stream = TcpStream::connect(drone_addr)?;
let req = DroneRequest::GetAirdrop {
airdrop_request_amount: tokens,
client_public_key: id.pubkey(),
};
let tx = serialize(&req).expect("serialize drone request");
stream.write_all(&tx).unwrap();
// TODO: add timeout to this function, in case of unresponsive drone
Ok(())
}
fn main() -> Result<(), Box<error::Error>> { fn main() -> Result<(), Box<error::Error>> {
env_logger::init(); logger::setup();
let config = parse_args()?; let config = parse_args()?;
let mut client = mk_client(&config.leader)?; let mut client = mk_client(&config.leader);
process_command(&config, &mut client) process_command(&config, &mut client)
} }

View File

@ -16,31 +16,25 @@ pub struct BlobFetchStage {
impl BlobFetchStage { impl BlobFetchStage {
pub fn new( pub fn new(
socket: UdpSocket, socket: Arc<UdpSocket>,
exit: Arc<AtomicBool>, exit: Arc<AtomicBool>,
blob_recycler: &BlobRecycler, recycler: &BlobRecycler,
) -> (Self, BlobReceiver) { ) -> (Self, BlobReceiver) {
Self::new_multi_socket(vec![socket], exit, blob_recycler) Self::new_multi_socket(vec![socket], exit, recycler)
} }
pub fn new_multi_socket( pub fn new_multi_socket(
sockets: Vec<UdpSocket>, sockets: Vec<Arc<UdpSocket>>,
exit: Arc<AtomicBool>, exit: Arc<AtomicBool>,
blob_recycler: &BlobRecycler, recycler: &BlobRecycler,
) -> (Self, BlobReceiver) { ) -> (Self, BlobReceiver) {
let (blob_sender, blob_receiver) = channel(); let (sender, receiver) = channel();
let thread_hdls: Vec<_> = sockets let thread_hdls: Vec<_> = sockets
.into_iter() .into_iter()
.map(|socket| { .map(|socket| {
streamer::blob_receiver( streamer::blob_receiver(socket, exit.clone(), recycler.clone(), sender.clone())
exit.clone(), }).collect();
blob_recycler.clone(),
socket,
blob_sender.clone(),
).expect("blob receiver init")
})
.collect();
(BlobFetchStage { exit, thread_hdls }, blob_receiver) (BlobFetchStage { exit, thread_hdls }, receiver)
} }
pub fn close(&self) { pub fn close(&self) {

225
src/broadcast_stage.rs Normal file
View File

@ -0,0 +1,225 @@
//! The `broadcast_stage` broadcasts data from a leader node to validators
//!
use counter::Counter;
use crdt::{Crdt, CrdtError, NodeInfo};
use entry::Entry;
#[cfg(feature = "erasure")]
use erasure;
use ledger::Block;
use log::Level;
use packet::{BlobRecycler, SharedBlobs};
use rayon::prelude::*;
use result::{Error, Result};
use service::Service;
use std::mem;
use std::net::UdpSocket;
use std::sync::atomic::AtomicUsize;
use std::sync::mpsc::{Receiver, RecvTimeoutError};
use std::sync::{Arc, RwLock};
use std::thread::{self, Builder, JoinHandle};
use std::time::{Duration, Instant};
use timing::duration_as_ms;
use window::{self, SharedWindow, WindowIndex, WindowUtil, WINDOW_SIZE};
fn broadcast(
node_info: &NodeInfo,
broadcast_table: &[NodeInfo],
window: &SharedWindow,
recycler: &BlobRecycler,
receiver: &Receiver<Vec<Entry>>,
sock: &UdpSocket,
transmit_index: &mut WindowIndex,
receive_index: &mut u64,
) -> Result<()> {
let id = node_info.id;
let timer = Duration::new(1, 0);
let entries = receiver.recv_timeout(timer)?;
let mut num_entries = entries.len();
let mut ventries = Vec::new();
ventries.push(entries);
while let Ok(entries) = receiver.try_recv() {
num_entries += entries.len();
ventries.push(entries);
}
let to_blobs_start = Instant::now();
let dq: SharedBlobs = ventries
.into_par_iter()
.flat_map(|p| p.to_blobs(recycler))
.collect();
let to_blobs_elapsed = duration_as_ms(&to_blobs_start.elapsed());
// flatten deque to vec
let blobs_vec: Vec<_> = dq.into_iter().collect();
let blobs_chunking = Instant::now();
// We could receive more blobs than window slots so
// break them up into window-sized chunks to process
let blobs_chunked = blobs_vec.chunks(WINDOW_SIZE as usize).map(|x| x.to_vec());
let chunking_elapsed = duration_as_ms(&blobs_chunking.elapsed());
trace!("{}", window.read().unwrap().print(&id, *receive_index));
let broadcast_start = Instant::now();
for mut blobs in blobs_chunked {
let blobs_len = blobs.len();
trace!("{}: broadcast blobs.len: {}", id, blobs_len);
// Index the blobs
window::index_blobs(node_info, &blobs, receive_index)
.expect("index blobs for initial window");
// keep the cache of blobs that are broadcast
inc_new_counter_info!("streamer-broadcast-sent", blobs.len());
{
let mut win = window.write().unwrap();
assert!(blobs.len() <= win.len());
for b in &blobs {
let ix = b.read().unwrap().get_index().expect("blob index");
let pos = (ix % WINDOW_SIZE) as usize;
if let Some(x) = mem::replace(&mut win[pos].data, None) {
trace!(
"{} popped {} at {}",
id,
x.read().unwrap().get_index().unwrap(),
pos
);
recycler.recycle(x, "broadcast-data");
}
if let Some(x) = mem::replace(&mut win[pos].coding, None) {
trace!(
"{} popped {} at {}",
id,
x.read().unwrap().get_index().unwrap(),
pos
);
recycler.recycle(x, "broadcast-coding");
}
trace!("{} null {}", id, pos);
}
while let Some(b) = blobs.pop() {
let ix = b.read().unwrap().get_index().expect("blob index");
let pos = (ix % WINDOW_SIZE) as usize;
trace!("{} caching {} at {}", id, ix, pos);
assert!(win[pos].data.is_none());
win[pos].data = Some(b);
}
}
// Fill in the coding blob data from the window data blobs
#[cfg(feature = "erasure")]
{
erasure::generate_coding(
&id,
&mut window.write().unwrap(),
recycler,
*receive_index,
blobs_len,
&mut transmit_index.coding,
)?;
}
*receive_index += blobs_len as u64;
// Send blobs out from the window
Crdt::broadcast(
&node_info,
&broadcast_table,
&window,
&sock,
transmit_index,
*receive_index,
)?;
}
let broadcast_elapsed = duration_as_ms(&broadcast_start.elapsed());
info!(
"broadcast: {} entries, blob time {} chunking time {} broadcast time {}",
num_entries, to_blobs_elapsed, chunking_elapsed, broadcast_elapsed
);
Ok(())
}
pub struct BroadcastStage {
thread_hdl: JoinHandle<()>,
}
impl BroadcastStage {
fn run(
sock: &UdpSocket,
crdt: &Arc<RwLock<Crdt>>,
window: &SharedWindow,
entry_height: u64,
recycler: &BlobRecycler,
receiver: &Receiver<Vec<Entry>>,
) {
let mut transmit_index = WindowIndex {
data: entry_height,
coding: entry_height,
};
let mut receive_index = entry_height;
let me = crdt.read().unwrap().my_data().clone();
loop {
let broadcast_table = crdt.read().unwrap().compute_broadcast_table();
if let Err(e) = broadcast(
&me,
&broadcast_table,
&window,
&recycler,
&receiver,
&sock,
&mut transmit_index,
&mut receive_index,
) {
match e {
Error::RecvTimeoutError(RecvTimeoutError::Disconnected) => break,
Error::RecvTimeoutError(RecvTimeoutError::Timeout) => (),
Error::CrdtError(CrdtError::NoPeers) => (), // TODO: Why are the unit-tests throwing hundreds of these?
_ => {
inc_new_counter_info!("streamer-broadcaster-error", 1, 1);
error!("broadcaster error: {:?}", e);
}
}
}
}
}
/// Service to broadcast messages from the leader to layer 1 nodes.
/// See `crdt` for network layer definitions.
/// # Arguments
/// * `sock` - Socket to send from.
/// * `exit` - Boolean to signal system exit.
/// * `crdt` - CRDT structure
/// * `window` - Cache of blobs that we have broadcast
/// * `recycler` - Blob recycler.
/// * `receiver` - Receive channel for blobs to be retransmitted to all the layer 1 nodes.
pub fn new(
sock: UdpSocket,
crdt: Arc<RwLock<Crdt>>,
window: SharedWindow,
entry_height: u64,
recycler: BlobRecycler,
receiver: Receiver<Vec<Entry>>,
) -> Self {
let thread_hdl = Builder::new()
.name("solana-broadcaster".to_string())
.spawn(move || {
Self::run(&sock, &crdt, &window, entry_height, &recycler, &receiver);
}).unwrap();
BroadcastStage { thread_hdl }
}
}
impl Service for BroadcastStage {
fn thread_hdls(self) -> Vec<JoinHandle<()>> {
vec![self.thread_hdl]
}
fn join(self) -> thread::Result<()> {
self.thread_hdl.join()
}
}

Some files were not shown because too many files have changed in this diff Show More