Cost model 1.7 (#20188)

* Cost Model to limit transactions which are not parallelizeable (#16694)

* * Add following to banking_stage:
  1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
  2. CostTracker which is shared between threads, tracks transaction costs for each block.

* replace hard coded program ID with id() calls

* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.

* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.

* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes

* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;

* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations

* replay stage feed back program cost (#17731)

* replay stage feeds back realtime per-program execution cost to cost model;

* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;

* changed cost unit to microsecond, using value collected from mainnet;

* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.

* investigate system performance test degradation  (#17919)

* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table

* Change mutex on cost_tracker to RwLock

* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.

* acquire and hold locks for block of TXs, instead of acquire and release per transaction;

* remove redundant would_fit check from cost_tracker update execution path

* refactor cost checking with less frequent lock acquiring

* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.

* create hashmap with new_capacity to reduce runtime heap realloc.

* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default

* address potential deadlock by acquiring locks one at time

* Persist cost table to blockstore (#18123)

* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.

* log warning when channel send fails (#18391)

* Aggregate cost_model into cost_tracker (#18374)

* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions

* review fixes

* update ledger tool to restore cost table from blockstore (#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test

* manually fix merge conflicts

* Per-program id timings (#17554)

* more manual fixing

* solve a merge conflict

* featurize cost model

* more merge fix

* cost model uses compute_unit to replace microsecond as cost unit
(#18934)

* Reject blocks for costs above the max block cost (#18994)

* Update block max cost limit to fix performance regession (#19276)

* replace function with const var for better readability (#19285)

* Add few more metrics data points (#19624)

* periodically report sigverify_stage stats (#19674)

* manual merge

* cost model nits (#18528)

* Accumulate consumed units (#18714)

* tx wide compute budget (#18631)

* more manual merge

* ignore zerorize drop security

* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984

* add transaction cost histogram metrics (#20350)

* rebase to 1.7.15

* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes

* remove cost_model feature_set

* ignore vote transactions from cost model

Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
This commit is contained in:
Tao Zhu
2021-10-06 15:11:41 -05:00
committed by Trent Nelson
parent a4df784e82
commit db85d659b9
40 changed files with 3208 additions and 266 deletions

View File

@@ -83,9 +83,9 @@ checksum = "eab1c04a571841102f5345a8fc0f6bb3d31c315dec879b5c6e42e40ce7ffa34e"
[[package]]
name = "assert_matches"
version = "1.4.0"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "695579f0f2520f3774bb40461e5adb066459d4e0af4d59d20175484fb8e9edf1"
checksum = "9b34d609dfbaf33d6889b2b7106d3ca345eacad44200913df5ba02bfd31d2ba9"
[[package]]
name = "async-trait"
@@ -155,11 +155,10 @@ checksum = "904dfeac50f3cdaba28fc6f57fdcddb75f49ed61346676a78c4ffe55877802fd"
[[package]]
name = "bincode"
version = "1.3.1"
version = "1.3.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f30d3a39baa26f9651f17b375061f3233dde33424a8b72b0dbe93a68a0bc896d"
checksum = "b1f45e9417d87227c7a56d22e471c6206462cba514c7590c09aff4cf6d1ddcad"
dependencies = [
"byteorder 1.3.4",
"serde",
]
@@ -272,6 +271,12 @@ version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "476e9cd489f9e121e02ffa6014a8ef220ecb15c05ed23fc34cca13925dc283fb"
[[package]]
name = "bs58"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "771fe0050b883fcc3ea2359b1a96bcfbc090b7116eae7c3c512c7a083fdf23d3"
[[package]]
name = "bumpalo"
version = "3.3.0"
@@ -2597,7 +2602,7 @@ dependencies = [
"Inflector",
"base64 0.12.3",
"bincode",
"bs58",
"bs58 0.3.1",
"bv",
"lazy_static",
"serde",
@@ -3035,7 +3040,7 @@ version = "1.8.0"
dependencies = [
"base64 0.13.0",
"bincode",
"bs58",
"bs58 0.3.1",
"clap",
"indicatif",
"jsonrpc-core",
@@ -3061,6 +3066,13 @@ dependencies = [
"url",
]
[[package]]
name = "solana-compute-budget-program"
version = "1.8.0"
dependencies = [
"solana-sdk",
]
[[package]]
name = "solana-config-program"
version = "1.8.0"
@@ -3123,7 +3135,7 @@ version = "1.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b0b98d31e0662fedf3a1ee30919c655713874d578e19e65affe46109b1b927f9"
dependencies = [
"bs58",
"bs58 0.3.1",
"bv",
"generic-array 0.14.3",
"log",
@@ -3141,7 +3153,7 @@ dependencies = [
name = "solana-frozen-abi"
version = "1.8.0"
dependencies = [
"bs58",
"bs58 0.3.1",
"bv",
"generic-array 0.14.3",
"log",
@@ -3248,7 +3260,7 @@ dependencies = [
"blake3",
"borsh",
"borsh-derive",
"bs58",
"bs58 0.3.1",
"bv",
"curve25519-dalek 2.1.0",
"hex",
@@ -3281,7 +3293,7 @@ dependencies = [
"blake3",
"borsh",
"borsh-derive",
"bs58",
"bs58 0.3.1",
"bv",
"curve25519-dalek 2.1.0",
"hex",
@@ -3388,6 +3400,7 @@ dependencies = [
"rustc_version",
"serde",
"serde_derive",
"solana-compute-budget-program",
"solana-config-program",
"solana-frozen-abi 1.8.0",
"solana-frozen-abi-macro 1.8.0",
@@ -3412,7 +3425,9 @@ version = "1.8.0"
dependencies = [
"assert_matches",
"bincode",
"bs58",
"borsh",
"borsh-derive",
"bs58 0.4.0",
"bv",
"byteorder 1.3.4",
"chrono",
@@ -3459,7 +3474,7 @@ version = "1.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "84710ce45a21cccd9f2b09d8e9aad529080bb2540f27b1253874b6e732b465b9"
dependencies = [
"bs58",
"bs58 0.3.1",
"proc-macro2 1.0.24",
"quote 1.0.6",
"rustversion",
@@ -3470,7 +3485,7 @@ dependencies = [
name = "solana-sdk-macro"
version = "1.8.0"
dependencies = [
"bs58",
"bs58 0.3.1",
"proc-macro2 1.0.24",
"quote 1.0.6",
"rustversion",
@@ -3511,7 +3526,7 @@ dependencies = [
"Inflector",
"base64 0.12.3",
"bincode",
"bs58",
"bs58 0.3.1",
"lazy_static",
"serde",
"serde_derive",

View File

@@ -34,6 +34,7 @@ use solana_sdk::{
bpf_loader, bpf_loader_deprecated, bpf_loader_upgradeable,
client::SyncClient,
clock::MAX_PROCESSING_AGE,
compute_budget,
entrypoint::{MAX_PERMITTED_DATA_INCREASE, SUCCESS},
instruction::{AccountMeta, CompiledInstruction, Instruction, InstructionError},
keyed_account::KeyedAccount,
@@ -1232,8 +1233,6 @@ fn test_program_bpf_call_depth() {
solana_logger::setup();
println!("Test program: solana_bpf_rust_call_depth");
let GenesisConfigInfo {
genesis_config,
mint_keypair,
@@ -1267,6 +1266,40 @@ fn test_program_bpf_call_depth() {
assert!(result.is_err());
}
#[cfg(feature = "bpf_rust")]
#[test]
fn test_program_bpf_compute_budget() {
solana_logger::setup();
let GenesisConfigInfo {
genesis_config,
mint_keypair,
..
} = create_genesis_config(50);
let mut bank = Bank::new(&genesis_config);
let (name, id, entrypoint) = solana_bpf_loader_program!();
bank.add_builtin(&name, id, entrypoint);
let bank_client = BankClient::new(bank);
let program_id = load_bpf_program(
&bank_client,
&bpf_loader::id(),
&mint_keypair,
"solana_bpf_rust_noop",
);
let message = Message::new(
&[
compute_budget::request_units(1),
Instruction::new_with_bincode(program_id, &0, vec![]),
],
Some(&mint_keypair.pubkey()),
);
let result = bank_client.send_and_confirm_message(&[&mint_keypair], message);
assert_eq!(
result.unwrap_err().unwrap(),
TransactionError::InstructionError(1, InstructionError::ProgramFailedToComplete),
);
}
#[test]
fn assert_instruction_count() {
solana_logger::setup();