* Cost Model to limit transactions which are not parallelizeable (#16694)
* * Add following to banking_stage:
1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
2. CostTracker which is shared between threads, tracks transaction costs for each block.
* replace hard coded program ID with id() calls
* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.
* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.
* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes
* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;
* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations
* replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;
* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;
* changed cost unit to microsecond, using value collected from mainnet;
* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
* investigate system performance test degradation (#17919)
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table
* Change mutex on cost_tracker to RwLock
* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.
* acquire and hold locks for block of TXs, instead of acquire and release per transaction;
* remove redundant would_fit check from cost_tracker update execution path
* refactor cost checking with less frequent lock acquiring
* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.
* create hashmap with new_capacity to reduce runtime heap realloc.
* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default
* address potential deadlock by acquiring locks one at time
* Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
* log warning when channel send fails (#18391)
* Aggregate cost_model into cost_tracker (#18374)
* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions
* review fixes
* update ledger tool to restore cost table from blockstore (#18489)
* update ledger tool to restore cost model from blockstore when compute-slot-cost
* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool
* refactor and simplify a test
* manually fix merge conflicts
* Per-program id timings (#17554)
* more manual fixing
* solve a merge conflict
* featurize cost model
* more merge fix
* cost model uses compute_unit to replace microsecond as cost unit
(#18934)
* Reject blocks for costs above the max block cost (#18994)
* Update block max cost limit to fix performance regession (#19276)
* replace function with const var for better readability (#19285)
* Add few more metrics data points (#19624)
* periodically report sigverify_stage stats (#19674)
* manual merge
* cost model nits (#18528)
* Accumulate consumed units (#18714)
* tx wide compute budget (#18631)
* more manual merge
* ignore zerorize drop security
* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984
* add transaction cost histogram metrics (#20350)
* rebase to 1.7.15
* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes
* remove cost_model feature_set
* ignore vote transactions from cost model
Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
* removes packet-count metrics from retransmit stage
Working towards sending shreds (instead of packets) to retransmit stage
so that shreds recovered from erasure codes are as well retransmitted.
Following commit will add these metrics back to window-service, earlier
in the pipeline.
(cherry picked from commit bf437b0336)
# Conflicts:
# core/src/retransmit_stage.rs
* adds packet/shred count stats to window-service
Adding back these metrics from the earlier commit which removed them
from retransmit stage.
(cherry picked from commit 8198a7eae1)
* removes erroneous uses of Arc<...> from retransmit stage
(cherry picked from commit 6e413331b5)
# Conflicts:
# core/src/retransmit_stage.rs
# core/src/tvu.rs
* sends shreds (instead of packets) to retransmit stage
Working towards channelling through shreds recovered from erasure codes
to retransmit stage.
(cherry picked from commit 3efccbffab)
# Conflicts:
# core/src/retransmit_stage.rs
* returns completed-data-set-info from insert_data_shred
instead of opaque (u32, u32) which are then converted to
CompletedDataSetInfo at the call-site.
(cherry picked from commit 3c71670bd9)
# Conflicts:
# ledger/src/blockstore.rs
* retransmits shreds recovered from erasure codes
Shreds recovered from erasure codes have not been received from turbine
and have not been retransmitted to other nodes downstream. This results
in more repairs across the cluster which is slower.
This commit channels through recovered shreds to retransmit stage in
order to further broadcast the shreds to downstream nodes in the tree.
(cherry picked from commit 7a8807b8bb)
# Conflicts:
# core/src/retransmit_stage.rs
# core/src/window_service.rs
* removes backport merge conflicts
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
* stake: Remove v2 program references (#19308)
* stake: Remove v2 program references
* Remove stake v2 feature, along with stake rewrite
(cherry picked from commit 73aa004c59)
# Conflicts:
# runtime/src/bank.rs
* Fix merge conflict
* Fix uses of old `stake` function
Co-authored-by: Jon Cinque <jon.cinque@gmail.com>
* backport new column families from master to 1.6 (#18743)
* backporting bank_hash and program_costs column families from master to 1.6 for rocksdb backward compatibility
* missed a line to allow dead code
* include code for purge
* Exclude stubbed ProgramCosts column from compaction (#18840)
Co-authored-by: Tao Zhu <82401714+taozhu-chicago@users.noreply.github.com>
This feature requires clang to support -march=native on M1. Before this change,
cargo build would output a build error when building for aarch64.
(cherry picked from commit 8a9b7f5ef2)
# Conflicts:
# ledger/Cargo.toml
* Make account shrink configurable #17544 (#17778)
1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true
Fixes#17544
(cherry picked from commit 269d995832)
# Conflicts:
# core/tests/snapshots.rs
# ledger/src/bank_forks_utils.rs
# runtime/src/accounts_db.rs
# runtime/src/snapshot_utils.rs
* fix some merge errors
Co-authored-by: Lijun Wang <83639177+lijunwangs@users.noreply.github.com>
Co-authored-by: Jeff Washington (jwash) <wash678@gmail.com>
* verify bank hash on startup with ledger tool option (#17939)
(cherry picked from commit f558b9b6bf)
# Conflicts:
# core/tests/snapshots.rs
# ledger/src/bank_forks_utils.rs
# runtime/src/snapshot_utils.rs
* fix merge errors
Co-authored-by: Jeff Washington (jwash) <75863576+jeffwashington@users.noreply.github.com>
Co-authored-by: Jeff Washington (jwash) <wash678@gmail.com>
* add metrics for startup
* roll timings up higher
* fix test
* fix duplicate
(cherry picked from commit 471b34132e)
# Conflicts:
# ledger/src/bank_forks_utils.rs
# runtime/src/snapshot_utils.rs
conflicts because #17778 is not present.
Co-authored-by: Jeff Washington (jwash) <75863576+jeffwashington@users.noreply.github.com>
Refactor a few functions that are on the load-from-snapshot path, to facilitate
adding in incremental snapshots more easily.
Additionally, add some tests and doc comments.
(cherry picked from commit 1953543274)
Co-authored-by: Brooks Prumo <brooks@solana.com>
* Refactor stake program into solana_program (#17906)
* Move stake state / instructions into solana_program
* Update account-decoder
* Update cli and runtime
* Update all other parts
* Commit Cargo.lock changes in programs/bpf
* Update cli stake instruction import
* Allow integer arithmetic
* Update ABI digest
* Bump rust mem instruction count
* Remove useless structs
* Move stake::id() -> stake::program::id()
* Re-export from solana_sdk and mark deprecated
* Address feedback
* Run cargo fmt
* Run cargo fmt post cherry-pick
* calculate_capitalization uses hash calculation
* feedback
* remove debugging code, clean up slot math
Co-authored-by: Jeff Washington (jwash) <75863576+jeffwashington@users.noreply.github.com>