2018-05-14 17:36:19 -06:00
|
|
|
//! The `tpu` module implements the Transaction Processing Unit, a
|
2019-01-08 09:21:39 -07:00
|
|
|
//! multi-stage transaction processing pipeline in software.
|
2018-05-14 17:36:19 -06:00
|
|
|
|
2019-11-20 16:43:10 -07:00
|
|
|
use crate::{
|
|
|
|
banking_stage::BankingStage,
|
2020-03-19 23:35:01 -07:00
|
|
|
broadcast_stage::{BroadcastStage, BroadcastStageType, RetransmitSlotsReceiver},
|
2021-03-24 23:41:52 -07:00
|
|
|
cluster_info_vote_listener::{
|
2021-04-10 17:34:45 -07:00
|
|
|
ClusterInfoVoteListener, GossipDuplicateConfirmedSlotsSender, GossipVerifiedVoteHashSender,
|
|
|
|
VerifiedVoteSender, VoteTracker,
|
2021-03-24 23:41:52 -07:00
|
|
|
},
|
Cost model 1.7 (#20188)
* Cost Model to limit transactions which are not parallelizeable (#16694)
* * Add following to banking_stage:
1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
2. CostTracker which is shared between threads, tracks transaction costs for each block.
* replace hard coded program ID with id() calls
* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.
* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.
* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes
* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;
* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations
* replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;
* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;
* changed cost unit to microsecond, using value collected from mainnet;
* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
* investigate system performance test degradation (#17919)
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table
* Change mutex on cost_tracker to RwLock
* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.
* acquire and hold locks for block of TXs, instead of acquire and release per transaction;
* remove redundant would_fit check from cost_tracker update execution path
* refactor cost checking with less frequent lock acquiring
* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.
* create hashmap with new_capacity to reduce runtime heap realloc.
* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default
* address potential deadlock by acquiring locks one at time
* Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
* log warning when channel send fails (#18391)
* Aggregate cost_model into cost_tracker (#18374)
* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions
* review fixes
* update ledger tool to restore cost table from blockstore (#18489)
* update ledger tool to restore cost model from blockstore when compute-slot-cost
* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool
* refactor and simplify a test
* manually fix merge conflicts
* Per-program id timings (#17554)
* more manual fixing
* solve a merge conflict
* featurize cost model
* more merge fix
* cost model uses compute_unit to replace microsecond as cost unit
(#18934)
* Reject blocks for costs above the max block cost (#18994)
* Update block max cost limit to fix performance regession (#19276)
* replace function with const var for better readability (#19285)
* Add few more metrics data points (#19624)
* periodically report sigverify_stage stats (#19674)
* manual merge
* cost model nits (#18528)
* Accumulate consumed units (#18714)
* tx wide compute budget (#18631)
* more manual merge
* ignore zerorize drop security
* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984
* add transaction cost histogram metrics (#20350)
* rebase to 1.7.15
* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes
* remove cost_model feature_set
* ignore vote transactions from cost model
Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
2021-10-06 15:11:41 -05:00
|
|
|
cost_model::CostModel,
|
|
|
|
cost_tracker::CostTracker,
|
2019-11-20 16:43:10 -07:00
|
|
|
fetch_stage::FetchStage,
|
|
|
|
sigverify::TransactionSigVerifier,
|
2020-05-08 10:00:23 -07:00
|
|
|
sigverify_stage::SigVerifyStage,
|
2019-11-20 16:43:10 -07:00
|
|
|
};
|
2019-06-26 18:42:27 -07:00
|
|
|
use crossbeam_channel::unbounded;
|
2021-05-26 09:15:46 -06:00
|
|
|
use solana_gossip::cluster_info::ClusterInfo;
|
2020-08-07 11:21:35 -07:00
|
|
|
use solana_ledger::{blockstore::Blockstore, blockstore_processor::TransactionStatusSender};
|
2021-06-04 18:19:08 +00:00
|
|
|
use solana_poh::poh_recorder::{PohRecorder, WorkingBankEntry};
|
2021-05-19 00:54:28 -06:00
|
|
|
use solana_rpc::{
|
|
|
|
optimistically_confirmed_bank_tracker::BankNotificationSender,
|
|
|
|
rpc_subscriptions::RpcSubscriptions,
|
|
|
|
};
|
2020-08-07 11:21:35 -07:00
|
|
|
use solana_runtime::{
|
|
|
|
bank_forks::BankForks,
|
|
|
|
vote_sender_types::{ReplayVoteReceiver, ReplayVoteSender},
|
2020-07-29 23:17:40 -07:00
|
|
|
};
|
2019-11-20 16:43:10 -07:00
|
|
|
use std::{
|
|
|
|
net::UdpSocket,
|
|
|
|
sync::{
|
|
|
|
atomic::AtomicBool,
|
|
|
|
mpsc::{channel, Receiver},
|
|
|
|
Arc, Mutex, RwLock,
|
|
|
|
},
|
|
|
|
thread,
|
|
|
|
};
|
2018-09-14 00:17:40 -07:00
|
|
|
|
2021-02-26 09:15:45 -08:00
|
|
|
pub const DEFAULT_TPU_COALESCE_MS: u64 = 5;
|
|
|
|
|
2019-03-04 20:08:21 -08:00
|
|
|
pub struct Tpu {
|
2019-02-16 18:03:55 -08:00
|
|
|
fetch_stage: FetchStage,
|
|
|
|
sigverify_stage: SigVerifyStage,
|
2021-09-29 09:12:58 -07:00
|
|
|
vote_sigverify_stage: SigVerifyStage,
|
2019-02-16 18:03:55 -08:00
|
|
|
banking_stage: BankingStage,
|
|
|
|
cluster_info_vote_listener: ClusterInfoVoteListener,
|
2019-03-01 20:43:30 -07:00
|
|
|
broadcast_stage: BroadcastStage,
|
2019-01-26 13:58:08 +05:30
|
|
|
}
|
|
|
|
|
2018-05-14 17:36:19 -06:00
|
|
|
impl Tpu {
|
2019-03-09 02:47:41 -08:00
|
|
|
#[allow(clippy::too_many_arguments)]
|
2019-03-03 16:44:06 -08:00
|
|
|
pub fn new(
|
2020-04-21 12:54:45 -07:00
|
|
|
cluster_info: &Arc<ClusterInfo>,
|
2019-02-26 10:48:18 -08:00
|
|
|
poh_recorder: &Arc<Mutex<PohRecorder>>,
|
2019-09-18 12:16:22 -07:00
|
|
|
entry_receiver: Receiver<WorkingBankEntry>,
|
2020-03-19 23:35:01 -07:00
|
|
|
retransmit_slots_receiver: RetransmitSlotsReceiver,
|
2019-01-26 13:58:08 +05:30
|
|
|
transactions_sockets: Vec<UdpSocket>,
|
2019-07-30 14:50:02 -07:00
|
|
|
tpu_forwards_sockets: Vec<UdpSocket>,
|
2021-09-29 09:12:58 -07:00
|
|
|
tpu_vote_sockets: Vec<UdpSocket>,
|
2019-12-16 17:11:18 -08:00
|
|
|
broadcast_sockets: Vec<UdpSocket>,
|
2020-05-17 22:01:08 +01:00
|
|
|
subscriptions: &Arc<RpcSubscriptions>,
|
2019-11-20 16:43:10 -07:00
|
|
|
transaction_status_sender: Option<TransactionStatusSender>,
|
2020-01-13 14:13:52 -07:00
|
|
|
blockstore: &Arc<Blockstore>,
|
2019-06-19 00:13:19 -07:00
|
|
|
broadcast_type: &BroadcastStageType,
|
2019-03-04 16:33:14 -08:00
|
|
|
exit: &Arc<AtomicBool>,
|
2019-11-18 18:05:02 -08:00
|
|
|
shred_version: u16,
|
2020-03-09 22:03:09 -07:00
|
|
|
vote_tracker: Arc<VoteTracker>,
|
|
|
|
bank_forks: Arc<RwLock<BankForks>>,
|
2020-07-09 23:52:54 -06:00
|
|
|
verified_vote_sender: VerifiedVoteSender,
|
2021-04-10 17:34:45 -07:00
|
|
|
gossip_verified_vote_hash_sender: GossipVerifiedVoteHashSender,
|
2020-08-07 11:21:35 -07:00
|
|
|
replay_vote_receiver: ReplayVoteReceiver,
|
|
|
|
replay_vote_sender: ReplayVoteSender,
|
2020-09-28 20:43:05 -06:00
|
|
|
bank_notification_sender: Option<BankNotificationSender>,
|
2021-02-26 09:15:45 -08:00
|
|
|
tpu_coalesce_ms: u64,
|
2021-03-24 23:41:52 -07:00
|
|
|
cluster_confirmed_slot_sender: GossipDuplicateConfirmedSlotsSender,
|
Cost model 1.7 (#20188)
* Cost Model to limit transactions which are not parallelizeable (#16694)
* * Add following to banking_stage:
1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
2. CostTracker which is shared between threads, tracks transaction costs for each block.
* replace hard coded program ID with id() calls
* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.
* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.
* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes
* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;
* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations
* replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;
* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;
* changed cost unit to microsecond, using value collected from mainnet;
* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
* investigate system performance test degradation (#17919)
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table
* Change mutex on cost_tracker to RwLock
* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.
* acquire and hold locks for block of TXs, instead of acquire and release per transaction;
* remove redundant would_fit check from cost_tracker update execution path
* refactor cost checking with less frequent lock acquiring
* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.
* create hashmap with new_capacity to reduce runtime heap realloc.
* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default
* address potential deadlock by acquiring locks one at time
* Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
* log warning when channel send fails (#18391)
* Aggregate cost_model into cost_tracker (#18374)
* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions
* review fixes
* update ledger tool to restore cost table from blockstore (#18489)
* update ledger tool to restore cost model from blockstore when compute-slot-cost
* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool
* refactor and simplify a test
* manually fix merge conflicts
* Per-program id timings (#17554)
* more manual fixing
* solve a merge conflict
* featurize cost model
* more merge fix
* cost model uses compute_unit to replace microsecond as cost unit
(#18934)
* Reject blocks for costs above the max block cost (#18994)
* Update block max cost limit to fix performance regession (#19276)
* replace function with const var for better readability (#19285)
* Add few more metrics data points (#19624)
* periodically report sigverify_stage stats (#19674)
* manual merge
* cost model nits (#18528)
* Accumulate consumed units (#18714)
* tx wide compute budget (#18631)
* more manual merge
* ignore zerorize drop security
* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984
* add transaction cost histogram metrics (#20350)
* rebase to 1.7.15
* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes
* remove cost_model feature_set
* ignore vote transactions from cost model
Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
2021-10-06 15:11:41 -05:00
|
|
|
cost_model: &Arc<RwLock<CostModel>>,
|
2019-03-03 16:44:06 -08:00
|
|
|
) -> Self {
|
2019-02-01 05:21:29 +05:30
|
|
|
let (packet_sender, packet_receiver) = channel();
|
2021-09-29 09:12:58 -07:00
|
|
|
let (vote_packet_sender, vote_packet_receiver) = channel();
|
2019-03-08 14:59:11 -08:00
|
|
|
let fetch_stage = FetchStage::new_with_sender(
|
|
|
|
transactions_sockets,
|
2019-07-30 14:50:02 -07:00
|
|
|
tpu_forwards_sockets,
|
2021-09-29 09:12:58 -07:00
|
|
|
tpu_vote_sockets,
|
2021-06-18 20:02:48 +00:00
|
|
|
exit,
|
2019-04-09 12:57:12 -07:00
|
|
|
&packet_sender,
|
2021-09-29 09:12:58 -07:00
|
|
|
&vote_packet_sender,
|
2021-06-18 20:02:48 +00:00
|
|
|
poh_recorder,
|
2021-02-26 09:15:45 -08:00
|
|
|
tpu_coalesce_ms,
|
2019-03-08 14:59:11 -08:00
|
|
|
);
|
2019-06-26 18:42:27 -07:00
|
|
|
let (verified_sender, verified_receiver) = unbounded();
|
2018-05-14 17:36:19 -06:00
|
|
|
|
2020-05-08 10:00:23 -07:00
|
|
|
let sigverify_stage = {
|
2019-10-28 16:07:51 -07:00
|
|
|
let verifier = TransactionSigVerifier::default();
|
2019-12-19 23:27:54 -08:00
|
|
|
SigVerifyStage::new(packet_receiver, verified_sender, verifier)
|
2019-10-28 16:07:51 -07:00
|
|
|
};
|
2019-04-09 12:57:12 -07:00
|
|
|
|
2021-09-29 09:12:58 -07:00
|
|
|
let (verified_tpu_vote_packets_sender, verified_tpu_vote_packets_receiver) = unbounded();
|
|
|
|
|
|
|
|
let vote_sigverify_stage = {
|
|
|
|
let verifier = TransactionSigVerifier::new_reject_non_vote();
|
|
|
|
SigVerifyStage::new(
|
|
|
|
vote_packet_receiver,
|
|
|
|
verified_tpu_vote_packets_sender,
|
|
|
|
verifier,
|
|
|
|
)
|
|
|
|
};
|
|
|
|
|
|
|
|
let (verified_gossip_vote_packets_sender, verified_gossip_vote_packets_receiver) =
|
|
|
|
unbounded();
|
2019-04-09 12:57:12 -07:00
|
|
|
let cluster_info_vote_listener = ClusterInfoVoteListener::new(
|
2021-06-18 20:02:48 +00:00
|
|
|
exit,
|
2019-04-09 12:57:12 -07:00
|
|
|
cluster_info.clone(),
|
2021-09-29 09:12:58 -07:00
|
|
|
verified_gossip_vote_packets_sender,
|
2021-06-18 20:02:48 +00:00
|
|
|
poh_recorder,
|
2020-03-09 22:03:09 -07:00
|
|
|
vote_tracker,
|
2021-09-25 19:09:49 +00:00
|
|
|
bank_forks.clone(),
|
2020-05-17 22:01:08 +01:00
|
|
|
subscriptions.clone(),
|
2020-07-09 23:52:54 -06:00
|
|
|
verified_vote_sender,
|
2021-04-10 17:34:45 -07:00
|
|
|
gossip_verified_vote_hash_sender,
|
2020-08-07 11:21:35 -07:00
|
|
|
replay_vote_receiver,
|
2020-07-28 02:33:27 -07:00
|
|
|
blockstore.clone(),
|
2020-09-28 20:43:05 -06:00
|
|
|
bank_notification_sender,
|
2021-03-24 23:41:52 -07:00
|
|
|
cluster_confirmed_slot_sender,
|
2019-04-09 12:57:12 -07:00
|
|
|
);
|
2018-05-14 17:36:19 -06:00
|
|
|
|
Cost model 1.7 (#20188)
* Cost Model to limit transactions which are not parallelizeable (#16694)
* * Add following to banking_stage:
1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
2. CostTracker which is shared between threads, tracks transaction costs for each block.
* replace hard coded program ID with id() calls
* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.
* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.
* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes
* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;
* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations
* replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;
* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;
* changed cost unit to microsecond, using value collected from mainnet;
* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
* investigate system performance test degradation (#17919)
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table
* Change mutex on cost_tracker to RwLock
* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.
* acquire and hold locks for block of TXs, instead of acquire and release per transaction;
* remove redundant would_fit check from cost_tracker update execution path
* refactor cost checking with less frequent lock acquiring
* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.
* create hashmap with new_capacity to reduce runtime heap realloc.
* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default
* address potential deadlock by acquiring locks one at time
* Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
* log warning when channel send fails (#18391)
* Aggregate cost_model into cost_tracker (#18374)
* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions
* review fixes
* update ledger tool to restore cost table from blockstore (#18489)
* update ledger tool to restore cost model from blockstore when compute-slot-cost
* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool
* refactor and simplify a test
* manually fix merge conflicts
* Per-program id timings (#17554)
* more manual fixing
* solve a merge conflict
* featurize cost model
* more merge fix
* cost model uses compute_unit to replace microsecond as cost unit
(#18934)
* Reject blocks for costs above the max block cost (#18994)
* Update block max cost limit to fix performance regession (#19276)
* replace function with const var for better readability (#19285)
* Add few more metrics data points (#19624)
* periodically report sigverify_stage stats (#19674)
* manual merge
* cost model nits (#18528)
* Accumulate consumed units (#18714)
* tx wide compute budget (#18631)
* more manual merge
* ignore zerorize drop security
* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984
* add transaction cost histogram metrics (#20350)
* rebase to 1.7.15
* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes
* remove cost_model feature_set
* ignore vote transactions from cost model
Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
2021-10-06 15:11:41 -05:00
|
|
|
let cost_tracker = Arc::new(RwLock::new(CostTracker::new(cost_model.clone())));
|
2019-04-17 21:07:45 -07:00
|
|
|
let banking_stage = BankingStage::new(
|
2021-06-18 20:02:48 +00:00
|
|
|
cluster_info,
|
2019-04-17 21:07:45 -07:00
|
|
|
poh_recorder,
|
|
|
|
verified_receiver,
|
2021-09-29 09:12:58 -07:00
|
|
|
verified_tpu_vote_packets_receiver,
|
|
|
|
verified_gossip_vote_packets_receiver,
|
2019-11-20 16:43:10 -07:00
|
|
|
transaction_status_sender,
|
2020-08-07 11:21:35 -07:00
|
|
|
replay_vote_sender,
|
Cost model 1.7 (#20188)
* Cost Model to limit transactions which are not parallelizeable (#16694)
* * Add following to banking_stage:
1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
2. CostTracker which is shared between threads, tracks transaction costs for each block.
* replace hard coded program ID with id() calls
* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.
* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.
* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes
* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;
* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations
* replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;
* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;
* changed cost unit to microsecond, using value collected from mainnet;
* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
* investigate system performance test degradation (#17919)
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table
* Change mutex on cost_tracker to RwLock
* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.
* acquire and hold locks for block of TXs, instead of acquire and release per transaction;
* remove redundant would_fit check from cost_tracker update execution path
* refactor cost checking with less frequent lock acquiring
* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.
* create hashmap with new_capacity to reduce runtime heap realloc.
* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default
* address potential deadlock by acquiring locks one at time
* Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
* log warning when channel send fails (#18391)
* Aggregate cost_model into cost_tracker (#18374)
* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions
* review fixes
* update ledger tool to restore cost table from blockstore (#18489)
* update ledger tool to restore cost model from blockstore when compute-slot-cost
* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool
* refactor and simplify a test
* manually fix merge conflicts
* Per-program id timings (#17554)
* more manual fixing
* solve a merge conflict
* featurize cost model
* more merge fix
* cost model uses compute_unit to replace microsecond as cost unit
(#18934)
* Reject blocks for costs above the max block cost (#18994)
* Update block max cost limit to fix performance regession (#19276)
* replace function with const var for better readability (#19285)
* Add few more metrics data points (#19624)
* periodically report sigverify_stage stats (#19674)
* manual merge
* cost model nits (#18528)
* Accumulate consumed units (#18714)
* tx wide compute budget (#18631)
* more manual merge
* ignore zerorize drop security
* - update const cost values with data collected by #19627
- update cost calculation to closely proposed fee schedule #16984
* add transaction cost histogram metrics (#20350)
* rebase to 1.7.15
* add tx count and thread id to stats (#20451)
each stat reports and resets when slot changes
* remove cost_model feature_set
* ignore vote transactions from cost model
Co-authored-by: sakridge <sakridge@gmail.com>
Co-authored-by: Jeff Biseda <jbiseda@gmail.com>
Co-authored-by: Jack May <jack@solana.com>
2021-10-06 15:11:41 -05:00
|
|
|
cost_tracker,
|
2019-04-17 21:07:45 -07:00
|
|
|
);
|
2019-01-26 13:58:08 +05:30
|
|
|
|
2019-06-19 00:13:19 -07:00
|
|
|
let broadcast_stage = broadcast_type.new_broadcast_stage(
|
2019-12-16 17:11:18 -08:00
|
|
|
broadcast_sockets,
|
2019-03-03 16:44:06 -08:00
|
|
|
cluster_info.clone(),
|
2019-01-26 13:58:08 +05:30
|
|
|
entry_receiver,
|
2020-03-19 23:35:01 -07:00
|
|
|
retransmit_slots_receiver,
|
2021-06-18 20:02:48 +00:00
|
|
|
exit,
|
2020-01-13 14:13:52 -07:00
|
|
|
blockstore,
|
2021-09-25 19:09:49 +00:00
|
|
|
&bank_forks,
|
2019-11-18 18:05:02 -08:00
|
|
|
shred_version,
|
2018-10-18 22:57:48 -07:00
|
|
|
);
|
2018-05-14 17:36:19 -06:00
|
|
|
|
2019-03-04 20:08:21 -08:00
|
|
|
Self {
|
2018-07-03 22:14:08 -06:00
|
|
|
fetch_stage,
|
|
|
|
sigverify_stage,
|
2021-09-29 09:12:58 -07:00
|
|
|
vote_sigverify_stage,
|
2018-07-03 22:14:08 -06:00
|
|
|
banking_stage,
|
2019-02-01 05:21:29 +05:30
|
|
|
cluster_info_vote_listener,
|
2019-03-01 20:43:30 -07:00
|
|
|
broadcast_stage,
|
2019-03-03 16:44:06 -08:00
|
|
|
}
|
2019-01-26 13:58:08 +05:30
|
|
|
}
|
2018-07-03 22:14:08 -06:00
|
|
|
|
2019-11-13 11:12:09 -07:00
|
|
|
pub fn join(self) -> thread::Result<()> {
|
2021-01-23 11:55:15 -08:00
|
|
|
let results = vec![
|
|
|
|
self.fetch_stage.join(),
|
|
|
|
self.sigverify_stage.join(),
|
2021-09-29 09:12:58 -07:00
|
|
|
self.vote_sigverify_stage.join(),
|
2021-01-23 11:55:15 -08:00
|
|
|
self.cluster_info_vote_listener.join(),
|
|
|
|
self.banking_stage.join(),
|
|
|
|
];
|
2019-03-04 20:08:21 -08:00
|
|
|
let broadcast_result = self.broadcast_stage.join();
|
|
|
|
for result in results {
|
|
|
|
result?;
|
|
|
|
}
|
|
|
|
let _ = broadcast_result?;
|
|
|
|
Ok(())
|
2018-05-14 17:36:19 -06:00
|
|
|
}
|
|
|
|
}
|