Cost model 1.7 (#20188)

* Cost Model to limit transactions which are not parallelizeable (#16694) * * Add following to banking_stage: 1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions. 2. CostTracker which is shared between threads, tracks transaction costs for each block. * replace hard coded program ID with id() calls * Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed. * Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table. * add test for cost_tracker atomically try_add operation, serves as safety guard for future changes * check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker; * bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations * replay stage feed back program cost (#17731) * replay stage feeds back realtime per-program execution cost to cost model; * program cost execution table is initialized into empty table, no longer populated with hardcoded numbers; * changed cost unit to microsecond, using value collected from mainnet; * add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs. * investigate system performance test degradation (#17919) * Add stats and counter around cost model ops, mainly: - calculate transaction cost - check transaction can fit in a block - update block cost tracker after transactions are added to block - replay_stage to update/insert execution cost to table * Change mutex on cost_tracker to RwLock * removed cloning cost_tracker for local use, as the metrics show clone is very expensive. * acquire and hold locks for block of TXs, instead of acquire and release per transaction; * remove redundant would_fit check from cost_tracker update execution path * refactor cost checking with less frequent lock acquiring * avoid many Transaction_cost heap allocation when calculate cost, which is in the hot path - executed per transaction. * create hashmap with new_capacity to reduce runtime heap realloc. * code review changes: categorize stats, replace explicit drop calls, concisely initiate to default * address potential deadlock by acquiring locks one at time * Persist cost table to blockstore (#18123) * Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks * Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()` * Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time * Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory * Only try to persist to blockstore when cost_table is changed. * Restore cost table during validator startup * Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads; * Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model. * log warning when channel send fails (#18391) * Aggregate cost_model into cost_tracker (#18374) * * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions * review fixes * update ledger tool to restore cost table from blockstore (#18489) * update ledger tool to restore cost model from blockstore when compute-slot-cost * Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool * refactor and simplify a test * manually fix merge conflicts * Per-program id timings (#17554) * more manual fixing * solve a merge conflict * featurize cost model * more merge fix * cost model uses compute_unit to replace microsecond as cost unit (#18934) * Reject blocks for costs above the max block cost (#18994) * Update block max cost limit to fix performance regession (#19276) * replace function with const var for better readability (#19285) * Add few more metrics data points (#19624) * periodically report sigverify_stage stats (#19674) * manual merge * cost model nits (#18528) * Accumulate consumed units (#18714) * tx wide compute budget (#18631) * more manual merge * ignore zerorize drop security * - update const cost values with data collected by #19627 - update cost calculation to closely proposed fee schedule #16984 * add transaction cost histogram metrics (#20350) * rebase to 1.7.15 * add tx count and thread id to stats (#20451) each stat reports and resets when slot changes * remove cost_model feature_set * ignore vote transactions from cost model Co-authored-by: sakridge <sakridge@gmail.com> Co-authored-by: Jeff Biseda <jbiseda@gmail.com> Co-authored-by: Jack May <jack@solana.com>
2021-10-06 15:11:41 -05:00
parent a4df784e82
commit db85d659b9
40 changed files with 3208 additions and 266 deletions
--- a/core/src/banking_stage.rs
+++ b/core/src/banking_stage.rs
@@ -1,7 +1,9 @@
 //! The `banking_stage` processes Transaction messages. It is intended to be used
 //! to contruct a software pipeline. The stage uses all available CPU cores and
 //! can do its processing in parallel with signature verification on the GPU.
-use crate::packet_hasher::PacketHasher;
+use crate::{
+    cost_tracker::CostTracker, cost_tracker_stats::CostTrackerStats, packet_hasher::PacketHasher,
+};
 use crossbeam_channel::{Receiver as CrossbeamReceiver, RecvTimeoutError};
 use itertools::Itertools;
 use lru::LruCache;
@@ -52,7 +54,7 @@ use std::{
    net::{SocketAddr, UdpSocket},
    ops::DerefMut,
    sync::atomic::{AtomicU64, AtomicUsize, Ordering},
-    sync::{Arc, Mutex},
+    sync::{Arc, Mutex, RwLock},
    thread::{self, Builder, JoinHandle},
    time::Duration,
    time::Instant,
@@ -93,6 +95,9 @@ pub struct BankingStageStats {
    current_buffered_packet_batches_count: AtomicUsize,
    rebuffered_packets_count: AtomicUsize,
    consumed_buffered_packets_count: AtomicUsize,
+    reset_cost_tracker_count: AtomicUsize,
+    cost_tracker_check_count: AtomicUsize,
+    cost_forced_retry_transactions_count: AtomicUsize,

    // Timing
    consume_buffered_packets_elapsed: AtomicU64,
@@ -101,7 +106,11 @@ pub struct BankingStageStats {
    filter_pending_packets_elapsed: AtomicU64,
    packet_duplicate_check_elapsed: AtomicU64,
    packet_conversion_elapsed: AtomicU64,
+    unprocessed_packet_conversion_elapsed: AtomicU64,
    transaction_processing_elapsed: AtomicU64,
+    cost_tracker_update_elapsed: AtomicU64,
+    cost_tracker_clone_elapsed: AtomicU64,
+    cost_tracker_check_elapsed: AtomicU64,
 }

 impl BankingStageStats {
@@ -165,6 +174,22 @@ impl BankingStageStats {
                        .swap(0, Ordering::Relaxed) as i64,
                    i64
                ),
+                (
+                    "reset_cost_tracker_count",
+                    self.reset_cost_tracker_count.swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
+                (
+                    "cost_tracker_check_count",
+                    self.cost_tracker_check_count.swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
+                (
+                    "cost_forced_retry_transactions_count",
+                    self.cost_forced_retry_transactions_count
+                        .swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
                (
                    "consume_buffered_packets_elapsed",
                    self.consume_buffered_packets_elapsed
@@ -199,12 +224,33 @@ impl BankingStageStats {
                    self.packet_conversion_elapsed.swap(0, Ordering::Relaxed) as i64,
                    i64
                ),
+                (
+                    "unprocessed_packet_conversion_elapsed",
+                    self.unprocessed_packet_conversion_elapsed
+                        .swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
                (
                    "transaction_processing_elapsed",
                    self.transaction_processing_elapsed
                        .swap(0, Ordering::Relaxed) as i64,
                    i64
                ),
+                (
+                    "cost_tracker_update_elapsed",
+                    self.cost_tracker_update_elapsed.swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
+                (
+                    "cost_tracker_clone_elapsed",
+                    self.cost_tracker_clone_elapsed.swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
+                (
+                    "cost_tracker_check_elapsed",
+                    self.cost_tracker_check_elapsed.swap(0, Ordering::Relaxed) as i64,
+                    i64
+                ),
            );
        }
    }
@@ -241,6 +287,7 @@ impl BankingStage {
        verified_vote_receiver: CrossbeamReceiver<Vec<Packets>>,
        transaction_status_sender: Option<TransactionStatusSender>,
        gossip_vote_sender: ReplayVoteSender,
+        cost_tracker: Arc<RwLock<CostTracker>>,
    ) -> Self {
        Self::new_num_threads(
            cluster_info,
@@ -251,6 +298,7 @@ impl BankingStage {
            Self::num_threads(),
            transaction_status_sender,
            gossip_vote_sender,
+            cost_tracker,
        )
    }

@@ -263,6 +311,7 @@ impl BankingStage {
        num_threads: u32,
        transaction_status_sender: Option<TransactionStatusSender>,
        gossip_vote_sender: ReplayVoteSender,
+        cost_tracker: Arc<RwLock<CostTracker>>,
    ) -> Self {
        let batch_limit = TOTAL_BUFFERED_PACKETS / ((num_threads - 1) as usize * PACKETS_PER_BATCH);
        // Single thread to generate entries from many banks.
@@ -298,6 +347,7 @@ impl BankingStage {
                let gossip_vote_sender = gossip_vote_sender.clone();
                let duplicates = duplicates.clone();
                let data_budget = data_budget.clone();
+                let cost_tracker = cost_tracker.clone();
                Builder::new()
                    .name("solana-banking-stage-tx".to_string())
                    .spawn(move || {
@@ -314,6 +364,7 @@ impl BankingStage {
                            gossip_vote_sender,
                            &duplicates,
                            &data_budget,
+                            &cost_tracker,
                        );
                    })
                    .unwrap()
@@ -371,6 +422,25 @@ impl BankingStage {
        has_more_unprocessed_transactions
    }

+    fn reset_cost_tracker_if_new_bank(
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        bank: Arc<Bank>,
+        banking_stage_stats: &BankingStageStats,
+        cost_tracker_stats: &mut CostTrackerStats,
+    ) {
+        if cost_tracker
+            .write()
+            .unwrap()
+            .reset_if_new_bank(bank.slot(), cost_tracker_stats)
+        {
+            // only increase counter when bank changed
+            banking_stage_stats
+                .reset_cost_tracker_count
+                .fetch_add(1, Ordering::Relaxed);
+        }
+    }
+
+    #[allow(clippy::too_many_arguments)]
    pub fn consume_buffered_packets(
        my_pubkey: &Pubkey,
        max_tx_ingestion_ns: u128,
@@ -381,6 +451,8 @@ impl BankingStage {
        test_fn: Option<impl Fn()>,
        banking_stage_stats: &BankingStageStats,
        recorder: &TransactionRecorder,
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        cost_tracker_stats: &mut CostTrackerStats,
    ) {
        let mut rebuffered_packets_len = 0;
        let mut new_tx_count = 0;
@@ -398,6 +470,9 @@ impl BankingStage {
                    original_unprocessed_indexes,
                    my_pubkey,
                    *next_leader,
+                    cost_tracker,
+                    banking_stage_stats,
+                    cost_tracker_stats,
                );
                Self::update_buffered_packets_with_new_unprocessed(
                    original_unprocessed_indexes,
@@ -406,6 +481,12 @@ impl BankingStage {
            } else {
                let bank_start = poh_recorder.lock().unwrap().bank_start();
                if let Some((bank, bank_creation_time)) = bank_start {
+                    Self::reset_cost_tracker_if_new_bank(
+                        cost_tracker,
+                        bank.clone(),
+                        banking_stage_stats,
+                        cost_tracker_stats,
+                    );
                    let (processed, verified_txs_len, new_unprocessed_indexes) =
                        Self::process_packets_transactions(
                            &bank,
@@ -416,6 +497,8 @@ impl BankingStage {
                            transaction_status_sender.clone(),
                            gossip_vote_sender,
                            banking_stage_stats,
+                            cost_tracker,
+                            cost_tracker_stats,
                        );
                    if processed < verified_txs_len
                        || !Bank::should_bank_still_be_processing_txs(
@@ -519,6 +602,8 @@ impl BankingStage {
        banking_stage_stats: &BankingStageStats,
        recorder: &TransactionRecorder,
        data_budget: &DataBudget,
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        cost_tracker_stats: &mut CostTrackerStats,
    ) -> BufferedPacketsDecision {
        let bank_start;
        let (
@@ -529,6 +614,14 @@ impl BankingStage {
        ) = {
            let poh = poh_recorder.lock().unwrap();
            bank_start = poh.bank_start();
+            if let Some((ref bank, _)) = bank_start {
+                Self::reset_cost_tracker_if_new_bank(
+                    cost_tracker,
+                    bank.clone(),
+                    banking_stage_stats,
+                    cost_tracker_stats,
+                );
+            };
            (
                poh.leader_after_n_slots(FORWARD_TRANSACTIONS_TO_LEADER_AT_SLOT_OFFSET),
                PohRecorder::get_bank_still_processing_txs(&bank_start),
@@ -559,6 +652,8 @@ impl BankingStage {
                    None::<Box<dyn Fn()>>,
                    banking_stage_stats,
                    recorder,
+                    cost_tracker,
+                    cost_tracker_stats,
                );
            }
            BufferedPacketsDecision::Forward => {
@@ -638,11 +733,13 @@ impl BankingStage {
        gossip_vote_sender: ReplayVoteSender,
        duplicates: &Arc<Mutex<(LruCache<u64, ()>, PacketHasher)>>,
        data_budget: &DataBudget,
+        cost_tracker: &Arc<RwLock<CostTracker>>,
    ) {
        let recorder = poh_recorder.lock().unwrap().recorder();
        let socket = UdpSocket::bind("0.0.0.0:0").unwrap();
        let mut buffered_packets = VecDeque::with_capacity(batch_limit);
        let banking_stage_stats = BankingStageStats::new(id);
+        let mut cost_tracker_stats = CostTrackerStats::new(id, 0);
        loop {
            while !buffered_packets.is_empty() {
                let decision = Self::process_buffered_packets(
@@ -657,6 +754,8 @@ impl BankingStage {
                    &banking_stage_stats,
                    &recorder,
                    data_budget,
+                    cost_tracker,
+                    &mut cost_tracker_stats,
                );
                if matches!(decision, BufferedPacketsDecision::Hold)
                    || matches!(decision, BufferedPacketsDecision::ForwardAndHold)
@@ -691,6 +790,8 @@ impl BankingStage {
                &banking_stage_stats,
                duplicates,
                &recorder,
+                cost_tracker,
+                &mut cost_tracker_stats,
            ) {
                Ok(()) | Err(RecvTimeoutError::Timeout) => (),
                Err(RecvTimeoutError::Disconnected) => break,
@@ -935,12 +1036,12 @@ impl BankingStage {
    ) -> (usize, Vec<usize>) {
        let mut chunk_start = 0;
        let mut unprocessed_txs = vec![];
+
        while chunk_start != transactions.len() {
            let chunk_end = std::cmp::min(
                transactions.len(),
                chunk_start + MAX_NUM_TRANSACTIONS_PER_BATCH,
            );
-
            let (result, retryable_txs_in_chunk) = Self::process_and_record_transactions(
                bank,
                &transactions[chunk_start..chunk_end],
@@ -1023,13 +1124,21 @@ impl BankingStage {
    // This function deserializes packets into transactions, computes the blake3 hash of transaction messages,
    // and verifies secp256k1 instructions. A list of valid transactions are returned with their message hashes
    // and packet indexes.
+    // Also returned is packet indexes for transaction should be retried due to cost limits.
+    #[allow(clippy::needless_collect)]
    fn transactions_from_packets(
        msgs: &Packets,
        transaction_indexes: &[usize],
        libsecp256k1_0_5_upgrade_enabled: bool,
        votes_only: bool,
-    ) -> (Vec<HashedTransaction<'static>>, Vec<usize>) {
-        transaction_indexes
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        banking_stage_stats: &BankingStageStats,
+        demote_program_write_locks: bool,
+        cost_tracker_stats: &mut CostTrackerStats,
+    ) -> (Vec<HashedTransaction<'static>>, Vec<usize>, Vec<usize>) {
+        let mut retryable_transaction_packet_indexes: Vec<usize> = vec![];
+
+        let verified_transactions_with_packet_indexes: Vec<_> = transaction_indexes
            .iter()
            .filter_map(|tx_index| {
                let p = &msgs.packets[*tx_index];
@@ -1040,14 +1149,68 @@ impl BankingStage {
                let tx: Transaction = limited_deserialize(&p.data[0..p.meta.size]).ok()?;
                tx.verify_precompiles(libsecp256k1_0_5_upgrade_enabled)
                    .ok()?;
-                let message_bytes = Self::packet_message(p)?;
-                let message_hash = Message::hash_raw_message(message_bytes);
-                Some((
-                    HashedTransaction::new(Cow::Owned(tx), message_hash),
-                    tx_index,
-                ))
+
+                Some((tx, *tx_index))
            })
-            .unzip()
+            .collect();
+        banking_stage_stats.cost_tracker_check_count.fetch_add(
+            verified_transactions_with_packet_indexes.len(),
+            Ordering::Relaxed,
+        );
+
+        let mut cost_tracker_check_time = Measure::start("cost_tracker_check_time");
+        let filtered_transactions_with_packet_indexes: Vec<_> = {
+            let cost_tracker_readonly = cost_tracker.read().unwrap();
+            verified_transactions_with_packet_indexes
+                .into_iter()
+                .filter_map(|(tx, tx_index)| {
+                    // put transaction into retry queue if it wouldn't fit
+                    // into current bank
+                    let is_vote = &msgs.packets[tx_index].meta.is_simple_vote_tx;
+
+                    // excluding vote TX from cost_model, for now
+                    if !is_vote
+                        && cost_tracker_readonly
+                            .would_transaction_fit(
+                                &tx,
+                                demote_program_write_locks,
+                                cost_tracker_stats,
+                            )
+                            .is_err()
+                    {
+                        debug!("transaction {:?} would exceed limit", tx);
+                        retryable_transaction_packet_indexes.push(tx_index);
+                        return None;
+                    }
+                    Some((tx, tx_index))
+                })
+                .collect()
+        };
+        cost_tracker_check_time.stop();
+
+        let (filtered_transactions, filter_transaction_packet_indexes) =
+            filtered_transactions_with_packet_indexes
+                .into_iter()
+                .filter_map(|(tx, tx_index)| {
+                    let p = &msgs.packets[tx_index];
+                    let message_bytes = Self::packet_message(p)?;
+                    let message_hash = Message::hash_raw_message(message_bytes);
+                    Some((
+                        HashedTransaction::new(Cow::Owned(tx), message_hash),
+                        tx_index,
+                    ))
+                })
+                .unzip();
+
+        banking_stage_stats
+            .cost_tracker_check_elapsed
+            .fetch_add(cost_tracker_check_time.as_us(), Ordering::Relaxed);
+
+        (
+            filtered_transactions,
+            filter_transaction_packet_indexes,
+            retryable_transaction_packet_indexes,
+        )
    }

    /// This function filters pending packets that are still valid
@@ -1089,6 +1252,7 @@ impl BankingStage {
        Self::filter_valid_transaction_indexes(&results, transaction_to_packet_indexes)
    }

+    #[allow(clippy::too_many_arguments)]
    fn process_packets_transactions(
        bank: &Arc<Bank>,
        bank_creation_time: &Instant,
@@ -1098,20 +1262,32 @@ impl BankingStage {
        transaction_status_sender: Option<TransactionStatusSender>,
        gossip_vote_sender: &ReplayVoteSender,
        banking_stage_stats: &BankingStageStats,
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        cost_tracker_stats: &mut CostTrackerStats,
    ) -> (usize, usize, Vec<usize>) {
        let mut packet_conversion_time = Measure::start("packet_conversion");
-        let (transactions, transaction_to_packet_indexes) = Self::transactions_from_packets(
-            msgs,
-            &packet_indexes,
-            bank.libsecp256k1_0_5_upgrade_enabled(),
-            bank.vote_only_bank(),
-        );
+        let (transactions, transaction_to_packet_indexes, retryable_packet_indexes) =
+            Self::transactions_from_packets(
+                msgs,
+                &packet_indexes,
+                bank.libsecp256k1_0_5_upgrade_enabled(),
+                bank.vote_only_bank(),
+                cost_tracker,
+                banking_stage_stats,
+                bank.demote_program_write_locks(),
+                cost_tracker_stats,
+            );
        packet_conversion_time.stop();
+        inc_new_counter_info!("banking_stage-packet_conversion", 1);

+        banking_stage_stats
+            .cost_forced_retry_transactions_count
+            .fetch_add(retryable_packet_indexes.len(), Ordering::Relaxed);
        debug!(
-            "bank: {} filtered transactions {}",
+            "bank: {} filtered transactions {} cost limited transactions {}",
            bank.slot(),
-            transactions.len()
+            transactions.len(),
+            retryable_packet_indexes.len()
        );

        let tx_len = transactions.len();
@@ -1126,11 +1302,27 @@ impl BankingStage {
            gossip_vote_sender,
        );
        process_tx_time.stop();
-
        let unprocessed_tx_count = unprocessed_tx_indexes.len();
+        inc_new_counter_info!(
+            "banking_stage-unprocessed_transactions",
+            unprocessed_tx_count
+        );
+
+        // applying cost of processed transactions to shared cost_tracker
+        let mut cost_tracking_time = Measure::start("cost_tracking_time");
+        transactions.iter().enumerate().for_each(|(index, tx)| {
+            if unprocessed_tx_indexes.iter().all(|&i| i != index) {
+                cost_tracker.write().unwrap().add_transaction_cost(
+                    tx.transaction(),
+                    bank.demote_program_write_locks(),
+                    cost_tracker_stats,
+                );
+            }
+        });
+        cost_tracking_time.stop();

        let mut filter_pending_packets_time = Measure::start("filter_pending_packets_time");
-        let filtered_unprocessed_packet_indexes = Self::filter_pending_packets_from_pending_txs(
+        let mut filtered_unprocessed_packet_indexes = Self::filter_pending_packets_from_pending_txs(
            bank,
            &transactions,
            &transaction_to_packet_indexes,
@@ -1143,12 +1335,19 @@ impl BankingStage {
            unprocessed_tx_count.saturating_sub(filtered_unprocessed_packet_indexes.len())
        );

+        // combine cost-related unprocessed transactions with bank determined unprocessed for
+        // buffering
+        filtered_unprocessed_packet_indexes.extend(retryable_packet_indexes);
+
        banking_stage_stats
            .packet_conversion_elapsed
            .fetch_add(packet_conversion_time.as_us(), Ordering::Relaxed);
        banking_stage_stats
            .transaction_processing_elapsed
            .fetch_add(process_tx_time.as_us(), Ordering::Relaxed);
+        banking_stage_stats
+            .cost_tracker_update_elapsed
+            .fetch_add(cost_tracking_time.as_us(), Ordering::Relaxed);
        banking_stage_stats
            .filter_pending_packets_elapsed
            .fetch_add(filter_pending_packets_time.as_us(), Ordering::Relaxed);
@@ -1162,6 +1361,9 @@ impl BankingStage {
        transaction_indexes: &[usize],
        my_pubkey: &Pubkey,
        next_leader: Option<Pubkey>,
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        banking_stage_stats: &BankingStageStats,
+        cost_tracker_stats: &mut CostTrackerStats,
    ) -> Vec<usize> {
        // Check if we are the next leader. If so, let's not filter the packets
        // as we'll filter it again while processing the packets.
@@ -1172,27 +1374,43 @@ impl BankingStage {
            }
        }

-        let (transactions, transaction_to_packet_indexes) = Self::transactions_from_packets(
-            msgs,
-            &transaction_indexes,
-            bank.libsecp256k1_0_5_upgrade_enabled(),
-            bank.vote_only_bank(),
-        );
+        let mut unprocessed_packet_conversion_time =
+            Measure::start("unprocessed_packet_conversion");
+        let (transactions, transaction_to_packet_indexes, retry_packet_indexes) =
+            Self::transactions_from_packets(
+                msgs,
+                &transaction_indexes,
+                bank.libsecp256k1_0_5_upgrade_enabled(),
+                bank.vote_only_bank(),
+                cost_tracker,
+                banking_stage_stats,
+                bank.demote_program_write_locks(),
+                cost_tracker_stats,
+            );
+        unprocessed_packet_conversion_time.stop();

        let tx_count = transaction_to_packet_indexes.len();

        let unprocessed_tx_indexes = (0..transactions.len()).collect_vec();
-        let filtered_unprocessed_packet_indexes = Self::filter_pending_packets_from_pending_txs(
+        let mut filtered_unprocessed_packet_indexes = Self::filter_pending_packets_from_pending_txs(
            bank,
            &transactions,
            &transaction_to_packet_indexes,
            &unprocessed_tx_indexes,
        );

+        filtered_unprocessed_packet_indexes.extend(retry_packet_indexes);
+
        inc_new_counter_info!(
            "banking_stage-dropped_tx_before_forwarding",
            tx_count.saturating_sub(filtered_unprocessed_packet_indexes.len())
        );
+        banking_stage_stats
+            .unprocessed_packet_conversion_elapsed
+            .fetch_add(
+                unprocessed_packet_conversion_time.as_us(),
+                Ordering::Relaxed,
+            );

        filtered_unprocessed_packet_indexes
    }
@@ -1228,6 +1446,8 @@ impl BankingStage {
        banking_stage_stats: &BankingStageStats,
        duplicates: &Arc<Mutex<(LruCache<u64, ()>, PacketHasher)>>,
        recorder: &TransactionRecorder,
+        cost_tracker: &Arc<RwLock<CostTracker>>,
+        cost_tracker_stats: &mut CostTrackerStats,
    ) -> Result<(), RecvTimeoutError> {
        let mut recv_time = Measure::start("process_packets_recv");
        let mms = verified_receiver.recv_timeout(recv_timeout)?;
@@ -1268,6 +1488,12 @@ impl BankingStage {
                continue;
            }
            let (bank, bank_creation_time) = bank_start.unwrap();
+            Self::reset_cost_tracker_if_new_bank(
+                cost_tracker,
+                bank.clone(),
+                banking_stage_stats,
+                cost_tracker_stats,
+            );

            let (processed, verified_txs_len, unprocessed_indexes) =
                Self::process_packets_transactions(
@@ -1279,6 +1505,8 @@ impl BankingStage {
                    transaction_status_sender.clone(),
                    gossip_vote_sender,
                    banking_stage_stats,
+                    cost_tracker,
+                    cost_tracker_stats,
                );

            new_tx_count += processed;
@@ -1310,6 +1538,9 @@ impl BankingStage {
                        &packet_indexes,
                        my_pubkey,
                        next_leader,
+                        cost_tracker,
+                        banking_stage_stats,
+                        cost_tracker_stats,
                    );
                    Self::push_unprocessed(
                        buffered_packets,
@@ -1464,6 +1695,7 @@ where
 #[cfg(test)]
 mod tests {
    use super::*;
+    use crate::cost_model::CostModel;
    use crossbeam_channel::unbounded;
    use itertools::Itertools;
    use solana_gossip::{cluster_info::Node, contact_info::ContactInfo};
@@ -1536,6 +1768,9 @@ mod tests {
                gossip_verified_vote_receiver,
                None,
                vote_forward_sender,
+                Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
            );
            drop(verified_sender);
            drop(gossip_verified_vote_sender);
@@ -1584,6 +1819,9 @@ mod tests {
                verified_gossip_vote_receiver,
                None,
                vote_forward_sender,
+                Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
            );
            trace!("sending bank");
            drop(verified_sender);
@@ -1656,6 +1894,9 @@ mod tests {
                gossip_verified_vote_receiver,
                None,
                gossip_vote_sender,
+                Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
            );

            // fund another account so we can send 2 good transactions in a single batch.
@@ -1806,6 +2047,9 @@ mod tests {
                    3,
                    None,
                    gossip_vote_sender,
+                    Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                        CostModel::default(),
+                    ))))),
                );

                // wait for banking_stage to eat the packets
@@ -2627,6 +2871,10 @@ mod tests {
                None::<Box<dyn Fn()>>,
                &BankingStageStats::default(),
                &recorder,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(buffered_packets[0].1.len(), num_conflicting_transactions);
            // When the poh recorder has a bank, should process all non conflicting buffered packets.
@@ -2643,6 +2891,10 @@ mod tests {
                    None::<Box<dyn Fn()>>,
                    &BankingStageStats::default(),
                    &recorder,
+                    &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                        CostModel::default(),
+                    ))))),
+                    &mut CostTrackerStats::default(),
                );
                if num_expected_unprocessed == 0 {
                    assert!(buffered_packets.is_empty())
@@ -2708,6 +2960,10 @@ mod tests {
                        test_fn,
                        &BankingStageStats::default(),
                        &recorder,
+                        &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                            CostModel::default(),
+                        ))))),
+                        &mut CostTrackerStats::default(),
                    );

                    // Check everything is correct. All indexes after `interrupted_iteration`
@@ -2956,21 +3212,33 @@ mod tests {
                make_test_packets(vec![transfer_tx.clone(), transfer_tx.clone()], vote_indexes);

            let mut votes_only = false;
-            let (txs, tx_packet_index) = BankingStage::transactions_from_packets(
+            let (txs, tx_packet_index, _) = BankingStage::transactions_from_packets(
                &packets,
                &packet_indexes,
                false,
                votes_only,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &BankingStageStats::default(),
+                false,
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(2, txs.len());
            assert_eq!(vec![0, 1], tx_packet_index);

            votes_only = true;
-            let (txs, tx_packet_index) = BankingStage::transactions_from_packets(
+            let (txs, tx_packet_index, _) = BankingStage::transactions_from_packets(
                &packets,
                &packet_indexes,
                false,
                votes_only,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &BankingStageStats::default(),
+                false,
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(0, txs.len());
            assert_eq!(0, tx_packet_index.len());
@@ -2985,21 +3253,33 @@ mod tests {
            );

            let mut votes_only = false;
-            let (txs, tx_packet_index) = BankingStage::transactions_from_packets(
+            let (txs, tx_packet_index, _) = BankingStage::transactions_from_packets(
                &packets,
                &packet_indexes,
                false,
                votes_only,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &BankingStageStats::default(),
+                false,
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(3, txs.len());
            assert_eq!(vec![0, 1, 2], tx_packet_index);

            votes_only = true;
-            let (txs, tx_packet_index) = BankingStage::transactions_from_packets(
+            let (txs, tx_packet_index, _) = BankingStage::transactions_from_packets(
                &packets,
                &packet_indexes,
                false,
                votes_only,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &BankingStageStats::default(),
+                false,
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(2, txs.len());
            assert_eq!(vec![0, 2], tx_packet_index);
@@ -3014,21 +3294,33 @@ mod tests {
            );

            let mut votes_only = false;
-            let (txs, tx_packet_index) = BankingStage::transactions_from_packets(
+            let (txs, tx_packet_index, _) = BankingStage::transactions_from_packets(
                &packets,
                &packet_indexes,
                false,
                votes_only,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &BankingStageStats::default(),
+                false,
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(3, txs.len());
            assert_eq!(vec![0, 1, 2], tx_packet_index);

            votes_only = true;
-            let (txs, tx_packet_index) = BankingStage::transactions_from_packets(
+            let (txs, tx_packet_index, _) = BankingStage::transactions_from_packets(
                &packets,
                &packet_indexes,
                false,
                votes_only,
+                &Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
+                    CostModel::default(),
+                ))))),
+                &BankingStageStats::default(),
+                false,
+                &mut CostTrackerStats::default(),
            );
            assert_eq!(3, txs.len());
            assert_eq!(vec![0, 1, 2], tx_packet_index);
--- a/core/src/cost_model.rs
+++ b/core/src/cost_model.rs
@@ -0,0 +1,519 @@
+//! 'cost_model` provides service to estimate a transaction's cost
+//! following proposed fee schedule #16984; Relevant cluster cost
+//! measuring is described by #19627
+//!
+//! The main function is `calculate_cost` which returns &TransactionCost.
+//!
+use crate::execute_cost_table::ExecuteCostTable;
+use log::*;
+use solana_ledger::block_cost_limits::*;
+use solana_sdk::{pubkey::Pubkey, transaction::Transaction};
+use std::collections::HashMap;
+
+const MAX_WRITABLE_ACCOUNTS: usize = 256;
+
+#[derive(Debug, Clone)]
+pub enum CostModelError {
+    /// transaction that would fail sanitize, cost model is not able to process
+    /// such transaction.
+    InvalidTransaction,
+
+    /// would exceed block max limit
+    WouldExceedBlockMaxLimit,
+
+    /// would exceed account max limit
+    WouldExceedAccountMaxLimit,
+}
+
+#[derive(Default, Debug)]
+pub struct TransactionCost {
+    pub writable_accounts: Vec<Pubkey>,
+    pub signature_cost: u64,
+    pub write_lock_cost: u64,
+    pub data_bytes_cost: u64,
+    pub execution_cost: u64,
+}
+
+impl TransactionCost {
+    pub fn new_with_capacity(capacity: usize) -> Self {
+        Self {
+            writable_accounts: Vec::with_capacity(capacity),
+            ..Self::default()
+        }
+    }
+
+    pub fn reset(&mut self) {
+        self.writable_accounts.clear();
+        self.signature_cost = 0;
+        self.write_lock_cost = 0;
+        self.data_bytes_cost = 0;
+        self.execution_cost = 0;
+    }
+
+    pub fn sum(&self) -> u64 {
+        self.signature_cost + self.write_lock_cost + self.data_bytes_cost + self.execution_cost
+    }
+}
+
+#[derive(Debug)]
+pub struct CostModel {
+    account_cost_limit: u64,
+    block_cost_limit: u64,
+    instruction_execution_cost_table: ExecuteCostTable,
+
+    // reusable variables
+    transaction_cost: TransactionCost,
+}
+
+impl Default for CostModel {
+    fn default() -> Self {
+        CostModel::new(MAX_WRITABLE_ACCOUNT_UNITS, MAX_BLOCK_UNITS)
+    }
+}
+
+impl CostModel {
+    pub fn new(chain_max: u64, block_max: u64) -> Self {
+        Self {
+            account_cost_limit: chain_max,
+            block_cost_limit: block_max,
+            instruction_execution_cost_table: ExecuteCostTable::default(),
+            transaction_cost: TransactionCost::new_with_capacity(MAX_WRITABLE_ACCOUNTS),
+        }
+    }
+
+    pub fn get_account_cost_limit(&self) -> u64 {
+        self.account_cost_limit
+    }
+
+    pub fn get_block_cost_limit(&self) -> u64 {
+        self.block_cost_limit
+    }
+
+    pub fn initialize_cost_table(&mut self, cost_table: &[(Pubkey, u64)]) {
+        cost_table
+            .iter()
+            .map(|(key, cost)| (key, cost))
+            .chain(BUILT_IN_INSTRUCTION_COSTS.iter())
+            .for_each(|(program_id, cost)| {
+                match self
+                    .instruction_execution_cost_table
+                    .upsert(program_id, *cost)
+                {
+                    Some(c) => {
+                        debug!(
+                            "initiating cost table, instruction {:?} has cost {}",
+                            program_id, c
+                        );
+                    }
+                    None => {
+                        debug!(
+                            "initiating cost table, failed for instruction {:?}",
+                            program_id
+                        );
+                    }
+                }
+            });
+        debug!(
+            "restored cost model instruction cost table from blockstore, current values: {:?}",
+            self.get_instruction_cost_table()
+        );
+    }
+
+    pub fn calculate_cost(
+        &mut self,
+        transaction: &Transaction,
+        demote_program_write_locks: bool,
+    ) -> &TransactionCost {
+        self.transaction_cost.reset();
+
+        self.transaction_cost.signature_cost = self.get_signature_cost(transaction);
+        self.get_write_lock_cost(transaction, demote_program_write_locks);
+        self.transaction_cost.data_bytes_cost = self.get_data_bytes_cost(transaction);
+        self.transaction_cost.execution_cost = self.get_transaction_cost(transaction);
+
+        debug!(
+            "transaction {:?} has cost {:?}",
+            transaction, self.transaction_cost
+        );
+        &self.transaction_cost
+    }
+
+    pub fn upsert_instruction_cost(
+        &mut self,
+        program_key: &Pubkey,
+        cost: u64,
+    ) -> Result<u64, &'static str> {
+        self.instruction_execution_cost_table
+            .upsert(program_key, cost);
+        match self.instruction_execution_cost_table.get_cost(program_key) {
+            Some(cost) => Ok(*cost),
+            None => Err("failed to upsert to ExecuteCostTable"),
+        }
+    }
+
+    pub fn get_instruction_cost_table(&self) -> &HashMap<Pubkey, u64> {
+        self.instruction_execution_cost_table.get_cost_table()
+    }
+
+    fn get_signature_cost(&self, transaction: &Transaction) -> u64 {
+        transaction.signatures.len() as u64 * SIGNATURE_COST
+    }
+
+    fn get_write_lock_cost(&mut self, transaction: &Transaction, demote_program_write_locks: bool) {
+        let message = transaction.message();
+        message.account_keys.iter().enumerate().for_each(|(i, k)| {
+            let is_writable = message.is_writable(i, demote_program_write_locks);
+
+            if is_writable {
+                self.transaction_cost.writable_accounts.push(*k);
+                self.transaction_cost.write_lock_cost += WRITE_LOCK_UNITS;
+            }
+        });
+    }
+
+    fn get_data_bytes_cost(&self, transaction: &Transaction) -> u64 {
+        let mut data_bytes_cost: u64 = 0;
+        transaction.message().instructions.iter().for_each(|ix| {
+            data_bytes_cost += ix.data.len() as u64 / DATA_BYTES_UNITS;
+        });
+        data_bytes_cost
+    }
+
+    fn get_transaction_cost(&self, transaction: &Transaction) -> u64 {
+        let mut cost: u64 = 0;
+
+        for instruction in &transaction.message().instructions {
+            let program_id =
+                transaction.message().account_keys[instruction.program_id_index as usize];
+            let instruction_cost = self.find_instruction_cost(&program_id);
+            trace!(
+                "instruction {:?} has cost of {}",
+                instruction,
+                instruction_cost
+            );
+            cost = cost.saturating_add(instruction_cost);
+        }
+        cost
+    }
+
+    fn find_instruction_cost(&self, program_key: &Pubkey) -> u64 {
+        match self.instruction_execution_cost_table.get_cost(program_key) {
+            Some(cost) => *cost,
+            None => {
+                let default_value = self.instruction_execution_cost_table.get_mode();
+                debug!(
+                    "Program key {:?} does not have assigned cost, using mode {}",
+                    program_key, default_value
+                );
+                default_value
+            }
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use solana_runtime::{
+        bank::Bank,
+        genesis_utils::{create_genesis_config, GenesisConfigInfo},
+    };
+    use solana_sdk::{
+        bpf_loader,
+        hash::Hash,
+        instruction::CompiledInstruction,
+        message::Message,
+        signature::{Keypair, Signer},
+        system_instruction::{self},
+        system_program, system_transaction,
+    };
+    use std::{
+        str::FromStr,
+        sync::{Arc, RwLock},
+        thread::{self, JoinHandle},
+    };
+
+    fn test_setup() -> (Keypair, Hash) {
+        solana_logger::setup();
+        let GenesisConfigInfo {
+            genesis_config,
+            mint_keypair,
+            ..
+        } = create_genesis_config(10);
+        let bank = Arc::new(Bank::new_no_wallclock_throttle(&genesis_config));
+        let start_hash = bank.last_blockhash();
+        (mint_keypair, start_hash)
+    }
+
+    #[test]
+    fn test_cost_model_instruction_cost() {
+        let mut testee = CostModel::default();
+
+        let known_key = Pubkey::from_str("known11111111111111111111111111111111111111").unwrap();
+        testee.upsert_instruction_cost(&known_key, 100).unwrap();
+        // find cost for known programs
+        assert_eq!(100, testee.find_instruction_cost(&known_key));
+
+        testee
+            .upsert_instruction_cost(&bpf_loader::id(), 1999)
+            .unwrap();
+        assert_eq!(1999, testee.find_instruction_cost(&bpf_loader::id()));
+
+        // unknown program is assigned with default cost
+        assert_eq!(
+            testee.instruction_execution_cost_table.get_mode(),
+            testee.find_instruction_cost(
+                &Pubkey::from_str("unknown111111111111111111111111111111111111").unwrap()
+            )
+        );
+    }
+
+    #[test]
+    fn test_cost_model_simple_transaction() {
+        let (mint_keypair, start_hash) = test_setup();
+
+        let keypair = Keypair::new();
+        let simple_transaction =
+            system_transaction::transfer(&mint_keypair, &keypair.pubkey(), 2, start_hash);
+        debug!(
+            "system_transaction simple_transaction {:?}",
+            simple_transaction
+        );
+
+        // expected cost for one system transfer instructions
+        let expected_cost = 8;
+
+        let mut testee = CostModel::default();
+        testee
+            .upsert_instruction_cost(&system_program::id(), expected_cost)
+            .unwrap();
+        assert_eq!(
+            expected_cost,
+            testee.get_transaction_cost(&simple_transaction)
+        );
+    }
+
+    #[test]
+    fn test_cost_model_transaction_many_transfer_instructions() {
+        let (mint_keypair, start_hash) = test_setup();
+
+        let key1 = solana_sdk::pubkey::new_rand();
+        let key2 = solana_sdk::pubkey::new_rand();
+        let instructions =
+            system_instruction::transfer_many(&mint_keypair.pubkey(), &[(key1, 1), (key2, 1)]);
+        let message = Message::new(&instructions, Some(&mint_keypair.pubkey()));
+        let tx = Transaction::new(&[&mint_keypair], message, start_hash);
+        debug!("many transfer transaction {:?}", tx);
+
+        // expected cost for two system transfer instructions
+        let program_cost = 8;
+        let expected_cost = program_cost * 2;
+
+        let mut testee = CostModel::default();
+        testee
+            .upsert_instruction_cost(&system_program::id(), program_cost)
+            .unwrap();
+        assert_eq!(expected_cost, testee.get_transaction_cost(&tx));
+    }
+
+    #[test]
+    fn test_cost_model_message_many_different_instructions() {
+        let (mint_keypair, start_hash) = test_setup();
+
+        // construct a transaction with multiple random instructions
+        let key1 = solana_sdk::pubkey::new_rand();
+        let key2 = solana_sdk::pubkey::new_rand();
+        let prog1 = solana_sdk::pubkey::new_rand();
+        let prog2 = solana_sdk::pubkey::new_rand();
+        let instructions = vec![
+            CompiledInstruction::new(3, &(), vec![0, 1]),
+            CompiledInstruction::new(4, &(), vec![0, 2]),
+        ];
+        let tx = Transaction::new_with_compiled_instructions(
+            &[&mint_keypair],
+            &[key1, key2],
+            start_hash,
+            vec![prog1, prog2],
+            instructions,
+        );
+        debug!("many random transaction {:?}", tx);
+
+        let testee = CostModel::default();
+        let result = testee.get_transaction_cost(&tx);
+
+        // expected cost for two random/unknown program is
+        let expected_cost = testee.instruction_execution_cost_table.get_mode() * 2;
+        assert_eq!(expected_cost, result);
+    }
+
+    #[test]
+    fn test_cost_model_sort_message_accounts_by_type() {
+        // construct a transaction with two random instructions with same signer
+        let signer1 = Keypair::new();
+        let signer2 = Keypair::new();
+        let key1 = Pubkey::new_unique();
+        let key2 = Pubkey::new_unique();
+        let prog1 = Pubkey::new_unique();
+        let prog2 = Pubkey::new_unique();
+        let instructions = vec![
+            CompiledInstruction::new(4, &(), vec![0, 2]),
+            CompiledInstruction::new(5, &(), vec![1, 3]),
+        ];
+        let tx = Transaction::new_with_compiled_instructions(
+            &[&signer1, &signer2],
+            &[key1, key2],
+            Hash::new_unique(),
+            vec![prog1, prog2],
+            instructions,
+        );
+
+        let mut cost_model = CostModel::default();
+        let tx_cost = cost_model.calculate_cost(&tx, /*demote_program_write_locks=*/ true);
+        assert_eq!(2 + 2, tx_cost.writable_accounts.len());
+        assert_eq!(signer1.pubkey(), tx_cost.writable_accounts[0]);
+        assert_eq!(signer2.pubkey(), tx_cost.writable_accounts[1]);
+        assert_eq!(key1, tx_cost.writable_accounts[2]);
+        assert_eq!(key2, tx_cost.writable_accounts[3]);
+    }
+
+    #[test]
+    fn test_cost_model_insert_instruction_cost() {
+        let key1 = Pubkey::new_unique();
+        let cost1 = 100;
+
+        let mut cost_model = CostModel::default();
+        // Using default cost for unknown instruction
+        assert_eq!(
+            cost_model.instruction_execution_cost_table.get_mode(),
+            cost_model.find_instruction_cost(&key1)
+        );
+
+        // insert instruction cost to table
+        assert!(cost_model.upsert_instruction_cost(&key1, cost1).is_ok());
+
+        // now it is known insturction with known cost
+        assert_eq!(cost1, cost_model.find_instruction_cost(&key1));
+    }
+
+    #[test]
+    fn test_cost_model_calculate_cost() {
+        let (mint_keypair, start_hash) = test_setup();
+        let tx =
+            system_transaction::transfer(&mint_keypair, &Keypair::new().pubkey(), 2, start_hash);
+
+        let expected_account_cost = WRITE_LOCK_UNITS * 2;
+        let expected_execution_cost = 8;
+
+        let mut cost_model = CostModel::default();
+        cost_model
+            .upsert_instruction_cost(&system_program::id(), expected_execution_cost)
+            .unwrap();
+        let tx_cost = cost_model.calculate_cost(&tx, /*demote_program_write_locks=*/ true);
+        assert_eq!(expected_account_cost, tx_cost.write_lock_cost);
+        assert_eq!(expected_execution_cost, tx_cost.execution_cost);
+        assert_eq!(2, tx_cost.writable_accounts.len());
+    }
+
+    #[test]
+    fn test_cost_model_update_instruction_cost() {
+        let key1 = Pubkey::new_unique();
+        let cost1 = 100;
+        let cost2 = 200;
+        let updated_cost = (cost1 + cost2) / 2;
+
+        let mut cost_model = CostModel::default();
+
+        // insert instruction cost to table
+        assert!(cost_model.upsert_instruction_cost(&key1, cost1).is_ok());
+        assert_eq!(cost1, cost_model.find_instruction_cost(&key1));
+
+        // update instruction cost
+        assert!(cost_model.upsert_instruction_cost(&key1, cost2).is_ok());
+        assert_eq!(updated_cost, cost_model.find_instruction_cost(&key1));
+    }
+
+    #[test]
+    fn test_cost_model_can_be_shared_concurrently_with_rwlock() {
+        let (mint_keypair, start_hash) = test_setup();
+        // construct a transaction with multiple random instructions
+        let key1 = solana_sdk::pubkey::new_rand();
+        let key2 = solana_sdk::pubkey::new_rand();
+        let prog1 = solana_sdk::pubkey::new_rand();
+        let prog2 = solana_sdk::pubkey::new_rand();
+        let instructions = vec![
+            CompiledInstruction::new(3, &(), vec![0, 1]),
+            CompiledInstruction::new(4, &(), vec![0, 2]),
+        ];
+        let tx = Arc::new(Transaction::new_with_compiled_instructions(
+            &[&mint_keypair],
+            &[key1, key2],
+            start_hash,
+            vec![prog1, prog2],
+            instructions,
+        ));
+
+        let number_threads = 10;
+        let expected_account_cost = WRITE_LOCK_UNITS * 3;
+        let cost1 = 100;
+        let cost2 = 200;
+        // execution cost can be either 2 * Default (before write) or cost1+cost2 (after write)
+
+        let cost_model: Arc<RwLock<CostModel>> = Arc::new(RwLock::new(CostModel::default()));
+
+        let thread_handlers: Vec<JoinHandle<()>> = (0..number_threads)
+            .map(|i| {
+                let cost_model = cost_model.clone();
+                let tx = tx.clone();
+
+                if i == 5 {
+                    thread::spawn(move || {
+                        let mut cost_model = cost_model.write().unwrap();
+                        assert!(cost_model.upsert_instruction_cost(&prog1, cost1).is_ok());
+                        assert!(cost_model.upsert_instruction_cost(&prog2, cost2).is_ok());
+                    })
+                } else {
+                    thread::spawn(move || {
+                        let mut cost_model = cost_model.write().unwrap();
+                        let tx_cost = cost_model
+                            .calculate_cost(&tx, /*demote_program_write_locks=*/ true);
+                        assert_eq!(3, tx_cost.writable_accounts.len());
+                        assert_eq!(expected_account_cost, tx_cost.write_lock_cost);
+                    })
+                }
+            })
+            .collect();
+
+        for th in thread_handlers {
+            th.join().unwrap();
+        }
+    }
+
+    #[test]
+    fn test_initialize_cost_table() {
+        // build cost table
+        let cost_table = vec![
+            (Pubkey::new_unique(), 10),
+            (Pubkey::new_unique(), 20),
+            (Pubkey::new_unique(), 30),
+        ];
+
+        // init cost model
+        let mut cost_model = CostModel::default();
+        cost_model.initialize_cost_table(&cost_table);
+
+        // verify
+        for (id, cost) in cost_table.iter() {
+            assert_eq!(*cost, cost_model.find_instruction_cost(id));
+        }
+
+        // verify built-in programs
+        assert!(cost_model
+            .instruction_execution_cost_table
+            .get_cost(&system_program::id())
+            .is_some());
+        assert!(cost_model
+            .instruction_execution_cost_table
+            .get_cost(&solana_vote_program::id())
+            .is_some());
+    }
+}
--- a/core/src/cost_tracker.rs
+++ b/core/src/cost_tracker.rs
@@ -0,0 +1,482 @@
+//! `cost_tracker` keeps tracking transaction cost per chained accounts as well as for entire block
+//! It aggregates `cost_model`, which provides service of calculating transaction cost.
+//! The main functions are:
+//! - would_transaction_fit(&tx), immutable function to test if `tx` would fit into current block
+//! - add_transaction_cost(&tx), mutable function to accumulate `tx` cost to tracker.
+//!
+use crate::cost_model::{CostModel, TransactionCost};
+use crate::cost_tracker_stats::CostTrackerStats;
+use solana_sdk::{clock::Slot, pubkey::Pubkey, transaction::Transaction};
+use std::{
+    collections::HashMap,
+    sync::{Arc, RwLock},
+};
+
+const WRITABLE_ACCOUNTS_PER_BLOCK: usize = 512;
+
+#[derive(Debug)]
+pub struct CostTracker {
+    cost_model: Arc<RwLock<CostModel>>,
+    account_cost_limit: u64,
+    block_cost_limit: u64,
+    current_bank_slot: Slot,
+    cost_by_writable_accounts: HashMap<Pubkey, u64>,
+    block_cost: u64,
+}
+
+impl CostTracker {
+    pub fn new(cost_model: Arc<RwLock<CostModel>>) -> Self {
+        let (account_cost_limit, block_cost_limit) = {
+            let cost_model = cost_model.read().unwrap();
+            (
+                cost_model.get_account_cost_limit(),
+                cost_model.get_block_cost_limit(),
+            )
+        };
+        assert!(account_cost_limit <= block_cost_limit);
+        Self {
+            cost_model,
+            account_cost_limit,
+            block_cost_limit,
+            current_bank_slot: 0,
+            cost_by_writable_accounts: HashMap::with_capacity(WRITABLE_ACCOUNTS_PER_BLOCK),
+            block_cost: 0,
+        }
+    }
+
+    pub fn would_transaction_fit(
+        &self,
+        transaction: &Transaction,
+        demote_program_write_locks: bool,
+        stats: &mut CostTrackerStats,
+    ) -> Result<(), &'static str> {
+        let mut cost_model = self.cost_model.write().unwrap();
+        let tx_cost = cost_model.calculate_cost(transaction, demote_program_write_locks);
+        self.would_fit(&tx_cost.writable_accounts, &tx_cost.sum(), stats)
+    }
+
+    pub fn add_transaction_cost(
+        &mut self,
+        transaction: &Transaction,
+        demote_program_write_locks: bool,
+        stats: &mut CostTrackerStats,
+    ) {
+        let mut cost_model = self.cost_model.write().unwrap();
+        let tx_cost = cost_model.calculate_cost(transaction, demote_program_write_locks);
+        let cost = tx_cost.sum();
+        for account_key in tx_cost.writable_accounts.iter() {
+            *self
+                .cost_by_writable_accounts
+                .entry(*account_key)
+                .or_insert(0) += cost;
+        }
+        self.block_cost += cost;
+
+        stats.transaction_count += 1;
+        stats.block_cost += cost;
+    }
+
+    pub fn reset_if_new_bank(&mut self, slot: Slot, stats: &mut CostTrackerStats) -> bool {
+        // report stats when slot changes
+        if slot != stats.bank_slot {
+            stats.report();
+            *stats = CostTrackerStats::new(stats.id, slot);
+        }
+
+        if slot != self.current_bank_slot {
+            self.current_bank_slot = slot;
+            self.cost_by_writable_accounts.clear();
+            self.block_cost = 0;
+
+            true
+        } else {
+            false
+        }
+    }
+
+    pub fn try_add(
+        &mut self,
+        transaction_cost: &TransactionCost,
+        stats: &mut CostTrackerStats,
+    ) -> Result<u64, &'static str> {
+        let cost = transaction_cost.sum();
+        self.would_fit(&transaction_cost.writable_accounts, &cost, stats)?;
+
+        self.add_transaction(&transaction_cost.writable_accounts, &cost);
+        Ok(self.block_cost)
+    }
+
+    fn would_fit(
+        &self,
+        keys: &[Pubkey],
+        cost: &u64,
+        stats: &mut CostTrackerStats,
+    ) -> Result<(), &'static str> {
+        stats.transaction_cost_histogram.increment(*cost).unwrap();
+
+        // check against the total package cost
+        if self.block_cost + cost > self.block_cost_limit {
+            return Err("would exceed block cost limit");
+        }
+
+        // check if the transaction itself is more costly than the account_cost_limit
+        if *cost > self.account_cost_limit {
+            return Err("Transaction is too expansive, exceeds account cost limit");
+        }
+
+        // check each account against account_cost_limit,
+        for account_key in keys.iter() {
+            match self.cost_by_writable_accounts.get(&account_key) {
+                Some(chained_cost) => {
+                    stats
+                        .writable_accounts_cost_histogram
+                        .increment(*chained_cost)
+                        .unwrap();
+
+                    if chained_cost + cost > self.account_cost_limit {
+                        return Err("would exceed account cost limit");
+                    } else {
+                        continue;
+                    }
+                }
+                None => continue,
+            }
+        }
+
+        Ok(())
+    }
+
+    fn add_transaction(&mut self, keys: &[Pubkey], cost: &u64) {
+        for account_key in keys.iter() {
+            *self
+                .cost_by_writable_accounts
+                .entry(*account_key)
+                .or_insert(0) += cost;
+        }
+        self.block_cost += cost;
+    }
+}
+
+// CostStats can be collected by util, such as ledger_tool
+#[derive(Default, Debug)]
+pub struct CostStats {
+    pub bank_slot: Slot,
+    pub total_cost: u64,
+    pub number_of_accounts: usize,
+    pub costliest_account: Pubkey,
+    pub costliest_account_cost: u64,
+}
+
+impl CostTracker {
+    pub fn get_stats(&self) -> CostStats {
+        let mut stats = CostStats {
+            bank_slot: self.current_bank_slot,
+            total_cost: self.block_cost,
+            number_of_accounts: self.cost_by_writable_accounts.len(),
+            costliest_account: Pubkey::default(),
+            costliest_account_cost: 0,
+        };
+
+        for (key, cost) in self.cost_by_writable_accounts.iter() {
+            if cost > &stats.costliest_account_cost {
+                stats.costliest_account = *key;
+                stats.costliest_account_cost = *cost;
+            }
+        }
+
+        stats
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use solana_runtime::{
+        bank::Bank,
+        genesis_utils::{create_genesis_config, GenesisConfigInfo},
+    };
+    use solana_sdk::{
+        hash::Hash,
+        signature::{Keypair, Signer},
+        system_transaction,
+        transaction::Transaction,
+    };
+    use std::{cmp, sync::Arc};
+
+    fn test_setup() -> (Keypair, Hash) {
+        solana_logger::setup();
+        let GenesisConfigInfo {
+            genesis_config,
+            mint_keypair,
+            ..
+        } = create_genesis_config(10);
+        let bank = Arc::new(Bank::new_no_wallclock_throttle(&genesis_config));
+        let start_hash = bank.last_blockhash();
+        (mint_keypair, start_hash)
+    }
+
+    fn build_simple_transaction(
+        mint_keypair: &Keypair,
+        start_hash: &Hash,
+    ) -> (Transaction, Vec<Pubkey>, u64) {
+        let keypair = Keypair::new();
+        let simple_transaction =
+            system_transaction::transfer(&mint_keypair, &keypair.pubkey(), 2, *start_hash);
+
+        (simple_transaction, vec![mint_keypair.pubkey()], 5)
+    }
+
+    #[test]
+    fn test_cost_tracker_initialization() {
+        let testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(10, 11))));
+        assert_eq!(10, testee.account_cost_limit);
+        assert_eq!(11, testee.block_cost_limit);
+        assert_eq!(0, testee.cost_by_writable_accounts.len());
+        assert_eq!(0, testee.block_cost);
+    }
+
+    #[test]
+    fn test_cost_tracker_ok_add_one() {
+        let (mint_keypair, start_hash) = test_setup();
+        let (_tx, keys, cost) = build_simple_transaction(&mint_keypair, &start_hash);
+
+        // build testee to have capacity for one simple transaction
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(cost, cost))));
+        assert!(testee
+            .would_fit(&keys, &cost, &mut CostTrackerStats::default())
+            .is_ok());
+        testee.add_transaction(&keys, &cost);
+        assert_eq!(cost, testee.block_cost);
+    }
+
+    #[test]
+    fn test_cost_tracker_ok_add_two_same_accounts() {
+        let (mint_keypair, start_hash) = test_setup();
+        // build two transactions with same signed account
+        let (_tx1, keys1, cost1) = build_simple_transaction(&mint_keypair, &start_hash);
+        let (_tx2, keys2, cost2) = build_simple_transaction(&mint_keypair, &start_hash);
+
+        // build testee to have capacity for two simple transactions, with same accounts
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(
+            cost1 + cost2,
+            cost1 + cost2,
+        ))));
+        {
+            assert!(testee
+                .would_fit(&keys1, &cost1, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys1, &cost1);
+        }
+        {
+            assert!(testee
+                .would_fit(&keys2, &cost2, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys2, &cost2);
+        }
+        assert_eq!(cost1 + cost2, testee.block_cost);
+        assert_eq!(1, testee.cost_by_writable_accounts.len());
+    }
+
+    #[test]
+    fn test_cost_tracker_ok_add_two_diff_accounts() {
+        let (mint_keypair, start_hash) = test_setup();
+        // build two transactions with diff accounts
+        let (_tx1, keys1, cost1) = build_simple_transaction(&mint_keypair, &start_hash);
+        let second_account = Keypair::new();
+        let (_tx2, keys2, cost2) = build_simple_transaction(&second_account, &start_hash);
+
+        // build testee to have capacity for two simple transactions, with same accounts
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(
+            cmp::max(cost1, cost2),
+            cost1 + cost2,
+        ))));
+        {
+            assert!(testee
+                .would_fit(&keys1, &cost1, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys1, &cost1);
+        }
+        {
+            assert!(testee
+                .would_fit(&keys2, &cost2, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys2, &cost2);
+        }
+        assert_eq!(cost1 + cost2, testee.block_cost);
+        assert_eq!(2, testee.cost_by_writable_accounts.len());
+    }
+
+    #[test]
+    fn test_cost_tracker_chain_reach_limit() {
+        let (mint_keypair, start_hash) = test_setup();
+        // build two transactions with same signed account
+        let (_tx1, keys1, cost1) = build_simple_transaction(&mint_keypair, &start_hash);
+        let (_tx2, keys2, cost2) = build_simple_transaction(&mint_keypair, &start_hash);
+
+        // build testee to have capacity for two simple transactions, but not for same accounts
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(
+            cmp::min(cost1, cost2),
+            cost1 + cost2,
+        ))));
+        // should have room for first transaction
+        {
+            assert!(testee
+                .would_fit(&keys1, &cost1, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys1, &cost1);
+        }
+        // but no more sapce on the same chain (same signer account)
+        {
+            assert!(testee
+                .would_fit(&keys2, &cost2, &mut CostTrackerStats::default())
+                .is_err());
+        }
+    }
+
+    #[test]
+    fn test_cost_tracker_reach_limit() {
+        let (mint_keypair, start_hash) = test_setup();
+        // build two transactions with diff accounts
+        let (_tx1, keys1, cost1) = build_simple_transaction(&mint_keypair, &start_hash);
+        let second_account = Keypair::new();
+        let (_tx2, keys2, cost2) = build_simple_transaction(&second_account, &start_hash);
+
+        // build testee to have capacity for each chain, but not enough room for both transactions
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(
+            cmp::max(cost1, cost2),
+            cost1 + cost2 - 1,
+        ))));
+        // should have room for first transaction
+        {
+            assert!(testee
+                .would_fit(&keys1, &cost1, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys1, &cost1);
+        }
+        // but no more room for package as whole
+        {
+            assert!(testee
+                .would_fit(&keys2, &cost2, &mut CostTrackerStats::default())
+                .is_err());
+        }
+    }
+
+    #[test]
+    fn test_cost_tracker_reset() {
+        let (mint_keypair, start_hash) = test_setup();
+        // build two transactions with same signed account
+        let (_tx1, keys1, cost1) = build_simple_transaction(&mint_keypair, &start_hash);
+        let (_tx2, keys2, cost2) = build_simple_transaction(&mint_keypair, &start_hash);
+
+        // build testee to have capacity for two simple transactions, but not for same accounts
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(
+            cmp::min(cost1, cost2),
+            cost1 + cost2,
+        ))));
+        // should have room for first transaction
+        {
+            assert!(testee
+                .would_fit(&keys1, &cost1, &mut CostTrackerStats::default())
+                .is_ok());
+            testee.add_transaction(&keys1, &cost1);
+            assert_eq!(1, testee.cost_by_writable_accounts.len());
+            assert_eq!(cost1, testee.block_cost);
+        }
+        // but no more sapce on the same chain (same signer account)
+        {
+            assert!(testee
+                .would_fit(&keys2, &cost2, &mut CostTrackerStats::default())
+                .is_err());
+        }
+        // reset the tracker
+        {
+            testee.reset_if_new_bank(100, &mut CostTrackerStats::default());
+            assert_eq!(0, testee.cost_by_writable_accounts.len());
+            assert_eq!(0, testee.block_cost);
+        }
+        //now the second transaction can be added
+        {
+            assert!(testee
+                .would_fit(&keys2, &cost2, &mut CostTrackerStats::default())
+                .is_ok());
+        }
+    }
+
+    #[test]
+    fn test_cost_tracker_try_add_is_atomic() {
+        let acct1 = Pubkey::new_unique();
+        let acct2 = Pubkey::new_unique();
+        let acct3 = Pubkey::new_unique();
+        let cost = 100;
+        let account_max = cost * 2;
+        let block_max = account_max * 3; // for three accts
+
+        let mut testee = CostTracker::new(Arc::new(RwLock::new(CostModel::new(
+            account_max,
+            block_max,
+        ))));
+
+        // case 1: a tx writes to 3 accounts, should success, we will have:
+        // | acct1 | $cost |
+        // | acct2 | $cost |
+        // | acct2 | $cost |
+        // and block_cost = $cost
+        {
+            let tx_cost = TransactionCost {
+                writable_accounts: vec![acct1, acct2, acct3],
+                execution_cost: cost,
+                ..TransactionCost::default()
+            };
+            assert!(testee
+                .try_add(&tx_cost, &mut CostTrackerStats::default())
+                .is_ok());
+            let stat = testee.get_stats();
+            assert_eq!(cost, stat.total_cost);
+            assert_eq!(3, stat.number_of_accounts);
+            assert_eq!(cost, stat.costliest_account_cost);
+        }
+
+        // case 2: add tx writes to acct2 with $cost, should succeed, result to
+        // | acct1 | $cost |
+        // | acct2 | $cost * 2 |
+        // | acct2 | $cost |
+        // and block_cost = $cost * 2
+        {
+            let tx_cost = TransactionCost {
+                writable_accounts: vec![acct2],
+                execution_cost: cost,
+                ..TransactionCost::default()
+            };
+            assert!(testee
+                .try_add(&tx_cost, &mut CostTrackerStats::default())
+                .is_ok());
+            let stat = testee.get_stats();
+            assert_eq!(cost * 2, stat.total_cost);
+            assert_eq!(3, stat.number_of_accounts);
+            assert_eq!(cost * 2, stat.costliest_account_cost);
+            assert_eq!(acct2, stat.costliest_account);
+        }
+
+        // case 3: add tx writes to [acct1, acct2], acct2 exceeds limit, should failed atomically,
+        // we shoudl still have:
+        // | acct1 | $cost |
+        // | acct2 | $cost |
+        // | acct2 | $cost |
+        // and block_cost = $cost
+        {
+            let tx_cost = TransactionCost {
+                writable_accounts: vec![acct1, acct2],
+                execution_cost: cost,
+                ..TransactionCost::default()
+            };
+            assert!(testee
+                .try_add(&tx_cost, &mut CostTrackerStats::default())
+                .is_err());
+            let stat = testee.get_stats();
+            assert_eq!(cost * 2, stat.total_cost);
+            assert_eq!(3, stat.number_of_accounts);
+            assert_eq!(cost * 2, stat.costliest_account_cost);
+            assert_eq!(acct2, stat.costliest_account);
+        }
+    }
+}
--- a/core/src/cost_tracker_stats.rs
+++ b/core/src/cost_tracker_stats.rs
@@ -0,0 +1,75 @@
+//! The Stats is not thread safe, each thread should have its own
+//! instance of stat with `id`; Stat reports and reset for each slot.
+#[derive(Debug, Default)]
+pub struct CostTrackerStats {
+    pub id: u32,
+    pub transaction_cost_histogram: histogram::Histogram,
+    pub writable_accounts_cost_histogram: histogram::Histogram,
+    pub transaction_count: u64,
+    pub block_cost: u64,
+    pub bank_slot: u64,
+}
+
+impl CostTrackerStats {
+    pub fn new(id: u32, bank_slot: u64) -> Self {
+        CostTrackerStats {
+            id,
+            bank_slot,
+            ..CostTrackerStats::default()
+        }
+    }
+
+    pub fn report(&self) {
+        datapoint_info!(
+            "cost_tracker_stats",
+            ("id", self.id as i64, i64),
+            (
+                "transaction_cost_unit_min",
+                self.transaction_cost_histogram.minimum().unwrap_or(0),
+                i64
+            ),
+            (
+                "transaction_cost_unit_max",
+                self.transaction_cost_histogram.maximum().unwrap_or(0),
+                i64
+            ),
+            (
+                "transaction_cost_unit_mean",
+                self.transaction_cost_histogram.mean().unwrap_or(0),
+                i64
+            ),
+            (
+                "transaction_cost_unit_2nd_std",
+                self.transaction_cost_histogram
+                    .percentile(95.0)
+                    .unwrap_or(0),
+                i64
+            ),
+            (
+                "writable_accounts_cost_min",
+                self.writable_accounts_cost_histogram.minimum().unwrap_or(0),
+                i64
+            ),
+            (
+                "writable_accounts_cost_max",
+                self.writable_accounts_cost_histogram.maximum().unwrap_or(0),
+                i64
+            ),
+            (
+                "writable_accounts_cost_mean",
+                self.writable_accounts_cost_histogram.mean().unwrap_or(0),
+                i64
+            ),
+            (
+                "writable_accounts_cost_2nd_std",
+                self.writable_accounts_cost_histogram
+                    .percentile(95.0)
+                    .unwrap_or(0),
+                i64
+            ),
+            ("transaction_count", self.transaction_count as i64, i64),
+            ("block_cost", self.block_cost as i64, i64),
+            ("bank_slot", self.bank_slot as i64, i64),
+        );
+    }
+}
--- a/core/src/cost_update_service.rs
+++ b/core/src/cost_update_service.rs
@@ -0,0 +1,292 @@
+//! this service receives instruction ExecuteTimings from replay_stage,
+//! update cost_model which is shared with banking_stage to optimize
+//! packing transactions into block; it also triggers persisting cost
+//! table to blockstore.
+
+use crate::cost_model::CostModel;
+use solana_ledger::blockstore::Blockstore;
+use solana_measure::measure::Measure;
+use solana_runtime::bank::ExecuteTimings;
+use solana_sdk::timing::timestamp;
+use std::{
+    sync::{
+        atomic::{AtomicBool, Ordering},
+        mpsc::Receiver,
+        Arc, RwLock,
+    },
+    thread::{self, Builder, JoinHandle},
+    time::Duration,
+};
+
+#[derive(Default)]
+pub struct CostUpdateServiceTiming {
+    last_print: u64,
+    update_cost_model_count: u64,
+    update_cost_model_elapsed: u64,
+    persist_cost_table_elapsed: u64,
+}
+
+impl CostUpdateServiceTiming {
+    fn update(
+        &mut self,
+        update_cost_model_count: u64,
+        update_cost_model_elapsed: u64,
+        persist_cost_table_elapsed: u64,
+    ) {
+        self.update_cost_model_count += update_cost_model_count;
+        self.update_cost_model_elapsed += update_cost_model_elapsed;
+        self.persist_cost_table_elapsed += persist_cost_table_elapsed;
+
+        let now = timestamp();
+        let elapsed_ms = now - self.last_print;
+        if elapsed_ms > 1000 {
+            datapoint_info!(
+                "cost-update-service-stats",
+                ("total_elapsed_us", elapsed_ms * 1000, i64),
+                (
+                    "update_cost_model_count",
+                    self.update_cost_model_count as i64,
+                    i64
+                ),
+                (
+                    "update_cost_model_elapsed",
+                    self.update_cost_model_elapsed as i64,
+                    i64
+                ),
+                (
+                    "persist_cost_table_elapsed",
+                    self.persist_cost_table_elapsed as i64,
+                    i64
+                ),
+            );
+
+            *self = CostUpdateServiceTiming::default();
+            self.last_print = now;
+        }
+    }
+}
+
+pub type CostUpdateReceiver = Receiver<ExecuteTimings>;
+
+pub struct CostUpdateService {
+    thread_hdl: JoinHandle<()>,
+}
+
+impl CostUpdateService {
+    #[allow(clippy::new_ret_no_self)]
+    pub fn new(
+        exit: Arc<AtomicBool>,
+        blockstore: Arc<Blockstore>,
+        cost_model: Arc<RwLock<CostModel>>,
+        cost_update_receiver: CostUpdateReceiver,
+    ) -> Self {
+        let thread_hdl = Builder::new()
+            .name("solana-cost-update-service".to_string())
+            .spawn(move || {
+                Self::service_loop(exit, blockstore, cost_model, cost_update_receiver);
+            })
+            .unwrap();
+
+        Self { thread_hdl }
+    }
+
+    pub fn join(self) -> thread::Result<()> {
+        self.thread_hdl.join()
+    }
+
+    fn service_loop(
+        exit: Arc<AtomicBool>,
+        blockstore: Arc<Blockstore>,
+        cost_model: Arc<RwLock<CostModel>>,
+        cost_update_receiver: CostUpdateReceiver,
+    ) {
+        let mut cost_update_service_timing = CostUpdateServiceTiming::default();
+        let mut dirty: bool;
+        let mut update_count: u64;
+        let wait_timer = Duration::from_millis(100);
+
+        loop {
+            if exit.load(Ordering::Relaxed) {
+                break;
+            }
+
+            dirty = false;
+            update_count = 0_u64;
+            let mut update_cost_model_time = Measure::start("update_cost_model_time");
+            for cost_update in cost_update_receiver.try_iter() {
+                dirty |= Self::update_cost_model(&cost_model, &cost_update);
+                update_count += 1;
+            }
+            update_cost_model_time.stop();
+
+            let mut persist_cost_table_time = Measure::start("persist_cost_table_time");
+            if dirty {
+                Self::persist_cost_table(&blockstore, &cost_model);
+            }
+            persist_cost_table_time.stop();
+
+            cost_update_service_timing.update(
+                update_count,
+                update_cost_model_time.as_us(),
+                persist_cost_table_time.as_us(),
+            );
+
+            thread::sleep(wait_timer);
+        }
+    }
+
+    fn update_cost_model(cost_model: &RwLock<CostModel>, execute_timings: &ExecuteTimings) -> bool {
+        let mut dirty = false;
+        {
+            let mut cost_model_mutable = cost_model.write().unwrap();
+            for (program_id, timing) in &execute_timings.details.per_program_timings {
+                if timing.count < 1 {
+                    continue;
+                }
+                let units = timing.accumulated_units / timing.count as u64;
+                match cost_model_mutable.upsert_instruction_cost(program_id, units) {
+                    Ok(c) => {
+                        debug!(
+                            "after replayed into bank, instruction {:?} has averaged cost {}",
+                            program_id, c
+                        );
+                        dirty = true;
+                    }
+                    Err(err) => {
+                        debug!(
+                        "after replayed into bank, instruction {:?} failed to update cost, err: {}",
+                        program_id, err
+                    );
+                    }
+                }
+            }
+        }
+        debug!(
+           "after replayed into bank, updated cost model instruction cost table, current values: {:?}",
+           cost_model.read().unwrap().get_instruction_cost_table()
+        );
+        dirty
+    }
+
+    fn persist_cost_table(blockstore: &Blockstore, cost_model: &RwLock<CostModel>) {
+        let cost_model_read = cost_model.read().unwrap();
+        let cost_table = cost_model_read.get_instruction_cost_table();
+        let db_records = blockstore.read_program_costs().expect("read programs");
+
+        // delete records from blockstore if they are no longer in cost_table
+        db_records.iter().for_each(|(pubkey, _)| {
+            if cost_table.get(pubkey).is_none() {
+                blockstore
+                    .delete_program_cost(pubkey)
+                    .expect("delete old program");
+            }
+        });
+
+        for (key, cost) in cost_table.iter() {
+            blockstore
+                .write_program_cost(key, cost)
+                .expect("persist program costs to blockstore");
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use solana_runtime::message_processor::ProgramTiming;
+    use solana_sdk::pubkey::Pubkey;
+
+    #[test]
+    fn test_update_cost_model_with_empty_execute_timings() {
+        let cost_model = Arc::new(RwLock::new(CostModel::default()));
+        let empty_execute_timings = ExecuteTimings::default();
+        CostUpdateService::update_cost_model(&cost_model, &empty_execute_timings);
+
+        assert_eq!(
+            0,
+            cost_model
+                .read()
+                .unwrap()
+                .get_instruction_cost_table()
+                .len()
+        );
+    }
+
+    #[test]
+    fn test_update_cost_model_with_execute_timings() {
+        let cost_model = Arc::new(RwLock::new(CostModel::default()));
+        let mut execute_timings = ExecuteTimings::default();
+
+        let program_key_1 = Pubkey::new_unique();
+        let mut expected_cost: u64;
+
+        // add new program
+        {
+            let accumulated_us: u64 = 1000;
+            let accumulated_units: u64 = 100;
+            let count: u32 = 10;
+            expected_cost = accumulated_units / count as u64;
+
+            execute_timings.details.per_program_timings.insert(
+                program_key_1,
+                ProgramTiming {
+                    accumulated_us,
+                    accumulated_units,
+                    count,
+                },
+            );
+            CostUpdateService::update_cost_model(&cost_model, &execute_timings);
+            assert_eq!(
+                1,
+                cost_model
+                    .read()
+                    .unwrap()
+                    .get_instruction_cost_table()
+                    .len()
+            );
+            assert_eq!(
+                Some(&expected_cost),
+                cost_model
+                    .read()
+                    .unwrap()
+                    .get_instruction_cost_table()
+                    .get(&program_key_1)
+            );
+        }
+
+        // update program
+        {
+            let accumulated_us: u64 = 2000;
+            let accumulated_units: u64 = 200;
+            let count: u32 = 10;
+            // to expect new cost is Average(new_value, existing_value)
+            expected_cost = ((accumulated_units / count as u64) + expected_cost) / 2;
+
+            execute_timings.details.per_program_timings.insert(
+                program_key_1,
+                ProgramTiming {
+                    accumulated_us,
+                    accumulated_units,
+                    count,
+                },
+            );
+            CostUpdateService::update_cost_model(&cost_model, &execute_timings);
+            assert_eq!(
+                1,
+                cost_model
+                    .read()
+                    .unwrap()
+                    .get_instruction_cost_table()
+                    .len()
+            );
+            assert_eq!(
+                Some(&expected_cost),
+                cost_model
+                    .read()
+                    .unwrap()
+                    .get_instruction_cost_table()
+                    .get(&program_key_1)
+            );
+        }
+    }
+}
--- a/core/src/execute_cost_table.rs
+++ b/core/src/execute_cost_table.rs
@@ -0,0 +1,279 @@
+/// ExecuteCostTable is aggregated by Cost Model, it keeps each program's
+/// average cost in its HashMap, with fixed capacity to avoid from growing
+/// unchecked.
+/// When its capacity limit is reached, it prunes old and less-used programs
+/// to make room for new ones.
+use log::*;
+use solana_sdk::pubkey::Pubkey;
+use std::{collections::HashMap, time::SystemTime};
+
+// prune is rather expensive op, free up bulk space in each operation
+// would be more efficient. PRUNE_RATIO defines the after prune table
+// size will be original_size * PRUNE_RATIO.
+const PRUNE_RATIO: f64 = 0.75;
+// with 50_000 TPS as norm, weights occurrences '100' per microsec
+const OCCURRENCES_WEIGHT: i64 = 100;
+
+const DEFAULT_CAPACITY: usize = 1024;
+
+#[derive(Debug)]
+pub struct ExecuteCostTable {
+    capacity: usize,
+    table: HashMap<Pubkey, u64>,
+    occurrences: HashMap<Pubkey, (usize, SystemTime)>,
+}
+
+impl Default for ExecuteCostTable {
+    fn default() -> Self {
+        ExecuteCostTable::new(DEFAULT_CAPACITY)
+    }
+}
+
+impl ExecuteCostTable {
+    pub fn new(cap: usize) -> Self {
+        Self {
+            capacity: cap,
+            table: HashMap::with_capacity(cap),
+            occurrences: HashMap::with_capacity(cap),
+        }
+    }
+
+    pub fn get_cost_table(&self) -> &HashMap<Pubkey, u64> {
+        &self.table
+    }
+
+    pub fn get_count(&self) -> usize {
+        self.table.len()
+    }
+
+    // instead of assigning unknown program with a configured/hard-coded cost
+    // use average or mode function to make a educated guess.
+    pub fn get_average(&self) -> u64 {
+        if self.table.is_empty() {
+            0
+        } else {
+            self.table.iter().map(|(_, value)| value).sum::<u64>() / self.get_count() as u64
+        }
+    }
+
+    pub fn get_mode(&self) -> u64 {
+        if self.occurrences.is_empty() {
+            0
+        } else {
+            let key = self
+                .occurrences
+                .iter()
+                .max_by_key(|&(_, count)| count)
+                .map(|(key, _)| key)
+                .expect("cannot find mode from cost table");
+
+            *self.table.get(&key).unwrap()
+        }
+    }
+
+    // returns None if program doesn't exist in table. In this case,
+    // client is advised to call `get_average()` or `get_mode()` to
+    // assign a 'default' value for new program.
+    pub fn get_cost(&self, key: &Pubkey) -> Option<&u64> {
+        self.table.get(&key)
+    }
+
+    pub fn upsert(&mut self, key: &Pubkey, value: u64) -> Option<u64> {
+        let need_to_add = self.table.get(key).is_none();
+        let current_size = self.get_count();
+        if current_size == self.capacity && need_to_add {
+            self.prune_to(&((current_size as f64 * PRUNE_RATIO) as usize));
+        }
+
+        let program_cost = self.table.entry(*key).or_insert(value);
+        *program_cost = (*program_cost + value) / 2;
+
+        let (count, timestamp) = self
+            .occurrences
+            .entry(*key)
+            .or_insert((0, SystemTime::now()));
+        *count += 1;
+        *timestamp = SystemTime::now();
+
+        Some(*program_cost)
+    }
+
+    // prune the old programs so the table contains `new_size` of records,
+    // where `old` is defined as weighted age, which is negatively correlated
+    // with program's age and
+    // positively correlated with how frequently the program
+    // is executed (eg. occurrence),
+    fn prune_to(&mut self, new_size: &usize) {
+        debug!(
+            "prune cost table, current size {}, new size {}",
+            self.get_count(),
+            new_size
+        );
+
+        if *new_size == self.get_count() {
+            return;
+        }
+
+        if *new_size == 0 {
+            self.table.clear();
+            self.occurrences.clear();
+            return;
+        }
+
+        let now = SystemTime::now();
+        let mut sorted_by_weighted_age: Vec<_> = self
+            .occurrences
+            .iter()
+            .map(|(key, (count, timestamp))| {
+                let age = now.duration_since(*timestamp).unwrap().as_micros();
+                let weighted_age = *count as i64 * OCCURRENCES_WEIGHT + -(age as i64);
+                (weighted_age, *key)
+            })
+            .collect();
+        sorted_by_weighted_age.sort_by(|x, y| x.0.partial_cmp(&y.0).unwrap());
+
+        for i in sorted_by_weighted_age.iter() {
+            self.table.remove(&i.1);
+            self.occurrences.remove(&i.1);
+            if *new_size == self.get_count() {
+                break;
+            }
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_execute_cost_table_prune_simple_table() {
+        solana_logger::setup();
+        let capacity: usize = 3;
+        let mut testee = ExecuteCostTable::new(capacity);
+
+        let key1 = Pubkey::new_unique();
+        let key2 = Pubkey::new_unique();
+        let key3 = Pubkey::new_unique();
+
+        testee.upsert(&key1, 1);
+        testee.upsert(&key2, 2);
+        testee.upsert(&key3, 3);
+
+        testee.prune_to(&(capacity - 1));
+
+        // the oldest, key1, should be pruned
+        assert!(testee.get_cost(&key1).is_none());
+        assert!(testee.get_cost(&key2).is_some());
+        assert!(testee.get_cost(&key2).is_some());
+    }
+
+    #[test]
+    fn test_execute_cost_table_prune_weighted_table() {
+        solana_logger::setup();
+        let capacity: usize = 3;
+        let mut testee = ExecuteCostTable::new(capacity);
+
+        let key1 = Pubkey::new_unique();
+        let key2 = Pubkey::new_unique();
+        let key3 = Pubkey::new_unique();
+
+        testee.upsert(&key1, 1);
+        testee.upsert(&key1, 1);
+        testee.upsert(&key2, 2);
+        testee.upsert(&key3, 3);
+
+        testee.prune_to(&(capacity - 1));
+
+        // the oldest, key1, has 2 counts; 2nd oldest Key2 has 1 count;
+        // expect key2 to be pruned.
+        assert!(testee.get_cost(&key1).is_some());
+        assert!(testee.get_cost(&key2).is_none());
+        assert!(testee.get_cost(&key3).is_some());
+    }
+
+    #[test]
+    fn test_execute_cost_table_upsert_within_capacity() {
+        solana_logger::setup();
+        let mut testee = ExecuteCostTable::default();
+
+        let key1 = Pubkey::new_unique();
+        let key2 = Pubkey::new_unique();
+        let cost1: u64 = 100;
+        let cost2: u64 = 110;
+
+        // query empty table
+        assert!(testee.get_cost(&key1).is_none());
+
+        // insert one record
+        testee.upsert(&key1, cost1);
+        assert_eq!(1, testee.get_count());
+        assert_eq!(cost1, testee.get_average());
+        assert_eq!(cost1, testee.get_mode());
+        assert_eq!(&cost1, testee.get_cost(&key1).unwrap());
+
+        // insert 2nd record
+        testee.upsert(&key2, cost2);
+        assert_eq!(2, testee.get_count());
+        assert_eq!((cost1 + cost2) / 2_u64, testee.get_average());
+        assert_eq!(cost2, testee.get_mode());
+        assert_eq!(&cost1, testee.get_cost(&key1).unwrap());
+        assert_eq!(&cost2, testee.get_cost(&key2).unwrap());
+
+        // update 1st record
+        testee.upsert(&key1, cost2);
+        assert_eq!(2, testee.get_count());
+        assert_eq!(((cost1 + cost2) / 2 + cost2) / 2, testee.get_average());
+        assert_eq!((cost1 + cost2) / 2, testee.get_mode());
+        assert_eq!(&((cost1 + cost2) / 2), testee.get_cost(&key1).unwrap());
+        assert_eq!(&cost2, testee.get_cost(&key2).unwrap());
+    }
+
+    #[test]
+    fn test_execute_cost_table_upsert_exceeds_capacity() {
+        solana_logger::setup();
+        let capacity: usize = 2;
+        let mut testee = ExecuteCostTable::new(capacity);
+
+        let key1 = Pubkey::new_unique();
+        let key2 = Pubkey::new_unique();
+        let key3 = Pubkey::new_unique();
+        let key4 = Pubkey::new_unique();
+        let cost1: u64 = 100;
+        let cost2: u64 = 110;
+        let cost3: u64 = 120;
+        let cost4: u64 = 130;
+
+        // insert one record
+        testee.upsert(&key1, cost1);
+        assert_eq!(1, testee.get_count());
+        assert_eq!(&cost1, testee.get_cost(&key1).unwrap());
+
+        // insert 2nd record
+        testee.upsert(&key2, cost2);
+        assert_eq!(2, testee.get_count());
+        assert_eq!(&cost1, testee.get_cost(&key1).unwrap());
+        assert_eq!(&cost2, testee.get_cost(&key2).unwrap());
+
+        // insert 3rd record, pushes out the oldest (eg 1st) record
+        testee.upsert(&key3, cost3);
+        assert_eq!(2, testee.get_count());
+        assert_eq!((cost2 + cost3) / 2_u64, testee.get_average());
+        assert_eq!(cost3, testee.get_mode());
+        assert!(testee.get_cost(&key1).is_none());
+        assert_eq!(&cost2, testee.get_cost(&key2).unwrap());
+        assert_eq!(&cost3, testee.get_cost(&key3).unwrap());
+
+        // update 2nd record, so the 3rd becomes the oldest
+        // add 4th record, pushes out 3rd key
+        testee.upsert(&key2, cost1);
+        testee.upsert(&key4, cost4);
+        assert_eq!(((cost1 + cost2) / 2 + cost4) / 2_u64, testee.get_average());
+        assert_eq!((cost1 + cost2) / 2, testee.get_mode());
+        assert_eq!(2, testee.get_count());
+        assert!(testee.get_cost(&key1).is_none());
+        assert_eq!(&((cost1 + cost2) / 2), testee.get_cost(&key2).unwrap());
+        assert!(testee.get_cost(&key3).is_none());
+        assert_eq!(&cost4, testee.get_cost(&key4).unwrap());
+    }
+}
--- a/core/src/lib.rs
+++ b/core/src/lib.rs
@@ -19,6 +19,11 @@ pub mod cluster_slots_service;
 pub mod commitment_service;
 pub mod completed_data_sets_service;
 pub mod consensus;
+pub mod cost_model;
+pub mod cost_tracker;
+pub mod cost_tracker_stats;
+pub mod cost_update_service;
+pub mod execute_cost_table;
 pub mod fetch_stage;
 pub mod fork_choice;
 pub mod gen_keys;
--- a/core/src/progress_map.rs
+++ b/core/src/progress_map.rs
@@ -114,6 +114,43 @@ impl ReplaySlotStats {
                i64
            ),
        );
+
+        let mut per_pubkey_timings: Vec<_> = self
+            .execute_timings
+            .details
+            .per_program_timings
+            .iter()
+            .collect();
+        per_pubkey_timings.sort_by(|a, b| b.1.accumulated_us.cmp(&a.1.accumulated_us));
+        let (total_us, total_units, total_count) =
+            per_pubkey_timings
+                .iter()
+                .fold((0, 0, 0), |(sum_us, sum_units, sum_count), a| {
+                    (
+                        sum_us + a.1.accumulated_us,
+                        sum_units + a.1.accumulated_units,
+                        sum_count + a.1.count,
+                    )
+                });
+
+        for (pubkey, time) in per_pubkey_timings.iter().take(5) {
+            datapoint_info!(
+                "per_program_timings",
+                ("slot", slot as i64, i64),
+                ("pubkey", pubkey.to_string(), String),
+                ("execute_us", time.accumulated_us, i64),
+                ("accumulated_units", time.accumulated_units, i64),
+                ("count", time.count, i64)
+            );
+        }
+        datapoint_info!(
+            "per_program_timings",
+            ("slot", slot as i64, i64),
+            ("pubkey", "all", String),
+            ("execute_us", total_us, i64),
+            ("accumulated_units", total_units, i64),
+            ("count", total_count, i64)
+        );
    }
 }

--- a/core/src/replay_stage.rs
+++ b/core/src/replay_stage.rs
@@ -18,7 +18,6 @@ use crate::{
    latest_validator_votes_for_frozen_banks::LatestValidatorVotesForFrozenBanks,
    progress_map::{ForkProgress, ProgressMap, PropagatedStats},
    repair_service::DuplicateSlotsResetReceiver,
-    result::Result,
    rewards_recorder_service::RewardsRecorderSender,
    unfrozen_gossip_verified_vote_hashes::UnfrozenGossipVerifiedVoteHashes,
    voting_service::VoteOp,
@@ -42,7 +41,7 @@ use solana_rpc::{
 };
 use solana_runtime::{
    accounts_background_service::AbsRequestSender,
-    bank::{Bank, NewBankOptions},
+    bank::{Bank, ExecuteTimings, NewBankOptions},
    bank_forks::BankForks,
    commitment::BlockCommitmentCache,
    vote_sender_types::ReplayVoteSender,
@@ -281,7 +280,7 @@ impl ReplayTiming {
                    "process_duplicate_slots_elapsed",
                    self.process_duplicate_slots_elapsed as i64,
                    i64
-                )
+                ),
            );

            *self = ReplayTiming::default();
@@ -291,7 +290,7 @@ impl ReplayTiming {
 }

 pub struct ReplayStage {
-    t_replay: JoinHandle<Result<()>>,
+    t_replay: JoinHandle<()>,
    commitment_service: AggregateCommitmentService,
 }

@@ -315,6 +314,7 @@ impl ReplayStage {
        gossip_verified_vote_hash_receiver: GossipVerifiedVoteHashReceiver,
        cluster_slots_update_sender: ClusterSlotsUpdateSender,
        voting_sender: Sender<VoteOp>,
+        cost_update_sender: Sender<ExecuteTimings>,
    ) -> Self {
        let ReplayStageConfig {
            my_pubkey,
@@ -412,6 +412,7 @@ impl ReplayStage {
                        &mut unfrozen_gossip_verified_vote_hashes,
                        &mut latest_validator_votes_for_frozen_banks,
                        &cluster_slots_update_sender,
+                        &cost_update_sender,
                    );
                    replay_active_banks_time.stop();

@@ -742,7 +743,6 @@ impl ReplayStage {
                        process_duplicate_slots_time.as_us(),
                    );
                }
-                Ok(())
            })
            .unwrap();

@@ -1690,9 +1690,11 @@ impl ReplayStage {
        unfrozen_gossip_verified_vote_hashes: &mut UnfrozenGossipVerifiedVoteHashes,
        latest_validator_votes_for_frozen_banks: &mut LatestValidatorVotesForFrozenBanks,
        cluster_slots_update_sender: &ClusterSlotsUpdateSender,
+        cost_update_sender: &Sender<ExecuteTimings>,
    ) -> bool {
        let mut did_complete_bank = false;
        let mut tx_count = 0;
+        let mut execute_timings = ExecuteTimings::default();
        let active_banks = bank_forks.read().unwrap().active_banks();
        trace!("active banks {:?}", active_banks);

@@ -1763,6 +1765,12 @@ impl ReplayStage {
            }
            assert_eq!(*bank_slot, bank.slot());
            if bank.is_complete() {
+                execute_timings.accumulate(&bank_progress.replay_stats.execute_timings);
+                debug!("bank {} is completed replay from blockstore, contribute to update cost with {:?}",
+                       bank.slot(),
+                       bank_progress.replay_stats.execute_timings
+                       );
+
                bank_progress.replay_stats.report_stats(
                    bank.slot(),
                    bank_progress.replay_progress.num_entries,
@@ -1824,6 +1832,14 @@ impl ReplayStage {
                );
            }
        }
+
+        // send accumulated excute-timings to cost_update_service
+        if !execute_timings.details.per_program_timings.is_empty() {
+            cost_update_sender
+                .send(execute_timings)
+                .unwrap_or_else(|err| warn!("cost_update_sender failed: {:?}", err));
+        }
+
        inc_new_counter_info!("replay_stage-replay_transactions", tx_count);
        did_complete_bank
    }
@@ -4929,7 +4945,6 @@ mod tests {
        );
        assert_eq!(tower.last_voted_slot().unwrap(), 1);
    }
-
    fn run_compute_and_select_forks(
        bank_forks: &RwLock<BankForks>,
        progress: &mut ProgressMap,
--- a/core/src/sigverify_stage.rs
+++ b/core/src/sigverify_stage.rs
@@ -8,13 +8,13 @@
 use crate::sigverify;
 use crossbeam_channel::{SendError, Sender as CrossbeamSender};
 use solana_measure::measure::Measure;
-use solana_metrics::datapoint_debug;
 use solana_perf::packet::Packets;
 use solana_sdk::timing;
 use solana_streamer::streamer::{self, PacketReceiver, StreamerError};
 use std::collections::HashMap;
 use std::sync::mpsc::{Receiver, RecvTimeoutError};
 use std::thread::{self, Builder, JoinHandle};
+use std::time::Instant;
 use thiserror::Error;

 const MAX_SIGVERIFY_BATCH: usize = 10_000;
@@ -41,6 +41,82 @@ pub trait SigVerifier {
 #[derive(Default, Clone)]
 pub struct DisabledSigVerifier {}

+#[derive(Default)]
+struct SigVerifierStats {
+    recv_batches_us_hist: histogram::Histogram, // time to call recv_batch
+    verify_batches_pp_us_hist: histogram::Histogram, // per-packet time to call verify_batch
+    batches_hist: histogram::Histogram,         // number of Packets structures per verify call
+    packets_hist: histogram::Histogram,         // number of packets per verify call
+    total_batches: usize,
+    total_packets: usize,
+}
+
+impl SigVerifierStats {
+    fn report(&self) {
+        datapoint_info!(
+            "sigverify_stage-total_verify_time",
+            (
+                "recv_batches_us_90pct",
+                self.recv_batches_us_hist.percentile(90.0).unwrap_or(0),
+                i64
+            ),
+            (
+                "recv_batches_us_min",
+                self.recv_batches_us_hist.minimum().unwrap_or(0),
+                i64
+            ),
+            (
+                "recv_batches_us_max",
+                self.recv_batches_us_hist.maximum().unwrap_or(0),
+                i64
+            ),
+            (
+                "recv_batches_us_mean",
+                self.recv_batches_us_hist.mean().unwrap_or(0),
+                i64
+            ),
+            (
+                "verify_batches_pp_us_90pct",
+                self.verify_batches_pp_us_hist.percentile(90.0).unwrap_or(0),
+                i64
+            ),
+            (
+                "verify_batches_pp_us_min",
+                self.verify_batches_pp_us_hist.minimum().unwrap_or(0),
+                i64
+            ),
+            (
+                "verify_batches_pp_us_max",
+                self.verify_batches_pp_us_hist.maximum().unwrap_or(0),
+                i64
+            ),
+            (
+                "verify_batches_pp_us_mean",
+                self.verify_batches_pp_us_hist.mean().unwrap_or(0),
+                i64
+            ),
+            (
+                "batches_90pct",
+                self.batches_hist.percentile(90.0).unwrap_or(0),
+                i64
+            ),
+            ("batches_min", self.batches_hist.minimum().unwrap_or(0), i64),
+            ("batches_max", self.batches_hist.maximum().unwrap_or(0), i64),
+            ("batches_mean", self.batches_hist.mean().unwrap_or(0), i64),
+            (
+                "packets_90pct",
+                self.packets_hist.percentile(90.0).unwrap_or(0),
+                i64
+            ),
+            ("packets_min", self.packets_hist.minimum().unwrap_or(0), i64),
+            ("packets_max", self.packets_hist.maximum().unwrap_or(0), i64),
+            ("packets_mean", self.packets_hist.mean().unwrap_or(0), i64),
+            ("total_batches", self.total_batches, i64),
+            ("total_packets", self.total_packets, i64),
+        );
+    }
+}
+
 impl SigVerifier for DisabledSigVerifier {
    fn verify_batch(&self, mut batch: Vec<Packets>) -> Vec<Packets> {
        sigverify::ed25519_verify_disabled(&mut batch);
@@ -92,6 +168,7 @@ impl SigVerifyStage {
        recvr: &PacketReceiver,
        sendr: &CrossbeamSender<Vec<Packets>>,
        verifier: &T,
+        stats: &mut SigVerifierStats,
    ) -> Result<()> {
        let (mut batches, len, recv_time) = streamer::recv_batch(recvr)?;

@@ -121,6 +198,19 @@ impl SigVerifyStage {
            ("recv_time", recv_time, i64),
        );

+        stats
+            .recv_batches_us_hist
+            .increment(recv_time as u64)
+            .unwrap();
+        stats
+            .verify_batches_pp_us_hist
+            .increment(verify_batch_time.as_us() / (len as u64))
+            .unwrap();
+        stats.batches_hist.increment(batches_len as u64).unwrap();
+        stats.packets_hist.increment(len as u64).unwrap();
+        stats.total_batches += batches_len;
+        stats.total_packets += len;
+
        Ok(())
    }

@@ -130,10 +220,14 @@ impl SigVerifyStage {
        verifier: &T,
    ) -> JoinHandle<()> {
        let verifier = verifier.clone();
+        let mut stats = SigVerifierStats::default();
+        let mut last_print = Instant::now();
        Builder::new()
            .name("solana-verifier".to_string())
            .spawn(move || loop {
-                if let Err(e) = Self::verifier(&packet_receiver, &verified_sender, &verifier) {
+                if let Err(e) =
+                    Self::verifier(&packet_receiver, &verified_sender, &verifier, &mut stats)
+                {
                    match e {
                        SigVerifyServiceError::Streamer(StreamerError::RecvTimeout(
                            RecvTimeoutError::Disconnected,
@@ -147,6 +241,11 @@ impl SigVerifyStage {
                        _ => error!("{:?}", e),
                    }
                }
+                if last_print.elapsed().as_secs() > 2 {
+                    stats.report();
+                    stats = SigVerifierStats::default();
+                    last_print = Instant::now();
+                }
            })
            .unwrap()
    }
--- a/core/src/tpu.rs
+++ b/core/src/tpu.rs
@@ -8,6 +8,8 @@ use crate::{
        ClusterInfoVoteListener, GossipDuplicateConfirmedSlotsSender, GossipVerifiedVoteHashSender,
        VerifiedVoteSender, VoteTracker,
    },
+    cost_model::CostModel,
+    cost_tracker::CostTracker,
    fetch_stage::FetchStage,
    sigverify::TransactionSigVerifier,
    sigverify_stage::SigVerifyStage,
@@ -71,6 +73,7 @@ impl Tpu {
        bank_notification_sender: Option<BankNotificationSender>,
        tpu_coalesce_ms: u64,
        cluster_confirmed_slot_sender: GossipDuplicateConfirmedSlotsSender,
+        cost_model: &Arc<RwLock<CostModel>>,
    ) -> Self {
        let (packet_sender, packet_receiver) = channel();
        let (vote_packet_sender, vote_packet_receiver) = channel();
@@ -120,6 +123,7 @@ impl Tpu {
            cluster_confirmed_slot_sender,
        );

+        let cost_tracker = Arc::new(RwLock::new(CostTracker::new(cost_model.clone())));
        let banking_stage = BankingStage::new(
            cluster_info,
            poh_recorder,
@@ -128,6 +132,7 @@ impl Tpu {
            verified_gossip_vote_packets_receiver,
            transaction_status_sender,
            replay_vote_sender,
+            cost_tracker,
        );

        let broadcast_stage = broadcast_type.new_broadcast_stage(
--- a/core/src/tvu.rs
+++ b/core/src/tvu.rs
@@ -12,6 +12,8 @@ use crate::{
    cluster_slots::ClusterSlots,
    completed_data_sets_service::CompletedDataSetsSender,
    consensus::Tower,
+    cost_model::CostModel,
+    cost_update_service::CostUpdateService,
    ledger_cleanup_service::LedgerCleanupService,
    replay_stage::{ReplayStage, ReplayStageConfig},
    retransmit_stage::RetransmitStage,
@@ -38,6 +40,7 @@ use solana_runtime::{
        AbsRequestHandler, AbsRequestSender, AccountsBackgroundService, SnapshotRequestHandler,
    },
    accounts_db::AccountShrinkThreshold,
+    bank::ExecuteTimings,
    bank_forks::{BankForks, SnapshotConfig},
    commitment::BlockCommitmentCache,
    vote_sender_types::ReplayVoteSender,
@@ -52,7 +55,7 @@ use std::{
    net::UdpSocket,
    sync::{
        atomic::AtomicBool,
-        mpsc::{channel, Receiver},
+        mpsc::{channel, Receiver, Sender},
        Arc, Mutex, RwLock,
    },
    thread,
@@ -67,6 +70,7 @@ pub struct Tvu {
    accounts_background_service: AccountsBackgroundService,
    accounts_hash_verifier: AccountsHashVerifier,
    voting_service: VotingService,
+    cost_update_service: CostUpdateService,
 }

 pub struct Sockets {
@@ -131,6 +135,7 @@ impl Tvu {
        gossip_confirmed_slots_receiver: GossipDuplicateConfirmedSlotsReceiver,
        tvu_config: TvuConfig,
        max_slots: &Arc<MaxSlots>,
+        cost_model: &Arc<RwLock<CostModel>>,
    ) -> Self {
        let keypair: Arc<Keypair> = cluster_info.keypair.clone();

@@ -285,6 +290,17 @@ impl Tvu {
            bank_forks.clone(),
        );

+        let (cost_update_sender, cost_update_receiver): (
+            Sender<ExecuteTimings>,
+            Receiver<ExecuteTimings>,
+        ) = channel();
+        let cost_update_service = CostUpdateService::new(
+            exit.clone(),
+            blockstore.clone(),
+            cost_model.clone(),
+            cost_update_receiver,
+        );
+
        let replay_stage = ReplayStage::new(
            replay_stage_config,
            blockstore.clone(),
@@ -303,6 +319,7 @@ impl Tvu {
            gossip_verified_vote_hash_receiver,
            cluster_slots_update_sender,
            voting_sender,
+            cost_update_sender,
        );

        let ledger_cleanup_service = tvu_config.max_ledger_shreds.map(|max_ledger_shreds| {
@@ -334,6 +351,7 @@ impl Tvu {
            accounts_background_service,
            accounts_hash_verifier,
            voting_service,
+            cost_update_service,
        }
    }

@@ -348,6 +366,7 @@ impl Tvu {
        self.replay_stage.join()?;
        self.accounts_hash_verifier.join()?;
        self.voting_service.join()?;
+        self.cost_update_service.join()?;
        Ok(())
    }
 }
@@ -455,6 +474,7 @@ pub mod tests {
            gossip_confirmed_slots_receiver,
            TvuConfig::default(),
            &Arc::new(MaxSlots::default()),
+            &Arc::new(RwLock::new(CostModel::default())),
        );
        exit.store(true, Ordering::Relaxed);
        tvu.join().unwrap();
--- a/core/src/validator.rs
+++ b/core/src/validator.rs
@@ -7,6 +7,7 @@ use {
        cluster_info_vote_listener::VoteTracker,
        completed_data_sets_service::CompletedDataSetsService,
        consensus::{reconcile_blockstore_roots_with_tower, Tower},
+        cost_model::CostModel,
        rewards_recorder_service::{RewardsRecorderSender, RewardsRecorderService},
        sample_performance_service::SamplePerformanceService,
        serve_repair::ServeRepair,
@@ -681,6 +682,10 @@ impl Validator {
            bank_forks.read().unwrap().root_bank().deref(),
        ));

+        let mut cost_model = CostModel::default();
+        cost_model.initialize_cost_table(&blockstore.read_program_costs().unwrap());
+        let cost_model = Arc::new(RwLock::new(cost_model));
+
        let (retransmit_slots_sender, retransmit_slots_receiver) = unbounded();
        let (verified_vote_sender, verified_vote_receiver) = unbounded();
        let (gossip_verified_vote_hash_sender, gossip_verified_vote_hash_receiver) = unbounded();
@@ -758,6 +763,7 @@ impl Validator {
                disable_epoch_boundary_optimization: config.disable_epoch_boundary_optimization,
            },
            &max_slots,
+            &cost_model,
        );

        let tpu = Tpu::new(
@@ -784,6 +790,7 @@ impl Validator {
            bank_notification_sender,
            config.tpu_coalesce_ms,
            cluster_confirmed_slot_sender,
+            &cost_model,
        );

        datapoint_info!("validator-new", ("id", id.to_string(), String));