* Add proposal for supporting big transactions
* account index program
* fix formatting
* review feedback
* Add cost changes section
* Add cost section and more attack details
* fix lint
* document metadata changes
* nit
* rpc details
* add index meta struct
* add additional proposal and chagne title
* rename proposal file
* rename to address map and rewrite tx format
* no more appends, limit mapping size to 256
* update dos section
* add note about readonly
* restructure message to use enum
* cleanup
(cherry picked from commit 191519188d
)
Co-authored-by: Justin Starry <justin@solana.com>
This commit is contained in:
302
docs/src/proposals/transactions-v2.md
Normal file
302
docs/src/proposals/transactions-v2.md
Normal file
@ -0,0 +1,302 @@
|
|||||||
|
# Transactions v2 - Address maps
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
Messages transmitted to Solana validators must not exceed the IPv6 MTU size to
|
||||||
|
ensure fast and reliable network transmission of cluster info over UDP.
|
||||||
|
Solana's networking stack uses a conservative MTU size of 1280 bytes which,
|
||||||
|
after accounting for headers, leaves 1232 bytes for packet data like serialized
|
||||||
|
transactions.
|
||||||
|
|
||||||
|
Developers building applications on Solana must design their on-chain program
|
||||||
|
interfaces within the above transaction size limit constraint. One common
|
||||||
|
work-around is to store state temporarily on-chain and consume that state in
|
||||||
|
later transactions. This is the approach used by the BPF loader program for
|
||||||
|
deploying Solana programs.
|
||||||
|
|
||||||
|
However, this workaround doesn't work well when developers compose many on-chain
|
||||||
|
programs in a single atomic transaction. With more composition comes more
|
||||||
|
account inputs, each of which takes up 32 bytes. There is currently no available
|
||||||
|
workaround for increasing the number of accounts used in a single transaction
|
||||||
|
since each transaction must list all accounts that it needs to properly lock
|
||||||
|
accounts for parallel execution. Therefore the current cap is about 35 accounts
|
||||||
|
after accounting for signatures and other transaction metadata.
|
||||||
|
|
||||||
|
## Proposed Solution
|
||||||
|
|
||||||
|
Introduce a new on-chain program which stores account address maps and add a new
|
||||||
|
transaction format which supports concise account references through the
|
||||||
|
on-chain address maps.
|
||||||
|
|
||||||
|
### Address Map Program
|
||||||
|
|
||||||
|
Here we describe a program-based solution to the problem, whereby a protocol
|
||||||
|
developer or end-user can create collections of related addresses on-chain for
|
||||||
|
concise use in a transaction's account inputs. This approach is similar to page
|
||||||
|
tables used in operating systems to succinctly map virtual addresses to physical
|
||||||
|
memory.
|
||||||
|
|
||||||
|
After addresses are stored on-chain in an address map account, they may be
|
||||||
|
succinctly referenced in a transaction using a 1-byte u8 index rather than a
|
||||||
|
full 32-byte address. This will require a new transaction format to make use of
|
||||||
|
these succinct references as well as runtime handling for looking up and loading
|
||||||
|
accounts from the on-chain mappings.
|
||||||
|
|
||||||
|
#### State
|
||||||
|
|
||||||
|
Address map accounts must be rent-exempt but may be closed with a one epoch
|
||||||
|
deactivation period. Address maps must be activated before use.
|
||||||
|
|
||||||
|
Since transactions use a u8 offset to look up mapped addresses, accounts can
|
||||||
|
store up to 2^8 addresses each. Anyone may create an address map account of any
|
||||||
|
size as long as its big enough to store the necessary metadata. In addition to
|
||||||
|
stored addresses, address map accounts must also track the latest count of
|
||||||
|
stored addresses and an authority which must be a present signer for all
|
||||||
|
appended map entries.
|
||||||
|
|
||||||
|
Map additions require one slot to activate so each map should track how many
|
||||||
|
addresses are still pending activation in their on-chain state:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
struct AddressMap {
|
||||||
|
// authority must sign for each addition and to close the map account
|
||||||
|
authority: Pubkey,
|
||||||
|
// record a deactivation epoch to help validators know when to remove
|
||||||
|
// the map from their caches.
|
||||||
|
deactivation_epoch: Epoch,
|
||||||
|
// entries may not be modified once activated
|
||||||
|
activated: bool,
|
||||||
|
// list of entries, max capacity of u8::MAX
|
||||||
|
entries: Vec<Pubkey>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Cleanup
|
||||||
|
|
||||||
|
Once an address map gets stale and is no longer used, it can be reclaimed by the
|
||||||
|
authority withdrawing lamports but the remaining balance must be greater than
|
||||||
|
two epochs of rent. This ensures that it takes at least one full epoch to
|
||||||
|
deactivate a map.
|
||||||
|
|
||||||
|
Maps may not be recreated because each new map must be created at a derived
|
||||||
|
address using a monotonically increasing counter as a derivation seed.
|
||||||
|
|
||||||
|
#### Cost
|
||||||
|
|
||||||
|
Since address map accounts require caching and special handling in the runtime,
|
||||||
|
they should incur higher costs for storage. Cost structure design will be added
|
||||||
|
later.
|
||||||
|
|
||||||
|
### Versioned Transactions
|
||||||
|
|
||||||
|
In order to allow accounts to be referenced more succinctly, the structure of
|
||||||
|
serialized transactions must be modified. The new transaction format should not
|
||||||
|
affect transaction processing in the Solana VM beyond the increased capacity for
|
||||||
|
accounts and program invocations. Invoked programs will be unaware of which
|
||||||
|
transaction format was used.
|
||||||
|
|
||||||
|
The new transaction format must be distinguished from the current transaction
|
||||||
|
format. Current transactions can fit at most 19 signatures (64-bytes each) but
|
||||||
|
the message header encodes `num_required_signatures` as a `u8`. Since the upper
|
||||||
|
bit of the `u8` will never be set for a valid transaction, we can enable it to
|
||||||
|
denote whether a transaction should be decoded with the versioned format or not.
|
||||||
|
|
||||||
|
#### New Transaction Format
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[derive(Serialize, Deserialize)]
|
||||||
|
pub struct Transaction {
|
||||||
|
#[serde(with = "short_vec")]
|
||||||
|
pub signatures: Vec<Signature>,
|
||||||
|
/// The message to sign.
|
||||||
|
pub message: Message,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Uses custom serialization. If the first bit is set, a versioned message is
|
||||||
|
// encoded starting from the next byte. If the first bit is not set, all bytes
|
||||||
|
// are used to encode the original unversioned `Message` format.
|
||||||
|
pub enum Message {
|
||||||
|
Unversioned(UnversionedMessage),
|
||||||
|
Versioned(VersionedMessage),
|
||||||
|
}
|
||||||
|
|
||||||
|
// use bincode varint encoding to use u8 instead of u32 for enum tags
|
||||||
|
#[derive(Serialize, Deserialize)]
|
||||||
|
pub enum VersionedMessage {
|
||||||
|
Current(Box<MessageV2>)
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize, Deserialize)]
|
||||||
|
pub struct MessageV2 {
|
||||||
|
// unchanged
|
||||||
|
pub header: MessageHeader,
|
||||||
|
|
||||||
|
// unchanged
|
||||||
|
#[serde(with = "short_vec")]
|
||||||
|
pub account_keys: Vec<Pubkey>,
|
||||||
|
|
||||||
|
/// The last `address_maps.len()` number of readonly unsigned account_keys
|
||||||
|
/// should be loaded as address maps
|
||||||
|
#[serde(with = "short_vec")]
|
||||||
|
pub address_maps: Vec<AddressMap>,
|
||||||
|
|
||||||
|
// unchanged
|
||||||
|
pub recent_blockhash: Hash,
|
||||||
|
|
||||||
|
// unchanged. Account indices are still `u8` encoded so the max number of accounts
|
||||||
|
// in account_keys + address_maps is limited to 256.
|
||||||
|
#[serde(with = "short_vec")]
|
||||||
|
pub instructions: Vec<CompiledInstruction>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Serialize, Deserialize)]
|
||||||
|
pub struct AddressMap {
|
||||||
|
/// The last num_readonly_entries of entries are read-only
|
||||||
|
pub num_readonly_entries: u8,
|
||||||
|
|
||||||
|
/// List of map entries to load
|
||||||
|
#[serde(with = "short_vec")]
|
||||||
|
pub entries: Vec<u8>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Size changes
|
||||||
|
|
||||||
|
- 1 byte for `prefix` field
|
||||||
|
- 1 byte for version enum discriminant
|
||||||
|
- 1 byte for `address_maps` length
|
||||||
|
- Each map requires 2 bytes for `entries` length and `num_readonly`
|
||||||
|
- Each map entry is 1 byte (u8)
|
||||||
|
|
||||||
|
#### Cost changes
|
||||||
|
|
||||||
|
Using an address map in a transaction should incur an extra cost due to
|
||||||
|
the extra work validators need to do to load and cache them.
|
||||||
|
|
||||||
|
#### Metadata changes
|
||||||
|
|
||||||
|
Each account accessed via an address map should be stored in the transaction
|
||||||
|
metadata for quick reference. This will avoid the need for clients to make
|
||||||
|
multiple RPC round trips to fetch all accounts referenced in a v2 transaction.
|
||||||
|
It will also make it easier to use the ledger tool to analyze account access
|
||||||
|
patterns.
|
||||||
|
|
||||||
|
#### RPC changes
|
||||||
|
|
||||||
|
Fetched transaction responses will likely require a new version field to
|
||||||
|
indicate to clients which transaction structure to use for deserialization.
|
||||||
|
Clients using pre-existing RPC methods will receive error responses when
|
||||||
|
attempting to fetch a versioned transaction which will indicate that they
|
||||||
|
must upgrade.
|
||||||
|
|
||||||
|
The RPC API should also support an option for returning fully expanded
|
||||||
|
transactions to abstract away the address map details from downstream clients.
|
||||||
|
|
||||||
|
### Limitations
|
||||||
|
|
||||||
|
- Max of 256 accounts may be specified in a transaction because u8 is used by compiled
|
||||||
|
instructions to index into transaction message account keys.
|
||||||
|
- Address maps can hold up to 256 addresses because references to map entries
|
||||||
|
are encoded as `u8` in transactions.
|
||||||
|
- Transaction signers may not be referenced with an address map, the full
|
||||||
|
address of each signer must be serialized in the transaction. This ensures that
|
||||||
|
the performance of transaction signature checks is not affected.
|
||||||
|
- Hardware wallets will probably not be able to display details about accounts
|
||||||
|
referenced through address maps due to inability to verify on-chain data.
|
||||||
|
- Only single level address maps can be used. Recursive maps will not be supported.
|
||||||
|
|
||||||
|
## Security Concerns
|
||||||
|
|
||||||
|
### Resource consumption
|
||||||
|
|
||||||
|
Enabling more account inputs in a transaction allows for more program
|
||||||
|
invocations, write-locks, and data reads / writes. Before address maps are
|
||||||
|
enabled, transaction-wide compute limits and increased costs for write locks and
|
||||||
|
data reads are required.
|
||||||
|
|
||||||
|
### Front running
|
||||||
|
|
||||||
|
If the addresses listed within an address map account are modifiable, front
|
||||||
|
running attacks could modify which mapped accounts are resolved for a later
|
||||||
|
transaction. For this reason, we propose that any stored address is immutable
|
||||||
|
and that address map accounts themselves may not be recreated.
|
||||||
|
|
||||||
|
Additionally, a malicious actor could try to fork the chain immediately after a
|
||||||
|
new address map account is added to a block. If successful, they could add a
|
||||||
|
different unexpected map entry in the fork. In order to deter this attack,
|
||||||
|
clients should wait for address maps to be finalized before using them in a
|
||||||
|
transaction. Clients may also append integrity check instructions to the
|
||||||
|
transaction which verify that the correct accounts are used.
|
||||||
|
|
||||||
|
### Denial of service
|
||||||
|
|
||||||
|
Address map accounts will be read very frequently and will therefore be a
|
||||||
|
more high profile target for denial of service attacks through write locks
|
||||||
|
similar to sysvar accounts.
|
||||||
|
|
||||||
|
For this reason, special handling should be given to address map lookups.
|
||||||
|
Address maps lookups should not be affected by account read/write locks.
|
||||||
|
|
||||||
|
### Duplicate accounts
|
||||||
|
|
||||||
|
Transactions may not load an account more than once whether directly through
|
||||||
|
`account_keys` or indirectly through `address_maps`.
|
||||||
|
|
||||||
|
## Other Proposals
|
||||||
|
|
||||||
|
1) Account prefixes
|
||||||
|
|
||||||
|
Needing to pre-register accounts in an on-chain address map is cumbersome
|
||||||
|
because it adds an extra step for transaction processing. Instead, Solana
|
||||||
|
transactions could use variable length address prefixes to specify accounts.
|
||||||
|
These prefix shortcuts can save on data usage without needing to setup on-chain
|
||||||
|
state.
|
||||||
|
|
||||||
|
However, this model requires nodes to keep a mapping of prefixes to active account
|
||||||
|
addresses. Attackers can create accounts with the same prefix as a popular account
|
||||||
|
to disrupt transactions.
|
||||||
|
|
||||||
|
2) Transaction builder program
|
||||||
|
|
||||||
|
Solana can provide a new on-chain program which allows "Big" transactions to be
|
||||||
|
constructed on-chain by normal transactions. Once the transaction is
|
||||||
|
constructed, a final "Execute" transaction can trigger a node to process the big
|
||||||
|
transaction as a normal transaction without needing to fit it into an MTU sized
|
||||||
|
packet.
|
||||||
|
|
||||||
|
The UX of this approach is tricky. A user could in theory sign a big transaction
|
||||||
|
but it wouldn't be great if they had to use their wallet to sign multiple
|
||||||
|
transactions to build that transaction that they already signed and approved. This
|
||||||
|
could be a use-case for transaction relay services, though. A user could pay a
|
||||||
|
relayer to construct the large pre-signed transaction on-chain for them.
|
||||||
|
|
||||||
|
In order to prevent the large transaction from being reconstructed and replayed,
|
||||||
|
its message hash will need to be added to the status cache when executed.
|
||||||
|
|
||||||
|
3) Epoch account indexes
|
||||||
|
|
||||||
|
Similarly to leader schedule calculation, validators could create a global index
|
||||||
|
of the most accessed accounts in the previous epoch and make that index
|
||||||
|
available to transactions in the following epoch.
|
||||||
|
|
||||||
|
This approach has a downside of only updating the index at epoch boundaries
|
||||||
|
which means there would be a few day delay before popular new accounts could be
|
||||||
|
referenced. It also needs to be consistently generated by all validators by
|
||||||
|
using some criteria like adding accounts in order by access count.
|
||||||
|
|
||||||
|
4) Address lists
|
||||||
|
|
||||||
|
Extend the transaction structure to support addresses that, when loaded, expand
|
||||||
|
to a list of addresses. After expansion, all account inputs are concatenated to
|
||||||
|
form a single list of account keys which can be indexed into by instructions.
|
||||||
|
Address lists would likely need to be immutable to prevent attacks. They would
|
||||||
|
also need to be limited in length to limit resource consumption.
|
||||||
|
|
||||||
|
This proposal can be thought of a special case of the proposed index account
|
||||||
|
approach. Since the full account list would be expanded, there's no need to add
|
||||||
|
additional offsets that use up the limited space in a serialized transaction.
|
||||||
|
However, the expected size of an address list may need to be encoded into the
|
||||||
|
transaction to aid the sanitization of account indexes. We would also need to
|
||||||
|
encode how many addresses in the list should be loaded as readonly vs
|
||||||
|
read-write. Lastly, special attention must be given to watch out for addresses
|
||||||
|
that exist in multiple account lists.
|
Reference in New Issue
Block a user