Flesh out development docs (#13318)

* flesh out development docs

* nits
This commit is contained in:
Jack May
2020-11-03 12:53:17 -08:00
committed by GitHub
parent 546915ee12
commit 3d5e778d5d
35 changed files with 1757 additions and 540 deletions

View File

@@ -1,98 +0,0 @@
---
title: Cross-Program Invocation
---
## Problem
In today's implementation, a client can create a transaction that modifies two accounts, each owned by a separate on-chain program:
```rust,ignore
let message = Message::new(vec![
token_instruction::pay(&alice_pubkey),
acme_instruction::launch_missiles(&bob_pubkey),
]);
client.send_and_confirm_message(&[&alice_keypair, &bob_keypair], &message);
```
However, the current implementation does not allow the `acme` program to conveniently invoke `token` instructions on the client's behalf:
```rust,ignore
let message = Message::new(vec![
acme_instruction::pay_and_launch_missiles(&alice_pubkey, &bob_pubkey),
]);
client.send_and_confirm_message(&[&alice_keypair, &bob_keypair], &message);
```
Currently, there is no way to create instruction `pay_and_launch_missiles` that executes `token_instruction::pay` from the `acme` program. A possible workaround is to extend the `acme` program with the implementation of the `token` program and create `token` accounts with `ACME_PROGRAM_ID`, which the `acme` program is permitted to modify. With that workaround, `acme` can modify token-like accounts created by the `acme` program, but not token accounts created by the `token` program.
## Proposed Solution
The goal of this design is to modify Solana's runtime such that an on-chain program can invoke an instruction from another program.
Given two on-chain programs `token` and `acme`, each implementing instructions `pay()` and `launch_missiles()` respectively, we would ideally like to implement the `acme` module with a call to a function defined in the `token` module:
```rust,ignore
mod acme {
use token;
fn launch_missiles(keyed_accounts: &[KeyedAccount]) -> Result<()> {
...
}
fn pay_and_launch_missiles(keyed_accounts: &[KeyedAccount]) -> Result<()> {
token::pay(&keyed_accounts[1..])?;
launch_missiles(keyed_accounts)?;
}
```
The above code would require that the `token` crate be dynamically linked so that a custom linker could intercept calls and validate accesses to `keyed_accounts`. Even though the client intends to modify both `token` and `acme` accounts, only `token` program is permitted to modify the `token` account, and only the `acme` program is allowed to modify the `acme` account.
Backing off from that ideal direct cross-program call, a slightly more verbose solution is to allow `acme` to invoke `token` by issuing a token instruction via the runtime.
```rust,ignore
mod acme {
use token_instruction;
fn launch_missiles(keyed_accounts: &[KeyedAccount]) -> Result<()> {
...
}
fn pay_and_launch_missiles(keyed_accounts: &[KeyedAccount]) -> Result<()> {
let alice_pubkey = keyed_accounts[1].key;
let instruction = token_instruction::pay(&alice_pubkey);
invoke(&instruction, accounts)?;
launch_missiles(keyed_accounts)?;
}
```
`invoke()` is built into Solana's runtime and is responsible for routing the given instruction to the `token` program via the instruction's `program_id` field.
Before invoking `pay()`, the runtime must ensure that `acme` didn't modify any accounts owned by `token`. It does this by applying the runtime's policy to the current state of the accounts at the time `acme` calls `invoke` vs. the initial state of the accounts at the beginning of the `acme`'s instruction. After `pay()` completes, the runtime must again ensure that `token` didn't modify any accounts owned by `acme` by again applying the runtime's policy, but this time with the `token` program ID. Lastly, after `pay_and_launch_missiles()` completes, the runtime must apply the runtime policy one more time, where it normally would, but using all updated `pre_*` variables. If executing `pay_and_launch_missiles()` up to `pay()` made no invalid account changes, `pay()` made no invalid changes, and executing from `pay()` until `pay_and_launch_missiles()` returns made no invalid changes, then the runtime can transitively assume `pay_and_launch_missiles()` as whole made no invalid account changes, and therefore commit all these account modifications.
### Instructions that require privileges
The runtime uses the privileges granted to the caller program to determine what privileges can be extended to the callee. Privileges in this context refer to signers and writable accounts. For example, if the instruction the caller is processing contains a signer or writable account, then the caller can invoke an instruction that also contains that signer and/or writable account.
This privilege extension relies on the fact that programs are immutable. In the case of the `acme` program, the runtime can safely treat the transaction's signature as a signature of a `token` instruction. When the runtime sees the `token` instruction references `alice_pubkey`, it looks up the key in the `acme` instruction to see if that key corresponds to a signed account. In this case, it does and thereby authorizes the `token` program to modify Alice's account.
### Program signed accounts
Programs can issue instructions that contain signed accounts that were not signed in the original transaction by
using [Program derived addresses](program-derived-addresses.md).
To sign an account with program derived addresses, a program may `invoke_signed()`.
```rust,ignore
invoke_signed(
&instruction,
accounts,
&[&["First addresses seed"],
&["Second addresses first seed", "Second addresses second seed"]],
)?;
```
### Reentrancy
Reentrancy is currently limited to direct self recursion capped at a fixed depth. This restriction prevents situations where a program might invoke another from an intermediary state without the knowledge that it might later be called back into. Direct recursion gives the program full control of its state at the point that it gets called back.

View File

@@ -1,159 +0,0 @@
---
title: Program Derived Addresses
---
## Problem
Programs cannot generate signatures when issuing instructions to other programs
as defined in the [Cross-Program Invocations](cross-program-invocation.md)
design.
The lack of programmatic signature generation limits the kinds of programs that
can be implemented in Solana. A program may be given the authority over an
account and later want to transfer that authority to another. This is impossible
today because the program cannot act as the signer in the transaction that gives
authority.
For example, if two users want to make a wager on the outcome of a game in
Solana, they must each transfer their wager's assets to some intermediary that
will honor their agreement. Currently, there is no way to implement this
intermediary as a program in Solana because the intermediary program cannot
transfer the assets to the winner.
This capability is necessary for many DeFi applications since they require
assets to be transferred to an escrow agent until some event occurs that
determines the new owner.
- Decentralized Exchanges that transfer assets between matching bid and ask
orders.
- Auctions that transfer assets to the winner.
- Games or prediction markets that collect and redistribute prizes to the
winners.
## Proposed Solution
The key to the design is two-fold:
1. Allow programs to control specific addresses, called program addresses, in
such a way that no external user can generate valid transactions with
signatures for those addresses.
2. Allow programs to programmatically sign for programa addresses that are
present in instructions invoked via [Cross-Program
Invocations](cross-program-invocation.md).
Given the two conditions, users can securely transfer or assign the authority of
on-chain assets to program addresses and the program can then assign that
authority elsewhere at its discretion.
### Private keys for program addresses
A Program address does not lie on the ed25519 curve and therefore has no valid
private key associated with it, and thus generating a signature for it is
impossible. While it has no private key of its own, it can be used by a program
to issue an instruction that includes the Program address as a signer.
### Hash-based generated program addresses
Program addresses are deterministically derived from a collection of seeds and a
program id using a 256-bit pre-image resistant hash function. Program address
must not lie on the ed25519 curve to ensure there is no associated private key.
During generation an error will be returned if the address is found to lie on
the curve. There is about a 50/50 change of this happening for a given
collection of seeds and program id. If this occurs a different set of seeds or
a seed bump (additional 8 bit seed) can be used to find a valid program address
off the curve.
Deterministic program addresses for programs follow a similar derivation path as
Accounts created with `SystemInstruction::CreateAccountWithSeed` which is
implemented with `system_instruction::create_address_with_seed`.
For reference that implementation is as follows:
```rust,ignore
pub fn create_address_with_seed(
base: &Pubkey,
seed: &str,
program_id: &Pubkey,
) -> Result<Pubkey, SystemError> {
if seed.len() > MAX_ADDRESS_SEED_LEN {
return Err(SystemError::MaxSeedLengthExceeded);
}
Ok(Pubkey::new(
hashv(&[base.as_ref(), seed.as_ref(), program_id.as_ref()]).as_ref(),
))
}
```
Programs can deterministically derive any number of addresses by using seeds.
These seeds can symbolically identify how the addresses are used.
From `Pubkey`::
```rust,ignore
/// Generate a derived program address
/// * seeds, symbolic keywords used to derive the key
/// * program_id, program that the address is derived for
pub fn create_program_address(
seeds: &[&[u8]],
program_id: &Pubkey,
) -> Result<Pubkey, PubkeyError>
```
### Using program addresses
Clients can use the `create_program_address` function to generate a destination
address.
```rust,ignore
// deterministically derive the escrow key
let escrow_pubkey = create_program_address(&[&["escrow"]], &escrow_program_id);
// construct a transfer message using that key
let message = Message::new(vec![
token_instruction::transfer(&alice_pubkey, &escrow_pubkey, 1),
]);
// process the message which transfer one 1 token to the escrow
client.send_and_confirm_message(&[&alice_keypair], &message);
```
Programs can use the same function to generate the same address. In the function
below the program issues a `token_instruction::transfer` from a program address
as if it had the private key to sign the transaction.
```rust,ignore
fn transfer_one_token_from_escrow(
program_id: &Pubkey,
keyed_accounts: &[KeyedAccount]
) -> Result<()> {
// User supplies the destination
let alice_pubkey = keyed_accounts[1].unsigned_key();
// Deterministically derive the escrow pubkey.
let escrow_pubkey = create_program_address(&[&["escrow"]], program_id);
// Create the transfer instruction
let instruction = token_instruction::transfer(&escrow_pubkey, &alice_pubkey, 1);
// The runtime deterministically derives the key from the currently
// executing program ID and the supplied keywords.
// If the derived address matches a key marked as signed in the instruction
// then that key is accepted as signed.
invoke_signed(&instruction, &[&["escrow"]])?
}
```
### Instructions that require signers
The addresses generated with `create_program_address` are indistinguishable from
any other public key. The only way for the runtime to verify that the address
belongs to a program is for the program to supply the seeds used to generate the
address.
The runtime will internally call `create_program_address`, and compare the
result against the addresses supplied in the instruction.

View File

@@ -1,81 +0,0 @@
---
title: secp256k1 builtin instruction
---
## Problem
Performing multiple secp256k1 pubkey recovery operations (ecrecover) in BPF would exceed the transction bpf instruction
limit and even if the limit is increased it would take a long time to process.
ecrecover is an ethereum instruction which takes a signature and message and recovers a publickey, a comparison
to that public key can thus verify that the signature is valid.
Since there needs to be 10-20 signatures in the transaction as well as the signing data which is on the
order of 500 bytes, transaction space is a concern. But also having more concentrated similar work should
provide for easier optimization.
## Solution
Add a new builtin instruction which takes in as the first byte a count of the following struct serialized in the instruction
data:
```
struct Secp256k1SignatureOffsets {
secp_signature_key_offset: u16, // offset to [signature,recovery_id,etherum_address] of 64+1+20 bytes
secp_signature_instruction_index: u8, // instruction index to find data
secp_pubkey_offset: u16, // offset to [signature,recovery_id] of 64+1 bytes
secp_signature_instruction_index: u8, // instruction index to find data
secp_message_data_offset: u16, // offset to start of message data
secp_message_data_size: u16, // size of message data
secp_message_instruction_index: u8, // index of instruction data to get message data
}
```
Pseudo code of the operation:
```
process_instruction() {
for i in 0..count {
// i'th index values referenced:
instructions = &transaction.message().instructions
signature = instructions[secp_signature_instruction_index].data[secp_signature_offset..secp_signature_offset + 64]
recovery_id = instructions[secp_signature_instruction_index].data[secp_signature_offset + 64]
ref_eth_pubkey = instructions[secp_pubkey_instruction_index].data[secp_pubkey_offset..secp_pubkey_offset + 32]
message_hash = keccak256(instructions[secp_message_instruction_index].data[secp_message_data_offset..secp_message_data_offset + secp_message_data_size])
pubkey = ecrecover(signature, recovery_id, message_hash)
eth_pubkey = keccak256(pubkey[1..])[12..]
if eth_pubkey != ref_eth_pubkey {
return Error
}
}
return Success
}
```
This allows the user to specify any instruction data in the transaction for signature and message data.
By specifying a special instructions sysvar, one can also receive data from the transaction itself.
Cost of the transaction will count the number of signatures to verify multiplied by the signature cost verify multiplier.
## Optimization notes
The operation will have to take place after (at least partial) deserialization, but all inputs come
from the transaction data itself, this allows it to be relatively easy to execute in parallel to
transaction processing and PoH verification.
## Other solutions
* Instruction available as CPI such that the program can call as desired or a syscall which can operate on the instruction inline.
- Could be harder to optimize given that it generally either requires bpf program scan to determine the inputs to the operation,
or the implementation needs to just wait until the program hits the operation in bpf processing to evaluate it.
- Vector version of the operation could allow for somewhat efficient simd/gpu execution. For most efficient though,
batching with other instructions in the pipeline would be ideal.
- Pros - Nicer interface for the user.
* Async execution environment inside bpf
- Might be hard to optimize for devices like gpus which cannot queue work for itself easily
- Might be easier to optimize on cpu since ordering can be more explicit
* All inputs have to come from the instruction
- Pros - easier to optimize, data is already sent to the GPU for instance for regular sigverify. Probably still need to
wait for deserialize though.
- Cons - ask for pubkeys outside the transaction data itself since they would not be stored on the transaction sending client,
and larger transaction size.