This reverts commit 51cab5d16d
.
This commit is contained in:
@@ -91,7 +91,6 @@ module.exports = {
|
||||
},
|
||||
"developing/test-validator",
|
||||
"developing/backwards-compatibility",
|
||||
"developing/plugins/accountsdb_plugin"
|
||||
],
|
||||
Integrating: ["integrations/exchange"],
|
||||
Validating: [
|
||||
|
@@ -1,363 +0,0 @@
|
||||
---
|
||||
title: AccountsDb Plugins
|
||||
---
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
Validators under heavy RPC loads, such as when serving getProgramAccounts calls,
|
||||
can fall behind the network. To solve this problem, the validator has been
|
||||
enhanced to support a plugin mechanism through which the information about
|
||||
accounts and slots can be transmitted to external data stores such as relational
|
||||
databases, NoSQL databases or Kafka. RPC services then can be developed to
|
||||
consume data from these external data stores with the possibility of more
|
||||
flexible and targeted optimizations such as caching and indexing. This allows
|
||||
the validator to focus on processing transactions without being slowed down by
|
||||
busy RPC requests.
|
||||
|
||||
This document describes the interfaces of the plugin and the referential plugin
|
||||
implementation for the PostgreSQL database.
|
||||
|
||||
[crates.io]: https://crates.io/search?q=solana-
|
||||
[docs.rs]: https://docs.rs/releases/search?query=solana-
|
||||
|
||||
### Important Crates:
|
||||
|
||||
- [`solana-accountsdb-plugin-interface`] — This crate defines the plugin
|
||||
interfaces.
|
||||
|
||||
- [`solana-accountsdb-plugin-postgres`] — The crate for the referential
|
||||
plugin implementation for the PostgreSQL database.
|
||||
|
||||
[`solana-accountsdb-plugin-interface`]: https://docs.rs/solana-accountsdb-plugin-interface
|
||||
[`solana-accountsdb-plugin-postgres`]: https://docs.rs/solana-accountsdb-plugin-postgres
|
||||
|
||||
|
||||
The Plugin Interface
|
||||
====================
|
||||
|
||||
The Plugin interface is declared in [`solana-accountsdb-plugin-interface`]. It
|
||||
is defined by the trait `AccountsDbPlugin`. The plugin should implement the
|
||||
trait and expose a "C" function `_create_plugin` to return the pointer to this
|
||||
trait. For example, in the referential implementation, the following code
|
||||
instantiates the PostgreSQL plugin `AccountsDbPluginPostgres ` and returns its
|
||||
pointer.
|
||||
|
||||
```
|
||||
#[no_mangle]
|
||||
#[allow(improper_ctypes_definitions)]
|
||||
/// # Safety
|
||||
///
|
||||
/// This function returns the AccountsDbPluginPostgres pointer as trait AccountsDbPlugin.
|
||||
pub unsafe extern "C" fn _create_plugin() -> *mut dyn AccountsDbPlugin {
|
||||
let plugin = AccountsDbPluginPostgres::new();
|
||||
let plugin: Box<dyn AccountsDbPlugin> = Box::new(plugin);
|
||||
Box::into_raw(plugin)
|
||||
}
|
||||
```
|
||||
|
||||
A plugin implementation can implement the `on_load` method to initialize itself.
|
||||
This function is invoked after a plugin is dynamically loaded into the validator
|
||||
when it starts. The configuration of the plugin is controlled by a configuration
|
||||
file in JSON format. The JSON file must have a field `libpath` that points
|
||||
to the full path name of the shared library implementing the plugin, and may
|
||||
have other configuration information, like connection parameters for the external
|
||||
database. The plugin configuration file is specified by the validator's CLI
|
||||
parameter `--accountsdb-plugin-config` and the file must be readable to the
|
||||
validator process.
|
||||
|
||||
Please see the [config file](#config) for the referential
|
||||
PostgreSQL plugin below for an example.
|
||||
|
||||
The plugin can implement the `on_unload` method to do any cleanup before the
|
||||
plugin is unloaded when the validator is gracefully shutdown.
|
||||
|
||||
The following method is used for notifying on an account update:
|
||||
|
||||
```
|
||||
fn update_account(
|
||||
&mut self,
|
||||
account: ReplicaAccountInfoVersions,
|
||||
slot: u64,
|
||||
is_startup: bool,
|
||||
) -> Result<()>
|
||||
```
|
||||
|
||||
The `ReplicaAccountInfoVersions` struct contains the metadata and data of the account
|
||||
streamed. The `slot` points to the slot the account is being updated at. When
|
||||
`is_startup` is true, it indicates the account is loaded from snapshots when
|
||||
the validator starts up. When `is_startup` is false, the account is updated
|
||||
when processing a transaction.
|
||||
|
||||
|
||||
The following method is called when all accounts have been notified when the
|
||||
validator restores the AccountsDb from snapshots at startup.
|
||||
|
||||
```
|
||||
fn notify_end_of_startup(&mut self) -> Result<()>
|
||||
```
|
||||
|
||||
When `update_account` is called during processing transactions, the plugin
|
||||
should process the notification as fast as possible because any delay may
|
||||
cause the validator to fall behind the network. Persistence to external data
|
||||
store is best to be done asynchronously.
|
||||
|
||||
The following method is used for notifying slot status changes:
|
||||
|
||||
```
|
||||
fn update_slot_status(
|
||||
&mut self,
|
||||
slot: u64,
|
||||
parent: Option<u64>,
|
||||
status: SlotStatus,
|
||||
) -> Result<()>
|
||||
```
|
||||
|
||||
To ensure data consistency, the plugin implementation can choose to abort
|
||||
the validator in case of error persisting to external stores. When the
|
||||
validator restarts the account data will be re-transmitted.
|
||||
|
||||
For more details, please refer to the Rust documentation in
|
||||
[`solana-accountsdb-plugin-interface`].
|
||||
|
||||
Example PostgreSQL Plugin
|
||||
=========================
|
||||
|
||||
The [`solana-accountsdb-plugin-postgres`] crate implements a plugin storing
|
||||
account data to a PostgreSQL database to illustrate how a plugin can be
|
||||
developed.
|
||||
|
||||
<a name="config">
|
||||
## Configuration File Format
|
||||
</a>
|
||||
|
||||
The plugin is configured using the input configuration file. An example
|
||||
configuration file looks like the following:
|
||||
|
||||
|
||||
```
|
||||
{
|
||||
"libpath": "/solana/target/release/libsolana_accountsdb_plugin_postgres.so",
|
||||
"host": "postgres-server",
|
||||
"user": "solana",
|
||||
"port": 5433,
|
||||
"threads": 20,
|
||||
"batch_size": 20,
|
||||
"panic_on_db_errors": true,
|
||||
"accounts_selector" : {
|
||||
"accounts" : ["*"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `host`, `user`, and `port` control the PostgreSQL configuration
|
||||
information. For more advanced connection options, please use the
|
||||
`connection_str` field. Please see [Rust postgres configuration]
|
||||
(https://docs.rs/postgres/0.19.2/postgres/config/struct.Config.html).
|
||||
|
||||
To improve the throughput to the database, the plugin supports connection pooling
|
||||
using multiple threads, each maintaining a connection to the PostgreSQL database.
|
||||
The count of the threads is controlled by the `threads` field. A higher thread
|
||||
count usually offers better performance.
|
||||
|
||||
To further improve performance when saving large numbers of accounts at
|
||||
startup, the plugin uses bulk inserts. The batch size is controlled by the
|
||||
`batch_size` parameter. This can help reduce the round trips to the database.
|
||||
|
||||
The `panic_on_db_errors` can be used to panic the validator in case of database
|
||||
errors to ensure data consistency.
|
||||
|
||||
## Account Selection
|
||||
|
||||
The `accounts_selector` can be used to filter the accounts that should be persisted.
|
||||
|
||||
For example, one can use the following to persist only the accounts with particular
|
||||
Base58-encoded Pubkeys,
|
||||
|
||||
```
|
||||
"accounts_selector" : {
|
||||
"accounts" : ["pubkey-1", "pubkey-2", ..., "pubkey-n"],
|
||||
}
|
||||
```
|
||||
|
||||
Or use the following to select accounts with certain program owners:
|
||||
|
||||
```
|
||||
"accounts_selector" : {
|
||||
"owners" : ["pubkey-owner-1", "pubkey-owner-2", ..., "pubkey-owner-m"],
|
||||
}
|
||||
```
|
||||
|
||||
To select all accounts, use the wildcard character (*):
|
||||
|
||||
```
|
||||
"accounts_selector" : {
|
||||
"accounts" : ["*"],
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Database Setup
|
||||
|
||||
### Install PostgreSQL Server
|
||||
|
||||
Please follow [PostgreSQL Ubuntu Installation](https://www.postgresql.org/download/linux/ubuntu/)
|
||||
on instructions to install the PostgreSQL database server. For example, to
|
||||
install postgresql-14,
|
||||
|
||||
```
|
||||
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
|
||||
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
|
||||
sudo apt-get update
|
||||
sudo apt-get -y install postgresql-14
|
||||
```
|
||||
### Control the Database Access
|
||||
|
||||
Modify the pg_hba.conf as necessary to grant the plugin to access the database.
|
||||
For example, in /etc/postgresql/14/main/pg_hba.conf, the following entry allows
|
||||
nodes with IPs in the CIDR 10.138.0.0/24 to access all databases. The validator
|
||||
runs in a node with an ip in the specified range.
|
||||
|
||||
```
|
||||
host all all 10.138.0.0/24 trust
|
||||
```
|
||||
|
||||
It is recommended to run the database server on a separate node from the validator for
|
||||
better performance.
|
||||
|
||||
### Configure the Database Performance Parameters
|
||||
|
||||
Please refer to the [PostgreSQL Server Configuration](https://www.postgresql.org/docs/14/runtime-config.html)
|
||||
for configuration details. The referential implementation uses the following
|
||||
configurations for better database performance in the /etc/postgresql/14/main/postgresql.conf
|
||||
which are different from the default postgresql-14 installation.
|
||||
|
||||
```
|
||||
max_connections = 200 # (change requires restart)
|
||||
shared_buffers = 1GB # min 128kB
|
||||
effective_io_concurrency = 1000 # 1-1000; 0 disables prefetching
|
||||
wal_level = minimal # minimal, replica, or logical
|
||||
fsync = off # flush data to disk for crash safety
|
||||
synchronous_commit = off # synchronization level;
|
||||
full_page_writes = off # recover from partial page writes
|
||||
max_wal_senders = 0 # max number of walsender processes
|
||||
```
|
||||
|
||||
The sample [postgresql.conf](https://github.com/solana-labs/solana/blob/7ac43b16d2c766df61ae0a06d7aaf14ba61996ac/accountsdb-plugin-postgres/scripts/postgresql.conf)
|
||||
can be used for reference.
|
||||
|
||||
### Create the Database Instance and the Role
|
||||
|
||||
Start the server:
|
||||
|
||||
```
|
||||
sudo systemctl start postgresql@14-main
|
||||
```
|
||||
|
||||
Create the database. For example, the following creates a database named 'solana':
|
||||
|
||||
```
|
||||
sudo -u postgres createdb solana -p 5433
|
||||
```
|
||||
|
||||
Create the database user. For example, the following creates a regular user named 'solana':
|
||||
|
||||
```
|
||||
sudo -u postgres createuser -p 5433 solana
|
||||
```
|
||||
|
||||
Verify the database is working using psql. For example, assuming the node running
|
||||
PostgreSQL has the ip 10.138.0.9, the following command will land in a shell where
|
||||
SQL commands can be entered:
|
||||
|
||||
```
|
||||
psql -U solana -p 5433 -h 10.138.0.9 -w -d solana
|
||||
```
|
||||
|
||||
### Create the Schema Objects
|
||||
|
||||
Use the [create_schema.sql](https://github.com/solana-labs/solana/blob/7ac43b16d2c766df61ae0a06d7aaf14ba61996ac/accountsdb-plugin-postgres/scripts/create_schema.sql)
|
||||
to create the objects for storing accounts and slots.
|
||||
|
||||
Download the script from github:
|
||||
|
||||
```
|
||||
wget https://raw.githubusercontent.com/solana-labs/solana/7ac43b16d2c766df61ae0a06d7aaf14ba61996ac/accountsdb-plugin-postgres/scripts/create_schema.sql
|
||||
```
|
||||
|
||||
Then run the script:
|
||||
|
||||
```
|
||||
psql -U solana -p 5433 -h 10.138.0.9 -w -d solana -f create_schema.sql
|
||||
```
|
||||
|
||||
After this, start the validator with the plugin by using the `--accountsdb-plugin-config`
|
||||
argument mentioned above.
|
||||
|
||||
### Destroy the Schema Objects
|
||||
|
||||
To destroy the database objects, created by `create_schema.sql`, use
|
||||
[drop_schema.sql](https://github.com/solana-labs/solana/blob/7ac43b16d2c766df61ae0a06d7aaf14ba61996ac/accountsdb-plugin-postgres/scripts/drop_schema.sql).
|
||||
For example,
|
||||
|
||||
```
|
||||
psql -U solana -p 5433 -h 10.138.0.9 -w -d solana -f drop_schema.sql
|
||||
```
|
||||
|
||||
## Capture Historical Account Data
|
||||
|
||||
The account historical data is captured using a database trigger as shown in
|
||||
`create_schema.sql`,
|
||||
|
||||
```
|
||||
CREATE FUNCTION audit_account_update() RETURNS trigger AS $audit_account_update$
|
||||
BEGIN
|
||||
INSERT INTO account_audit (pubkey, owner, lamports, slot, executable, rent_epoch, data, write_version, updated_on)
|
||||
VALUES (OLD.pubkey, OLD.owner, OLD.lamports, OLD.slot,
|
||||
OLD.executable, OLD.rent_epoch, OLD.data, OLD.write_version, OLD.updated_on);
|
||||
RETURN NEW;
|
||||
END;
|
||||
|
||||
$audit_account_update$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE TRIGGER account_update_trigger AFTER UPDATE OR DELETE ON account
|
||||
FOR EACH ROW EXECUTE PROCEDURE audit_account_update();
|
||||
```
|
||||
|
||||
The historical data is stored in the account_audit table.
|
||||
|
||||
The trigger can be dropped to disable this feature, for example,
|
||||
|
||||
|
||||
```
|
||||
DROP TRIGGER account_update_trigger ON account;
|
||||
```
|
||||
|
||||
Over time, the account_audit can accumulate large amount of data. You may choose to
|
||||
limit that by deleting older historical data.
|
||||
|
||||
|
||||
For example, the following SQL statement can be used to keep up to 1000 of the most
|
||||
recent records for an account:
|
||||
|
||||
```
|
||||
delete from account_audit a2 where (pubkey, write_version) in
|
||||
(select pubkey, write_version from
|
||||
(select a.pubkey, a.updated_on, a.slot, a.write_version, a.lamports,
|
||||
rank() OVER ( partition by pubkey order by write_version desc) as rnk
|
||||
from account_audit a) ranked
|
||||
where ranked.rnk > 1000)
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
When a validator lacks sufficient compute power, the overhead of saving the
|
||||
account data can cause it to fall behind the network especially when all
|
||||
accounts or a large number of accounts are selected. The node hosting the
|
||||
PostgreSQL database need to be powerful enough to handle the database loads
|
||||
as well. It has been found using GCP n2-standard-64 machine type for the
|
||||
validator and n2-highmem-32 for the PostgreSQL node is adequate for handling
|
||||
transmiting all accounts while keeping up with the network. In addition, it is
|
||||
best to keep the validator and the PostgreSQL in the same local network to
|
||||
reduce latency. You may need to size the validator and database nodes
|
||||
differently if serving other loads.
|
Reference in New Issue
Block a user