Rework cluster metrics dashboard to support the modern clusters

This commit is contained in:
Michael Vines
2020-03-11 10:21:53 -07:00
parent 0ef9d79056
commit 5f5824d78d
9 changed files with 58 additions and 55 deletions

View File

@ -24,8 +24,8 @@ solana transaction-count
Inspect the network explorer at Inspect the network explorer at
[https://explorer.solana.com/](https://explorer.solana.com/) for activity. [https://explorer.solana.com/](https://explorer.solana.com/) for activity.
View the [metrics dashboard](https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta?var-testnet=testnet) View the [metrics dashboard](https://metrics.solana.com:3000/d/monitor/cluster-telemetry) for more
for more detail on cluster activity. detail on cluster activity.
## Confirm your Installation ## Confirm your Installation

View File

@ -5,7 +5,7 @@ testnet participants, [https://discord.gg/pquxPsq](https://discord.gg/pquxPsq).
## Useful Links & Discussion ## Useful Links & Discussion
* [Network Explorer](http://explorer.solana.com/) * [Network Explorer](http://explorer.solana.com/)
* [Testnet Metrics Dashboard](https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?refresh=60s&orgId=2) * [Testnet Metrics Dashboard](https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge?refresh=60s&orgId=2)
* Validator chat channels * Validator chat channels
* [\#validator-support](https://discord.gg/rZsenD) General support channel for any Validator related queries. * [\#validator-support](https://discord.gg/rZsenD) General support channel for any Validator related queries.
* [\#tourdesol](https://discord.gg/BdujK2) Discussion and support channel for Tour de SOL participants ([What is Tour de SOL?](https://solana.com/tds/)). * [\#tourdesol](https://discord.gg/BdujK2) Discussion and support channel for Tour de SOL participants ([What is Tour de SOL?](https://solana.com/tds/)).
@ -14,6 +14,6 @@ testnet participants, [https://discord.gg/pquxPsq](https://discord.gg/pquxPsq).
* [Core software repo](https://github.com/solana-labs/solana) * [Core software repo](https://github.com/solana-labs/solana)
* [Tour de SOL Docs](https://docs.solana.com/tour-de-sol) * [Tour de SOL Docs](https://docs.solana.com/tour-de-sol)
* [TdS repo](https://github.com/solana-labs/tour-de-sol) * [TdS repo](https://github.com/solana-labs/tour-de-sol)
* [TdS metrics dashboard](https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?refresh=1m&from=now-15m&to=now&var-testnet=tds&orgId=2&var-datasource=TdS%20Metrics%20%28read-only%29) * [TdS metrics dashboard](https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge?refresh=1m&from=now-15m&to=now&var-testnet=tds)
Can't find what you're looking for? Send an email to ryan@solana.com or reach out to @rshea\#2622 on Discord. Can't find what you're looking for? Send an email to ryan@solana.com or reach out to @rshea\#2622 on Discord.

View File

@ -6,7 +6,7 @@ description: Where to go after you've read this guide
* [Solana Docs](https://docs.solana.com/) * [Solana Docs](https://docs.solana.com/)
* [Network Explorer](http://explorer.solana.com/) * [Network Explorer](http://explorer.solana.com/)
* [TdS metrics dashboard](https://metrics.solana.com:3000/d/testnet/testnet-monitor?refresh=1m&from=now-15m&to=now&orgId=2&var-datasource=Solana%20Metrics%20(read-only)&var-testnet=tds&var-hostid=All9) * [TdS metrics dashboard](https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge?refresh=1m&from=now-15m&to=now&var-testnet=tds)
* Validator chat channels * Validator chat channels
* [\#validator-support](https://discord.gg/rZsenD) General support channel for any Validator related queries that dont fall under Tour de SOL. * [\#validator-support](https://discord.gg/rZsenD) General support channel for any Validator related queries that dont fall under Tour de SOL.
* [\#tourdesol](https://discord.gg/BdujK2) Discussion and support channel for Tour de SOL participants. * [\#tourdesol](https://discord.gg/BdujK2) Discussion and support channel for Tour de SOL participants.

View File

@ -4,13 +4,14 @@
There are three versions of the testnet dashboard, corresponding to the three There are three versions of the testnet dashboard, corresponding to the three
release channels: release channels:
* https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge * https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge
* https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta * https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta
* https://metrics.solana.com:3000/d/testnet/testnet-monitor * https://metrics.solana.com:3000/d/monitor/cluster-telemetry
The dashboard for each channel is defined from the The dashboard for each channel is defined from the
`metrics/testnet-monitor.json` source file in the git branch associated with `metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json` source
that channel, and deployed by automation running `ci/publish-metrics-dashboard.sh`. file in the git branch associated with that channel, and deployed by automation
running `ci/publish-metrics-dashboard.sh`.
A deploy can be triggered at any time via the `New Build` button of A deploy can be triggered at any time via the `New Build` button of
https://buildkite.com/solana-labs/publish-metrics-dashboard. https://buildkite.com/solana-labs/publish-metrics-dashboard.
@ -18,7 +19,7 @@ https://buildkite.com/solana-labs/publish-metrics-dashboard.
### Modifying a Dashboard ### Modifying a Dashboard
Dashboard updates are accomplished by modifying Dashboard updates are accomplished by modifying
`metrics/scripts/grafana-provisioning/dashboards/testnet-monitor.json`, `metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json`,
**manual edits made directly in Grafana will be overwritten**. **manual edits made directly in Grafana will be overwritten**.
* Check out metrics to add at https://metrics.solana.com:8888/ in the data explorer. * Check out metrics to add at https://metrics.solana.com:8888/ in the data explorer.
@ -32,13 +33,13 @@ Dashboard updates are accomplished by modifying
`Settings` menu for the dashboard `Settings` menu for the dashboard
3. Edit dashboard as desired 3. Edit dashboard as desired
4. Extract the JSON Model by selecting `JSON Model` in the `Settings` menu. Copy the JSON to the clipboard 4. Extract the JSON Model by selecting `JSON Model` in the `Settings` menu. Copy the JSON to the clipboard
and paste into `metrics/scripts/grafana-provisioning/dashboards/testnet-monitor.json`, and paste into `metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json`,
5. Delete your development dashboard: `Settings` => `Delete` 5. Delete your development dashboard: `Settings` => `Delete`
### Deploying a Dashboard Manually ### Deploying a Dashboard Manually
If you need to immediately deploy a dashboard using the contents of If you need to immediately deploy a dashboard using the contents of
`testnet-monitor.json` in your local workspace, `cluster-monitor.json` in your local workspace,
``` ```
$ export GRAFANA_API_TOKEN="an API key from https://metrics.solana.com:3000/org/apikeys" $ export GRAFANA_API_TOKEN="an API key from https://metrics.solana.com:3000/org/apikeys"
$ metrics/publish-metrics-dashboard.sh (edge|beta|stable) $ metrics/publish-metrics-dashboard.sh (edge|beta|stable)

View File

@ -11,13 +11,13 @@ fi
case $CHANNEL in case $CHANNEL in
edge) edge)
DASHBOARD=testnet-monitor-edge DASHBOARD=cluster-telemetry-edge
;; ;;
beta) beta)
DASHBOARD=testnet-monitor-beta DASHBOARD=cluster-telemetry-beta
;; ;;
stable) stable)
DASHBOARD=testnet-monitor DASHBOARD=cluster-telemetry
;; ;;
*) *)
echo "Error: Invalid CHANNEL=$CHANNEL" echo "Error: Invalid CHANNEL=$CHANNEL"
@ -31,7 +31,7 @@ if [[ -z $GRAFANA_API_TOKEN ]]; then
exit 1 exit 1
fi fi
DASHBOARD_JSON=scripts/grafana-provisioning/dashboards/testnet-monitor.json DASHBOARD_JSON=scripts/grafana-provisioning/dashboards/cluster-monitor.json
if [[ ! -r $DASHBOARD_JSON ]]; then if [[ ! -r $DASHBOARD_JSON ]]; then
echo Error: $DASHBOARD_JSON not found echo Error: $DASHBOARD_JSON not found
fi fi

View File

@ -21,7 +21,7 @@ with open(dashboard_json, 'r') as read_file:
data = json.load(read_file) data = json.load(read_file)
if channel == 'local': if channel == 'local':
data['title'] = 'Local Testnet Monitor' data['title'] = 'Local Cluster Monitor'
data['uid'] = 'local' data['uid'] = 'local'
data['links'] = [] data['links'] = []
data['templating']['list'] = [{'current': {'text': '$datasource', data['templating']['list'] = [{'current': {'text': '$datasource',
@ -66,10 +66,9 @@ if channel == 'local':
'useTags': False}] 'useTags': False}]
elif channel == 'stable': elif channel == 'stable':
# Stable dashboard only allows the user to select between the stable # Stable dashboard only allows the user to select between public clusters
# testnet databases data['title'] = 'Cluster Telemetry'
data['title'] = 'Testnet Monitor' data['uid'] = 'monitor'
data['uid'] = 'testnet'
data['templating']['list'] = [{'current': {'text': '$datasource', data['templating']['list'] = [{'current': {'text': '$datasource',
'value': '$datasource'}, 'value': '$datasource'},
'hide': 1, 'hide': 1,
@ -81,20 +80,26 @@ elif channel == 'stable':
'regex': '', 'regex': '',
'type': 'datasource'}, 'type': 'datasource'},
{'allValue': None, {'allValue': None,
'current': {'text': 'testnet', 'current': {'text': 'Developer Testnet',
'value': 'testnet'}, 'value': 'devnet'},
'hide': 1, 'hide': 1,
'includeAll': False, 'includeAll': False,
'label': 'Testnet', 'label': 'Testnet',
'multi': False, 'multi': False,
'name': 'testnet', 'name': 'testnet',
'options': [{'selected': False, 'options': [{'selected': True,
'text': 'testnet', 'text': 'Developer Testnet',
'value': 'testnet'}, 'value': 'devnet'},
{'selected': True, {'selected': False,
'text': 'testnet-perf', 'text': 'Mainnet Beta',
'value': 'testnet-perf'}], 'value': 'mainnet-beta'},
'query': 'testnet,testnet-perf', {'selected': False,
'text': 'Tour de SOL Testnet',
'value': 'tds'},
{'selected': False,
'text': 'Soft Launch Testnet',
'value': 'cluster'}],
'query': 'devnet,mainnet-beta,tds,cluster',
'type': 'custom'}, 'type': 'custom'},
{'allValue': ".*", {'allValue': ".*",
'datasource': '$datasource', 'datasource': '$datasource',
@ -114,10 +119,9 @@ elif channel == 'stable':
'type': 'query', 'type': 'query',
'useTags': False}] 'useTags': False}]
else: else:
# Non-stable dashboard only allows the user to select between all testnet # Non-stable dashboard includes all the dev clusters
# databases data['title'] = 'Cluster Telemetry ({})'.format(channel)
data['title'] = 'Testnet Monitor ({})'.format(channel) data['uid'] = 'monitor-' + channel
data['uid'] = 'testnet-' + channel
data['templating']['list'] = [{'current': {'text': '$datasource', data['templating']['list'] = [{'current': {'text': '$datasource',
'value': '$datasource'}, 'value': '$datasource'},
'hide': 1, 'hide': 1,
@ -129,8 +133,8 @@ else:
'regex': '', 'regex': '',
'type': 'datasource'}, 'type': 'datasource'},
{'allValue': ".*", {'allValue': ".*",
'current': {'text': 'testnet', 'current': {'text': 'Developer Testnet',
'value': 'testnet'}, 'value': 'devnet'},
'datasource': '$datasource', 'datasource': '$datasource',
'hide': 1, 'hide': 1,
'includeAll': False, 'includeAll': False,
@ -140,7 +144,7 @@ else:
'options': [], 'options': [],
'query': 'show databases', 'query': 'show databases',
'refresh': 1, 'refresh': 1,
'regex': 'testnet.*', 'regex': '(devnet|cluster|tds|mainnet-beta|testnet.*)',
'sort': 1, 'sort': 1,
'tagValuesQuery': '', 'tagValuesQuery': '',
'tags': [], 'tags': [],

View File

@ -27,21 +27,21 @@
"title": "Stable", "title": "Stable",
"tooltip": "", "tooltip": "",
"type": "link", "type": "link",
"url": "https://metrics.solana.com:3000/d/testnet/testnet-monitor" "url": "https://metrics.solana.com:3000/d/monitor/cluster-telemetry"
}, },
{ {
"icon": "dashboard", "icon": "dashboard",
"tags": [], "tags": [],
"title": "Beta", "title": "Beta",
"type": "link", "type": "link",
"url": "https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta" "url": "https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta"
}, },
{ {
"icon": "dashboard", "icon": "dashboard",
"tags": [], "tags": [],
"title": "Edge", "title": "Edge",
"type": "link", "type": "link",
"url": "https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge" "url": "https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge"
} }
], ],
"panels": [ "panels": [
@ -4618,7 +4618,7 @@
}, },
"yaxes": [ "yaxes": [
{ {
"format": "µs", "format": "\u00b5s",
"label": null, "label": null,
"logBase": 1, "logBase": 1,
"max": null, "max": null,
@ -5385,7 +5385,7 @@
}, },
"yaxes": [ "yaxes": [
{ {
"format": "µs", "format": "\u00b5s",
"label": null, "label": null,
"logBase": 1, "logBase": 1,
"max": null, "max": null,
@ -5752,7 +5752,7 @@
}, },
"yaxes": [ "yaxes": [
{ {
"format": "µs", "format": "\u00b5s",
"label": null, "label": null,
"logBase": 1, "logBase": 1,
"max": null, "max": null,
@ -6727,7 +6727,7 @@
}, },
"yaxes": [ "yaxes": [
{ {
"format": "µs", "format": "\u00b5s",
"label": null, "label": null,
"logBase": 1, "logBase": 1,
"max": null, "max": null,
@ -10181,7 +10181,6 @@
"list": [ "list": [
{ {
"current": { "current": {
"selected": true,
"text": "$datasource", "text": "$datasource",
"value": "$datasource" "value": "$datasource"
}, },
@ -10197,9 +10196,8 @@
{ {
"allValue": ".*", "allValue": ".*",
"current": { "current": {
"selected": false, "text": "Developer Testnet",
"text": "testnet", "value": "devnet"
"value": "testnet"
}, },
"datasource": "$datasource", "datasource": "$datasource",
"hide": 1, "hide": 1,
@ -10210,7 +10208,7 @@
"options": [], "options": [],
"query": "show databases", "query": "show databases",
"refresh": 1, "refresh": 1,
"regex": "testnet.*", "regex": "(devnet|cluster|tds|mainnet-beta|testnet.*)",
"sort": 1, "sort": 1,
"tagValuesQuery": "", "tagValuesQuery": "",
"tags": [], "tags": [],
@ -10269,7 +10267,7 @@
] ]
}, },
"timezone": "", "timezone": "",
"title": "Testnet Monitor (edge)", "title": "Cluster Telemetry (edge)",
"uid": "testnet-edge", "uid": "monitor-edge",
"version": 2 "version": 2
} }

View File

@ -34,7 +34,7 @@ source lib/config.sh
if [[ ! -f lib/grafana-provisioning ]]; then if [[ ! -f lib/grafana-provisioning ]]; then
cp -va grafana-provisioning lib cp -va grafana-provisioning lib
./adjust-dashboard-for-channel.py \ ./adjust-dashboard-for-channel.py \
lib/grafana-provisioning/dashboards/testnet-monitor.json local lib/grafana-provisioning/dashboards/cluster-monitor.json local
mkdir -p lib/grafana-provisioning/datasources mkdir -p lib/grafana-provisioning/datasources
cat > lib/grafana-provisioning/datasources/datasource.yml <<EOF cat > lib/grafana-provisioning/datasources/datasource.yml <<EOF

View File

@ -106,7 +106,7 @@ function upload_results_to_slack() {
BUILDKITE_BUILD_URL="https://buildkite.com/solana-labs/" BUILDKITE_BUILD_URL="https://buildkite.com/solana-labs/"
fi fi
GRAFANA_URL="https://metrics.solana.com:3000/d/testnet-${CHANNEL:-edge}/testnet-monitor-${CHANNEL:-edge}?var-testnet=${TESTNET_TAG:-testnet-automation}&from=${TESTNET_START_UNIX_MSECS:-0}&to=${TESTNET_FINISH_UNIX_MSECS:-0}" GRAFANA_URL="https://metrics.solana.com:3000/d/monitor-${CHANNEL:-edge}/cluster-telemetry-${CHANNEL:-edge}?var-testnet=${TESTNET_TAG:-testnet-automation}&from=${TESTNET_START_UNIX_MSECS:-0}&to=${TESTNET_FINISH_UNIX_MSECS:-0}"
[[ -n $RESULT_DETAILS ]] || RESULT_DETAILS="Undefined" [[ -n $RESULT_DETAILS ]] || RESULT_DETAILS="Undefined"
[[ -n $TEST_CONFIGURATION ]] || TEST_CONFIGURATION="Undefined" [[ -n $TEST_CONFIGURATION ]] || TEST_CONFIGURATION="Undefined"