Osmosis is an automated market maker for interchain assets. Over the past 7 months, the adoption has continued to accelerate with nearly $1.5B in TVL as of the time of writing. Additionally, the AMM supports 33 unique assets and continues to add new assets as new chains join IBC.
Osmosis is unique from other Cosmos Chains with the implementation of an epochs module. The epochs module hooks the incentives and mint keepers to distribute various rewards once a day. With the growth of the network, increase in incentivized pools, the time to compute the epoch block and produce a NewHeight has increased to roughly 20 minutes.
New users are coming to Osmosis everyday and stay for its ease-of-use, access to many new assets, and incredible speed. The epoch block takes new users by surprise, and can be a negative experience. With more AMMs arriving in the IBC ecosystem, giving users a wider range of choices, the need to reduce the impact of the daily epoch increases.
The impact of the epoch block has gone under a round of intense discussion, and has been well analyzed in the past. The goal of this document is to provide additional data.
In the following sections, I'll describe the procedure for measuring the epoch, analyzing the data, and provide some thoughts on where we can go next.
To measure the timing of a block's creation, we need to understand a bit about how blocks are created. All Tendermint-based chains rely on a Byzantine Consenus Algorithm run by peers operating as validators to determine the next block. The round-based protocol follows state transitions to produce a NewHeight:
+-------------------------------------+
v |(Wait til `CommmitTime+timeoutCommit`)
+-----------+ +-----+-----+
+----------> | Propose +--------------+ | NewHeight |
| +-----------+ | +-----------+
| | ^
|(Else, after timeoutPrecommit) v |
+-----+-----+ +-----------+ |
| Precommit | <------------------------+ Prevote | |
+-----+-----+ +-----------+ |
|(When +2/3 Precommits for block found) |
v |
+--------------------------------------------------------------------+
| Commit |
| |
| * Set CommitTime = now; |
| * Wait for block, then stage/save/commit block; |
+--------------------------------------------------------------------+
PreVotes include a timestamp which can be used to measure the arrival of votes. Let's take a look a snippet of a the prevotes of a roundset for block 2834022. To do this, we'll need to capture the output of consensus_state over time. I'll be using the following script to capture unique steps per block. This script will run once per 0.1 seconds and gather up the data, with fine enough granularity for 6 second block arrival.
#!/bin/bash
NOW_JSON="$(date +%s.%N).json"
curl -s localhost:26657/consensus_state > ${NOW_JSON}
HRS="$(cat ${NOW_JSON} | jq -r '.result.round_state["height/round/step"]' | tr '/' '_')"
mv ${NOW_JSON} ${HRS}.json
echo ${HRS}
Next, to extract the prevotes we'll need to traverse the json document. For [block 2834022](https://www.mintscan.io/osmosis/blocks/2834022, I'll use the last step to which should have the most complete information, which is stored in 2834022_0_6.json
. Using jq
we can easily extract the prevotes:
jq -r '.result.round_state.height_vote_set[0].prevotes[]' 2834022_0_6.json
Here's a snippet of that data.
Vote{0:CB5A63B91E8F 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 E919E6BD75BD @ 2022-01-17T11:55:03.532273386Z}
Vote{1:16A169951A87 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 FF18B06CC26E @ 2022-01-17T11:55:03.544551492Z}
Vote{2:9D0281786872 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 B70890682857 @ 2022-01-17T11:55:03.00362945Z}
Vote{3:66B69666EBF7 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 8F66A22BAA82 @ 2022-01-17T11:55:03.385849207Z}
Vote{4:76F706AE73A8 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 6DA36F8194AD @ 2022-01-17T11:55:03.352144029Z}
Vote{5:03C016AB7EC3 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 E5364A2E784C @ 2022-01-17T11:55:03.321100618Z}
Vote{6:6239A498C22D 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 8156037496DC @ 2022-01-17T11:55:03.316470356Z}
Vote{7:844290531EE5 2834022/00/SIGNED_MSG_TYPE_PREVOTE(Prevote) AD67900B1097 5C3E87DFC638 @ 2022-01-17T11:55:03.392949049Z}
Each of these lines are generated by the Vote String()
method. Let's break one down
Field | Value |
---|---|
ValidatorIndex | 0 |
FingerPrint of Validator Address | CB5A63B91E8F |
Height | 2834022 |
Round | 00 |
Type | SIGNED_MSG_TYPE_PREVOTE |
Type String | Prevote |
FingerPrint of Blockhash | AD67900B1097 |
FingerPrint of Signature | E919E6BD75BD |
Canonical Time of Vote Timestamp | 2022-01-17T11:55:03.532273386Z |
We now have the ability to associate a validator's address with the timestamp of their prevote.
We can then do some simple charting on this to view as a Histogram. Below we see a histogram of validators time from block proposal to prevote, filtering out 5% outliers:
Let's compare that histogram with an epoch block:
The primary to note with that graph is the dramatic change in X-axis range. From pre-votes arriving in under a second on the first graph, to the second graph ranging from 50 seconds to 20 minutes.
The histogram helps us understand the average arrival rate. We can see that a 36 validators are able to complete the epoch block in under 5 minutes. The histogram doesn't let us easily visualize the prevote arrival rate against a validator's delegations.
Let's us a Bubble chart for this. Let's keep the X-axis representing pre-vote arrival time. Since we can use other dimensions of data let's use the validator's power to represent size and Y-axis position.
This section describes the steps followed for measuring the epoch. Additionally, care will be taken to ensure measurements are made on a node that is not a validator.
The validators and sentries were rapidly synchronized using bootstrap.sh, a script which sets up the tooling required and uses ./scripts/statesync.sh
After each node was synced, I modified the config.toml's to match Tendermint's Recommendations. Additionally, I disabled tx_index
by setting it to "null".
I've also firewalled off the mainnet validator from the internet so that it's only accessible via SSH (and peering to the sentries).
Below is a table of the nodes, types, specifications and configurations. Normally one should not expose their validator configuration, but for the purpose of independent verifiction, I've included it in the table below:
Node | Type | Spec | Location | Config |
---|---|---|---|---|
osmosis-mainnet-validator | validator | Contabo VPS XL 800GB NVME | St. Louis | here |
osmosis-mainnet-sentry-001 | sentry | Contabo VPS XL 800GB NVME | St. Louis | here |
osmosis-mainnet-sentry-002 | sentry | Contabo VPS XL 800GB NVME | St. Louis | here |
osmosis-mainnet-sentry-003 | sentry | Contabo VPS XL 800GB NVME | St. Louis | here |
osmosis-mainnet-node | node | Hetzner AX51-NVME | Helsinki | here |
osmosis-mainnet-node-2 | node | i3en.2xlarge | "Ohio" (us-east-2) | here |
We'll run this script in a tight loop on osmosis-mainnet-node and osmosis-mainnet-node-2 :
#!/bin/bash
NOW_JSON="$(date +%s.%N).json"
curl -s localhost:26657/consensus_state > ${NOW_JSON}
HRS="$(cat ${NOW_JSON} | jq -r '.result.round_state["height/round/step"]' | tr '/' '_')"
mv ${NOW_JSON} ${HRS}.json
echo ${HRS}
Using Contabo VPS:
Node | Type | Spec | Location |
---|---|---|---|
osmosis-mainnet-validator | validator | Contabo VPS XL 800GB NVME | St. Louis |
osmosis-mainnet-sentry-001 | sentry | Contabo VPS XL 800GB NVME | St. Louis |
osmosis-mainnet-sentry-002 | sentry | Contabo VPS XL 800GB NVME | St. Louis |
osmosis-mainnet-sentry-003 | sentry | Contabo VPS XL 800GB NVME | St. Louis |
osmosis-mainnet-node | node | Hetzner AX51-NVME | Helsinki |
osmosis-mainnet-node-2 | node | i3en.2xlarge | "Ohio" (us-east-2) |
Using Hetzner Dedicated Servers:
Node | Type | Spec | Location | Config |
---|---|---|---|---|
osmosis-mainnet-validator | validator | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-sentry-001 | sentry | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-sentry-002 | sentry | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-sentry-003 | sentry | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-node-2 | node | i3en.2xlarge | "Ohio" (us-east-2) |
Using Hetzner Dedicated Servers:
Node | Type | Spec | Location | Config |
---|---|---|---|---|
osmosis-mainnet-validator | validator | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-sentry-001 | sentry | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-sentry-002 | sentry | Hetzner AX51-NVME | Helsinki | |
osmosis-mainnet-sentry-003 | sentry | Hetzner AX51-NVME | Helsinki | |
backup-osmosis-mainnet-sentry-001 | sentry | Contabo VPS XL 800GB NVME | St. Louis | |
backup-osmosis-mainnet-sentry-002 | sentry | Contabo VPS XL 800GB NVME | St. Louis | |
backup-osmosis-mainnet-sentry-003 | sentry | Contabo VPS XL 800GB NVME | St. Louis | |
backup-osmosis-mainnet-validator | node | Contabo VPS XL 800GB NVME | St. Louis |
On osmosis-mainnet-sentry-003, I've configured all available peers from the #peers-list
channel as persistent_peers
, I've also made a few additional tweaks to the config:
- Attempt to have many connections
# Maximum number of outbound peers to connect to, excluding persistent peers
max_num_outbound_peers = 320
- Attempt to avoid exponential backoff
# Maximum pause when redialing a persistent peer (if zero, exponential backoff is used)
persistent_peers_max_dial_period = "1s"
- Make node very impatient.
# Peer connection configuration.
handshake_timeout = "5s"
dial_timeout = "1s"
My theory is if the Osmosis Chain has a few nodes like this, they can help rapidly rebuild the p2p network.
Neither positive nor negative:
Observation: Very few IPv6 peers are online.
Unchanged setup from 1/19/2022
The 900-1200 range appears a bit more sparsely populated!
Yesterday @valardragon asked:
do we know for the epoch, how much of the time is spent in commit vs execution?
I've set logs to ERROR
per the set of optimizations recommended.. so I'll reset that back to INFO
so I can get that data today.
I've instrumented ApplyBlock within Tendermint in small branch off of osmosis to be able to better see what is going on internally. Here's a sample of Block 2888662
As of this writing Contabo does not have IPV6 support.
Lower precedence of ipv6 address resolution in /etc/gai.conf
# For sites which prefer IPv4 connections change the last line to
#
precedence ::ffff:0:0/96 100