Length prefixes key namespacing, version 2

In https://github.com/CosmWasm/cosmwasm/blob/main/docs/STORAGE_KEYS.md you can see the standard storage key encoding used in CosmWasm from 2020-2023. In the meantime we figured out that using short keys is important for the performance of the underlying database. How much it affects performance, I don't know. But the more I learn about how those databases and their caching works, the more I'm convinced this is the right direction.

CosmWasm developers love performance. It allows them to build things that are hard or impossible to build in other systems. We want to make it easy for our developers to reach peak performance.

The encoding

Version 1 of Length prefixed key namespacing as initially discussed in webmaster128/key-namespacing has a few inefficiencies:

Each key component supports a length of up to 65535 bytes (64 KiB), which is way too much to be efficient at scale. We should discourage such large keys.
2 byte encoding of the length means that every component is of size component.len() + 2, which is a lot of overhead for short components.

This is particularly wasteful if you try to build a storage efficient data model. See e.g.

// Map from ID to address
const FASTMAP: Map<u16, Addr> = Map::new("f");

in which an example key looks like this

  b"\x00\x01f\x00\x02\xA1\xB2"
//  ^^^^^^^^                  two byte length
//          ^                 one byte "f"
//           ^^^^^^^^         two byte length
//                   ^^^^^^^^ two bytes 41394u16

making it 7 bytes to encode "f"/41394. This overhead will be present in every single database entry.

The first obvious improvement would be reducing the length encoding from two bytes to one, reducing the max length of a component to 256 bytes. This would bring out 7 bytes key down to 5 bytes. But usually we don't need keys that long. A varint can be used to have a short encoding for the most common small values. But why not write components with 1 byte length without length prefix? The first ~half of the one byte range can be reserved for those 1-byte keys. The other half can hold a component length values.

To accomodate 1-byte keys and a length value for larger keys we use the following schema (decoding):

Next byte `b` (u8)	Interpretation
0 - 126	len: 1; value: set to `b`
127 - 249	len: b-127: value: read `len` more bytes
250	len: next byte + 127: value: read `len` more bytes
251	len: next byte + 127 + 1 * 256: value: read `len` more bytes
252	len: next byte + 127 + 2 * 256: value: read `len` more bytes
253	len: next byte + 127 + 3 * 256: value: read `len` more bytes
254	len: next byte + 127 + 4 * 256: value: read `len` more bytes
255	len: next byte + 127 + 5 * 256: value: read `len` more bytes

This allows key components from 0 bytes to 1662 bytes but favours the encoding of 1 byte compoents with values from NULL to "~", including all printable ASCII characters. This allows for enumerated top level keys of up to 127 elements to be stored most efficiently.

The above example would then be encoded as 4 bytes only:

   b"f\x02\xA1\xB2"
//  ^             one byte "f"
//   ^^^^         one byte length
//       ^^^^^^^^ two bytes 41394u16

The length can be encoded as 0, 1 or 2 bytes:

key component	Key length encoding
length 1 and value <= 126u8	0 bytes
length <= 122	1 byte
length <= 1662	2 bytes
length > 1662	unsupported

Benefits

Slightly shorter keys on average
Much shorter keys in optimized scenarios

Drawbacks

Slightly more complicated design as one byte can be value or length

Neutral

Reduces the max component length from 64KiB to 1662 bytes. This can be limiting but if you hit this limit you are probably doing something wrong anyways. Longer keys are possible if spread across multiple key components.

Further remarks

The above encoding is always more compact or equal to the previous one. There are no cases in which the encoding becomes longer. This is primarily due to the length restriction.
The encoding can be implemented using the same API as right now, leading to no inconvenience for users
Top key enum solutions as shown in https://github.com/noislabs/nois-contracts/pull/268/files can be streamlined by a functions or macros. It gives you a convenient way for 1 byte key components without the risk of collisions.