Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active December 10, 2024 20:03
Show Gist options
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@hyc
Copy link

hyc commented Dec 14, 2022

Right, ok. So for 181 symbols of base32 it will use 996 bits.
For 150 symbols of base64 it will use 1200 bits. Still easily manageable for a QR code.

Is there any reason we can't just use raw 8-bit bytes for the QR code?

@hbs
Copy link

hbs commented Dec 14, 2022

Should base64 be seriously considered, please stick to base64url otherwise the / and + will make it harder to generate URLs containing addresses.

@tevador
Copy link
Author

tevador commented Dec 14, 2022

It means that for i != 0 there will be a term depending on G for the one-time address? If so it means that the enote-image creation will have to take into account this number when generating the blinding factor

There will be a G-component regardless of the value of i (check the formula carefully). This is done to provide perfect forward secrecy against quantum attackers. Explained in this comment.

why not use base64 instead of base32

With base64, we'd lose the ability to double-click-select an address, which is bad for UX. As for the length, there is not much qualitative difference between 164 and 196 characters, both are way too long to type but easy to copy-paste.

It would be possible to use something like base59, which would save roughly 28 characters, but that would be highly non-standard and require more complex checksum code (mod 59 math instead of simple XOR).

Is there any reason we can't just use raw 8-bit bytes for the QR code?

There are interoperability issues with binary QR codes. Most readers expect a string formatted as an URI. For example, Android apps can define custom handlers for specific URI schemes: https://developer.android.com/training/app-links/deep-linking

@DangerousFreedom1984
Copy link

Oh, thanks. It is a bit confusing looking at the specs in the main post, the seraphis pdf and the comments at the same time. Maybe we could have a place where we find the updated notations then? Or could you update the specs?

@tevador
Copy link
Author

tevador commented Dec 14, 2022

Yes, the specs here are somewhat outdated. I'm planning to update it.

@hyc
Copy link

hyc commented Dec 14, 2022

With base64, we'd lose the ability to double-click-select an address,

Why is that?

@SChernykh
Copy link

Because of / and + characters, they act as delimiters when you double click on a text.

@hyc
Copy link

hyc commented Dec 14, 2022

@tevador
Copy link
Author

tevador commented Dec 14, 2022

Base64url also gets delimited on the - character.

@hbs
Copy link

hbs commented Dec 14, 2022

Unfortunately I don't think this is the case, base64url replaces / and + with - and _ but most devices will consider - as a separator and will not extend past it when double clicking text.

@hyc
Copy link

hyc commented Dec 15, 2022

Not to sidetrack this too much, but you can easily use _ and ~ for this and no problem.

@tevador
Copy link
Author

tevador commented Dec 15, 2022

I tested a few web browsers and text editors and all of them end selection on ~.

I think most software uses the definition of the regex word character placeholder \w, which expands to [a-zA-Z_0-9], to determine the selection span. There are just 63 characters in that set, so it seems to be impossible to have a base64 alphabet that has the double-click select feature.

@SChernykh
Copy link

I don't want base64 because of i1lI (these are 4 different characters), and probably a few other ambiguous sets of symbols.

@tevador
Copy link
Author

tevador commented Dec 30, 2022

FYI, certified addresses have been removed from this specification. Instead, I added similar functionality to the new payment URI proposal.

@UkoeHB
Copy link

UkoeHB commented Dec 31, 2022

Comments on latest updates:

  • 2.2: "If provided with a list of wallet addresses, this tier can also link [non-self-send] outputs to those addresses."
  • 2.2.1: Address tags mean the total speedup is ~2^24 for unowned enotes.
  • 3.2: I updated the implementation to include your 'personalization string'. The transcript prefix is coded as monero. I put domain-separation stuff in mostly lower-case for consistency, and only use upper case for external concepts like CLSAG. The KDF transcript builder is supposed to be as efficient as possible, so prefixes and domain separators don't have padding (that way all current uses of the KDF builder can fit in one blake2b block).
  • 3.2: For simplicity I mandate 32-byte keys for key derivation. There shouldn't be a case where we need 64-byte keys (even though blake2b permits it).
  • 3.2: For clarity I really recommend the notation H_x[k](data) for secret derivation. The comma is easily confused with || (I use them interchangeably), and it is less obvious when you are doing secret derivation vs a plain hash.
  • 3.3: I have been using xk for X25519 privkeys, xK for pubkeys, and xG for the generator. It is slightly verbose, but adds conceptual clarity and memorability that d, D, B do not have. 'Use whatever letter is unused' isn't always the best approach IMO - and note that B overloads the cryptonote spend key notation while D is confusingly closely related to {A, B} cryptonote address notation.
  • 3.3: I guarantee that I (and probably everyone else too) will continuously forget which is which between KeyDerive1() and KeyDerive2(). In code these are sp_derive_x25519_key() and sp_derive_key() respectively.
  • 3.3.1: You may want to clarify that x25519 scalars are not x*8 mod l, but instead are x mod l << 3. This way the mul8 automatically 'travels with' scalars.
  • 3.3.2: I decided to eliminate spaces from all domain separators. The U and X are now hashed from seraphis_U and seraphis_X. U = 10948b00d2de50b576998c11e83c59a79684d25c9f8a0dc6864570d797b9c16e, X = a4fb43ca695e12998802a20a158f12ea79474fb9012116956a69767c4d41110f
  • 4.2.1: KeyDerive1() should be KeyDerive2() here (good example how the names are confusing).
  • 4.3.1: Make a note that k_v is not migrated to k_vb because k_vb has more authority than k_v. In multisig it's migrated because otherwise you need a new setup ceremony (you do need a migration ceremony in order to get the new base spend key k_m U, but it doesn't require all group members to participate in case one of them became unavailable after initial account setup - you'd need all group members to make a new k_vb key).
  • 4.4: There are several typos in the key names in the Derivation column.
  • 4.6: The table here is inconsistent/incomplete compared to section 2.2 in your URI proposal (that proposal's table is correct/complete). Also, the master tier should include k_vb, since it isn't always derived from k_m.
  • 4.6.2: A remote scanner doesn't need to compute the nominal spend key, only check view tags and decipher address tags. The downstream client then checks the address tags and recomputes the sender-receiver secret if needed. Owned enotes will have some duplicate work done (just the DH exchange), but otherwise the work done and data transmitted by a remote scanner are minimized.
  • 4.7: FYI I have been calling K_s = k_vb X + k_m U the 'jamtis spend key', and k_m U the 'seraphis core spend key'.
  • 4.7: The Name column collides with section 4.4 naming. In general I try to use 'privkey' and 'pubkey' to disambiguate them.
  • 5.1.2: The hash string for encrypting address tags is "monero" || "jamtis_address_tag_hint" || k || cipher[k](j), since we don't want to use blake2b's keyed hash mode which adds an extra compression round, and we don't want to use the seraphis transcript builders which have some allocation overhead.
  • 5.5.2: Coinbase transcations don't have this 2-output optimization.
  • 6.3: I recommend citing your research and here too.
  • 7: It may be a while before test vectors can be fully locked down. The call stack for addresses needs to be fully reviewed, including transcript and config stuff - e.g. I just added transcript prefixing based on your comments. Vectors will all break on a single character change in any of the transcripts.

@tevador
Copy link
Author

tevador commented Dec 31, 2022

  • 3.2: I updated the implementation to include your 'personalization string'. The transcript prefix is coded as monero. I put domain-separation stuff in mostly lower-case for consistency, and only use upper case for external concepts like CLSAG. The KDF transcript builder is supposed to be as efficient as possible, so prefixes and domain separators don't have padding (that way all current uses of the KDF builder can fit in one blake2b block).

The "personalization string" is not a prefix in the transcript. It is a field in the Blake2b parameter block. Refer to section 2.8 of the Blake2 paper. It essentially domain separates the whole hash function. It might not be strictly needed, but it has zero cost.

@UkoeHB
Copy link

UkoeHB commented Jan 1, 2023

I see, the personal parameter. I'd rather not bake a customization like this so deep into the hash function wrappers, better to keep it simple and just use a domain separator. Adding a customization means all downstream projects have to implement that customization correctly.

@tevador
Copy link
Author

tevador commented Jan 1, 2023

I'd rather not bake a customization like this so deep into the hash function wrappers

It's supported by the C API of the blake2 library. It's not like we'd be changing the hash function internals.

better to keep it simple and just use a domain separator

Using the personalization is actually the simpler solution because you wouldn't need to have a separate implementation for view tags. It could also prevent future inconsistencies when someone could use the unkeyed hash and forget to add the "monero" prefix.

3.3: I have been using xk for X25519 privkeys, xK for pubkeys, and xG for the generator. It is slightly verbose, but adds conceptual clarity and memorability that d, D, B do not have.

That makes sense when writing code, but I've never seen a math notation that uses multi-letter variables. Someone could mistake xK for scalar multiplication of x and K. Both subscript and superscript indices are already in use in some places, so something like xk / xK can't be used (and are very hard to distinguish). Also the letter X is already used as one of the generators.

note that B overloads the cryptonote spend key notation while D is confusingly closely related to {A, B} cryptonote address notation.

Notations don't need to be globally unique, just unambiguous in the scope of this document (which doesn't use the CryptoNote address notation).

I'll try to implement your remaining comments in the next revision.

@UkoeHB
Copy link

UkoeHB commented Jan 1, 2023

It's supported by the C API of the blake2 library. It's not like we'd be changing the hash function internals.

What I mean is you can no longer just call blake2b() to hash, you need a custom sequence of API calls.

Using the personalization is actually the simpler solution because you wouldn't need to have a separate implementation for view tags. It could also prevent future inconsistencies when someone could use the unkeyed hash and forget to add the "monero" prefix.

I like having it be more explicit, which makes it more visible. Also, I updated the transcripts so the prefix is a constructor parameter that just defaults to the config "monero". This way the parameter is injected and not mandated. You could inject it to the hash functions too, but that would be more messy I think.

I've never seen a math notation that uses multi-letter variables

I was thinking of a left superscript: xk, xK, xG

Notations don't need to be globally unique, just unambiguous in the scope of this document

Sure, but this is implemented in the Monero codebase, and cryptonote will always be with us.

@UkoeHB
Copy link

UkoeHB commented Jan 10, 2023

I am working on seraphis knowledge/audit proofs with @DangerousFreedom1984 and ran into some issues with enote ownership proofs and address index proofs.

  • enote ownership proof: prove that an enote is owned by a specific user address (transitively, the owner of that address owns the enote)

Any proof method you come up with (A. subtract K_1 from Ko and make a composition proof on the remainder; B. expose the sender-receiver secret q and allow the verifier to recompute Ko from K_1 [only works for non-selfsends]) can be spoofed by the prover if they know the private keys of the real address, since the K_1 used in the proof can be freely defined. Spoofing means making a proof that an enote was sent to a particular address when the original sender sent it to a different address.

To get around that problem, I propose updating the sender extension to include the key K_1 that is being extended (e.g. k_{g, sender} = H_n("..g..", K_1, q, C). Then you can expose q and K_1 and the verifier can recreate Ko and be confident that K_1 owns the enote. Note that this proof doesn't provide a way for you to prove an address doesn't own an enote, all it says is 'if you make a valid proof, then the K_1 in that proof is accurate'.

Another issue is you can't use that approach to make selfsend enote ownership proofs, because q is used without a secondary secret (the baked key) when constructing amount commitments and encoded amounts (meaning you can't make a selfsend enote ownership proof without exposing the amount). Moreover, any such proof would have to reveal that an enote is a selfsend type (no type-agnostic proof).

To solve that I propose updating selfsend enote construction so it mimicks normal enotes more closely. The only changes needed are adding a selfsend baked key to amounts (baked_key_selfsend = H_32[k_vb](q); for consistency, update the normal one to baked_key_plain = H_32(xr xG) so that both baked keys will have the same serialization pattern [random 32 bytes]), and encrypting address tags the same way as normal enotes (instead of encrypting the raw index). Changing those things actually simplifies the protocol a little by isolating per-type customization to just the construction of secrets q and the baked key.

  • address index proof: prove that an address was generated from a particular index

There is currently no way to prove an address was constructed from a particular index without exposing s_ga. I propose changing the address extensions to H_n(K_s, j, H_32[s_ga](j)) where K_s = k_vb X + k_m U (and H_n_x25519(K_s, j, H_32[s_ga](j)) for the xK_2 and xK_3 modifiers). Then an address index proof for {K}_j will expose K_s, j, and secret H_32[s_ga](j). The user can then do another proof on K_s to show the private keys are known, or do a composition proof with the address {K}_j.

EDIT: These changes have been implemented.

@jeffro256
Copy link

I have a concern with mixing the "find-received" tier (k_fr) and "generate-address" tier (s_ga & K_s). Having access to both these tiers allows more than the sum of these tiers, namely the ability to 100% recognize owned incoming enotes (basically the "payment validator" tier w/o knowing the amounts). If the shared secret used to encrypt the address tag is a function s_sr2, then the nominal one-time address K_'o would only be calculable on the "payment validator" tier.

I suggest using the following method for creating the encrypted address tag in an enote: addr_tag_enc = addr_tag XOR H_ate(s_sr1 || s_sr2 || Ko). Under this scheme, the "generate-address" tier can still generate any public address with the same information, but can't decrypt the encrypted address tags.

There's two real-life issues that I can imagine that this change would fix. Let's say that you wanted to create a social payment app, like Venmo, in which the backend both calculates and filters view tags to speed up scanning, as well as generates new receive addresses for people who want to send money to their users. Without changing the address tiers, this service would be able to identify all owned enotes of their users w/ ease. Another scenario in which this change would increase security is a merchant server system where the find-receive keys and generate address keys are spread across user-facing servers for quick & responsive invoice generation. If a malicious actor gains access to both key tiers, then they can generate addresses and see all incoming transactions whereas under the modified scheme, they can only generate addresses and calculate view tags.

@UkoeHB
Copy link

UkoeHB commented Aug 14, 2023

@jeffro256

  1. If decrypting the address tag requires s_sr2, that invalidates the performance benefit of k_fr scanning, because clients of a remote scanner now have to compute the baked key 1/(k^j_a ∗ k_ua)) ∗ K_e.
  2. The baked key actually depends on the address index j, so what you describe is logically impossible.

@jeffro256
Copy link

jeffro256 commented Aug 19, 2023

Okay I've looked into the 3 main privacy issues I've had with Jamtis deeper, and have a proposal. Thanks to @UkoeHB for the guidance thus far! I modified the Jamtis section of Ukoe's "Implementing Seraphis" paper with the details and uploaded it to Ufile since its a little more fleshed out than this doc. See below for a high-level view of the proposal.

Jamtis Change: Fix F-R Privacy Issues and New View Tag Tier

Pros

  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes to known public addresses.
  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes sent to a public address that is used more than once.
  • Third-parties can now compute view tags and generate public addresses on behalf of users without the ability to learn any additonal balance recovery information.
  • There are now two tiers of view tag wallets that users can pick between depending on their desire of balance/privacy: dense (1 byte) and sparse (2 bytes).

Cons

  • Public address raw size is increased by 30 bytes (48 characters if encoded using base32) (Additional +32 bytes for new public key, -2 bytes to remove decipher hint). Transactions remain the same size.
  • Light wallet scanning is slower on the client side (each deciphering op is replaced with DH op)
  • Additional spec complexity
  • Some other things I'm not currently seeing

Description of changes

Account secrets

Instead of one find-receive key k_fr, there are now two keys: the dense view key k_dv and the sparse view key k_sv. Instead of a base pubkey K_fr, there is now the dense view pubkey K_dv = k_dv * K_ua and sparse view pubkey K_sv = k_sv * K_ua.

Public address

The new public address now contains 4 pubkeys instead of 3 and does away with the decipher hint. The four pubkeys are labeled Kj_s, Dj_ua, Dj_dv, and Dj_sv. Kj_s is the same as Kj_1 in the old address scheme, while Dj_ua, Dj_dv, and Dj_sv are equal to their respective base pubkeys multiplied by the address private key. The ciphered address index c^j stays the same and Dj_ua is the same as the old Kj_3. To summarize, the new address tuple is [Kj_s, Dj_ua, Dj_dv, Dj_sv, c^j].

DH exchanges

The ephemeral pubkey K_e is calculated K_e = r * Dj_ua by the sender like normal (remember that Dj_ua is functionally identical to the old Kj_3), but now there are two DH keys: the dense DH key Kdv_d = r * Dj_dv = k_dv * K_e and the sparse DH key Ksv_d = r * Dj_sv = k_sv * K_e.

View tags

There are now two view tags (dense_view_tag and sparse_view_tag) per enote which are functions of their respective DH keys. In keeping with the old scheme, the dense view tag replaces the old view tag at 1 byte in size, and the sparse view tag replaces the decipher hint at 2 bytes in size. These view tags are completely independent of each other so combining the checks multiplies the amount of filtering done to 1:16777216. A user can choose to reveal either k_dv or k_sv (but not both, explained later) to a light wallet server to pre-scan the enotes for them. Unlike the old k_fr, knowing only one of k_dv or k_sv does not allow a third-party to perform any process in the balance recovery process except for recomputing view tags, which is good for privacy (also explained later).

Sender-Receiver Secret s_sr1

The sender-receiver secret s_sr1 is now computed as s_sr1 = Hsr1n(Kdv_d || Ksv_d || K_e || input_context). Notice that both DH keys are needed to compute s_sr1. This point is crucial to the privacy properties of the new scheme.

Wallet Tiers

There are now two tiers for view tag computation: dense and sparse. They can do nothing else besides compute their respective view tags. The find-receive tier can identify all incoming enotes, but not view amounts. The payment-validator tier remains the same in terms of capabilities. There are 2 new "compound tiers" which are the combination of the dense/sparse view tag tier and the generate address tier, which do exactly as expected without additional privacy drawbacks.

How the New Changes Address Privacy Issues

The core of the first two privacy issues mentioned in the "pros" section stem from the fact that the ability to decrypt address tags was tied to the ability to perform view tag computation. Since address tags are 1) public and 2) constant for a given address, third-parties with knowledge of k_fr can make extremely strong guesses about users' ownership of enotes under loose conditions. This new scheme decouples those two things so that a third-party can compute view tags for a user but not learn any additional information about enotes. To decrypt "address tags" (its now just the ciphered address index) under this new scheme, a third-party must know both k_dv and k_sv, since those are needed to compute s_sr1.

The third privacy issue (mixing the find-receive and generate-address tier) is fixed because of the same reason: a third-party must now know both k_dv and k_sv to compute s_sr1. However, there was a deeper issue here with the old scheme since third-parties who knew k_fr, k_ga, and K_s (the combination of find-receive and generate-address tiers) could decipher the address indicies and recompute the onetime address Ko, proving to themselves that a user owns this enote with 100% certainty (assuming the user did not lose their keys). Since this issue is addressed, this opens up the possibility for Venmo-like applications where s_ga, k_dv, and K_s are given to a single third-party so that the third-party can reduce users' refresh times by ~99.6% using view tags and generate receive addresses for their users' while they are offline without compromising privacy.

How the New Changes Affect Scanning Speed

For regular full wallets where both k_dv and k_sv are known, the first view tag check can be against the 2-byte sparse view tag to initially filter out all but 1:65536 enotes with just one DH exchange, as compared to 1:256. After that, the dense view tag can be checked to further refine the enotes in a 1:256 ratio. For owned enotes, the balance recovery process is actually slower since 3 DH operations are needed instead of 2. As for 1-byte-view-tag light wallets (Old "find-receive" tier and new "dense-view" tier), the server does the exact same amount of work (1 DH + view tag check), but the client will need to do expend more CPU cycles, assuming that a DH exchange is more expensive than symmetrically deciphering the 16 bytes address index.

Below, I have provided and quick and dirty comparison of the operations that must be done to scan enotes under different wallet types. I use the term "Sparse-View Light Wallet" and "Dense-View Light Wallet" to refer to wallet schemes in which the sparse (2 byte) view tag key k_sv and dense (1 byte) view tag key k_dv are provided to third parties, respectively, to initially filter out enotes. The new "Dense-View" wallet tier is the most similar to the old "Find-Received" wallet tier in the respect that they can both calculate 1 byte view tags on behalf of users.

Normal Enote Operation Density

Amortized Period Old Full/Light Wallet New Full/Sparse-View Light Wallet New Dense-View Light Wallet
1:1 enotes DH + view tag check* DH + view tag check* DH + view tag check*
1:28 enotes 16 byte decipher + decipher hint check DH + view tag check
1:216 enotes DH + view tag check
1:224 enotes Ko recompute 16 byte decipher + Ko recompute 16 byte decipher + Ko recompute

* = If applicable, a light wallet server would perform this operation on behalf of a user in the background. This is important when considering trade-offs because if you value client scanning time above all else, you can disregard the operations marked by an asterisk when considering light wallet schemes.

Total Notable Operations for a Owned Normal Enote

Scheme Total Notable Operations for a Owned Normal Enote
Old DH + view tag check + 16 byte decipher + decipher hint check + Ko recompute
New 2 * (DH + view tag check) + 16 byte decipher + Ko recompute

There are obviously more operations in the balance recovery than are mentioned here, but these are likely the most expensive. The main performance difference between full wallets is that every 256 enotes, the old scheme has to decrypt the address tag, decipher the address index j. The new scheme only does this every 1:16777216 enotes, but must perform an extra Diffie-Helman key exchange and view tag check once every 65536 enotes. According to @tevador, DH exchanges are ~100x more expensive than deciphering, so the scanning performance will likely remain more or less the same for full wallets.

On the other hand, the performance for light wallet clients is worse. The work for the server is exactly the same: 1 DH + view tag check, but the client must do 1 DH + view tag check instead of a 16 byte decipher every 1:256 enotes on-chain (every enote the client receives). At any rate, in both the new "dense-view" light wallet and old "find-receive" wallet tiers, the client must download ~65536x more information (less if the user owns a large fraction of on-chain enotes) than is actually needed for balancy recovery past view tag/decipher hint checks, so the performance difference here is hard to quantify without real-word testing. It should be noted that the new "sparse-view" wallet tier follows the same recovery path as the full wallet, so it gets the performance benefit on both the server and client of first being able to check against the sparse view tag, filtering out all but 1:65536 enotes for a user as compared to the normal 1:256. This means that a "sparse-view" light wallet client has to download 256x less information than current Jamtis light wallets, obviously at the privacy cost of narrowing down owned enotes probabilistically to 1:65536.

Additional Opinions on Why I Think the Trade-off is Worth it

Gathering from years of forum discussions and IRC/Matrix chats, one of the biggest UX complaints (arguably only beaten by the 10-block-lock) against Monero is the frustratingly long refresh times. This is such an issue that light wallet ecosystems evolved in the very early days of Monero to tackle this problem. The users of these light wallets were willing to completely sacrifice their incoming enote privacy (by revealing private view keys) just to bring refresh times down. There are innumerable posts online about potential users who left Monero completely because of refresh bugs and the corresponding wait times. This is why privacy-preserving light wallet servers are the future for most casual users, and will capture many on-the-fence people who want a better privacy/UX balance. Creating accessible, un-foot-gun-able digital cash is the core value proposition of Monero for me.

The new light wallet scheme under Jamtis is exciting and brings and lot of possibility. However, the privacy issues inherent to them would make it hard for me to recommend to anyone except the least privacy-minded people. There are simply too many ways to footgun, the main concerns being over the passive address tag decrypting issues, which means you can't receive to the same address more than once or let your light wallet server know your public addresses.

Addressing the main downside of this change, the address size, I say: its not that big of a deal to me. Jamtis addresses are already >3x the size of BTC addresses, so increasing the size by ~25% doesn't matter. The new addresses would still be easily copy-and-paste-able and fit on a medium QR code. I don't know anyone who is typing the addresses out by hand or reading them aloud, even with legacy Cryptonote addresses which are >2x the length of a BTC address, so I don't believe that use case is affected. For those that have read this far, thank you for your time and consideration. ;)

@tevador
Copy link
Author

tevador commented Aug 19, 2023

Public address size is increased by 30 bytes

The actual address length would increase from 196 to 244 characters.

As you can see, unless deciphering is more than 256x faster than a DH operation

DH is about 100x slower. The performance impact of this change is likely negligible (slightly slower overall).

Nevertheless, the privacy benefits might still be worth it.

@One-horse-wagon
Copy link

To enhance security by accommodating your new protocol in a 244 character address is a no-brainer to me. Address length would become an issue only if it would limit what you can do, such as making Q-R codes unusable.

@jeffro256
Copy link

DH is about 100x slower.

I assume you're talking about X25519 and Twofish here, is that correct? If we move to a curve cycle to prepare for FCMPs, how fast can DH/variable-base-scalar-multiplication be made using your curve cycle? I would assume that it would be slower, so the full wallet scanning performance changes would likely wash out completely.

@tevador
Copy link
Author

tevador commented Aug 20, 2023

I assume you're talking about X25519 and Twofish here, is that correct?

Correct.

how fast can DH/variable-base-scalar-multiplication be made using your curve cycle?

X25519 has many optimized implementations that would be very hard to beat with a custom curve.

If we switch to the curve cycle, this only affects the "proof" keys (denoted with the letter K in this specification). I strongly recommend to keep X25519 for the key exchange keys (denoted with the letter D in this specification). This should be easy to do because Jamtis never needs any interop between the key exchange keys and the "proof" keys, so these can be completely unrelated elliptic curve groups.

@j-berman
Copy link

I lean yes on the idea to add an additional pub key to the address for the privacy gain to light wallet users, however, I'm not a hard yes and I think it's an acceptable decision to proceed without it. I'm going to steel man an opposing argument: even with this proposal, a light wallet user should still expect that a 3rd party server is able to trace their transactions using statistical analysis. As such, the addition offers a benefit that light wallet users using 3rd party servers shouldn't consider in their threat model, and therefore is not worth the added UX and complexity burden.

I lean yes (and do not agree with the steel man) because the address length would still fall within an "acceptable" size, and the proposal offers a tangible privacy benefit to light wallet users (and therefore benefits the anonymity set): the server cannot definitively identify a user's received enotes even if the user receives to the same address twice or if the server knows the user's address, which is a strict improvement to a light wallet user's privacy even if there are still potential statistical leaks under certain conditions.

I'll explain why I think a 3rd party server may still be able to trace transactions using statistical analysis under certain conditions.

Assume this proposal is accepted alongside full chain membership proofs. After some discussion with @kayabaNerve, here is what I understand the theoretically optimal privacy profile for light wallets could look like when constructing a tx:

  1. The user opens their light wallet client and requests their wallet's view-tag-matched enotes from the server.
  2. In order to construct a tx, a light wallet client fetches paths in the merkle tree to a set of enotes (1 real path + N decoy paths so whomever is serving the paths does not know which enote the user is spending)1
    • Each single path would be on the order of kilobytes, thus the light wallet client would fetch a subset of paths similar to fetching decoys today (using a decoy selection algo).
    • The light wallet client should request these paths from a 3rd party daemon whose operator is ideally not colluding with the light wallet server. This way the user avoids revealing to the light wallet server that the user is trying to construct a tx.
      • The light wallet client could request paths to view-tag-matched enotes only, just in case the 3rd party daemon is colluding with the light wallet server.
    • The light wallet client should also request fees from and submit the final tx to 1 or more 3rd party daemons ideally not colluding with the light wallet server to avoid revealing the tx was constructed by the user to the server.
  3. Finally, the user's tx will include a view tag match on chain.

If you assume the 3rd party daemon is not colluding with the light wallet server, then the statistical footprint is: user opened their light wallet, shortly thereafter there's a view tag match on chain. This footprint's impact on a user's privacy depends entirely on tx volume. With low volume, the server is able to tell the user likely spent an enote in the tx since the view tag match is likely change. If the server collects these footprints for every tx the user constructs, with low volume, the server can perhaps start to build a user's plausible tx graph.

If you assume the 3rd party daemon is colluding with the light wallet server, which I think should be every user's default assumption (trusted 3rd parties are security holes), then the statistical footprint naturally can have a worse impact on a user's privacy. The light wallet server can definitively tell when the user constructs a tx, and further can narrow in on a subset of plausible spends. Example:

  • The user receives an enoteA in txA, then spends that enoteA in txB and has a change enoteB in txB. The light wallet server knows the user constructed txB and therefore knows the view tag matched enoteB in txB is likely the user's.
  • When spending enoteB in txC, the user requests a set of merkle paths where enoteB is 1 of N path requests.
  • The light wallet server knows the user constructed txC and can make an educated guess that enoteB was spent in txC.

The light wallet server has thus built up evidence the user received enoteB in txB and spent enoteB in txC.

This statistical leak should be considered unavoidable for the light wallet tier imo; this leak can only be mitigated in some capacity. Which is why I would hope that light wallets don't replace full wallets for privacy-conscious users unless they're running their own light wallet servers. I would still argue the single additional key "find-received" tier as currently spec'd is valuable and worth implementing because 1) amounts are unknown to the server (significant privacy benefit), and 2) it offers a tangible privacy benefit since the light wallet server cannot definitively identify all of a user's received enotes under all conditions. But I can understand the argument why two additional keys for the tier is excessive considering the above argument.

Reiterating: I'm still for the proposal to add an additional pub key to the address. I think the tangible privacy benefit the additional key brings to light wallet users is worth ~25% larger addresses and more complexity. But I don't hold a strong yes considering I think the argument against is a strong argument.


I haven't dug deeper into the sparse/dense view side of the proposal yet and will comment on that later.


1: requesting paths in the merkle tree would be unnecessary if the client downloads the entire merkle tree when scanning. However, this downloading could be on the order of gb's, which would then defeat the core benefit of a light wallet: instant wallet open.

@kayabaNerve
Copy link

kayabaNerve commented Aug 22, 2023

The Merkle tree leaves would be 32 bytes per output, or a few GB @ 100m outputs. If we have branches with no view-tag-matched outputs, they can be dropped for one 32 byte value. If the view tag hit rate is 1/256, I believe more than half of the branches will have at least one leaf. If the view tag hit rate is 1/65536, most won't.

(branch length is currently configured to 167)

If we have a 1:65536 hit rate, only ~1/400 branches will be hit? That means the 3.2 GB leaf set at 100m outputs becomes 10 MB? It seems much saner to just download the tree in this case.

@j-berman
Copy link

j-berman commented Aug 22, 2023

Tx volume is hovering around ~20k txs per day these days, which is a floor of ~40k outputs per day. Let's assume ~65k outputs per day, which is an expected ~1 view tag match per day at a 1:65,536 hit rate. At that rate, any view-tag-matched enotes the server identifies around the time a user opens their wallet would almost certainly be the user's enotes. Further, any clusters of enotes the user spends/receives in a single day would stick out like a sore thumb to the server.

Seems at that hit rate and today's volume, the privacy gain of view tags is close to nil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment