Huffman-Encoded Taproot Output Descriptors

Motivation

Taproot outputs can feature complex merkle branches involving multiple participating wallets. We prepose a huffman-encoded taproot output descriptor which allows all participating wallets to solve for all merkle branch spending paths, without imposing any restrictions to possible tree structure and ensuring a unique descriptor-to-output mapping.

The design of the proposed output descriptor is also influenced by the desire to enable higher-level descriptor expressions which may compile to multiple tapscripts of potentially different execution probability.

Basic Design

In essence, the taproot descriptor expresses a sequence of tapscript descriptors, each which compile to the output script of the respective tapleaf. Each descriptor also is followed by a tuple consisting of a leaf-position and huffman-weight. The descriptors are also sorted by leaf position, but this does not make the leaf position redundant as we will see later with higher-level taproot-specific descriptor multi.

tap(descriptor(keys)(leaf position & weight), ...)

For example:

tap(descriptor0(keys)(pos0,weight0), descriptor1(keys)(pos1,weight1), descriptor2(keys)(pos2,weight2), ...)

Depending on the huffman weights, the taproot descriptor could represent the following taptree. The taptree structure is solved for via optimal merging.

---------------------------------------
                   4
                  / \
                 2   2 
                / \ / \
leaf weight:    1 1 1 1
leaf position:  0 1 2 3
---------------------------------------

Let us consider another example with different weights.

---------------------------------------
                     5
                    / \ 
                   3  | 
                  / \ | 
                 2  | | 
                / \ | | 
leaf weight:    1 1 2 2
leaf position:  0 1 2 3
---------------------------------------

To ensure a unique set of leaf weights for a given tree structure, the lowest possible leaf weights for a given structure must be selected. Furthermore, if the the creation of the next node is not clear given multiple merge-pair candidates of equal weightsums, the pair nearest to the left is selected.

The leaf positions of each sibling-pair at the lowest level can seemingly be switched since the parent node is derived from the lexicographic ordered taphash of its children. To avoid this ambiguity, we require that the descriptor literals to also be ordered lexicographically.

Higher-Level Multi Output Descriptors

Having established the basic encoding principles of the proposed taproot descriptor, we now introduce the multi output descriptor which has been reinterpreted for tapscripts.

multi_non_interactive(n, m pubkeys)(pos_0, weight_0)(pos_1, weight_1)...(pos_s, weight_s)

* Where s is the number of possible n-sized subsets of m.

multi_interactive(n, m pubkeys)(pos_0, weight_0)(pos_1, weight_1)...(pos_s, weight_s)

* Where s is the number of possible n-sized subsets of m.

Notice the sequence of tuples following the multisig tapscript descriptors. These describe position and weight for each of the n-of-n tapscripts these multi descriptor expressions compile down to.

Non n-of-m interactive multisig descriptor:

multi_non_interactive(n, pubkey_0, pubkey_1, pubkey_2 ... pubkey_m)(pos0, weight0)(pos1, weight1)...

expands to:

  * multi_add(pubkey_0, pubkey_1)
  * multi_add(pubkey_0, pubkey_2)
  * multi_add(pubkey_1, pubkey_2)

or

  * <pubkey_0> checksigadd <pubkey_1> checksig 2 equal 
  * <pubkey_0> checksigadd <pubkey_2> checksig 2 equal
  * <pubkey_1> checksigadd <pubkey_2> checksig 2 equal

Interactive n-of-m musig descriptor:

multi_interactive(2, pubkey_0, pubkey_1, pubkey_2)(pos0, weight0)(pos1, weight1)...

expands to:

  * p2k(pubkey_0 + pubkey_1)
  * p2k(pubkey_0 + pubkey_2)
  * p2k(pubkey_1 + pubkey_2)

or

  * <pubkey_0 + pubkey_1> checksig
  * <pubkey_0 + pubkey_2> checksig
  * <pubkey_1 + pubkey_2> checksig

The ordering of the compiled tapscripts dependent on the lexicographic ordering of the pubkeys in the multi descriptor expression, and therefore correspond to the same position and weight tuple order.

Appendix: Encoding a Taproot Descriptor

Let us consider the following list of tapscripts.

* P2PKH(key)
* P2WSH(key)
* multi_non_interactive(2, 3 keys)

Note that the multi_non_interactive descriptor actually expands to multiple 2-of-2 scripts..

* P2PKH(key)
* P2WSH(key)
* multi_add(pubkey_0, pubkey_1)
* multi_add(pubkey_0, pubkey_2)
* multi_add(pubkey_1, pubkey_2)

We would like to have the leafscripts above in the following tree structure:

---------------------------------------
                           *
                          / \  
                         *  | 
                        / \ |
                       *  | |
                      / \ | |
                     *  | | |
                    / \ | | |
multi_add(pk0, pk2) | | | | |
                      | | | |
           P2WSH(key) | | | |
                        | | |
    multi_add(pk0, pk1) | | |
                          | |
               P2PKH(key) | |
                            |
        multi_add(pk0, pk1) |
---------------------------------------

Since multi_add(pk0, pk2) and P2WSH(key) are siblings at the same depth, their descripters are ordered lexicographically. Next we need to determine the minimal weights necessary to support imply the tree structure above.

---------------------------------------
                     2
                    / \ 
                    1 1
multi_add(pk0, pk2) | | 
                      |
           P2WSH(key) | 
---------------------------------------

The lowest leafs require the lowest possible weights, namely 1. The resulting parent has a merged weight of 2. The next parent node is created from nodes with the lowest weights, so the 3rd leaf from left can continue to have a weight of 1. However, the 4th node must have a weight of 2, so it will not be merged with the 3rd node (if multiple merge-pair-candidates are available, the pair to the left is selected). We can also conclude that the 5th node must have a weight of 3, so it will not be merged with the fourth node.

---------------------------------------
                       3  
                      / \
                     2  |
                    / \ |
                    1 1 1 2 3 
multi_add(pk0, pk2) | | | | |
                      | | | |
           P2WSH(key) | | | |
                        | | |
    multi_add(pk0, pk1) | | |
                          | |
               P2PKH(key) | |
                            |
        multi_add(pk0, pk1) |

---------------------------------------

The final taproot tree:

---------------------------------------
                           8
                          / \
                         5  | 
                        / \ |
                       3  | |
                      / \ | |
                     2  | | |
                    / \ | | |
                    1 1 1 2 3 
multi_add(pk0, pk2) | | | | |
                      | | | |
           P2WSH(key) | | | |
                        | | |
    multi_add(pk0, pk1) | | |
                          | |
               P2PKH(key) | |
                            |
        multi_add(pk0, pk1) |
---------------------------------------

jachiang/TaprootDescriptor.md

Huffman-Encoded Taproot Output Descriptors

Motivation

Basic Design

Higher-Level Multi Output Descriptors

Appendix: Encoding a Taproot Descriptor