Stopping Arbitrary Data

Disclaimer: This is not an endorsement of any of the ideas presented in this document.

These are notes from the 2025-09-15 livestream: https://www.twitch.tv/videos/2567397716

Disallow From Output Scripts

First have to validate output scripts at all.

Any opcode must be a defined opcode and not disabled or immediate failure (e.g. OP_CAT, or OP_RETURN)
An output script must fail with invalid stack operation when executed by itself
- Bypass: Start script with OP_DROP, then push all the data
Or, script must fail with invalid stack operation, but interpreter can ignore that and the script must be cleanstack
- Bypass: push data, then OP_DROP everything, add an extra OP_DROP

Basically intractable to do analysis on complex output scripts since output scripts are only half of the script.

Only allowed output script are P2PKH, P2SH, P2WPKH, P2WSH, P2TR
- P2PKH, P2SH, and P2WPKH would still allow 20 bytes of data per output
- P2WSH and P2TR allows 32 bytes of data per output
Other current standard ones (i.e P2PK, Bare multisig, OP_RETURN) are no longer valid

Disallow From Input Scripts

Make the inscription envelope (OP_FALSE OP_IF) invalid
- Bypass: Move OP_FALSE to the stack
- Bypass: OP_1 OP_1 OP_SUB OP_IF
Scripts with unreachable branches are invalid
For tapscript, OP_IF as the first opcode is disallowed
- Bypass: OP_CHECKSIG then the OP_IF ...
Disallow OP_IF
- Massive drawbacks, kills HTLCs and also other useful things, e.g. used in miniscript
Scripts must be valid Miniscript, and analysis on all branches must be satisfiable
- Bypass: big multisig
Scripts must be valid Miniscript, except no multi() or multi_a()
- Multisigs need to use Taproot and MuSig or FROST
- Bypass: or_i() a bunch of hash fragments
- Removes all upgrade paths because no OP_NOPs or OP_SUCCESS
  - Keeping OP_NOP or OP_SUCCESS breaks the analysis

Every Pubkey Needs To Be Valid

Creating an output requires revealing the redeemScript, witnessScript, or tapscripts for that output, then each pubkey must come with a signature to prove that a private key exists
- To preserve non-interactivity, the signature is over some other fixed message defined by consensus
- New sidecar data structure for pubkey signatures
- Maybe easier to also just make output scripts tapscript, instead of hiding behind hashes since the script needs to be revealed anyways
  - Loses benefits of taproot
- Probably no pubkey hashing
- Scripts must be Miniscript so that pubkeys can be identified
Still possible to encode some data with grinding, or the privkey thing that bitmex described
- thresh(1, pk_k(k1), pk_k(k2)...) as the script contains the pubkeys from the private key encoding thing with the fixed signatures using known k values.
Somehow all of the scripts and pubkeys and signatures need to be communicated to senders in order to even be put into outputs

Don't Forget The Control Block and Annex

Reduce the scriptpath merkle tree depth, but there will always be room for more data

Upgrade Paths Allow Data

No Annex: Annex allows for an arbitrary amount of data to be pushed to the stack
No unknown witness versions: can have up to 40 bytes of data in the output script
- Unknown witness versions can have anything in their witness stack which means that there can be maximal data included

All upgrades must be via hard fork.... eww.

No Expressivity, Keys Only

Every output is a P2TR, no scriptpath, with a signature over a fixed message
Still possible to MuSig and FROST for multisigs
Can't do anything else interesting, including HTLCs, or covenants, or whatever other neat script thing that a bunch of people want to do.

Other Random Places in Txs to put data

Always enforce locktime regardless of input sequence to prevent data in locktime
- As long as the locktime is less than current timestamp or block height, data can be stored
Every transaction must be >= v2 to prevent data in input sequences

Instead of Stopping, Make Arbitrary Data More Expensive

Increase the weight cost on outputs, instead of 4 weight per byte, 20 weight per byte
Unexecuted branches in inputs don't get witness discount
- Bypass: Push then OP_DROP
Original stack items that are consumed but not processed by an opcode (e.g. OP_DROP) don't get witness discount
- Bypass: OP_EQUAL OP_NOT
Any stack item consumed by OP_CHECK(MULTI)SIG(ADD) or Tapscript keypath
- Bypass: CHECKMULTISIG or CHECKSIGADD with fake pubkeys
Any pubkey which wasn't used in the input (no/invalid sig) doesn't get the discount
Altstack must be empty at the end of the script

Unexecuted branches in inputs don't get witness discount

Bypass: Push then OP_DROP

Original stack items that are consumed but not processed by an opcode (e.g. OP_DROP) don't get witness discount

Bypass: OP_EQUAL OP_NOT

Any stack item consumed by OP_CHECK(MULTI)SIG(ADD) or Tapscript keypath

Bypass: CHECKMULTISIG or CHECKSIGADD with fake pubkeys

Any pubkey which wasn't used in the input (no/invalid sig) doesn't get the discount

Altstack must be empty at the end of the script

These sound like good ideas worth thinking about.

But I don't like this:

Increase the weight cost on outputs, instead of 4 weight per byte, 20 weight per byte

This would make monetary transactions considerably more expensive while barely impacting large inscriptions.

achow101/stopping-arbitrary-data-notes.md