Restoring Bitcoin's Full Script Power
In my previous posts I’ve been carefully considering what bitcoin Script improvements we might want if we had introspection. Script was hobbled back in v0.3.1 due to denial-of-service issues: this has been a long-ongoing source of regret, but functions like OP_TXHASH
bring Script limitations into clear focus.
Ye Olde Bitcoin Script
Most people know that Satoshi disabled OP_CAT
and a few other opcodes in v0.3.1, but Anthony Towns pointed out that until v0.3 bitcoin also allowed arbitrary size numbers using the OpenSSL BIGNUM type.
This was early in the project, and I completely understand the desire to avoid DoS immediately and clearly, and restore functionality later once the issues were carefully considered. Unfortunately, the difficult nature of Script enhancements was not deeply appreciated until years later, so here we are!
A Varops Budget: Full Script Restoration Without Denial of Service
BIP-342 replaced the global signature limit with a sigops budget based on weight, designed to be ample for any reasonable signature validation (such as might be produced by miniscript), yet limited enough to avoid denial of service.
We can use the approach for other operations whose expense is related to their operand size, and similarly remove existing arbitrary limits in script. I call this a “varops” budget, as it applies to operations on variable-length operands.
My draft proposal sets the varops budget as simple:
- The transaction weight multiplied by 520.
This ensures that even if the budget were enforced on existing scripts, no script could conceivably fall short (e.g. each OP_SHA256 can always operate on the maximal-size stack object, with its own opcode weight supporting that budget).
Note: the budget is for the entire transaction, not per input: this is in anticipation of introspection opcodes which mean that a fairly short script may nonetheless want to examine other inputs which may be much larger.
The consumption of the various opcodes is as follows (anything not listed doesn’t have a cost):
Opcode | Varops Budget Cost |
OP_CAT | 0 |
OP_SUBSTR | 0 |
OP_LEFT | 0 |
OP_RIGHT | 0 |
OP_INVERT | 1 + len(a) / 8 |
OP_AND | 1 + MAX(len(a), len(b)) / 8 |
OP_OR | 1 + MAX(len(a), len(b)) / 8 |
OP_XOR | 1 + MAX(len(a), len(b)) / 8 |
OP_2MUL | 1 + len(a) / 8 |
OP_2DIV | 1 + len(a) / 8 |
OP_ADD | 1 + MAX(len(a), len(b)) / 8 |
OP_SUB | 1 + MAX(len(a), len(b)) / 8 |
OP_MUL | (1 + len(a) / 8) * (1 + len(b) / 8 |
OP_DIV | (1 + len(a) / 8) * (1 + len(b) / 8 |
OP_MOD | (1 + len(a) / 8) * (1 + len(b) / 8 |
OP_LSHIFT | 1 + len(a) / 8 |
OP_RSHIFT | 1 + len(a) / 8 |
OP_EQUAL | 1 + MAX(len(a), len(b)) / 8 |
OP_NOTEQUAL | 1 + MAX(len(a), len(b)) / 8 |
OP_SHA256 | 1 + len(a) |
OP_RIPEMD160 | 0 (fails if len(a) > 520 bytes) |
OP_SHA1 | 0 (fails if len(a) > 520 bytes) |
OP_HASH160 | 1 + len(a) |
OP_HASH256 | 1 + len(a) |
Removal Of Other Limits
Ethan Heilman’s proposal for restoring OP_CAT maintained a limit of 520 bytes for the result. This can now be removed, in favor of a total stack limit already valid for taproot v1 (1000 elements and 520,000 bytes).
Further, if we were to introduce a new segwit version (such as Anthony Towns’ generalized taproot] or just to allow keyless entry, we can lift these limits to reasonable blocksize maxima (perhaps 10,000 elements totalling 4M bytes).
Minor Changes to Semantics
Values are still little-endian, but unsigned. This simplifies implementation and makes the interaction of bit operations and arithmetic operations far simpler. It allows existing positive numbers to use these opcodes without modification, not requiring conversion.
If a new segwit version were used, existing opcodes can be replaced, otherwise, new opcodes (e.g. OP_ADDV
) would be added.
Implementation Details
The v0.3.0 implementation used a simple class wrapper of OpenSSL’s BIGNUM type, but for maximum clarity and simplicity I reimplemented each operation without external dependencies.
Except for OP_EQUAL
/OP_EQUALVERIFY
, each one converts to and from a little-wordian vector of uint64_t
. This could be optimized by doing conversion on demand.
OP_DIV
, OP_MOD
and OP_MUL
are implemented naively (comparison with libgmp’s big number operations shows more sophisticated approaches are astronomically faster).
Benchmarks: Are Limits Low Enough To Prevent DoS?
Are Limits High Enough to Be Ignored?
We can remove the 520 byte limit
We still require a limit on total stack size: with a new segwit version this could be raised to 4000000, or left at 520,000 as per the current limit.
After I’ve had a series of posts looking at Script improvements.
In my previous post on Examing scriptpubkeys in Script I pointed out that there are cases where we want to require a certain script condition, but not an exact script: an example would be a vault-like covenant which requires a delay, but doesn’t care what else is in the script.
The problem with this is that in Taproot scripts, any unknown opcode (OP_SUCCESSx
) will cause the entire script to succeed without being executed, so we need to hobble this slightly. My previous proposal of some kind of separator was awkward, so I’ve developed a new idea which is simpler.
Introducing OP_SEGMENT
Currently, the entire tapscript is scanned for the OP_SUCCESS
opcodes, and succeeds immediately if one it found. This would be modified:
- The tapscript is scanned for either
OP_SEGMENT
orOP_SUCCESSx
. - If
OP_SEGMENT
is found, the script up to that point is executed. If the script does not fail, scanning continues from that point. - If
OP_SUCCESSx
is found, the script succeeds.
This basically divides the script into segments, each executed serially. It’s not quite as simple as “cut into pieces by OP_SEGMENT and examine one at a time” because the tapscript is allowed to contain things which would fail to decode altogether, after an OP_SUCCESSx
, and we want to retain that property.
When OP_SEGMENT
is executed, it does nothing: it simply limits the range of OP_SUCCESS
opcodes.
Implementation
The ExecuteWitnessScript
would have to be refactored (probably as a separate ExecuteTapScript
since 21 of its 38 lines are an “if Tapscript” anyway), and it also implies that the stack limits for the current tapscript would be enforced upon encountering OP_SEGMENT
, even if OP_SUCCESS
were to follow after.
Interestingly, the core EvalScript
function wouldn’t change except to ignore OP_SEGMENT
, as it’s already fairly flexible.
Note that I haven’t implemented it yet, so there may yet be surprises, but I plan to prototype after the idea has received some review!
Enjoy!