Last night f2pool mined a 1MB block containing a single 1MB transaction.  This scooped up some of the spam which has been going to various weakly-passworded "brainwallets", gaining them 0.5569 bitcoins (on top of the normal 25 BTC subsidy).  You can see the megatransaction on blockchain.info.

It was widely reported to take about 25 seconds for bitcoin core to process this block: this is far worse than my "2 seconds per MB" result in my last post, which was considered a pretty bad case.  Let's look at why.

### How Signatures Are Verified

The algorithm to check a transaction input (of this form) looks like this:

1. Strip the other inputs from the transaction.
2. Replace the input script we're checking with the script of the output it's trying to spend.
3. Hash the resulting transaction with SHA256, then hash the result with SHA256 again.
4. Check the signature correctly signed that hash result.

Now, for a transaction with 5570 inputs, we have to do this 5570 times.  And the bitcoin core code does this by making a copy of the transaction each time, and using the marshalling code to hash that; it's not a huge surprise that we end up spending 20 seconds on it.

### How Fast Could Bitcoin Core Be If Optimized?

Once we strip the inputs, the result is only about 6k long; hashing 6k 5570 times takes about 265 milliseconds (on my modern i3 laptop).  We have to do some work to change the transaction each time, but we should end up under half a second without any major backflips.

Problem solved?  Not quite....

### This Block Isn't The Worst Case (For An Optimized Implementation)

As I said above, the amount we have to hash is about 6k; if a transaction has larger outputs, that number changes.  We can fit in fewer inputs though.  A simple simulation shows the worst case for 1MB transaction has 3300 inputs, and 406000 byte output(s): simply doing the hashing for input signatures takes about 10.9 seconds.  That's only about two or three times faster than the bitcoind naive implementation.

This problem is far worse if blocks were 8MB: an 8MB transaction with 22,500 inputs and 3.95MB of outputs takes over 11 minutes to hash.  If you can mine one of those, you can keep competitors off your heels forever, and own the bitcoin network... Well, probably not.  But there'd be a lot of emergency patching, forking and screaming...

### Short Term Steps

An optimized implementation in bitcoind is a good idea anyway, and there are three obvious paths:

1. Optimize the signature hash path to avoid the copy, and hash in place as much as possible.
2. Use the Intel and ARM optimized SHA256 routines, which increase SHA256 speed by about 80%.
3. Parallelize the input checking for large numbers of inputs.

### Longer Term Steps

A soft fork could introduce an OP_CHECKSIG2, which hashes the transaction in a different order.  In particular, it should hash the input script replacement at the end, so the "midstate" of the hash can be trivially reused.  This doesn't entirely eliminate the problem, since the sighash flags can require other permutations of the transaction; these would have to be carefully explored (or only allowed with OP_CHECKSIG).

This soft fork could also place limits on how big an OP_CHECKSIG-using transaction could be.

Such a change will take a while: there are other things which would be nice to change for OP_CHECKSIG2, such as new sighash flags for the Lightning Network, and removing the silly DER encoding of signatures.