Quick Stats on zstandard (zstd) Performance

Was looking at using zstd for backup, and wanted to see the effect of different compression levels. I backed up my (built) bitcoin source, which is a decent representation of my home directory, but only weighs in 2.3GB. zstd -1 compressed it 71.3%, zstd -22 compressed it 78.6%, and here’s a graph showing runtime (on my laptop) and the resulting size:

zstandard compression (bitcoin source code, object files and binaries) times and sizes

For this corpus, sweet spots are 3 (the default), 6 (2.5x slower, 7% smaller), 14 (10x slower, 13% smaller) and 20 (46x slower, 22% smaller). Spreadsheet with results here.

Minor update on transaction fees: users still don’t care.

I ran some quick numbers on the last retargeting period (blocks 415296 through 416346 inclusive) which is roughly a week’s worth.

Blocks were full: median 998k mean 818k (some miners blind mining on top of unknown blocks). Yet of the 1,618,170 non-coinbase transactions, 48% were still paying dumb, round fees (like 5000 satoshis). Another 5% were paying dumbround-numbered per-byte fees (like 80 satoshi per byte).

The mean fee was 24051 satoshi (~16c), the mean fee rate 60 satoshi per byte. But if we look at the amount you needed to pay to get into a block (using the second cheapest tx which got in), the mean was 16.81 satoshis per byte, or about 5c.

tl;dr: It’s like a tollbridge charging vehicles 7c per ton, but half the drivers are just throwing a quarter as they drive past and hoping it’s enough. It really shows fees aren’t high enough to notice, and transactions don’t get stuck often enough to notice. That’s surprising; at what level will they notice? What wallets or services are they using?

Bitcoin Generic Address Format Proposal

I’ve been implementing segregated witness support for c-lightning; it’s interesting that there’s no address format for the new form of addresses.  There’s a segregated-witness-inside-p2sh which uses the existing p2sh format, but if you want raw segregated witness (which is simply a “0” followed by a 20-byte or 32-byte hash), the only proposal is BIP142 which has been deferred.

If we’re going to have a new address format, I’d like to make the case for shifting away from bitcoin’s base58 (eg. 1At1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2):

  1. base58 is not trivial to parse.  I used the bignum library to do it, though you can open-code it as bitcoin-core does.
  2. base58 addresses are variable-length.  That makes webforms and software mildly harder, but also eliminates a simple sanity check.
  3. base58 addresses are hard to read over the phone.  Greg Maxwell points out that the upper and lower case mix is particularly annoying.
  4. The 4-byte SHA check does not guarantee to catch the most common form of errors; transposed or single incorrect letters, though it’s pretty good (1 in 4 billion chance of random errors passing).
  5. At around 34 letters, it’s fairly compact (36 for the BIP141 P2WPKH).

This is my proposal for a generic replacement (thanks to CodeShark for generalizing my previous proposal) which covers all possible future address types (as well as being usable for current ones):

  1. Prefix for type, followed by colon.  Currently “btc:” or “testnet:“.
  2. The full scriptPubkey using base 32 encoding as per http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt.
  3. At least 30 bits for crc64-ecma, up to a multiple of 5 to reach a letter boundary.  This covers the prefix (as ascii), plus the scriptPubKey.
  4. The final letter is the Damm algorithm check digit of the entire previous string, using this 32-way quasigroup. This protects against single-letter errors as well as single transpositions.

These addresses look like btc:ybndrfg8ejkmcpqxot1uwisza345h769ybndrrfg (41 digits for a P2WPKH) or btc:yybndrfg8ejkmcpqxot1uwisza345h769ybndrfg8ejkmcpqxot1uwisza34 (60 digits for a P2WSH) (note: neither of these has the correct CRC or check letter, I just made them up).  A classic P2PKH would be 45 digits, like btc:ybndrfg8ejkmcpqxot1uwisza345h769wiszybndrrfg, and a P2SH would be 42 digits.

While manually copying addresses is something which should be avoided, it does happen, and the cost of making them robust against common typographic errors is small.  The CRC is a good idea even for machine-based systems: it will let through less than 1 in a billion mistakes.  Distinguishing which blockchain is a nice catchall for mistakes, too.

We can, of course, bikeshed this forever, but I wanted to anchor the discussion with something I consider fairly sane.

BIP9: versionbits In a Nutshell

Hi, I was one of the authors/bikeshedders of BIP9, which Pieter Wuille recently refined (and implemented) into its final form.  The bitcoin core plan is to use BIP9 for activations from now on, so let’s look at how it works!

Some background:

  • Blocks have a 32-bit “version” field.  If the top three bits are “001”, the other 29 bits represent possible soft forks.
  • BIP9 uses the same 2016-block periods (roughly 2 weeks) as the difficulty adjustment does.

So, let’s look at BIP68 & 112 (Sequence locks and OP_CHECKSEQUENCEVERIFY) which are being activated together:

  • Every soft fork chooses an unused bit: these are using bit 1 (not bit 0), so expect to see blocks with version 536870914.
  • Every soft fork chooses an start date: these use May 1st, 2016, and time out a year later if it fails.
  • Every period, we look back to see if 95% have a bit set (75% for testnet).
    • If so, and that bit is for a known soft fork, and we’re within its start time that soft fork is locked-in: it will activate after another 2016 blocks, giving the stragglers time to upgrade.

There are also two alerts in the bitcoin core implementation:

  • If at any stage 50 of the last 100 blocks have unexpected bits set, you get Warning: Unknown block versions being mined! It’s possible unknown rules are in effect.
  • If we see an unknown softfork bit activate: you get Warning: unknown new rules activated (versionbit X).

Now, when could the OP_CSV soft forks activate? bitcoin-core will only start setting the bit in the first period after the start date, so somewhere between 1st and 15th of May[1], then will take another period to lock-in (even if 95% of miners are already upgraded), then another period to activate.  So early June would be the earliest possible date, but we’ll get two weeks notice for sure.

The Old Algorithm

For historical purposes, I’ll describe how the old soft-fork code worked.  It used version as a simple counter, eg. 3 or above meant BIP66, 4 or above meant BIP65 support.  Every block, it examined the last 1000 blocks to see if more than 75% had the new version.  If so, then the new softfork rules were enforced on new version blocks: old version blocks would still be accepted, and use the old rules.  If more than 95% had the new version, old version blocks would be rejected outright.

I remember Gregory Maxwell and other core devs stayed up late several nights because BIP66 was almost activated, but not quite.  And as a miner there was no guarantee on how long before you had to upgrade: one smaller miner kept producing invalid blocks for weeks after the BIP66 soft fork.  Now you get two weeks’ notice (probably more if you’re watching the network).

Finally, this change allows for miners to reject a particular soft fork without rejecting them all.  If we’re going to see more contentious or competing proposals in the future, this kind of plumbing allows it.

Hope that answers all your questions!


[1] It would be legal for an implementation to start setting it on the very first block past the start date, though it’s easier to only think about version bits once every two weeks as bitcoin-core does.

Bitcoin And Stuck Transactions?

One problem of filling blocks is that transactions with too-low fees will get “stuck”; I’ve read about such things happening on Reddit.  Then one of my coworkers told me that those he looked at were simply never broadcast properly, and broadcasting them manually fixed it.  Which lead both of us to wonder how often it’s really happening…

My approach is to look at the last 2 years of block data, and make a simple model:

  1. I assume the tx is not a priority tx (some miners reserve space for these; default 50k).
  2. I judge the “minimum feerate to get into a block” as the smallest feerate for any transaction after the first 50k beyond the coinbase (this is an artifact of how bitcoin core builds transactions; priority area first).
  3. I assume the tx won’t be included in “empty” blocks with only a coinbase or a single non-coinbase tx (SPV mining); their feerate is “infinite”.

Now, what feerate do we assume?  The default “dumb wallet” fee is 10000 satoshi per kilobyte: bitcoin-core doesn’t do this pro-rata, so a median 300-byte transaction still pays 10000 satoshi by default (fee-per-byte 33.33).  The worse case is a transaction of exactly 1000 bytes (or, a wallet which does pro-rata fees), which would have a fee-per-byte of 10.

So let’s consider the last two years (since block 277918).  How many blocks in a row we see with a fee-per-byte > 33.33, and how many we see with a feerate of > 10:


In the last two years you would never have experienced a delay of more than 10 blocks for a median-size transaction with a 10,000 satoshi fee.

For a 1000-byte transaction paying the same fee, you would have experienced a 10 block delay 0.7% of the time, with a 20+ block delay on eight occasions: the worse being a 26 block delay at block 382918 (just under 5 hours).  But note that this fee is insufficient to be included in 40% of blocks during the last two years, too; if your wallet is generating such things without warning you, it’s time to switch wallets!

Stuck low-fee transactions are not a real user problem yet.  It’s good to see adoption of smarter wallets, though, because it’s expected that they will be in the near future…

Bitcoin: Mixed Signs of A Fee Market

Six months ago in a previous post I showed that 45% of transactions have an output of less that $1, and estimated that they would get squeezed out first as blocks filled.  It’s time to review that prediction, and also to see several things:

  1. Are fees rising?
  2. Are fees detached from magic (default) numbers of satoshi?
  3. Are low value transactions getting squeezed out?
  4. Are transactions starting to shrink in response to fee pressure?

Here are some scenarios: low-value transactions might be vanishing even if nothing else changes, because people’s expectations (“free global microtransactions!” are changing).  Fees might be rising but still on magic numbers, because miners and nodes increased their relayfee due to spam attacks (most commonly, the rate was increased from 1000 satoshi per kb to 5000 satoshi per kb).  Finally, we’d eventually expect wallets which produce large transactions (eg. using uncompressed signatures) to lose popularity, and wallets to get smarter about transaction generation (particularly once Segregated Witness makes it fairly easy).

Fees For The Last 2 Years

The full 4 year graph is very noisy, so I only plotted the mean txfee/kb for each day for the last two years, in Satoshi and USD (thanks to the Coindesk BPI data for the conversion):


Conclusion: Too noisy to be conclusive: they seem to be rising recently, but some of that reflects the exchange rate changes.

Are Fees on Magic Boundaries?

Wallets should be estimating fees: in a real fee market they’d need to.

Dumb wallets pay a fixed fee per kb: eg. the bitcoin-core wallet pays 1,000 (now 5,000) satoshi per kb by default; even if the transaction is 300 bytes, it will pay 5,000 satoshi.  Some wallets use (slightly more sensible) scaling-by-size, so they’d pay 1,500 satoshi.  So if a transaction fee ends in “000”, or the scaled transaction fee does (+/- 2) we can categorize them as “fixed fee”.  We assume others are using a variable fee (about 0.6% will be erroneously marked as fixed):

This graph is a bit dense, so we thin it by grouping into weeks:


Conclusion: Wallets are starting to adapt to fee pressure, though the majority are still using a fixed fee.

Low Value Transactions For Last 4 Years

We categorize 4 distinct types of transactions: ones which have an output below 25c, ones which have an output between 25c and $1, ones which have an output between $1 and $5, and ones which have no output below $5, and graph the trends for each for the last four years:

Conclusion: 25c transactions are flat (ignoring those spam attack spikes).  < $1 and <$5 are growing, but most growth is coming from transactions >= $5.

Transaction Size For Last 4 Years

Here are the transaction sizes for the last 4 years:

Conclusion: There seems to be a slight decline in transaction sizes, but it’s not clear the cause, and it might be just noise.


There are signs of a nascent fee market, but it’s still very early. I’d expect something conclusive in the next 6 months.

The majority of fees should be variable, and they’re not: wallets remain poor, but users will migrate as blocks fill and more transactions get stuck.

A fee rate of over 10c per kb (2.5c per median transaction) hasn’t suppressed 25c transactions: perhaps it’s not high enough yet, or perhaps wallets aren’t making the relative fees clear enough (eg. my Trezor gives fees in BTC, as well as  only offering fixed fee rates).

The slight dip in mean transaction sizes and lack of growth in 25c transactions to may point to early market pressure, however.

Six months ago I showed that 45% of transactions were less than a dollar.  In the last six months that has declined to 38%.  I previously estimated that we would want larger blocks within two years, and need them within three.  That still seems a reasonable estimate.


I used bitcoin-iterate and a really crappy Makefile to generate CSVs with the data.  You can see the result on github or go straight to downloading the Gnumeric spreadsheet with the graphs.

Disclaimer: I Work For Blockstream

On lightning.  Not on drawing pretty graphs.  But I wanted to see the data…


ccan/mem’s memeqzero iteration

On Thursday I was writing some code, and I wanted to test if an array was all zero.  First I checked if ccan/mem had anything, in case I missed it, then jumped on IRC to ask the author (and overall CCAN co-maintainer) David Gibson about it.

We bikeshedded around names: memallzero? memiszero? memeqz? memeqzero() won by analogy with the already-extant memeq and memeqstr. Then I asked:

rusty: dwg: now, how much time do I waste optimizing?
dwg: rusty, in the first commit, none

Exactly five minutes later I had it implemented and tested.

The Naive Approach: Times: 1/7/310/37064 Bytes: 50

bool memeqzero(const void *data, size_t length)
    const unsigned char *p = data;

    while (length) {
        if (*p)
            return false;
    return true;

As a summary, I’ve give the nanoseconds for searching through 1,8,512 and 65536 bytes only.

Another 20 minutes, and I had written that benchmark, and an optimized version.

128-byte Static Buffer: Times: 6/8/48/5872 Bytes: 108

Here’s my first attempt at optimization; using a static array of 128 bytes of zeroes and assuming memcmp is well-optimized for fixed-length comparisons.  Worse for small sizes, much better for big.

 const unsigned char *p = data;
 static unsigned long zeroes[16];

 while (length > sizeof(zeroes)) {
     if (memcmp(zeroes, p, sizeof(zeroes)))
         return false;
     p += sizeof(zeroes);
     length -= sizeof(zeroes);
 return memcmp(zeroes, p, length) == 0;

Using a 64-bit Constant: Times: 12/12/84/6418 Bytes: 169

dwg: but blowing a cacheline (more or less) on zeroes for comparison, which isn’t necessarily a win

Using a single zero uint64_t for comparison is pretty messy:

bool memeqzero(const void *data, size_t length)
    const unsigned char *p = data;
    const unsigned long zero = 0;
    size_t pre;
    pre = (size_t)p % sizeof(unsigned long);
    if (pre) {
        size_t n = sizeof(unsigned long) - pre;
        if (n > length)
            n = length;
        if (memcmp(p, &zero, n) != 0)
            return false;
        p += n;
        length -= n;
    while (length > sizeof(zero)) {
        if (*(unsigned long *)p != zero)
            return false;
        p += sizeof(zero);
        length -= sizeof(zero);
    return memcmp(&zero, p, length) == 0;

And, worse in every way!

Using a 64-bit Constant With Open-coded Ends: Times: 4/9/68/6444 Bytes: 165

dwg: rusty, what colour is the bikeshed if you have an explicit char * loop for the pre and post?

That’s slightly better, but memcmp still wins over large distances, perhaps due to prefetching or other tricks.

Epiphany #1: We Already Have Zeroes: Times 3/5/92/5801 Bytes: 422

Then I realized that we don’t need a static buffer: we know everything we’ve already tested is zero!  So I open coded the first 16 byte compare, then memcmp()ed against the previous bytes, doubling each time.  Then a final memcmp for the tail.  Clever huh?

But it no faster than the static buffer case on the high end, and much bigger.

dwg: rusty, that is brilliant. but being brilliant isn’t enough to make things work, necessarily :p

Epiphany #2: memcmp can overlap: Times 3/5/37/2823 Bytes: 307

My doubling logic above was because my brain wasn’t completely in phase: unlike memcpy, memcmp arguments can happily overlap!  It’s still worth doing an open-coded loop to start (gcc unrolls it here with -O3), but after 16 it’s worth memcmping with the previous 16 bytes.  This is as fast as naive with as little as 2 bytes, and the fastest solution by far with larger numbers:

 const unsigned char *p = data;
 size_t len;

 /* Check first 16 bytes manually */
 for (len = 0; len < 16; len++) {
     if (!length)
         return true;
     if (*p)
         return false;

 /* Now we know that's zero, memcmp with self. */
 return memcmp(data, p, length) == 0;

You can find the final code in CCAN (or on Github) including the benchmark code.

Finally, after about 4 hours of random yak shaving, it turns out lightning doesn’t even want to use memeqzero() any more!  Hopefully someone else will benefit.

Broadband Speeds, New Data

Thanks to edmundedgar on reddit I have some more accurate data to update my previous bandwidth growth estimation post: OFCOM UK, who released their November 2014 report on average broadband speeds.  Whereas Akamai numbers could be lowered by the increase in mobile connections, this directly measures actual broadband speeds.

Extracting the figures gives:

  1. Average download speed in November 2008 was 3.6Mbit
  2. Average download speed in November 2014 was 22.8Mbit
  3. Average upload speed in November 2014 was 2.9Mbit
  4. Average upload speed in November 2008 to April 2009 was 0.43Mbit/s

So in 6 years, downloads went up by 6.333 times, and uploads went up by 6.75 times.  That’s an annual increase of 36% for downloads and 37% for uploads; that’s good, as it implies we can use download speed factor increases as a proxy for upload speed increases (as upload speed is just as important for a peer-to-peer network).

This compares with my previous post’s Akamai’s UK numbers of 3.526Mbit in Q4 2008 and 10.874Mbit in Q4 2014: only a factor of 3.08 (26% per annum).  Given how close Akamai’s numbers were to OFCOM’s in November 2008 (a year after the iPhone UK release, but probably too early for mobile to have significant effect), it’s reasonable to assume that mobile plays a large part of this difference.

If we assume Akamai’s numbers reflected real broadband rates prior to November 2008, we can also use it to extend the OFCOM data back a year: this is important since there was almost no bandwidth growth according to Akamai from Q4 2007 to Q7 2008: ignoring that period gives a rosier picture than my last post, and smells of cherrypicking data.

So, let’s say the UK went from 3.265Mbit in Q4 2007 (Akamai numbers) to 22.8Mbit in Q4 2014 (OFCOM numbers).  That’s a factor of 6.98, or 32% increase per annum for the UK. If we assume that the US Akamai data is under-representing Q4 2014 speeds by the same factor (6.333 / 3.08 = 2.056) as the UK data, that implies the US went from 3.644Mbit in Q4 2007 to 11.061 * 2.056 = 22.74Mbit in Q4 2014, giving a factor of 6.24, or 30% increase per annum for the US.

As stated previously, China is now where the US and UK were 7 years ago, suggesting they’re a reasonable model for future growth for that region.  Thus I revise my bandwidth estimates; instead of 17% per annum this suggests 30% per annum as a reasonable growth rate.

The Bitcoin Blocksize: A Summary

There’s a significant debate going on at the moment in the Bitcoin world; there’s a great deal of information and misinformation, and it’s hard to find a cogent summary in one place.  This post is my attempt, though I already know that it will cause me even more trouble than that time I foolishly entitled a post “If you didn’t run code written by assholes, your machine wouldn’t boot”.

The Technical Background: 1MB Block Limit

The bitcoin protocol is powered by miners, who gather transactions into blocks, producing a block every 10 minutes (but it varies a lot).  They get a 25 bitcoin subsidy for this, plus whatever fees are paid by those transactions.  This subsidy halves every 4 years: in about 12 months it will drop to 12.5.

Full nodes on the network check transactions and blocks, and relay them to others.  There are also lightweight nodes which simply listen for transactions which affect them, and trust that blocks from miners are generally OK.

A normal transaction is 250 bytes, and there’s a hard-coded 1 megabyte limit on the block size.  This limit was introduced years ago as a quick way of avoiding a miner flooding the young network, though the original code could only produce 200kb blocks, and the default reference code still defaults to a 750kb limit.

In the last few months there have been increasing runs of full blocks, causing backlogs for a few hours.  More recently, someone deliberately flooded the network with normal-fee transactions for several days; any transactions paying less fees than those had to wait for hours to be processed.

There are 5 people who have commit access to the bitcoin reference implementation (aka. “bitcoin-core”), and they vary significantly in their concerns on the issue.

The Bitcoin Users’ Perspective

From the bitcoin users perspective, blocks should be infinite, and fees zero or minimal.  This is the basic position of respected (but non-bitcoin-core) developer Mike Hearn, and has support from bitcoin-core ex-lead Gavin Andresen.  They work on the wallet and end-user side of bitcoin, and they see the issue as the most urgent.  In an excellent post arguing why growth is so important, Mike raises the following points, which I’ve paraphrased:

  1. Currencies have network effects. A currency that has few users is simply not competitive with currencies that have many.
  2. A decentralised currency that the vast majority can’t use doesn’t change the amount of centralisation in the world. Most people will still end up using banks, with all the normal problems.
  3. Growth is a part of the social contract. It always has been.
  4. Businesses will only continue to invest in bitcoin and build infrastructure if they are assured that the market will grow significantly.
  5. Bitcoin needs users, lots of them, for its political survival. There are many people out there who would like to see digital cash disappear, or be regulated out of existence.

At this point, it’s worth mentioning another bitcoin-core developer: Jeff Garzik.  He believes that the bitcoin userbase has been promised that transactions will continue to be almost free.  When a request to change the default mining limit from 750kb to 1M was closed by the bitcoin lead developer Wladimir van der Laan as unimportant, Jeff saw this as a symbolic moment:

What Happens If We Don’t Increase Soon?

Mike Hearn has a fairly apocalyptic view of what would happen if blocks fill.  That was certainly looking likely when the post was written, but due to episodes where the blocks were full for days, wallet designers are (finally) starting to estimate fees for timely processing (miners process larger fee transactions first).  Some wallets and services didn’t even have a way to change the setting, leaving users stranded during high-volume events.

It now seems that the bursts of full blocks will arrive with increasing frequency; proposals are fairly mature now to allow users to post-increase fees if required, which (if all goes well) could make for a fairly smooth transition from the current “fees are tiny and optional” mode of operation to a “there will be a small fee”.

But even if this rosy scenario is true, this begsavoids the bigger question of how high fees can become before bitcoin becomes useless.  1c?  5c?  20c? $1?

So What Are The Problems With Increasing The Blocksize?

In a word, the problem is miners.  As mining has transitioned from a geek pastime, semi-hobbyist, then to large operations with cheap access to power, it has become more concentrated.

The only difference between bitcoin and previous cryptocurrencies is that instead of a centralized “broker” to ensure honesty, bitcoin uses an open competition of miners. Given bitcoin’s endurance, it’s fair to count this a vital property of bitcoin.  Mining centralization is the long-term concern of another bitcoin-core developer (and my coworker at Blockstream), Gregory Maxwell.

Control over half the block-producing power and you control who can use bitcoin and cheat anyone not using a full node themselves.  Control over 2/3, and you can force a rule change on the rest of the network by stalling it until enough people give in.  Central control is also a single point to shut the network down; that lets others apply legal or extra-legal pressure to restrict the network.

What Drives Centralization?

Bitcoin mining is more efficient at scale. That was to be expected[7]. However, the concentration has come much faster than expected because of the invention of mining pools.  These pools tell miners what to mine, in return for a small (or in some cases, zero) share of profits.  It saves setup costs, they’re easy to use, and miners get more regular payouts.  This has caused bitcoin to reel from one centralization crisis to another over the last few years; the decline in full nodes has been precipitous by some measures[5] and continues to decline[6].

Consider the plight of a miner whose network is further away from most other miners.  They find out about new blocks later, and their blocks get built on later.  Both these effects cause them to create blocks which the network ignores, called orphans.  Some orphans are the inevitable consequence of miners racing for the same prize, but the orphan problem is not symmetrical.  Being well connected to the other miners helps, but there’s a second effect: if you discover the previous block, you’ve a head-start on the next one.  This means a pool which has 20% of the hashing power doesn’t have to worry about delays at all 20% of the time.

If the orphan rate is very low (say, 0.1%), the effect can be ignored.  But as it climbs, the pressure to join a pool (the largest pool) becomes economically irresistible, until only one pool remains.

Larger Blocks Are Driving Up Orphan Rates

Large blocks take longer to propagate, increasing the rate of orphans.  This has been happening as blocks increase.  Blocks with no transactions at all are smallest, and so propagate fastest: they still get a 25 bitcoin subsidy, though they don’t help bitcoin users much.

Many people assumed that miners wouldn’t overly centralize, lest they cause a clear decentralization failure and drive the bitcoin price into the ground.  That assumption has proven weak in the face of climbing orphan rates.

And miners have been behaving very badly.  Mining pools orchestrate attacks on each other with surprising regularity; DDOS and block withholding attacks are both well documented[1][2].  A large mining pool used their power to double spend and steal thousands of bitcoin from a gambling service[3].  When it was noticed, they blamed a rogue employee.  No money was returned, nor any legal action taken.  It was hoped that miners would leave for another pool as they approached majority share, but that didn’t happen.

If large blocks can be used as a weapon by larger miners against small ones[8], it’s expected that they will be.

More recently (and quite by accident) it was discovered that over half the mining power aren’t verifying transactions in blocks they build upon[4].  They did this in order to reduce orphans, and one large pool is still doing so.  This is a problem because lightweight bitcoin clients work by assuming anything in the longest chain of blocks is good; this was how the original bitcoin paper anticipated that most users would interact with the system.

The Third Side Of The Debate: Long Term Network Funding

Before I summarize, it’s worth mentioning the debate beyond the current debate: long term network support.  The minting of new coins decreases with time; the plan of record (as suggested in the original paper) is that total transaction fees will rise to replace the current mining subsidy.  The schedule of this is unknown and generally this transition has not happened: free transactions still work.

The block subsidy as I write this is about $7000.  If nothing else changes, miners would want $3500 in fees in 12 months when the block subsidy halves, or about $2 per transaction.  That won’t happen; miners will simply lose half their income.  (Perhaps eventually they form a cartel to enforce a minimum fee, causing another centralization crisis? I don’t know.)

It’s natural for users to try to defer the transition as long as possible, and the practice in bitcoin-core has been to aggressively reduce the default fees as the bitcoin price rises.  Core developers Gregory Maxwell and Pieter Wuille feel that signal was a mistake; that fees will have to rise eventually and users should not be lulled into thinking otherwise.

Mike Hearn in particular has been holding out the promise that it may not be necessary.  On this he is not widely supported: that some users would offer to pay more so other users can continue to pay less.

It’s worth noting that some bitcoin businesses rely on the current very low fees and don’t want to change; I suspect this adds bitterness and vitriol to many online debates.


The bitcoin-core developers who deal with users most feel that bitcoin needs to expand quickly or die, that letting fees emerge now will kill expansion, and that the infrastructure will improve over time if it has to.

Other bitcoin-core developers feel that bitcoin’s infrastructure is dangerously creaking, that fees need to emerge anyway, and that if there is a real emergency a blocksize change could be rolled out within a few weeks.

At least until this is resolved, don’t count on future bitcoin fees being insignificant, nor promise others that bitcoin has “free transactions”.

[1] http://www.coindesk.com/bitcoin-mining-pools-ddos-attacks/ “Bitcoin Mining Pools Targeted in Wave of DDOS Attacks” Coinbase 2015

[2] http://blog.bettercrypto.com/?p=1131 “Block Withholding Attacks – Recent Research” N T Courtois 2014

[3] https://bitcointalk.org/index.php?topic=327767.0 “GHash.IO and double-spending against BetCoin Dice” mmtech et. al 2013

[4] https://www.reddit.com/r/Bitcoin/comments/3c8tq2/questions_about_the_july_4th_bip66_fork/cstgajp “Questions about the July 4th BIP66 fork”

[5] https://twitter.com/petertoddbtc/status/608475414449778688 “350,000 full nodes to 6,000 in two years…” P Todd 2015

[6] https://getaddr.bitnodes.io/dashboard/?days=365 “Reachable nodes during the last 365 days.” Bitnodes.io

[7] https://bitcointalk.org/index.php?topic=532.msg6306#msg6306 “Re: Scalability and transaction rate” Satoshi 2010

[8] http://www.mail-archive.com/bitcoin-development@lists.sourceforge.net/msg08161.html “[Bitcoin-development] Mining centralization pressure from non-uniform propagation speed” Pieter Wuille 2015

The Megatransaction: Why Does It Take 25 Seconds?

Last night f2pool mined a 1MB block containing a single 1MB transaction.  This scooped up some of the spam which has been going to various weakly-passworded “brainwallets”, gaining them 0.5569 bitcoins (on top of the normal 25 BTC subsidy).  You can see the megatransaction on blockchain.info.

It was widely reported to take about 25 seconds for bitcoin core to process this block: this is far worse than my “2 seconds per MB” result in my last post, which was considered a pretty bad case.  Let’s look at why.

How Signatures Are Verified

The algorithm to check a transaction input (of this form) looks like this:

  1. Strip the other inputs from the transaction.
  2. Replace the input script we’re checking with the script of the output it’s trying to spend.
  3. Hash the resulting transaction with SHA256, then hash the result with SHA256 again.
  4. Check the signature correctly signed that hash result.

Now, for a transaction with 5570 inputs, we have to do this 5570 times.  And the bitcoin core code does this by making a copy of the transaction each time, and using the marshalling code to hash that; it’s not a huge surprise that we end up spending 20 seconds on it.

How Fast Could Bitcoin Core Be If Optimized?

Once we strip the inputs, the result is only about 6k long; hashing 6k 5570 times takes about 265 milliseconds (on my modern i3 laptop).  We have to do some work to change the transaction each time, but we should end up under half a second without any major backflips.

Problem solved?  Not quite….

This Block Isn’t The Worst Case (For An Optimized Implementation)

As I said above, the amount we have to hash is about 6k; if a transaction has larger outputs, that number changes.  We can fit in fewer inputs though.  A simple simulation shows the worst case for 1MB transaction has 3300 inputs, and 406000 byte output(s): simply doing the hashing for input signatures takes about 10.9 seconds.  That’s only about two or three times faster than the bitcoind naive implementation.

This problem is far worse if blocks were 8MB: an 8MB transaction with 22,500 inputs and 3.95MB of outputs takes over 11 minutes to hash.  If you can mine one of those, you can keep competitors off your heels forever, and own the bitcoin network… Well, probably not.  But there’d be a lot of emergency patching, forking and screaming…

Short Term Steps

An optimized implementation in bitcoind is a good idea anyway, and there are three obvious paths:

  1. Optimize the signature hash path to avoid the copy, and hash in place as much as possible.
  2. Use the Intel and ARM optimized SHA256 routines, which increase SHA256 speed by about 80%.
  3. Parallelize the input checking for large numbers of inputs.

Longer Term Steps

A soft fork could introduce an OP_CHECKSIG2, which hashes the transaction in a different order.  In particular, it should hash the input script replacement at the end, so the “midstate” of the hash can be trivially reused.  This doesn’t entirely eliminate the problem, since the sighash flags can require other permutations of the transaction; these would have to be carefully explored (or only allowed with OP_CHECKSIG).

This soft fork could also place limits on how big an OP_CHECKSIG-using transaction could be.

Such a change will take a while: there are other things which would be nice to change for OP_CHECKSIG2, such as new sighash flags for the Lightning Network, and removing the silly DER encoding of signatures.