6 Technical Things I Learned About Bitcoin
I've been collecting these as I research the bitcoin protocol, so I thought it was worth posting about. None of these are groundbreaking, but these are what surprised me as I deepened my understanding.
10 Minute Blocks. Currently 9 minutes. But usually 7 minutes.
Everyone talks about a block every 10 minutes, but that's the long-term mean. Spikes in exchange rates are followed fairly closely by spikes in network hashrate, and ASIC miners are ramping up to meet demand. As difficulty adjustment happens every 2016 blocks (ideally 2 weeks), there's a lag. Over the life of bitcoin, and over the last year the average is almost exactly 600 seconds, but over the last 3 months it's been 520 seconds. The last month is 542 seconds, so hashrate acceleration is slowing.
But a subtler effect is shown when we look at the median, rather than the mean: it's just under 7 minutes. This is because the time to hit the target hash is not a normal distribution at all. There's probably a fancy name for this spike with an exponential tail, but I've graphed here a recent set of 2016 blocks (fortnight 115) showing the distribution of block times in minute-wide buckets.
Now, these stats were using timestamps in the blocks, rather than the actual observed times, but I'm assuming on average that they're correct.
Actually, 10.005 Minute Blocks
The bitcoin client calculates how long an interval took by subtracting the timestamp from beginning of the interval to the end of the interval of 2016 blocks. There are 2015 spaces between 2016 blocks, but the code divides by 2016. But I'm sure no one else cares about that 0.3 second mistake, since block times are never that precise anyway.
Politics In The Genesis Block. Or Not.
It's common to point to the text in the very first block "The Times 03/Jan/2009 Chancellor on brink of second bailout for banks" as a political statement by Satoshi. While I'm sure the headline amused the author, we need look no further than the initial Bitcoin Paper, section 3:
A timestamp server works by taking a hash of a block of items to be timestamped and widely publishing the hash, such as in a newspaper or Usenet post [2-5]. The timestamp proves that the data must have existed at the time, obviously, in order to get into the hash.
In other words, it simply proves that there was no pre-mining going on. It would be interesting to get an accurate timestamp of the initial release of bitcoin and examine London Times headlines around that date to see if it was cherry-picked, or happy coincidence.
Crazy Address Encoding
Bitcoin addresses are a 25-byte number. It's usually encoding using 58 characters (numbers and letters, omitting zero, capital I and O, lower-case l to avoid confusion). Dividing by 58 is a bit of a pain, but doing crypto means we have big number libraries lying around which we can use.
But it's not the straight encoding one might expect, which would result in 37 character addresses. You might expect that leading zeroes can be omitted for compactness, but in fact, leading whole zero bytes are encoded separately. This gives variable-length addresses of between 27 and 34 characters and a second loop to encode and decode them. https://en.bitcoin.it/wiki/Base58Check_encoding
Anonymity Off By Default
Anonymity is hard, but I was surprised to see blockchain.info's page about my donation to Unfilter correctly geolocated to my home town! Perhaps it's a fluke, but I was taken aback by how clear it was.
CVEs in Bitcoin
Like any software, there have been flaws in the bitcoin reference client: obviously there has been a great deal of scrutiny and concern. Unlike most projects, there is a superb wiki page which details each vulnerability, with consequences and deployment status across the network: https://en.bitcoin.it/wiki/Common_Vulnerabilities_and_Exposures.
Corrections welcome!
Rusty.