Bitcoin Generic Address Format Proposal
I've been implementing segregated witness support for c-lightning; it's interesting that there's no address format for the new form of addresses. There's a segregated-witness-inside-p2sh which uses the existing p2sh format, but if you want raw segregated witness (which is simply a "0" followed by a 20-byte or 32-byte hash), the only proposal is BIP142 which has been deferred.
If we're going to have a new address format, I'd like to make the case for shifting away from bitcoin's base58 (eg. 1At1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2):
- base58 is not trivial to parse. I used the bignum library to do it, though you can open-code it as bitcoin-core does.
- base58 addresses are variable-length. That makes webforms and software mildly harder, but also eliminates a simple sanity check.
- base58 addresses are hard to read over the phone. Greg Maxwell points out that the upper and lower case mix is particularly annoying.
- The 4-byte SHA check does not guarantee to catch the most common form of errors; transposed or single incorrect letters, though it's pretty good (1 in 4 billion chance of random errors passing).
- At around 34 letters, it's fairly compact (36 for the BIP141 P2WPKH).
This is my proposal for a generic replacement (thanks to CodeShark for generalizing my previous proposal) which covers all possible future address types (as well as being usable for current ones):
- Prefix for type, followed by colon. Currently "btc:" or "testnet:".
- The full scriptPubkey using base 32 encoding as per http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt.
- At least 30 bits for crc64-ecma, up to a multiple of 5 to reach a letter boundary. This covers the prefix (as ascii), plus the scriptPubKey.
- The final letter is the Damm algorithm check digit of the entire previous string, using this 32-way quasigroup. This protects against single-letter errors as well as single transpositions.
These addresses look like btc:ybndrfg8ejkmcpqxot1uwisza345h769ybndrrfg (41 digits for a P2WPKH) or btc:yybndrfg8ejkmcpqxot1uwisza345h769ybndrfg8ejkmcpqxot1uwisza34 (60 digits for a P2WSH) (note: neither of these has the correct CRC or check letter, I just made them up). A classic P2PKH would be 45 digits, like btc:ybndrfg8ejkmcpqxot1uwisza345h769wiszybndrrfg, and a P2SH would be 42 digits.
While manually copying addresses is something which should be avoided, it does happen, and the cost of making them robust against common typographic errors is small. The CRC is a good idea even for machine-based systems: it will let through less than 1 in a billion mistakes. Distinguishing which blockchain is a nice catchall for mistakes, too.
We can, of course, bikeshed this forever, but I wanted to anchor the discussion with something I consider fairly sane.