Bitcoin Generic Address Format Proposal

I’ve been implementing segregated witness support for c-lightning; it’s interesting that there’s no address format for the new form of addresses.  There’s a segregated-witness-inside-p2sh which uses the existing p2sh format, but if you want raw segregated witness (which is simply a “0” followed by a 20-byte or 32-byte hash), the only proposal is BIP142 which has been deferred.

If we’re going to have a new address format, I’d like to make the case for shifting away from bitcoin’s base58 (eg. 1At1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2):

  1. base58 is not trivial to parse.  I used the bignum library to do it, though you can open-code it as bitcoin-core does.
  2. base58 addresses are variable-length.  That makes webforms and software mildly harder, but also eliminates a simple sanity check.
  3. base58 addresses are hard to read over the phone.  Greg Maxwell points out that the upper and lower case mix is particularly annoying.
  4. The 4-byte SHA check does not guarantee to catch the most common form of errors; transposed or single incorrect letters, though it’s pretty good (1 in 4 billion chance of random errors passing).
  5. At around 34 letters, it’s fairly compact (36 for the BIP141 P2WPKH).

This is my proposal for a generic replacement (thanks to CodeShark for generalizing my previous proposal) which covers all possible future address types (as well as being usable for current ones):

  1. Prefix for type, followed by colon.  Currently “btc:” or “testnet:“.
  2. The full scriptPubkey using base 32 encoding as per http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt.
  3. At least 30 bits for crc64-ecma, up to a multiple of 5 to reach a letter boundary.  This covers the prefix (as ascii), plus the scriptPubKey.
  4. The final letter is the Damm algorithm check digit of the entire previous string, using this 32-way quasigroup. This protects against single-letter errors as well as single transpositions.

These addresses look like btc:ybndrfg8ejkmcpqxot1uwisza345h769ybndrrfg (41 digits for a P2WPKH) or btc:yybndrfg8ejkmcpqxot1uwisza345h769ybndrfg8ejkmcpqxot1uwisza34 (60 digits for a P2WSH) (note: neither of these has the correct CRC or check letter, I just made them up).  A classic P2PKH would be 45 digits, like btc:ybndrfg8ejkmcpqxot1uwisza345h769wiszybndrrfg, and a P2SH would be 42 digits.

While manually copying addresses is something which should be avoided, it does happen, and the cost of making them robust against common typographic errors is small.  The CRC is a good idea even for machine-based systems: it will let through less than 1 in a billion mistakes.  Distinguishing which blockchain is a nice catchall for mistakes, too.

We can, of course, bikeshed this forever, but I wanted to anchor the discussion with something I consider fairly sane.

4 thoughts on “Bitcoin Generic Address Format Proposal”

  1. Why is this posted on social media (your blog) rather than the Bitcoin-dev ML? :/

    > If we’re going to have a new address format, I’d like to make the case for shifting away from bitcoin’s base58

    Great, I proposed that in 2013 (https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2013-May/002588.html )… But a more important issue today is that people want to reuse addresses. As-is, your proposal fails to support that. I suggest starting with BIP 47, making it sane (remove its required address reuse), and combining it with BIP 124 and your suggestions.

    > base58 is not trivial to parse. I used the bignum library to do it, though you can open-code it as bitcoin-core does.

    There’s libbase58. Why didn’t you use that?

    > base58 addresses are variable-length. That makes webforms and software mildly harder, but also eliminates a simple sanity check.

    This seems unavoidable. (Your own proposal in particular is also variable-length.)

    > Prefix for type, followed by colon. Currently “btc:” or “testnet:”.

    This improperly conflates BTC with Bitcoin. BTC is the unit, not the system. If something shorter than “bitcoin” is desired, “bc” is the historically correct abbreviation.

    1. Why is this posted on social media (your blog) rather than the Bitcoin-dev ML? :/

      Mainly because I knew it was half-baked. This was a quick way to get feedback (and hey, it worked!)

      But a more important issue today is that people want to reuse addresses. As-is, your proposal fails to support that. I suggest starting with BIP 47, making it sane (remove its required address reuse), and combining it with BIP 124 and your suggestions.

      Address reuse is an interesting problem, but I don’t think there’s a solution which doesn’t require metadata in the blockchain; at least an OP_RETURN. Given your stance on blockchain spam, I thought you’d avoid this?

      There’s libbase58. Why didn’t you use that?

      Because I didn’t know it existed. I’ve put it on my TODO list, thanks!

      This seems unavoidable. (Your own proposal in particular is also variable-length.)

      My original wasn’t, but then it got more ambitious. It’s still a bad property, even if I decided not to solve it.

      > Prefix for type, followed by colon. Currently “btc:” or “testnet:”.

      This improperly conflates BTC with Bitcoin. BTC is the unit, not the system. If something shorter than “bitcoin” is desired, “bc” is the historically correct abbreviation.

      Good point. Let’s make it “bc” and “bctest”.

  2. > Mainly because I knew it was half-baked. This was a quick way to get feedback (and hey, it worked!)

    Only barely. It took someone linking it on /r/Bitcoin, me happening to notice it there, and then idly pondering if you had replied to come back here and check manually…

    > Address reuse is an interesting problem, but I don’t think there’s a solution which doesn’t require metadata in the blockchain; at least an OP_RETURN. Given your stance on blockchain spam, I thought you’d avoid this?

    There are other ways to convey metadata. The BIP 75 stuff seems usefully interesting, but the BIP 47 solution (except using OP_RETURN instead of address reuse) seems reasonable as well – it’s arguably part of the transaction, so not really spam (at the very least, I think it’s a grey area like Counterparty).

    > My original wasn’t, but then it got more ambitious. It’s still a bad property, even if I decided not to solve it.

    Yes. If everything was P2SH-style in Bitcoin (eg, a segwit-only softfork that makes any other sPK invalid) it /might/ be possible. But probably not, once multisig is considered.

  3. I’ve investigated this a bit more, and found a concise way to implement a 30-bit Reed-Solomon code (more compact than 30-bit CRC + 10-bit Damm, and much stronger detection guarantees) in a very concise way. I’ve implemented a prototype here: https://github.com/sipa/ezbase32

    @Luke: I’d very much like to work on better payment protocols or other solutions, but we’ll need some standard way to convey a segwit scriptPubKey, or we’ll be stuck with just P2SH for a long time, I’m afraid.

Comments are closed.