Rusty Russell's Coding Blog | Stealing From Smart People

My previous discovery that poll() indicating an fd was writable didn’t mean write() wouldn’t block lead to some interesting discussion on Google+.

It became clear that there is much confusion over read and write; eg. Linus thought read() was like write() whereas I thought (prior to my last post) that write() was like read(). Both wrong…

Both Linux and v6 UNIX always returned from read() once data was available (v6 didn’t have sockets, but they had pipes). POSIX even suggests this:

The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading.

But write() is different. Presumably so simple UNIX filters didn’t have to check the return and loop (they’d just die with EPIPE anyway), write() tries hard to write all the data before returning. And that leads to a simple rule.  Quoting Linus:

Sure, you can try to play games by knowing socket buffer sizes and look at pending buffers with SIOCOUTQ etc, and say “ok, I can probably do a write of size X without blocking” even on a blocking file descriptor, but it’s hacky, fragile and wrong.

I’m travelling, so I built an Ubuntu-compatible kernel with a printk() into select() and poll() to see who else was making this mistake on my laptop:

cups-browsed: (1262): fd 5 poll() for write without nonblock
cups-browsed: (1262): fd 6 poll() for write without nonblock
Xorg: (1377): fd 1 select() for write without nonblock
Xorg: (1377): fd 3 select() for write without nonblock
Xorg: (1377): fd 11 select() for write without nonblock

This first one is actually OK; fd 5 is an eventfd (which should never block). But the rest seem to be sockets, and thus probably bugs.

What’s worse, are the Linux select() man page:

       A file descriptor is considered ready if it is possible to
       perform the corresponding I/O operation (e.g., read(2)) without
       blocking.
       ... those in writefds will be watched to see if a write will
       not block...

And poll():

	POLLOUT
		Writing now will not block.

Man page patches have been submitted…

No tags Hide

Aug/14

2

ccan/io: revisited

There are numerous C async I/O libraries; tevent being the one I’m most familiar with.  Yet, tevent has a very wide API, and programs using it inevitably descend into “callback hell”.  So I wrote ccan/io.

The idea is that each I/O callback returns a “struct io_plan” which says what I/O to do next, and what callback to call.  Examples are “io_read(buf, len, next, next_arg)” to read a fixed number of bytes, and “io_read_partial(buf, lenp, next, next_arg)” to perform a single read.  You could also write your own, such as pettycoin’s “io_read_packet()” which read a length then allocated and read in the rest of the packet.

This should enable a convenient debug mode: you turn each io_read() etc. into synchronous operations and now you have a nice callchain showing what happened to a file descriptor.  In practice, however, debug was painful to use and a frequent source of bugs inside ccan/io, so I never used it for debugging.

And I became less happy when I used it in anger for pettycoin, but at some point you’ve got to stop procrastinating and start producing code, so I left it alone.

Now I’ve revisited it.   820 insertions(+), 1042 deletions(-) and the code is significantly less hairy, and the API a little simpler.  In particular, writing the normal “read-then-write” loops is still very nice, while doing full duplex I/O is possible, but more complex.  Let’s see if I’m still happy once I’ve merged it into pettycoin…

No tags Hide

Jul/14

29

Pettycoin Alpha01 Tagged

As all software, it took longer than I expected, but today I tagged the first version of pettycoin.  Now, lots more polish and features, but at least there’s something more than the git repo for others to look at!

No tags Hide

A “non-blocking” IPv6 connect() call was in fact, blocking.  Tracking that down made me realize the IPv6 address was mostly random garbage, which was caused by this function:

bool get_fd_addr(int fd, struct protocol_net_address *addr)
{
   union {
      struct sockaddr sa;
      struct sockaddr_in in;
      struct sockaddr_in6 in6;
   } u;
   socklen_t len = sizeof(len);
   if (getsockname(fd, &u.sa, &len) != 0)
      return false;
   ...
}

The bug: “sizeof(len)” should be “sizeof(u)”.  But when presented with a too-short length, getsockname() truncates, and otherwise “succeeds”; you have to check the resulting len value to see what you should have passed.

Obviously an error return would be better here, but the writable len arg is pretty useless: I don’t know of any callers who check the length return and do anything useful with it.  Provide getsocklen() for those who do care, and have getsockname() take a size_t as its third arg.

Oh, and the blocking?  That was because I was calling “fcntl(fd, F_SETFD, …)” instead of “F_SETFL”!

No tags Hide

I decided to use github for pettycoin, and tested out their blogging integration (summary: it’s not very integrated, but once set up, Jekyll is nice).  I’m keeping a blow-by-blow development blog over there.

No tags Hide

At linux.conf.au I spoke about my pre-alpha implementation of Pettycoin, but progress since then has been slow.  That’s partially due to yak shaving (like rewriting ccan/io library), partially reimplementation of parts I didn’t like, and partially due to the birth of my son, but mainly because I have a day job which involves working on Power 8 KVM issues for IBM.  So Alex convinced me to take 6 months off from the day job, and work 4 days a week on pettycoin.

I’m going to be blogging my progress, so expect several updates a week.  The first few alpha releases will be useless for doing any actual transactions, but by the first beta the major pieces should be in place…

No tags Hide

Chris Fisher’s Jupiter Broadcasting pod/vodcasting started 8 years ago with the Linux Action Show: still their flagship show, and how I discovered them 3 years ago.  Shows like this give access to FOSS to those outside the LWN-reading crowd; community building can be a thankless task, and as a small shop Chris has had ups and downs along the way.  After listening to them for a few years, I feel a weird bond with this bunch of people I’ve never met.

I regularly listen to Techsnap for security news, Scibyte for science with my daughter, and Unfilter to get an insight into the NSA and what the US looks like from the inside.  I bugged Chris a while back to accept bitcoin donations, and when they did I subscribed to Unfilter for a year at 2 BTC.  To congratulate them on reaching the 100th Unfilter episode, I repeated that donation.

They’ve started doing new and ambitious things, like Linux HOWTO, so I know they’ll put the funds to good use!

No tags Hide

I was thinking about peer-to-peer networking (in the context of Pettycoin, of course) and I wondered if sending ~1420 bytes of data is really any slower than sending 1 byte on real networks.  Similarly, is it worth going to extremes to avoid crossing over into two TCP packets?

So I wrote a simple Linux TCP ping pong client and server: the client connects to the server then loops: reads until it gets a ‘1’ byte, then it responds with a single byte ack.  The server sends data ending in a 1 byte, then reads the response byte, printing out how long it took.  First 1 byte of data, then 101 bytes, all the way to 9901 bytes.  It does this 20 times, then closes the socket.

Here are the results on various networks (or download the source and result files for your own analysis):

On Our Gigabit Lan

Interestingly, we do win for tiny packets, but there’s no real penalty once we’re over a packet (until we get to three packets worth):

Over the Gigabit Lan

Over the Gigabit Lan

gigabit-lan-closeup

Over Gigabit LAN (closeup)

On Our Wireless Lan

Here we do see a significant decline as we enter the second packet, though extra bytes in the first packet aren’t completely free:

Wireless LAN (all results)

Wireless LAN (all results)

Wireless LAN (closeup)

Wireless LAN (closeup)

Via ADSL2 Over The Internet (Same Country)

Ignoring the occasional congestion from other uses of my home net connection, we see a big jump after the first packet, then another as we go from 3 to 4 packets:

ADSL over internet in same country

ADSL over internet in same country

ADSL over internet in same country (closeup)

ADSL over internet in same country (closeup)

Via ADSL2 Over The Internet (Australia <-> USA)

Here, packet size is completely lost in the noise; the carrier pidgins don’t even notice the extra weight:

Wifi + ADSL2 from Adelaide to US

Wifi + ADSL2 from Adelaide to US

Wifi + ADSL2 from Adelaide to US (closeup)

Wifi + ADSL2 from Adelaide to US (closeup)

Via 3G Cellular Network (HSPA)

I initially did this with Wifi tethering, but the results were weird enough that Joel wrote a little Java wrapper so I could run the test natively on the phone.  It didn’t change the resulting pattern much, but I don’t know if this regularity of delay is a 3G or an Android thing.  Here every packet costs, but you don’t win a prize for having a short packet:

3G network

3G network

3G network (closeup)

3G network (closeup)

Via 2G Network (EDGE)

This one actually gives you a penalty for short packets!  800 bytes to 2100 bytes is the sweet-spot:

2G (EDGE) network

2G (EDGE) network

2G (EDGE) network (closeup)

2G (EDGE) network (closeup)

Summary

So if you’re going to send one byte, what’s the penalty for sending more?  Eyeballing the minimum times from the graphs above:

Wired LAN Wireless ADSL 3G 2G
Penalty for filling packet 30%  15%  5%  0%  0%*
Penalty for second packet 30%  40%  15%  20%  0%
Penalty for fourth packet 60%  80%  25%  40%  25%

* Average for EDGE actually improves by about 35% if you fill packet

No tags Hide

I tested out livingroomofsatoshi.com, which lets you pay any BPAY bill (see explanation from reddit).  Since I’d never heard of the developer, I wasn’t going to send anything large through it, but it worked flawlessly.  At least the exposure is limited to the time between sending the BTC and seeing the BPAY receipt.  Exchange rate was fair, and it was a simple process.

Now I need to convince my wife we should buy some BTC for paying bills…

No tags Hide

As my previous post documented, I’ve experimented with localbitcoins.com.  Following the arrest of two Miami men for trading on localbitcoins, I decided to seek legal advice on the sitation in Australia.

Online research led me to Nick Karagiannis of Kelly and Co, who was already familiar with Bitcoin: I guess it’s a rare opportunity for excitement in financial regulatory circles!  This set me back several thousand dollars (in fiat, unfortunately), but the result was reassuring.

They’ve released an excellent summary of the situation, derived from their research.  I hope that helps other bitcoin users in Australia, and I’ll post more in future should the legal situation change.

No tags Hide

« Previous Entries

Next Page »

Find it!

Theme Design by devolux.org

Tag Cloud