Summary of "Advanced C Coding For Fun!"
Perhaps there was too much fun, and not enough advanced C coding, as one attendee implied. My original intent is to walk through a real implementation in the order I coded it, warts and all, but over 50% got cut for time. After all, it took me 15 minutes in my BoF session just to run through the implementation of ccan/foreach. (Hi to the three people who attended!).
So I ended up doing a fair bit of waving at other code (yes, mainly in CCAN: if I have a useful trick, I tend to put it there). Here's the bullet-point version of my talk with links:
- CCAN is a CPAN-wannabe project for snippets of C code.
- Your headers should be a readable and complete reference on your API.
- Code documentation should be human readable and machine processable (eg. kerneldoc), but extracting it is a waste of time. See above.
- Your headers should contain example code, and this should be compile tested and even executed (ccanlint does this).
- Perl's TAP (Test Anything Protocol) has a C implementation which is easy to use.
- You can write a better ARRAY_SIZE(arr) macro than "sizeof(arr)/sizeof((arr)[0])", using gcc extensions to warn if the argument is actually a pointer, not an array.
- I got bitten by strcmp()'s usually-wrong return value after coding in C for ten years. I suggest defining a streq() macro.
- It is possible, though quite difficult, to implement a fixed-values iterator macro, aka. foreach. It's even efficient if you have C99.
- Making functions return false rather than exit, even if the caller can't really handle the failure, makes for easier testing.
- Making your functions use errno is a bonus, though its semantic limitations are definitely a two-edged sword.
- A common mistake is to call close, fclose, unlink or free in error paths, not realizing that they can alter errno even if they succeed.
- Never think to write malloc-fail-proof code without testing it thoroughly, otherwise you haven't written malloc-fail-proof code.
- You can test such "never-happen" failure paths automatically by forking; make sure you give a nice way to get a debugger to the fail point though, and terminate failing tests as early as possible.
- There are libraries to make option parsing easier than getopt; popt and ccan/opt are two.
- You can use macros to provide typesafe callbacks rather than forcing callbacks to take void * and cast internally; the compiler will warn you if you change the type of the callback or callback parameter so they no longer match.
- Do not rely on the user to provide zero'd terminators to tables: use a non-zero value so you're much more likely to catch a missing terminator.
- Use talloc for allocation.
- Don't return a void * as a handle, even if you have to make up a type. Your callers' code will be more typesafe that way.
- Don't use global variables in routines unless it's clearly a global requirement: keep everything in the handle pointer.
- Valgrind is awesome. Valgrind with failtesting is invaluable for finding use-after-free and similar exit-path bugs.
- Fixing a test doesn't mean your program doesn't suck. I "fixed" a one-client-dies-while-another-is-talking-to-it by grabbing another client; that's stupid, though my test now passes.
- Don't do anything in a signal hander; write to a nonblocking pipe and handle it in your event loop.
- The best way to see why your program is getting larger over time is to use talloc_report() and see your allocation tree (you can use gdb if you need, a-la Carl Worth.
- You might want to do something time-consuming like that in a child; remember to use _exit() in the child to avoid side-effects.
- There are at least two tools which help you dump and restore C structures: genstruct and cdump (coming soon, it's in the talk's git tree for the moment). Both are very limited, though cdump is still being developed.
- You can use a dump/exec/restore pattern to live-upgrade processes; forking a child to test dump and restore is recommended here!
- If your restore code is well-defined for restoring fields that weren't dumped, you can make significant code modifications using this pattern.
- You can use C as a scripting language with a little boilerplate. Use "#if 0" as the first line, followed by the code to recompile and exec, then "#else" followed by the actual code. Make it executable, and the shell will do the right thing.
- You can use gdb to do just about anything to a running program; script it if you can't afford to have it stopped for long.
- The best hash algorithm to use is the Jenkins lookup3 hash (there's a ccan/hash convenient wrapper too).
- The best map/variable array algorithm to use is Judy arrays (much nicer with the ccan/jmap wrapper).
That was all I had room for; there was none for questions, and even the last two points were squished onto the final "Questions?" slide.