Speeding CCAN Testing (By Not Optimizing)

So, ccanlint has accreted into a vital tool for me when writing standalone bits of code; it does various sanity checks (licensing, documentation, dependencies) and then runs the tests, offers to run the first failure under gdb, etc. With the TDB2 work, I just folded in the whole TDB1 code and hence its testsuite, which made it blow out from 46 to 71 tests. At this point, ccanlint takes over ten minutes!

This is for two reasons: firstly because ccanlint runs everything serially, and secondly because ccanlint runs each test four times: once to see if it passes, once to get coverage, once under valgrind, and once with all the platform features it tests turned off (eg. HAVE_MMAP). I balked at running the reduced-feature variant under valgrind, though ideally I'd do that too.

Before going parallel, I thought I should cut down the compile/run cycles. A bit of measurement gives some interesting results (on the initial TDB2 with 46 tests):

Compiling the tests takes 24 seconds.
Running the tests takes 12 seconds.
Compiling the tests with coverage support takes 32 seconds.
Running the tests with coverage support takes 32 seconds.
Running the tests under valgrind takes 204 seconds (17x slowdown)
Running the tests with coverage under valgrind takes 326 seconds.

It's no surprise that valgrind is the slowest step, but I was surprised that compiling is slower than running the tests. This is because CCAN "run" tests actually #include the entire module source so they can do invasive testing.

So the simple approach of compiling up once, with -fprofile-arcs -ftest-coverage, and running that under valgrind to get everything in one go is much slower (from 325 up to 407 seconds!). The only win is to skip running the tests without valgrind, shaving 11 seconds off (about 2%).

One easy thing to do would be to compile with optimization to speed the tests up. Valgrind documentation (and my testing) confirms that using "-O" doesn't effect the results on any CCAN module, so that should make it run faster, for very little effort. When I actually measured, total test time increases from 407 seconds to 495, because compiling with optimization is so slow. Here are the numbers:

Compiling the tests with optimization (-O/-O2/-O3) takes 54/77/130 seconds.
Running the tests with optimization takes 11/11/11 seconds.
Running the tests under valgrind with optimization takes 201/208/208 seconds

So no joy there. Time to go and fix up my tests to run faster, and make ccanlint run (and compile!) them in parallel...