Rusty Russell's Coding Blog | Stealing From Smart People



Speeding CCAN Testing (By Not Optimizing)

So, ccanlint has accreted into a vital tool for me when writing standalone bits of code; it does various sanity checks (licensing, documentation, dependencies) and then runs the tests, offers to run the first failure under gdb, etc.   With the TDB2 work, I just folded in the whole TDB1 code and hence its testsuite, which made it blow out from 46 to 71 tests.  At this point, ccanlint takes over ten minutes!

This is for two reasons: firstly because ccanlint runs everything serially, and secondly because ccanlint runs each test four times: once to see if it passes, once to get coverage, once under valgrind, and once with all the platform features it tests turned off (eg. HAVE_MMAP).  I balked at running the reduced-feature variant under valgrind, though ideally I’d do that too.

Before going parallel, I thought I should cut down the compile/run cycles.  A bit of measurement gives some interesting results (on the initial TDB2 with 46 tests):

  1. Compiling the tests takes 24 seconds.
  2. Running the tests takes 12 seconds.
  3. Compiling the tests with coverage support takes 32 seconds.
  4. Running the tests with coverage support takes 32 seconds.
  5. Running the tests under valgrind takes 204 seconds (17x slowdown)
  6. Running the tests with coverage under valgrind takes 326 seconds.

It’s no surprise that valgrind is the slowest step, but I was surprised that compiling is slower than running the tests.  This is because CCAN “run” tests actually #include the entire module source so they can do invasive testing.

So the simple approach of compiling up once, with -fprofile-arcs -ftest-coverage, and running that under valgrind to get everything in one go is much slower (from 325 up to 407 seconds!).  The only win is to skip running the tests without valgrind, shaving 11 seconds off (about 2%).

One easy thing to do would be to compile with optimization to speed the tests up. Valgrind documentation (and my testing) confirms that using “-O” doesn’t effect the results on any CCAN module, so that should make it run faster, for very little effort.  When I actually measured, total test time increases from 407 seconds to 495, because compiling with optimization is so slow.  Here are the numbers:

  1. Compiling the tests with optimization (-O/-O2/-O3) takes 54/77/130 seconds.
  2. Running the tests with optimization takes 11/11/11 seconds.
  3. Running the tests under valgrind with optimization takes 201/208/208 seconds

So no joy there. Time to go and fix up my tests to run faster, and make ccanlint run (and compile!) them in parallel…

RSS Feed

4 Comments for Speeding CCAN Testing (By Not Optimizing)

Roger | August 25, 2011 at 7:04 pm

I use -Os for valgrind testing (optimize for size). The less code it has to emulate the better.

Brad Hards | August 26, 2011 at 11:01 am

Parallel testing will (obviously) help, but perhaps you can drop valgrind with coverage. Just run the coverage tests without valgrind. The coverage options are distorting the code so it isn’t fully representative of real runtime behaviour.

You wouldn’t want to lose the full capability, but perhaps it isn’t necessary to do it with every run.

Or perhaps you can optimise this a bit more selectively (e.g. what tests are costing you the most time, and what are the risks with those tests)?

Author comment by rusty | August 27, 2011 at 11:14 am

Yes, I now run valgrind on the non-cov versions, and recompile for coverage; doing it all-in-one was just a (failed) experiment.

ccanlint has various options to avoid steps, but it’s nice to just say “ccanlint -v” and have it Do Everything. Only faster :)

Author comment by rusty | August 29, 2011 at 11:12 am

Interesting idea re: -Os, but doesn’t help here: compilation took 78 seconds, and valgrind took 204 seconds.

Mind you, most of our valgrind time is from all the forking we do; that hurts.

Leave a comment!



Find it!

Theme Design by

Tag Cloud