Code Reviewing: popt

I decided that every day I would review some code in ctdb.  I never spend enough time reading code, and yet it’s the only way to really get to know a project.  And I decided to start easy: with the popt library in ctdb.

popt is the command-line parsing library included with the RPM source, and used by SAMBA.  I vaguely recall Tridge telling me years ago how good it was.  I started with the man page, and it is an excellent and useful library: it does enough to make the simple cases less code that getopt_long, yet allows flexibility for complex cases.

The idea is simple: similar to getopt_long, you create an array describing the possible options.  Unlike getopt, however, simple integers and flags are handled automatically.  So you set up the context with the commandline, then hand that context to poptGetNextOpt() to do the parsing.  That keeps parsing args until it hits an error or you’ve marked an argument to return a value for special handling.  The manual page is excellent and made me feel really comfortable with using the code.

Now, the code itself.  Even before you notice the four-space tabs and bumpyCaps, you see the lint annotations and docbook-style comments cluttering the source.  It does make the code annoying to read, breaking CCAN’s Golden Rule.  Typedefs of structs, typedefs to pointers, and several #ifdef  __cplusplus complete my minor gripes.

The code is old and cross-platform: the header test for features we’d simply assume in a modern Linux.  It has a simple set of bitmap macros (taken from pbm, judging from the name), a helper routine to find an executable in $PATH (using alloca!) .  These are the kinds of things which would be in separate modules, were this in CCAN.

The definition of _free() to be a (redundantly-NULL-taking) free() which always returns NULL is used to effect throughout the code:

    defs = _free(defs);

Here is a trick I hadn’t seen before to zero an onstack var, and really don’t recommend:

poptDone done = memset(alloca(sizeof(*done)), 0, sizeof(*done));

The help-printing code is in a separate file, popthelp.c.  It’s actually about half 1/3 of the code!  That’s mainly because it takes pains to pretty-print, and it’s done by manually tracking the column number through the routines (aka ‘cursor’).  These days we’d asprintf into a temp buffer and strlen() to figure out if we should start a new line and indent before printing this.  Or even better, create a struct with the FILE * and the column number, and create a fprintf variant which updated the column number and figured out wrapping for us. Routines like maxArgWidth() which iterates the table(s) to figure how much to indent will still suck however: probably simplest to build all the strings in the help table in memory and then dump at the end.

I thought I found a buffer overrun, but a test program disproved it: the singleOptionHelp() uses its maxLeftCol (plus some extra stuff) to size a buffer.  This works because maxLeftCol is previously calculated as the worst-case size of the argument help.  Now, the code is unclear (AFAICT maxLeftCol must always be sufficient, so the extra bytes are wasted), but not buggy.

In summary, this is excellent code.  Talloc would help it, as would modern printf usage (%.*s for example), but generally the code is in really good shape.  I know that the popt code in RPM is a little updated, but I can’t imagine that much has changed in such an obviously-stable codebase. The temptation to rewrite it is thus very low: who knows what the testsuite would miss?

10 replies on “Code Reviewing: popt”

  1. Also used in openchange, and I find it quite easy to use. Unfortunately it is a bit easy to mis-use. I often find that valgrind tells me I’ve forgotten to clean up something that is deep in the popt stuff.

  2. could you explain the trick on how the ‘done’ struct is zeroing? My first confusion is what I get when you dereference ‘done’, and then, second, wouldn’t alloca() return a pointer to memory NOT the same as the ‘done’ struct (which would have been allocated automatically on the stack prior to the alloca() call right?)? You wouldn’t be zeroing out the right memory?


    1. alloca returns a chunk of memory of size “sizeof(*done)”; even though done is yet to be initialized, this declaration is valid C. memset’s (little-used) return value is always the same as its first argument, ie. the return of alloca() in this case. The vanilla alternative to “poptDone done = memset(alloca(sizeof(*done)), 0, sizeof(*done));” would use a real variable rather than alloca; which is certainly clearer to those who haven’t seen this before. But you can’t zero it generically in a one-liner.

  3. I suppose I should leave a pointer to current active development
    of POPT 2.0 that was largely initiated by this review and private
    “nuging” from Rusty:

    Feature requests and patches are most gratefully received at

    The coding is already underway, perhaps 20% of the way to what *I*
    want to see in POPT 2.0, will likely result in a release sometime this fall
    (depending on my time & energy).

  4. > The vanilla alternative to “poptDone done = memset(alloca(sizeof(*done)), 0, sizeof(*done));” would use a real variable rather than alloca; which is certainly clearer to those who haven’t seen this before. But you can’t zero it generically in a one-liner.

    If you want to zero-initialize members, then:

    poptDone done = { 0 };

    is sufficient and clean. It is slightly different from using memset in that it won’t zero non-member data (i.e., padding between members), and zero-initialization might not be the same as setting all-bits-zero. If you’re not using memcmp on the structure, then the padding bits won’t matter, and usually zero-initialization is preferable to setting all-bits-zero for portability.

    It is perhaps not completely generic if poptDone’s first member is another aggregate type; in that case, it’d need to be:

    poptDone done = { { 0 } };

    and so on.

Comments are closed.