Code review: libreplace
libreplace is the SAMBA library (also used in ctdb) to provide working implementations of various standard(ish) functions on platforms where they are missing or flawed. It was initially created in 1996 by Andrew Tridgell based on various existing replacement hacks in utils.c (see commit 3ee9d454).
The basic format of replace.h is:
#ifndef HAVE_STRDUP #define strdup rep_strdup char *rep_strdup(const char *s); #endif
If configure fails to identify the given function X, rep_X is used in its place. replace.h has some such declarations, but most have migrated to the system/ include directory which has loosely grouped functions by categories such as dir.h, select.h, time.h, etc. This works around the "which header(s) do I include" problem as well as guaranteeing specific functions.
Other than reading this code for a sense of Unix-like paleontology (and it's so hard to tell when to remove any of these helpers that cleanups are rare) we can group replacements into three categories:
- Helper functions or definitions which are missing, eg. strdup or S_IRWXU.
- "Works for me" hacks for platform limitations, which make things compile but are not general, and
- Outright extensions, such as #define ZERO_STRUCT(x) memset((char *)&(x), 0, sizeof(x)) or Linux kernel inspired likely()
Since it's autoconf-based, it uses the standard #ifdef instead of #if (a potential source of bugs, as I've mentioned before). I'll concentrate on the insufficiently-general issues which can bite users of the library, and a few random asides.
- #ifndef HAVE_VOLATILE ? I can't believe Samba still compiles on a compiler that doesn't support volatile (this just defines volatile away altogether) If it did no optimizations whatsoever, volatile might not matter, but I'm suspicious...
- typedef int bool; is a fairly common workaround for lack of bool, but since pointers implicitly cast to bool but can get truncated when passed as an int, it's a theoretical trap. ie. (bool)0x1234567800000000 == true, (int)0x1234567800000000 == 0.
- #if !defined(HAVE_VOLATILE) is the same test as above, repeated. It's still as bad an idea as it was 186 lines before :)
- ZERO_STRUCT, ZERO_ARRAY and ARRAY_SIZE are fairly sane, but could use gcc extensions to check their args where available. I implemented this for ARRAY_SIZE in the Linux kernel and in CCAN. Making sure an arg is a struct is harder, but we could figure something...
- #define PATH_MAX 1024 assumes that systems which don't define PATH_MAX probably have small path limits. If it's too short though, it opens up buffer overruns. Similarly for NGROUPS_MAX and PASSWORD_LENGTH.
- The dlopen replacement is cute: it uses shl_load where available (Google says HPUX), but dlerror simply looks like so:
#ifndef HAVE_DLERROR char *rep_dlerror(void) { return "dynamic loading of objects not supported on this platform"; } #endif
This cute message for runtime failure allows your code to compile, but isn't helpful if dlopen was a requirement. Also, this should use strerror for shl_load.
- havenone.h is (I assume) a useful header for testing all the replacements at once: it undefines all HAVE_ macros. Unfortunately it hasn't been updated, and so it isn't complete (unused code is buggy code).
- inet_pton is credited to Paul Vixie 1996. It's K&R-style non-prototype, returns an int instead of bool, and doesn't use strspn ((pch = strchr(digits, ch)) != NULL) or (better) atoi. But it checks for exactly 4 octets, numbers > 255, and carefully doesn't write to dst unless it succeeds. I would have used sscanf(), which wouldn't have caught too-long input like "1.2.3.4.5". OTOH, it would catch "127...1" which this would allow. But making input checks more strict is a bad way to be popular...
- Tridge's opendir/readdir/telldir/seekdir/closedir replacement in repdir_getdents.c is a replacement for broken telldir/seekdir in the presence of deletions, and a workaround for (older?) BSD's performance issues. It is in fact never used, because the configure test has had #error _donot_use_getdents_replacement_anymore in it since at least 2006 when the Samba4 changes were merged back into a common library!
- repdir_getdirents.c is the same thing, implemented in terms of getdirents rather than getdents; it's still used if the telldir/delete/seekdir test fails.
- replace.c shows some of the schizophrenia of approaches to replacement: rep_ftruncate #errors if there's no chsize or F_FREESP ioctl which can be used instead, but rep_initgroups returns -1/ENOSYS in the similar case. Best would be not to implement replacements if none can be implemented, so compile will fail if they're used.
- rep_pread and rep_pwrite are classic cases of the limitations of replacement libraries like this. As pread is not supposed to effect the file offset, and file offsets are shared with children or dup'd fds. There's no sane general way to implement this, and in fact tdb has to test this in tdb_reopen_internal. I would implement a read_seek/write_seek which are documented not to have these guarantees. I remember Tridge ranting about glibc doing the same kind of unsafe implementation of pselect :)
- snprintf only rivals qsort-with-damn-priv-pointer for pain of "if only they'd done the original function right, I wouldn't have to reimplement the entire thing". I'll avoid the glibc-extracted strptime as well.
I'm not sure Samba compiles on as many platforms as it used to; Perl is probably a better place for this kind of library to have maximum obscure-platform testing. But if I were to put this in CCAN, this would make an excellent start.