Archive for February 2010
Headed through Germany 26th through 3rd March or so, then Lithuania via Poland. Back via Singapore on 24/25 March.
My email will be intermittent (I hope!) but if you’re around and want to grab a meal or a beer with us, ping me!
Mikael noted in my previous post that Con Kolivas’s lrzip is another interesting compressor. In fact, Con has already done a simple 64-bit enhance of rzip for lrzip, and on our example file it gets 56M vs 55M for xz (lrzip in raw mode, followed by xz, gives 100k worse than just using lrzip: lrzip already uses lzma).
Assuming no bugs in rzip, the takeaway here is simple: rzip should not attempt to find matches within the range that the backend compressor (900k for bzip2 in rzip, 32k for gzip, megabytes for LZMA as used by lrzip). The backend compressor will do a better job (as shown by similar results with lrzip when I increase the hash array size so it finds more matches: the resulting file is larger).
The rzip algorithm is good at finding matches over huge distances, and that is what it should stick to. Huge here == size of file (rzip does not stream, for this reason). And this implies only worrying about large matches over huge distances (the current 32 byte minimum is probably too small). The current version of rzip uses an mmap window so it never has to seek, but this window is artificially limited to 900MB (or 60% of mem in lrzip). If we carefully limit the number of comparisons with previous parts of the file, we may be able to reduce them to the point where we don’t confuse the readahead algorithms and thus get nice performance (fadvise may help here too) whether we are mmaped or seeking.
I like the idea that rzip should scale with the size of the file being compressed, not make assumptions about today’s memory sizes. Though some kind of thrash detection using mincore might be necessary to avoid killing our dumb mm systems :(
As the kernel archive debates replacing .bz2 files with .xz, I took a brief glance at xz. My test was to take a tarball of the linux kernel source (made from a recent git tree, but excluding the .git directory):
linux.2.6.tar 395M
For a comparison, bzip2 -9, rzip -9 (which uses bzip2 after finding distant matches), and xz:
linux.2.6.tar.bz2 67M
linux.2.6.tar.rz 65M
linux.2.6.tar.xz 55M
So, I hacked rzip with a -R option to output non-bzip’d blocks:
linux.2.6.tar.rawrz 269M
Xz on this file simulates what would happen if rzip used xz instead of libbz2:
linux.2.5.tar.rawrz.xz 57M
Hmm, it makes xz worse! OK, what if we rev up the conservative rzip to use 1G of memory rather than 128M max? And the xz that?
linux.2.6.tar.rawrz 220M
linux.2.6.tar.rawrz.xz 58M
It actually gets worse as rzip does more work, implying xz is finding quite long-distance matches (bzip2 won’t find matches over more than 900k). So, rzip could only have benefit over xz on really huge files: but note that current rzip is limited on filesize to 4G so it’s a pretty small useful window.
libreplace is the SAMBA library (also used in ctdb) to provide working implementations of various standard(ish) functions on platforms where they are missing or flawed. It was initially created in 1996 by Andrew Tridgell based on various existing replacement hacks in utils.c (see commit 3ee9d454).
The basic format of replace.h is:
#ifndef HAVE_STRDUP
#define strdup rep_strdup
char *rep_strdup(const char *s);
#endif
If configure fails to identify the given function X, rep_X is used in its place. replace.h has some such declarations, but most have migrated to the system/ include directory which has loosely grouped functions by categories such as dir.h, select.h, time.h, etc. This works around the “which header(s) do I include” problem as well as guaranteeing specific functions.
Other than reading this code for a sense of Unix-like paleontology (and it’s so hard to tell when to remove any of these helpers that cleanups are rare) we can group replacements into three categories:
- Helper functions or definitions which are missing, eg. strdup or S_IRWXU.
- “Works for me” hacks for platform limitations, which make things compile but are not general, and
- Outright extensions, such as #define ZERO_STRUCT(x) memset((char *)&(x), 0, sizeof(x)) or Linux kernel inspired likely()
Since it’s autoconf-based, it uses the standard #ifdef instead of #if (a potential source of bugs, as I’ve mentioned before). I’ll concentrate on the insufficiently-general issues which can bite users of the library, and a few random asides.
- #ifndef HAVE_VOLATILE ? I can’t believe Samba still compiles on a compiler that doesn’t support volatile (this just defines volatile away altogether) If it did no optimizations whatsoever, volatile might not matter, but I’m suspicious…
- typedef int bool; is a fairly common workaround for lack of bool, but since pointers implicitly cast to bool but can get truncated when passed as an int, it’s a theoretical trap. ie. (bool)0×1234567800000000 == true, (int)0×1234567800000000 == 0.
- #if !defined(HAVE_VOLATILE) is the same test as above, repeated. It’s still as bad an idea as it was 186 lines before :)
- ZERO_STRUCT, ZERO_ARRAY and ARRAY_SIZE are fairly sane, but could use gcc extensions to check their args where available. I implemented this for ARRAY_SIZE in the Linux kernel and in CCAN. Making sure an arg is a struct is harder, but we could figure something…
- #define PATH_MAX 1024 assumes that systems which don’t define PATH_MAX probably have small path limits. If it’s too short though, it opens up buffer overruns. Similarly for NGROUPS_MAX and PASSWORD_LENGTH.
- The dlopen replacement is cute: it uses shl_load where available (Google says HPUX), but dlerror simply looks like so:
#ifndef HAVE_DLERROR char *rep_dlerror(void) { return "dynamic loading of objects not supported on this platform"; } #endifThis cute message for runtime failure allows your code to compile, but isn’t helpful if dlopen was a requirement. Also, this should use strerror for shl_load.
- havenone.h is (I assume) a useful header for testing all the replacements at once: it undefines all HAVE_ macros. Unfortunately it hasn’t been updated, and so it isn’t complete (unused code is buggy code).
- inet_pton is credited to Paul Vixie 1996. It’s K&R-style non-prototype, returns an int instead of bool, and doesn’t use strspn ((pch = strchr(digits, ch)) != NULL) or (better) atoi. But it checks for exactly 4 octets, numbers > 255, and carefully doesn’t write to dst unless it succeeds. I would have used sscanf(), which wouldn’t have caught too-long input like “1.2.3.4.5″. OTOH, it would catch “127…1″ which this would allow. But making input checks more strict is a bad way to be popular…
- Tridge’s opendir/readdir/telldir/seekdir/closedir replacement in repdir_getdents.c is a replacement for broken telldir/seekdir in the presence of deletions, and a workaround for (older?) BSD’s performance issues. It is in fact never used, because the configure test has had #error _donot_use_getdents_replacement_anymore in it since at least 2006 when the Samba4 changes were merged back into a common library!
- repdir_getdirents.c is the same thing, implemented in terms of getdirents rather than getdents; it’s still used if the telldir/delete/seekdir test fails.
- replace.c shows some of the schizophrenia of approaches to replacement: rep_ftruncate #errors if there’s no chsize or F_FREESP ioctl which can be used instead, but rep_initgroups returns -1/ENOSYS in the similar case. Best would be not to implement replacements if none can be implemented, so compile will fail if they’re used.
- rep_pread and rep_pwrite are classic cases of the limitations of replacement libraries like this. As pread is not supposed to effect the file offset, and file offsets are shared with children or dup’d fds. There’s no sane general way to implement this, and in fact tdb has to test this in tdb_reopen_internal. I would implement a read_seek/write_seek which are documented not to have these guarantees. I remember Tridge ranting about glibc doing the same kind of unsafe implementation of pselect :)
- snprintf only rivals qsort-with-damn-priv-pointer for pain of “if only they’d done the original function right, I wouldn’t have to reimplement the entire thing”. I’ll avoid the glibc-extracted strptime as well.
I’m not sure Samba compiles on as many platforms as it used to; Perl is probably a better place for this kind of library to have maximum obscure-platform testing. But if I were to put this in CCAN, this would make an excellent start.
