Rusty Russell's Coding Blog | Stealing From Smart People



xz vs rzip

As the kernel archive debates replacing .bz2 files with .xz, I took a brief glance at xz. My test was to take a tarball of the linux kernel source (made from a recent git tree, but excluding the .git directory):

     linux.2.6.tar 395M

For a comparison, bzip2 -9, rzip -9 (which uses bzip2 after finding distant matches), and xz:

     linux.2.6.tar.bz2 67M
     linux.2.6.tar.rz 65M
     linux.2.6.tar.xz 55M

So, I hacked rzip with a -R option to output non-bzip’d blocks:

     linux.2.6.tar.rawrz 269M

Xz on this file simulates what would happen if rzip used xz instead of libbz2:

     linux.2.5.tar.rawrz.xz 57M

Hmm, it makes xz worse!  OK, what if we rev up the conservative rzip to use 1G of memory rather than 128M max?  And the xz that?

     linux.2.6.tar.rawrz 220M
     linux.2.6.tar.rawrz.xz 58M

It actually gets worse as rzip does more work, implying xz is finding quite long-distance matches (bzip2 won’t find matches over more than 900k).  So, rzip could only have benefit over xz on really huge files: but note that current rzip is limited on filesize to 4G so it’s a pretty small useful window.

RSS Feed

2 Comments for xz vs rzip

Mikael | February 16, 2010 at 12:07 am

Did you try too? Con has done some benchmarks with linux already in
lrzip has a -n switch to not do any backend compression so you can try it with xz too, but I think xz and lzma are very similar.

Followup: lrzip - Rusty Russell's Coding Blog | February 16, 2010 at 11:52 am

[…] noted in my previous post that Con Kolivas’s lrzip is another interesting compressor.  In fact, […]



Find it!

Theme Design by

Tag Cloud