Typesafe callbacks in C (and gcc)

A classic pattern in C is to hand a generic callback function around which takes a "void *priv" pointer so the function can take arbitrary state (side note: a classic anti-pattern is not to do this, resulting in qsort being reimplemented in Samba so one can be provided!).

The problem with this pattern is that it breaks type safety completely, such as in the following example:
int register_callback(void (*callback)(void *priv), void *priv); static void my_callback(void *_obj) { struct obj *obj = _obj; ... } ... register_callback(my_callback, &my_obj);

If I change the type of my_obj, there's no compiler warning that I'm now handing my callback something it doesn't expect.

Some time ago, after such a change bit me, I proposed a patch to lkml to rectify this, using a typesafe callback mechanism. It was a significant change, and the kernel tends to be lukewarm on safety issues so it went nowhere. But these thoughts did evolve into CCAN's typesafe_cb module.

The tricksiness...

More recently, I tried to use it in libctdb (the new async ctdb library I've been toying with), and discovered a fatal flaw. To understand the problem, you have to dive into how I implemented typesafe_cb. At its base is a conditional cast macro: cast_if_type(desttype, expr, oktype). If expr is of type "oktype", cast it to "desttype". On compilers which don't support the fancy gcc builtins needed to do this, this just becomes an unconditional cast "(desttype)(expr)". This allows us to do the following:
#define register_callback(func, priv) \ _register_callback(cast_if_type(void (*)(void *), (func), void (*)(typeof(priv)))

This says that we cast the func to the generic function type only if it exactly matches the private argument. The real typesafe_cb macro is more complex than this because it needs to ensure that priv is a pointer, but you get the idea.

Now, one great trick is that the callback function can take a "const" (or volatile) pointer of the priv type, and we let that work as well: we have a "cast_if_any" which extends "cast_if_type" to any of three types:
#define typesafe_cb(rtype, fn, arg) \ cast_if_any(rtype (*)(void *), (fn), \ rtype (*)(typeof(*arg)*), \ rtype (*)(const typeof(*arg)*), \ rtype (*)(volatile typeof(*arg)*))

The flaw...

If your private arg is an undefined type, typeof (*arg) won't work, and you need this to declare a const pointer to the same type. I have just filed a bug report, but meanwhile, I need a solution.

The workarounds...

Rather than use cast_if_any, you can insert an explicit call to the callback to evoke a warning if the private arg doesn't match, then just cast the callback function. This is in fact what I now do, with an additional test that the return type of the function exactly matches the expected return type. cast_if_type() now takes an extra argument, which is the type to test:

#define typesafe_cb(rtype, fn, arg) \ cast_if_type(rtype (*)(void *), (fn), (fn)(arg), rtype)

cast_if_type does a typeof() on (fn)(arg), which will cause a warning if the arg doesn't match the function, and the cast_if_type will only cast (fn) if the return type matches rtype. You can't test the return type using a normal test (eg. "rtype _test; sizeof(test = fn(arg));") because implicit integer promotion makes this compile without a warning even if fn() returns a different integer type.

Unfortunately, the more general typesafe_cb_preargs() and typesafe_cb_postargs() macros lose out. These are like typesafe_cb but for callbacks which take extra arguments (the more common case).

/* This doesn't work: arg might be ptr to undefined struct. */ #define typesafe_cb_preargs(rtype, fn, arg, ...) \ cast_if_any(rtype (*)(__VA_ARGS__, void *), (fn), \ rtype (*)(__VA_ARGS__, typeof(arg)), \ rtype (*)(__VA_ARGS__, const typeof(*arg) *), \ rtype (*)(__VA_ARGS__, volatile typeof(*arg) *))

We can't rely on testing an indirect call: we'd need example parameters to pass, and because they'd be promoted. The direct call might work fine but an indirect call via a different function signature fail spectacularly. We're supposed to increase type safety, not reduce it!

We could force the caller to specify the type of the priv arg, eg. "register_callback(func, struct foo *, priv)". But this strikes me as placing the burden in the wrong place, for an issue I hope will be resolved soonish. So for the moment, you can't use const or volatile on callback functions:

/* This doesn't work: arg might be ptr to undefined struct. */ #define typesafe_cb_preargs(rtype, fn, arg, ...) \ cast_if_type(rtype (*)(__VA_ARGS__, void *), (fn), (fn), \ rtype (*)(__VA_ARGS__, typeof(arg)))