It's useful to arm ourselves with a pithy phrase should we ever have to face an "it'll be easier to use!" argument. But once we've pointed to it, it's still not clear how to improve the difficulty of interface misuse.
So I've created a "best" to "worst" list: my hope is that by putting "hard to misuse" on one axis in our mental graphs, we can at least make informed decisions about tradeoffs like "hard to misuse" vs "optimal". The Hard To Misuse Positive Score List
This ideal is represented by the dwim()
(Do What I Mean) function, where misuse means the implementation has a bug.
In real life this goal is only achievable by greatly restricting your definition of misuse. Even the dwim()
function can
be abused by not calling it at all.
As a C person, I like that the compiler reads all my code before it even gives me a chance to run any of it. We're
so used to this we don't give it a second thought when the compiler barfs because we use the wrong type or don't provide
enough arguments to a function. But we can go out of our way to use this: various project such as gcc and the Linux
kernel have macros like BUILD_BUG_ON(cond)
which can be implanted strategically to evoke compile errors (it evalates
sizeof(char[1-2*!!(cond)])
which won't compile if cond is true).
I use this in the kernel's module_param(name, type, perm) macro to check that the read/write permissions for the module parameter are sane (a common mistake was to specify 644 instead of 0644).
This is weaker than breaking the compile, but in many cases easier to achieve. The classic of this school is the Linux
kernel min()
and max()
macros, which use two GCC extensions: a statement expression which allows the whole statement to
be treated by the caller as a single expression, and typeof which lets us declare a temporary variable of same type as
another:
/*
* min()/max() macros that also do
* strict type-checking.. See the
* "unnecessary" pointer comparison.
*/
#define min(x,y) ({ \
typeof(x) _x = (x); \
typeof(y) _y = (y); \
(void) (&_x == &_y); \
_x < _y ? _x : _y; })
Since a common error in C is to compare signed vs unsigned types and expect a signed result, this macro insists that both types be identical.
Always make it easier to do the Right Thing than the Wrong Thing. So if you can't make the right thing easy, make the wrong thing hard! This is the "explicit args required for kmalloc" example again, but it usually means choosing defaults carefully and knowing the normal use for the function.
My example here is the standard Unix exit()
and _exit()
: the latter does not call any atexit() handlers and is
usually not the right choice, so it's harder to find.
Everyone knows a good name is invaluable. In the _exit() the underscore punches far above its one-character weight was a warning sign.
My example here is the strange reference counting mechanism used by the Linux Kernel module code: getting a
reference count can fail, unlike almost all the rest of the kernel reference counts. Hence, the "get a reference count"
function is called try_module_get()
: those first four characters reflect the importance of the return code. Note that
these days, the GCC __attribute__((warn_unused_result))
can be used to promote this usage to a warning. I still like
the name, though, because overuse of such things has lead to some warning fatigue...
As soon as the misusing code is executed, it'll die horribly. Not all code paths are tested, but this will often
catch cases where someone is writing new code using your interface. It's hard for the compiler to ensure that the user
calls your "open" routine before your other routines, but an assert()
can at least get you to this level.
This is a corollary of "this simplest use is the correct one", and a very useful handhold on the way up this scale. In particular, C convention for argument order seems to have evolved down to three ordered rules:
- Context argument(s) go first. A context is something the user will do a series of different things to; a handle.
- Associated arguments are adjacent. An array and its length go together, as does a timestamp and its granularity. If you could see yourself making a structure out of some of the args, they should go together.
- Details go as late as possible. Flags for the function go at the end. Pointer and length pairs are passed in that order.
I've never gotten the argument order of the standard write()
wrong, even though the fd
and count
could be interchanged:
ssize_t write(int fd, const void *buf, size_t count);
There are also minor (but important!) conventions, such as memcpy's "destination before source", which you should use for any memcpy-like routines.
Like all rules, this one exists to be violated; but know you're doing so.
People only read instructions after they've already tied themselves into a knot. Then they skim them for keywords and don't read your warnings. I don't give an example of this; if this is the best an interface can get do, it's in trouble.
We've all done this. Reading the implementation can work for the simple questions (what unit is this argument in?), but leads to trouble for the subtler issues. The concept of "the" implementation is always problematic, and when the implementation is tightened or fixed we discover we didn't actually get it right, we just got it working.
In some cases, the implementation is a noop, which doesn't help.
The reason the some strange interface quirk exists might be for compatibility with some strange OS or compiler, weird corner case or even older versions of this codebase. In other words, historical reasons ("see, on the VAX we only had 6 characters for..."). You sometimes only find this when you send a patch to fix it and the original author yells at you.
Sometimes they add it to the FAQ. That does not increase the interface's score very much: please try harder.
The misuse scale is written by Rusty Russell and licensed under Creative Commons Attribution 2.1 Australia License. The original can be accessed in here . I have just reproduced the document as a Markdown document for archiving purposes.