Designing a good API is an art, but there are reasons behind the feelings and enumerable rules of thumb that can save costly missteps. Come on an adventure through compactness, orthogonality, consistency, safety, coupling, state handling, layering, and more, illustrated with shining examples (and gruesome mistakes) from popular Python libraries—and walk away with an actionable design plan.
Here's a rough outline that will certainly grow and develop before the actual talk.
- What is an API? Everything is.
- Even if you're not going to expose it publicly, every routine, private or public, affects the code of its callers.
- What are the actual steps I go through when I sit down to design an
API?
- 1: Do I consider it from the caller's point of view? From a close match with the machinery I'm building an interface to (when that's the case)? What?
- Well, for one thing, I often moosh the thing I'm giving access to
together with all the features of the Python language and find
the ones that line up best with it:
- informal protocols
- iterators (including consumers)
- mappings
- sequences
- context managers
- simple methods
- decorators
- informal protocols
- 2: Consider error handling. Should we return something or throw
an exception? The decision is based on…
- whether you want to not interrupt the flow of an internal iterator
- whether it's less harmful to ignore the error or to start peeling off stack frames
- Actually, I wish for resumable exceptions, like returning a continuation or just not peeling the frames off but rather invoking the exception handling in a new frame.
- More
- Don't be an architecture astronaut:
“It's hard to make predictions, especially about the future.”
- The best frameworks and libs are extracted, not invented.
- Looking ahead is fine, but the whole scientific method is based on the recognition that we need to accept constant correction from reality. “Reality-guided imagination” is key, both for science and design. (Otherwise, for example, how will you know what's frequently needed and what isn't?)
- Every lib should have a program grow alongside it—ideally, 2, as different as possible. (nose-progressive and conway, in the case of blessings)
- We can borrow a lot of principles from UI design and physical product design.
- Principle of least astonishment.
- Minimize “Oh, that's weird” unless it results in a major payoff.
- Make common stuff…
- Easy to remember
- Simple conceptually
- Short (on tokens)
- Make uncommon stuff…
- Possible
- First-class (supported, not painful)
- Resist expensive accidents
- Save and Delete buttons close together. "Reset" on web forms close to "Submit".
- Complicated working in dialogs with "Yes" and "No" buttons
- Similarly, there are analogues in API design:
- UPDATE table SET thing='thong' --oops, forgot WHERE
- pyelastic.delete('index', 'doctype') # oops, deleted all documents
- rm
- Ask yourself, "How can I use this to hurt myself and others?"
- Design for forgetfulness
- Context managers are great for enforcing this.
- with closing() in stdlib
- with location() in blessings
- with self._locking() in pyelasticsearch
- Context managers are great for enforcing this.
- A good API is usable at multiple levels. Burrow down (only) as deeply as you need to to change the default behaviors. A good API has layers, multiple (but not too many) public ones.
- Don't let people represent what doesn't make sense:
“A problem well-represented is three quarters solved.”
- Keep state to a minimum. For example, having lots of instance vars but then methods that make sense only when they have certain values is a recipe for mental exhaustion on the part of your callers.
- Don't have big piles of kwargs that make sense only when used in
certain combinations. Break them up into multiple functions
instead. e.g., elasticutils' faceting stuff
- Example: the changes in pyelasticsearch between 0d93fc0cd325056a44e73bd054624c65f5277c86 and the following commit. search() used to take either a q kwarg or a body kwarg, and it wouldn't make sense to use both. What should it do in that case? Error? Have some precedence? Either way, it's unclear and requires either docs or code to clarify, either of which have to be maintained, comprehended, etc. The new design does a little introspection on the "query" arg (which there can be only one of) and does the right thing, dispatching on data type.
- If you have a bunch of bools, make sure they make sense in any combination. Otherwise, consider replacing some sets of them with single enums. Btw, this also goes for DB design.
- Acquisition is initialization. Don't let people get ahold of class instances that are in don't-make-sense states. en.wikipedia.org—Resource_Acquisition_Is_Initialization <http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initiali zation>
- Composition is better than inheritance. People always say this, but why? Because it decreases coupling.
- Memorability vs. accuracy (worth talking about?)
- Avoid doc lookups. If the user has to keep looking at the docs, you have failed compactness.
- Example: calling kwargs "indexes" when they can support multiple indexes and "index" when they can't. Make it easier to glean that info from the docs, but that's not very useful info, since the overwhelmingly common case is to use a single index.
- Low impedance mismatch with the language.
- Every time you declare a function, name a var, or make a data structure, you're extending the language. You might be doing it within the bounds of a prescribed syntax, but you're coining terminology and introducing behavior. If you take off in a different direction from the language, be prepared to rewrite all support code (libs, glue) yourself, just as if you had written a new from-scratch language.
- Not only that, but consider the existing body of libraries, standard and not. If you run off and make your own incompatible mapping or list API, you don't get to use any of that stuff.
- Zen of Python: the reasons behind the proverbs
- Backward-compatibility
- How much to do? Well, it goes back to not being an astronaut: would you be glad to rework the 2 programs that are growing up alongside your API? If so, go for it. If not, find a compatible way.
- Along these lines, be careful about *args. It means you can never add any more required args ever again—unless you want to raise a bunch of ValueErrors and TypeErrors manually, which makes you function hard to reason about from just looking at its signature.
- Semantic versioning
- I typically make no guarantees from [0.1 to 1.0). Then I break stuff only in major releases.
- PyPI and its tools make it hard to release betas. Don't put betas on PyPI.
- Documentation
- Preflight checklist
I've long been fascinated with API design, and I've been doing a lot of it lately. For example...
- My
blessingsterminal library—which serves a pretty niche audience—has accumulated 10,000 downloads and been featured on Hacker News and Reddit. People said things like "This is amazing. Lovely API." and "I doubt @kennethreitz often writes console apps, but when he does, I bet he prefers blessings." (http://www.reddit.com/r/programming/comments/yeudv/blessings_15_a_python_replacement_for_curses/ and http://news.ycombinator.com/item?id=4399060, respectively) - I just rewrote pyelasticsearch, partly to fix glaring API problems.
- I'm working on a new parsing library, parsimonious, which has as its twin goals speed and a beautiful API.
Some of my past talks include...
- Django's Nasal Passage at DjangoCon US 2012 (no video yet)
- Parsing Horrible Things at PyCon 2012
- Speedily Practical Large-Scale Tests at PyCon 2012
- A fun little lightning talk at PyCodeConf 2011 about stackful, my hack for doing dynamically scoped variables in Python