No two engineers can agree what is or is not a unit test, an integration test, or an acceptance test, or the difference between them. This is one of the immutable laws of the universe. So, in the grand tradition of my craft, I'm going to stir the pot further by throwing my own take into the ring.
The key to grokking unit and integration tests, in my opinion, is not to think of them individually, as distinct things with their own definitions that can be known on their own terms. Instead, I think of them as being related by a recursive definition, holisitcally. On their face, integration tests are, intrinsically, more complex and fragile than unit tests. They combine multiple moving pieces into one. Oftentimes, they need fixtures or some encompassing test harness, or are otherwise awkwardly contorted to fit environmental constraints. It stands to reason that anything that lets us reduce the need for integration tests without sacrificing coverage is a good thing.
The simplest possible explanation of a unit test is actually just that: it's a test that lets us write fewer integration tests. By striving to write tests that let us write fewer integration tests, we're driven to improve our code to make it easier to write those tests that let us write fewer integration tests. That's the essence of TDD.
Here's a hypothetical, possibly worst-case illustrative scenario:
You have two classes: Foo
and Bar
, each handling two orthogonal responsibilities, and each with multiple internal code paths accessed via a single public method. If your team lead thinks that comprehensive coverage means exhaustively testing all code paths at once, and each has two code paths, you now have four code paths to test: two in Foo
and two in Bar
, multiplied. If, instead, they each have four code paths, then you're up to 16, in total, for two methods. Add in a Baz
class and now you're up to 64, for just three different methods you actually call in the tests. Testing all those paths quickly grows impossible and—if you're dogged enough to write them all out—keeping track of what is going on, why different branches are different, and actually using those tests to find and fix bugs (as opposed to chasing the brittle high of confidence that you get from a passing suite) grows more and more difficult.
Clearly what we would have are 64 integration tests, since they'd be the direct result of testing Foo
, Bar
, and Baz
together and combining their code paths to test the total system. What if, however, we could just take as a given that Foo
, Bar
, and Baz
implement their public methods, and not concern ourselves with the particulars of their implementation? That would boil away almost all of our tests, would it not? Testing Foo
would not mean also testing Bar
's four code paths, and then doing it all over again for every one of Foo
's code paths and then multiplying all of those tests by Bar
's code paths. Indeed, ideally, when using Foo
we should not care about what is going on with Bar
at all, and vice versa, so long as it is there when we need to use it.
This is the essence of encapsulation, or contracts, or interfaces, or any number of other ways to put it: one object can be used alongside another without knowing or caring about what is going on inside that that object, as long as it faithfully implements its interface.
I can feel the anxiety ramping up, and I don't just mean sympathetically. The fear of hidden holes in test coverage is ever-present. "But what if Bar
doesn't implement its method correctly? The whole point of the test is ensuring that it does. If Foo
works and Bar
doesn't, but appears to work in one very specific way, then my passing test combining Foo
and Bar
is meaningless!"
Here's the trick: what lets us boil away 90% (not all of them, please note) of our "integration" tests is something very specific: the requirement that their behavior individually is encapsulated behind a boundary, and thus that they can be treated as black boxes for the purposes of higher-level tests, just like any other client should be able to treat them as black boxes without caring about their implementations.
Think of it as a logical proposition: if a
is true
and b
is true
, by the rules of logic a and b
must be true
. Yes, Foo
or Bar
might still have undiscovered bugs, no matter what we do or how careful we are. They probably do. Most code does. No system is perfect, but we build rules on top of rules on the assumption—used here not as a loaded term but as a synonym for requirement, antecedent, or prerequisite—that the underlying rules hold, because that's the only way to have a functioning system of any kind. Just like division of labor means you don't worry about whether or not a specific baker was able to get into the bakery that morning, because there's going to be bread on that grocery shelf either way, and your job is to program, not worry about how the bread gets baked.
So, let's consider this proposition: if our requirement is met, then we could write far fewer integration tests. It's a big if, of course, but the promised boon is tantalizing. If you assume Foo
and Bar
are solid rubber balls, then you only have to bounce them together once to know that they bounce together, whether they're red or green or blue. The question now becomes: since we obviously can't just assume that Foo
and Bar
individually work as they should, how can we come to know or at least reasonably to believe that they do?
Well, by writing tests, of course. Let's remember that a "unit test" is a test that lets us write fewer integration tests. Since this requirement is something that would let us write fewer integration tests, is it possible to write tests that verify that requirement, replacing many tests with far fewer? Yes, naturally enough, and this requirement actually translates pretty well into a useful first-order description of a unit: a unit is something which encapsulates behavior behind a boundary.
This doesn't necessarily mean a unit is a single class, and a boundary can be a single interface or a set of interfaces, so long as they all belong to the unit. Unit doesn't mean intrinsically indivisible, it means you can pick it up and move it elsewhere (but not necessarily that you want to or anticipate doing so) and what constitutes that unit, its purpose and responsibilities, and how it behaves shouldn't change based on the other classes and units that surround it. Most units are, in fact, complex structures in themselves, with multiple collaborators.
You ask: if they're complex structures, possibly with multiple interfaces, composed of multiple collaborators... doesn't that make anything testing them an integration test? Take a deep breath. Here's the big secret about unit testing and integration testing:
It absolutely does.
Integration and unit tests are—when you spit them at each other in the Large Hadron Collider to see what they're made out of—ultimately the same thing. Mic drop.
That's the beauty of encapsulation, irrespective of whether you're writing tests or code: on the inside it's a complex system teeming with different collaborator objects that have to work together. From the outside, we get to treat it as a black box that just does what it says it does. What makes the unit test of that system a unit test is by applying the definition of a unit test: it's a thing that lets us write fewer integration tests. Keep applying it recursively and you will eventually, inexorably, end up with many unit tests and few integration tests, "nested" (not actually nested) at the unit boundaries.
This might not jibe with our intuitive sense of the different purposes to which unit and integration tests are applied, or maybe just our app's directory structure, for that matter. There aren't just two levels to software, however. integration
and unit
divisions are a useful directory structure, but they don't map to the architecture of your code. Software is fractal, and the only hard part of software engineering, other than deciding on names and job titles, is deciding where these boundaries belong, and in what direction they should point.