We began with a single question: "what next?" The one-word answer was obvious: "testing." From there, we discussed three topics: what kinds of tests to write; how to write and run them; and what refactorings might best support the tests.
Each topic led to less consensus than the preceding one, but I will attempt to summarize the competing views.
We all agreed: we need smoke tests. At the very least, we need to know if a given change breaks any of the applications -- design tool, iOS, reply, renderer -- in any of the target browsers. These do not need to be end-to-end acceptance tests. We're checking for gross breakage: blank screens, uncaught exceptions, total animation freakouts.
Since we need to run several applications in several browsers, and since we want to catch errors as quickly as possible, smoke tests must be automated. Fortunately, this problem class is widespread and familiar: there are third-party products that specialize in it. I think smoke tests will require effort to set up, but not some stroke of genius.
At the mention of unit tests, we immediately ran aground on our first question: what is a unit test? Common explanations fall along a spectrum, from testing a single behavior of an application as a whole, to testing a single function within a codebase. Server-side software tends to use the latter definition: each test covers a single function or behavior; each test suite covers a class, or a category of behavior.
I would prefer to use the same standard. Testing individual functions, with minimal setup requirements, is superior to testing behaviors which require the entire app to start up and reach a certain state. The approach is less fragile, and much more amenable to automation.
Testing the application as a whole is valuable, but I would prefer that this be handled by acceptance tests, which can treat the design tool largely as a black box. Or at least a dark gray box.
Our app, especially the more interactive or asynchronous portions, are difficult to unit test, for many reasons:
- Getting into a known correct state often involves a lot of setup.
- Individual functions might have many inputs, by which I mean the arguments and the other properties the function consults.
- Behaviors such as dragging might involve many function calls over time.
These problems can be overcome, in part through refactoring. This leads us to our third main topic of discussion.
First, about basic unit testing. The discussion so far implies a chicken-and-egg scenario:
- We cannot safely refactor without tests to warn us when things break.
- We cannot write unit tests without refactoring highly-coupled code.
(Obviously, the answer is "eggs wrapped in chicken")
This is not a deadlock. We can target some deep bedrock functions for unit testing, functions that have minimal dependency on outside state; we set up unit tests for them with the minimum of refactoring; we repeat, one nibble at a time, as long as it takes.
I would maintain that this is not a question of underlying fundamental engineering problems: this is a question of self-confidence and will.
We have several long-standing suggestions on the table.
Data-First Flow: I have always proposed that changes to events should be changes to the event data. The viewport should re-render based on the new state of the data. Behaviors such as text formatting reverse this flow: if you select and reformat text, we apply a TextFormat
to the text and then TextLayer.save
it to the data model.
We can reverse that without too much effort, but behaviors such as dragging are harder to get into an input -> data -> rendering flow. I don't have an immediate suggestion for that -- but the closer we can get, the easier the application code will be to understand, modify, and test.
Fatter Event Data Model: Right now, model objects are JSON-friendly data structures. This means that they can have properties, but not methods. However, we frequently wish they had methods. This wish is embodied in the domain
component, which does not need to exist, strictly speaking. That entire component is a holdover from the hybrid, and it was written in a time when there was a much stronger motivation to keep model objects free of functions.
A literal fat model
Even then, it was not actually necessary. Today, it is not necessary at all. We should do now what we should have done then: promote model objects to classes/prototypes, give them useful methods, and provide a toJSON
function which generates function-free data we can sync back to Rails.
Model Objects Manage Resource Load: If model classes get fatter, we can put resource loading behavior in them. This would make display layers even simpler: instead of waiting for their images to load, the display layers can simply wait for a LOADED
event from their model object, and render on command.
Loading some fresh resources
If we take this approach, we also gain new opportunities for load coordination among model objects, which takes some of this confusing logic out of the display list. The display list is free to act as a dumb renderer, and separation of concerns grows stronger.
Welcome aboard. Here is your standard-issue Paperless Post mouth knife.
Our codebase has also grown large enough that it is difficult for new developers to get a grip on it. This has been true for years, #sincewebeinhonest, but our team was growing slowly. It was painful for each new developer to learn the ropes, but new developers were rare.
This is no longer true. Technical "onboarding" is still hard, and as new people join the team, we need to help them understand the code better than we have been.
In ascending order of manual effort:
- JSDoc: We already have a fair number of doc comments in JSDoc format. Let's make the effort necessary to run JSDoc and generate some actual documentation. If we can get this running on the command line, we can integrate it into the build process.
- Diagrams: Assuming we have valid JSDoc output, we can also use try to use tools to automatically generate diagrams of our dependencies. I worry that JavaScript is not perfectly suited for this: as a multi-paradigm language, it provides a lot of flexibility, which means that perfectly decent code might still be able to confuse a diagramming tool. On the other hand, there's no reason we shouldn't at least try it and see what we get.
- Overview Docs: We should also write documentation explaining what our overall approach is, how the application is structured, what it should look like once you discount all the bugs and hacks and
TODO
s. Since this documentation would not be generated from the code, it has a chance of going stale; but since it is not built into the code, it can stand as a reminder of what our original intentions were.
We've talked about adding a beautification stage to our Grunt compiles. Nobody disagrees, or at least nobody spoke up. It's time to just do it.
As long as we're refactoring, we need to decide what exactly we're refactoring toward. It is obvious that different parts of the codebase have different coding styles and standards, but it is not necessarily obvious why. I wrote a whole thing about this, and decided to delete it: this message is not for that. But once we have this train moving down the testing/refactoring track, we need to talk about where it's going.