2018-12-27
So I had a frustrating week, last week. I'd been spending much of december working on a long-standing feature for xi, line breaking / word wrap. That is; the ability to wrap lines of text to the width of the window, as needed. This is table stakes for a text editor (or a text view, or a web browser, or anything that displays text in a variable width window) but it's something we had until now only half-done.
This problem has some nuance. To back up: We have a string, and we have a width (of a window, say), and we need to determine how to break the string up so that it fits in the given window. To do this, we need to know two things at runtime: where in the string we are allowed to create a line break (the break opportunities), and the width required to draw various 'atoms' of the string, where an 'atom' is a sub-string that cannot contain a line break.
The basic algorithm, then, is that you iterate through the break opportunities; at each possible break you measure the width of the line so far; and if the width is larger than the width of the window, you insert a break and start again.
The part where you determine where where the break opportunities are is complicated, and you can get almost arbitrarily fancy. Fortunately the unicode standard includes as supplementary material UAX 14, Unicode Line Breaking Algorithm. This is a detailed description of how to determine break opportunities in unicode, and it is implemented as part of xi's unicode crate.
So this tells us where we're allowed to break; the next problem is to determine where we actually break. Doing this correctly requires that we be able to measure the actual width of a given substring, for a given font, for a given text stack; that is, the specific set of libraries and utilities that will be responsible for turning font and string into pixels on a display. Importantly, the width of a given string, for a given font, may be different on different platforms (or on the same platform using different font shaping or drawing libraries.)
There are some assumptions and details that are missing here that might be worth being explicit about.
Firstly: the width of a sequence of 'characters' is not the sum of the widths of each character in the sequence. (As an aside: the term 'character' is not really defined in unicode, but we will use it here to mean what most people would assume to when they hear the term; essentially a single drawn glyph, perhaps with accents, such as 'a' or 'é' or ’は’; in unicodese, a 'grapheme cluster'.) While this is often true for languages that use latin derived scripts, particular fonts will often provide ligatures, that provide special ways of drawing certain sequences of glyphs; for instance in many fonts the letters 'ff' in words like 'off' or 'difference' have a special glyph.
Secondly, in some scripts a given 'letter' can be drawn dramatically differently depending on where it is used. In Tamil, for instance, when 'ம' is followed by 'இ', the two glyphs are drawn together, as 'மி'.
This means that when we're measuring the width of text, we need to measure not individual characters but runs of characters, where a run is a sequence of characters terminated by some known boundary character.
Anyway: what this means is that we need to defer to a given frontend to do our measuring for us, and we need to measure each unique run of characters, for each font, at least once, assuming we cache everything. (There are optimizations available when text is entirely 'latin' (i.e. at least up to Latin-1, and potentially further) and the font is 'fixed pitch', i.e. monospace
, but that's really a detail; we want to think about the general case.) This is not cheap, and it means that caculating line breaks is going to be slower for us than it would be if core and frontend were completely integrated; in that case (such as in an editor like Sublime Text) the code calculating the line breaks could have perfect knowledge of the code responsible for laying out the text, which would let us make more assumptions and optimize more aggressively, as needed. (Incidentally, the line breaking behaviour of Sublime Text is excellent.)
One conclusion of this might be that we are doomed to poor line breaking performance. One of the design principles of the xi-editor project, however, is that no single task should be able to slow down the editing experience, and one of the key ways we achieve this is through doing work incrementally, where possible.
The traditional way to do line breaking is all at once. When the width of the window changes, the application reflows the entire document immediately, and then renders the result. This works well for the majority of documents, but when documents become too large, it can cause noticible lag. If I give Sublime a document in the 100MB range, for instance, and I grab the window frame and start to resize, there is a visible lag as the breaks are updated to fit the new window size. (That this lag is relatively minimal is a testament to the care put into this by the Sublime folks).
The approach taken by xi-editor is to do all wrapping incrementally. This means that when you resize a window, we rewrap the visible portion immediately; we then update the view, before continuing to update the breaks in the rest of the document asynchronously. The user won't need to know that this is happening, and we can be comparatively slow; it might take us ten times longer to rewrap an entire document than it takes Sublime, but in a large document it will always seem instantaneous.
So that was the idea.
Here's the problem: the frontend communicates positions in a document (say when the user clicks somewhere) as a line/column pair. The line is the visual line, that is it includes any 'soft' breaks (breaks that aren't the result of the user actually inserting a newline, e.g. by pressing return; these are the breaks we've been talking about.) Herein rests the problem: if we're rewrapping a chunk of lines above the visual region, we're changing the indexes of the lines in the visual region. This means that when the user clicks somewhere on the screen it's possible that they will think they're clicking on, say, line 50, but in the meantime core has gone and added or removed visual lines above the visible reigon, and what core thinks is line 50 is now a different line.
So: a pretty classic synchronization bug. And although this hasn't actually come up before, this bug isn't limited to just rewrapping; it should be possible anytime an edit occurs which changes the number of visual lines above the visible region; in theory it could even happen as a result of normal user behaviour, although the window during which the client's line cache was out of sync would only be a few ms, so actually scrolling and clicking in that time would be... tricky. Unfortunately an unlikely bug is still a bug. (And much of the time an unlikely bug is worse than a likely one.)
How to approach this problem was more ambiguious when it wasn't clear that it was as fundamental as it is. If this was something that could only occur during rewrap, we could get all kinds of fancy; we could just plain refuse to rewrap above the visual region, or we could do it atomically (although how this would work is unclear) or we could just focus on optimizing the wrapping in the traditional ways, and still do it synchronously. The fact that this problem can happen in a variety of situations makes the likely solution clear: we need to be passing a revision identifier with each update to the client, and then the client needs to send us its current revision with each command. In core, we will keep track of past versions of the view's linebreaks, and when the client references the view state we will know at exactly what point in history the client exists in, and we can reference that exact revision when handling the client's command.
This doesn't feel great, although it does feel necessary. The main problem is additional complexity for frontends, although that should actually be minimal; the frontend just needs to stash the revision id on each update, and then include that id in each view-related RPC.
In core, the complexity is greater; we will need to start versioning our breaks data structure (this isn't a huge deal) and we'll also have to add some mechanism for garbage collecting old revisions; basically anytime the frontend references a new revision we can get rid of all revisions older than that. This means that (assuming a well-behaved client) the worst case scenario is that we unnecessarily hold on to a complete duplicate set of breaks.
The trickier feeling bit will be deciding how to store and retrieve these revisions, and how to assemble the various bits of state necessary to handling a given RPC, while keeping core as tidy as possible.