Reading lit-html

I've been interested in lit-html for a while now: it's an HTML templating solution that doesn't require a build step like JSX, uses standard JavaScript tagged literals for its syntax, and claims to selectively update the DOM without a V-DOM. At 8KB minified, it's also reasonably-sized--not quite as tiny as my go-to dot.js micromodule, but the functionality is much higher.

In the past, reading through library code has been one of the ways that I got better at development. I remember Paul Irish's classic "10 things I learned from reading the jQuery source" video, which had inspired me to do my own spelunking back in the day. Figuring out how jQuery, D3, and Backbone translated high-level function calls into the low-level browser API taught me a lot, and I figured lit-html would be a similar chance to learn something new.

I was right! It turns out, there's a lot going on under the hood of lit-html. Even better, much of it is built on top of platform features, as opposed to pure JS abstractions (which seems common in the React community), which means I can take these lessons and apply them to new projects in any framework.

Since writing and teaching is the last step for my retention, let's take a walk through rendering a template in lit-html. This won't cover all the details, but it's enough to understand conceptually what's happening when you use the library to render to a component, and to catch some edge cases that crop up.

Using lit-html: a review

Before we get started, let's go over how lit-html works from the dev side. Essentially, it's a pair of functions imported from the module: html() is used to tag template strings, and render() does the actual DOM updates.

import { html, render } from "./lit-html.js";

var templateFunction = data => html`
  <h1>Hello, ${data.name}!</h1>
  <input type="color" value=${data.color}>
`;

// this will fill the body with our completed template
render(templateFunction({ name: "world", color: "#880088" }), document.body);

// now we can update it selectively, just changing the color:
render(templateFunction({ name: "world", color: "#FF0000" }), document.body);

Pretty simple! You can use this standalone, but of course it's more typically for it to be integrated into a component framework, the way it is for lit-element.

`html()` for tagging templates

Let's start with the easy part: the html() function is used to tag a JS template literal and convert it into a structure that lit-html understands. The documentation says this is "fast and cheap," which it is, but mostly because it's basically just storing the arguments of the function in an object. Here's the actual code that powers html(), slightly condensed:

const tag = (type) => (strings, ...values) => {
  return {
      // This property needs to remain unminified.
      ['_$litType$']: type,
      strings,
      values,
  };
};

const html = tag(HTML_RESULT);

In addition to the HTML templates, lit-html can generate SVG templates (which play by slightly different rules), so there's a factory named tag() to generate the actual html() function. When you use a function to "tag" a JS template literal, it gets called with an array of constant strings (i.e., the stuff that isn't interpolated), plus a variable number of arguments for the interpolated values. lit-html just jams both of those into an object, adds an ID for the type of template, and calls it a day.

html`<h1>${"hello"}</h1>`;

returns:

{ 
  _$litType$: HTML_RESULT,
  strings: ["<h1>", "</h1>"],
  values: ["hello"]
}

So yeah, fast and cheap makes sense, considering it's not really doing anything much. The meat is all in the actual render() call.

`render()` - an overview

Rendering a template to a DOM node also looks pretty simple at first, at least conceptually. When you call render() and pass in a value and a container, the function does the following:

Checks to see if the container has been updated before.
If so, get the TemplatePart instance that was cached there.
If not, generate a TemplatePart and cache it on a _$litPart$ property on the Node.
Either way, call _$setValue() on the TemplatePart to actually update it.

A TemplatePart is how lit-html connects chunks of template to a DOM structure. It may be responsible for a large section (the one that's attached to the container essentially maps the template to the element's entire contents), or it might just control a subsection, like an attribute or a repeated list. The most common is ChildPart for replacing chunks of DOM content, but some of these have specialized classes that implement the same _$setValue() interface, but operate on different underlying APIs, like AttributePart (for attributes, obviously) and ElementPart (for cases where the tag name is dynamic).

We'll come back to _$setValue() in a minute. But first, we need to talk about how the TemplatePart takes that tagged TemplateResult from html() and turns it into an actual DOM structure.

Generating templates from strings

This is where things got complicated.

Let's assume this is our first time rendering to a container, so we need to generate a new TemplatePart to be the "root" of the template. As I mentioned, these objects serve as a way to map a template to a chunk of the document tree--remember, lit-html doesn't use virtual DOM, any updates are performed directly on the page itself. So the first cool thing that render() does is to create an inert marker that represents where the TemplatePart is hosted, but won't affect CSS or HTML structure.

It does this using an HTML Comment node, with at least one at the start and often (in the case of iterable lists or inline values in a larger text block) another marking the end of the TemplatePart. If the part needs to be re-rendered, everything between the two markers can be cleared. But since they're comments, they don't affect CSS that relies on element positioning/order, like :nth-child(), and they won't show up when you're querying for HTML elements. In the inspector, these look like .

With the markers in place, the TemplatePart can do the actual template parsing, which it accomplishes using an actual <template> tag! These elements were created a few years back, when it was common for frameworks like Backbone to load HTML from <script> tags marked with type="text/html" or something similar. The idea was that you could switch to something more semantic, and it would provide extra capabilities built-in.

<template> tags were largely seen as a misfire by the time that they got from conceptualization to shipping standard. By that point, React's JSX had tackled the same problem--it required a build step, but the ergonomics and infrastructure were often better for client-side apps, since <template> needed to be injected into HTML, instead of simply going into the same JS bundle as the rest of your code.

But <template> turns out to be really useful if you want to turn a string into DOM and manipulate it, because (as opposed to the contents of a <script> tag), it parses its text into an inert DocumentFragment that can be cloned and queried, but won't load scripts, images, or other resources. And it can be manipulated using the same methods that we use elsewhere in the DOM, including those marker comments that lit-html uses to demarcate TemplatePart sections.

So: the TemplatePart finally takes the result object from our html() call, and it runs through the non-interpolated strings, one character at a time, so that it knows whether the string ends in:

a section of text inside an element,
an attribute value,
an attribute name, or
an element name

The end of each string is the start of an interpolated value, so when it reaches that point, it injects a marker to represent that placeholder, either a comment node (for text values) or a special attribute name (for everything else). When all the strings are parsed, we have a big chunk of HTML that can go into our <template>, and will come out the other end as valid DOM.

Once the <template> constructs that DOM tree, lit-html creates a TreeWalker, which is a classic DOM interface for stepping through every node (including text and comments) and attribute in a section of the page. The walker steps through template's DocumentFragment, and creates a tree of TemplateParts from the markers it encounters. This sounds redundant--you already stepped through the template letter-by-letter, and now you're going to run through it again?--but it means that instead of writing a custom parser, lit-html just relies on the browser, so its behavior will match what developers would expect. Walking the template tree does take a little time, but it's only done once per TemplatePart assignment to find marker locations, and then all this is cached for future updates. The template tag is also cached, so that multiple page sections using the same template (think list items) don't need to go through the whole marker injection process again.

Committing values

Now that we have our template DOM and a corresponding tree of TemplatePart objects, we can apply the actual values and render it out. This brings us back to _$setValue(). Our root TemplatePart has an array of "parts" inside of it, each of which is associated with a placeholder marker where a value should go. What it does on render depends on the type the argument passed to _$setValue():

If it's a TemplateResult, we'll loop through the "values" array that html() generated for us and use _$setValue() on the corresponding TemplatePart placeholders generated above--essentially, we recurse for this case.
If it's a primitive value (text or number), it'll just set the innerHTML of a TextNode after the marker comment.
If it's an iterable, it removes everything between the start and end markers, and replaces it with the output from the array, which is probably a collection of TemplateResult objects.
If it's an element (say, if you wrote hasError ? errorTemplateResult : successTemplateResult), it'll replace that chunk of the DOM tree with the new node.

This is how lit-html creates selective updates without diffing the DOM: its inner data structures effectively encode a mapping of values to live DOM markers by index, and each mapping "knows" how to update itself. Some simple optimizations (only commit a value if it's different from the last commit, cache template sections aggressively) make this extremely fast.

An additional optimization is "directives," which is a common interface that can be used to minimize DOM changes. For example, you can use map() to generate an array of template results for rendering, but it'll replace the entire contents of the list if they change. To be more efficient, the repeat() directive will compare each value to the previous state and only make changes where needed. How this works is left as an exercise for the reader.

One last trick

I spent most of my time in this investigation just reading the source code for lit-html. I'm not great with TypeScript, but once I understood what each piece was doing, it was pretty easy to step through it mentally without actually needing to execute the code--render() is a mostly linear process, so I didn't need to keep too many things in my head at once.

But before writing this up, I decided to grab the unminified version of the code, add some comments, and step through it to make sure I understood everything. And that's when I hit some weird behavior. Consider the following test setup:

import { html, render } from "./lit-html.js";

render(html`<h1>${"hello, world"}</h1>`, document.body);

// get the h1 tag
var h1 = document.querySelector("h1");

// re-render
render(html`<h1>${"goodbye, world"}</h1>`, document.body);
var h1b = document.querySelector("h1");

console.log(h1 == h1b); // false?

lit-html is supposed to be selectively updating the DOM, and the structure hasn't changed, but that <h1> is brand-new after the second render. This also carried through to things like <input> tags, which would lose their value or their focus state if I re-rendered after typing into them. I knew this isn't supposed to happen!

It got weirder, since if the template was wrapped in a function, everything worked just fine:

import { html, render } from "./lit-html.js";

var template = input => html`<h1>${input}</h1>`;

render(template("hello, world"), document.body);

// get the h1 tag
var h1 = document.querySelector("h1");

// re-render
render(template("goodbye, world"), document.body);
var h1b = document.querySelector("h1");

console.log(h1 == h1b); // true!?!?

What the hell is going on? Both methods are feeding identical TemplateResult objects to render(), so why does one correctly change the text only, and one completely re-renders the contents of the body?

This took me down a rabbit hole of tracing the code, and I soon realized that in the first case, lit-html saw these as completely different templates, whereas in the second it correctly identified that they were identical and re-used the cached root TemplatePart. I assumed that the cache was based on the static template content, but this turned out to be false: lit-html uses a WeakMap to look up templates in the cache by their "strings" property. Which (as far as I knew) shouldn't work, since "strings" would normally be a fresh array with a different identity after each call to template(). Clearly, something else was operating outside the normal JavaScript semantics I was used to.

The answer, as it turns out, is deep in the implementation details of tagged template literals. When we apply a function to a tagged literal, the string constant array that the tag function gets as its first argument is actually a special kind of array that the VM caches, so repeated invocations get a reference to the same frozen array object as their argument. But, crucially, the array isn't cached based on its contents, but on a combination of contents and the source location.

So when we write a template as a function that then calls html() and returns the result, the resulting "strings" array will have the same reference identity each time, because that function only lives in one place in the source code. But if we invoke html() on identical literals in different locations, the browser will return references to new string arrays for each one, and lit-html's caching can't recognize them as the "same" template for updates:

// these will be seen as different
var a = html`hello`;
var b = html`hello`;
console.log(a.strings == b.strings); // false

// but a single function will be cacheable based on "strings" identity
var t = () => html`hello`;
var c = t();
var d = t();

// c and d are distinct objects
console.log(c == d); // false
// but they have the same "strings" identity for caching
console.log(c.strings == d.strings); // true!

// a new, identical function will have a new "strings" identity
var y = () => html`hello`;
var e = y();

console.log(c.strings == e.strings); // false

So there you have it: a tiny detail in the specified behavior for tagged template literals, and a clever hack around the library's caching behavior, combine to create counterintuitive implications for how efficient your rendering will be under lit-html. The framework guide says to use functions for dynamic templating, but they never explain why. A cache based on concatenated template contents would use a little more memory, but probably not significantly more, and its behavior would be more predictable.

For me, a person who does enjoy learning about and knowing the weird corners of browser behavior, this serves as a cautionary tale. No matter how neat it is to find a fun trick like this, especially if it gives you a performance boost in the general case, it needs to be documented at least--and maybe even eschewed entirely, if the potential confusion outweighs what little performance gain you might make.

thomaswilburn/lit-html.md