This document proposes an alternate, human readable source map format, called the Simple format, that complements the current (version 3) format, called the Standard format in this document, and explains why it is needed.
Many people seem to think that source maps are difficult. It’s a black box you fiddle with when setting up a project. When you’ve finally made it so that you get the original source code file names as well as line and column numbers in the error console you can relax and hope that you’ll never need to touch that again.
Most people also seem to have no idea how to debug their source maps. All they can do is fiddle around in the console/devtools of their browser to try to understand what maps where and what’s really going on. That’s tremendously difficult.
I base the above statements on my experience on following and helping out in various CSS and JavaScript compilers and minifiers for a few years—both in code and in issue trackers.
This is a shame, because source maps are, in concept, very simple. The reason is, in my opinion, misinformation and focusing on the wrong things.
First off, I’ve seen countless of articles and blog posts linking to this article: http://www.html5rocks.com/en/tutorials/developertools/sourcemaps/
While the above article starts out great with simple step-by-step instructions on how to enable source maps in the browser, it then continues explaining the source map format. That’s where things start to go wrong.
At first, the source map format is very easy to explain. It’s simply a JSON object with clearly named, self-explaining keys. All keys map to easily understandable values—except “mappings”. It maps to a cryptic, long string. All you know is that that string somehow maps line and column numbers from a generated file back to the source file. That’s a shame, because it is the data in this string that is the gist of the source map, the interesting part. It’s here that things go wrong, and you’d need to look when debugging your source-map setup of your build tool. It is very annoying looking at that string when your source map is broken, knowing that the problem is somewhere in there, but you can’t see it. You can’t manually correct some of it, just to know that you’re on the right path.
The article actually goes on explaining how that cryptic string works, and it does it well. The point is—that step shouldn’t be needed. Nobody should need to know how that format works, except implementors and people who actually like that kind of thing. This is what scare people off.
The reason the “mappings” field consists of this cryptic string is size. In other words, what the HTML5 Rocks article ends up explaining, is more or less a compression algorithm—one specially made to compress source mappings.
Most people use gzip to compress their JavaScript and CSS before sending it over the Internet, but not many have a clue how gzip works. They don’t need to either. We only work with the uncompressed source code.
The problem with source maps is that the “mappings” field has no standard readable source format. There’s only the compressed format, which means that working with a source map is a bit like editing gzipped JavaScript and CSS files—nobody would do that!
The lack of a standard readable format makes it difficult to communicate about source maps. Imagine StackOverflow questions where people posted gzipped JavaScript and expected people to try to help.
Luckily, there are tools to help:
- http://sokra.github.io/source-map-visualization/
- https://gfx.github.io/source-map-inspector/
- http://sourcemapper.qfox.nl/ (old and defunct)
- http://murzwin.com/base64vlq.html
The first three of those are visual tools, which show the original and generated code side by side and visually show the mappings between them. That’s great and something that is really useful for both compiler writers and source map users trying to debug them.
The interesting thing, though, is that the first of the tools also has a readable text representation of the source map. So has the last one. Looking at those together with a visualization is very relieving. You actually understand what’s going on! This indicates that there is a common need for a human readable format—why would people create these tools otherwise?
I wish that there was a standardized human readable format for source maps, so people could just open up their map.json file, have a look at it and actually understand something. I think that would greatly help reduce the common belief that source maps are difficult and black magic (I’ve actually seen people calling it black magic in an issue tracker).
Another benefit of a standard human readable format is unit testing of source map. We’ll get back to that.
Yet another benefit is that a human readable format is actually useful to computers as well. It is easy to generate, and easy to consume. We’ll get back to that as well.
Using an uncompressed format can also have performance gains. Many times, the same source file is piped through several compilers/optimizers, each of which update the source map. Not having to uncompress and re-compress between each step should enhance performance. This is a real problem, as can be seen how much work has been put into performance of the mozilla/source-map module.
When designing the format described here, I just tried to do the simplest thing possible. When I was done, I realized that I had more or less re-invented the format used in the http://sokra.github.io/source-map-visualization/ visualization tool mentioned above. Seeing that somebody else (Tobias Koppers) had come up with the same format before felt good. It indicates that it is on the right track.
First off, the proposed source map format is identical to the current standard. It is a JSON object with the same keys as the standard. The only difference is that the value of the “mappings” field is not a string, but an object, described below. This means that the Simple format is 100% “backwards compatible”. All a source map consumer needs to do is to check the type of the “mappings” field. If it is a string—uncompress it. If it is an object—consume it.
The keys of the “mappings” object are line numbers in the generated file. The value of each key is an object that holds all the mappings for that particular line.
The keys of those line objects are column numbers in the generated file. The value of each key is an array that tells where this particular (line,column) pair maps to.
Those arrays can be empty, which indicates that no original location is associated with that segment, or contain three or four integers:
- The source file index into the “sources” field.
- Source file line number.
- Source file column number.
- Optional: An index into the “names” field.
(Note how the above maps nicely to the fields of the segments in the Standard format.)
Example:
{
"version": 3,
"sources": ["foo.js"],
"names": [],
"mappings": {
"1": {"0": [0, 5, 5]},
"2": {"2": [0, 6, 6], "4": [], "8": [0, 6, 8]}
}
}
(Note that this format by design does not allow duplicate mappings.)
The reason the Standard format was invented was because of size. The spec also allows to gzip source maps, which would save space further, but I’ve never seen anyone doing this in practice. Instead, many choose to inline their source maps into the generated files as base64 encoded URLs, which increases the size by 33%. So it seems that size is important, but not that important.
I’ve done some real-world testing on the size impact of the source maps. The results of the tests as well as everything you need to re-run them are available here: https://github.com/lydell/ostio/tree/simple-maps
Here is a summary of the tests:
- The Simple format is 3–4 times larger than the Standard format.
- Gzipping the Simple format results in about the same size as the Standard format or less.
- Gzipping both, the Simple format is about 2–10 times larger. (This does not make the Simple format huge, it’s just that the Standard format seems to gzip better.)
This is, in my opinion, entirely reasonable. The largest uncompressed source map in the tests became 392K. The biggest use case for source maps is consuming them locally while developing, and then 392K is not a problem at all. If that source map needs to be served over the Internet, there’s nothing stopping you from compressing it to the Standard format and the gzipping it down to 20K.
It would be interesting to see if there is any performance gain of using the Simple format. I’d guess there is.
Many tools, especially CSS compilers, have test fixtures consisting of expected CSS output. This makes sense, since many of those tools are supposed to have a specific output style that needs to be tested. CoffeeScript, on the other hand, do not have such fixtures; it only tests that CoffeeScript programs do what they should. The following applies only to compilers with expected output fixtures.
I’ve seen CSS compilers adding source map support by using one of the visualizers linked to above to verify the source map. When finishing the implementation, fixtures of the entire source maps were saved along with the expected output. For a tool that has expected output fixtures it makes sense to test the entire source map this way. The bad thing is that when a test fails, most likely the “mappings” strings did not match. It is very difficult to tell what went wrong: You know that something went wrong with the source map generation, but you have no clue why. You could use a visualizer in this case again, but then you’d have to manually spot the mistake visually, which can be quite difficult.
Using the Simple format, paired with an assertion library with object diffs, such as Unexpected, you’d instantly see that a column number there should be 5 instead of 6, that you’re missing a mapping there and have one too many there. This gist includes mocha test cases that compares failing test cases between the Simple format and the Standard format. It is very easy to run:
- Clone this gist with
git
. - Run
npm install
. - Run
npm test
.
I encourage you to spare two minutes doing the above! The difference is amazing!
(It might be said that doing deep object comparisons for source maps is a bad way to test, because two source maps can be equivalent while not deeply equal, thus making you test implementation details. For example, the “sources” and “names” arrays could have different orders (and each mapping then different source and name indexes). I consider this a theoretical problem, though, since I’ve never seen a compiler populating those arrays in any other order than the order they appear in the generated file, and I’ve never seen a source map test fail for this reason. Generated files are after all generated from start to end, populating the sources and names arrays along the way in the order they are found.)
In other words, the Simple format would allow a dead simple and good way to test source maps suitable for many tools.
Consuming a source map in the human readable format is very simple. To find the original location for line and column pair (L,C) in a generated file:
function getOriginalLocation(map, line, column) {
if (!(line in map.mappings)) return null
let lineMappings = map.mappings[line]
let closestColumn
if (column in lineMappings) {
closestColumn = column
} else {
closestColumn = Object.keys(lineMappings)
.reduce((closestColumn, key) => {
key = Number(key)
return key > closestColumn && key < column ? key : closestColumn
}, -1)
if (closestColumn === -1) return null
}
let mapping = lineMappings[closestColumn]
if (mapping.length === 0) return null
return mapping
// Alternatively:
return {
source: map.sources[mapping[0]],
line: mapping[1],
column: mapping[2],
name: mapping.length === 4 ? mapping[3] : null,
}
}
As an optimization, one could cache a sorted Object.keys(lineMappings)
,
allowing for a binary search. Note that Node.js tools have an advantage here:
Node.js is V8-based, and V8 automatically keeps integer-looking keys sorted!
Of course, the above should be in a library. I’m just pointing out that the Simple format is easy to work with in code, too.
Iterating over all mappings:
Object.keys(map.mappings).forEach(line => {
Object.keys(map.mappings[line]).forEach(column => {
let mapping = map.mappings[line][column]
doSomethingWith(line, column, mapping)
})
})
The Simple format is very hackable. You can very easily tweak a line or column number, for the purposes of experimenting, testing and doing that one-off manual job.
You could, for example, manually concatenate two files (that have source maps)
with cat
, bump the line numbers of the latter file’s source map with
map-obj and then merge the source
maps more or less with Object.assign()
.