Skip to content

Instantly share code, notes, and snippets.

@lynsei
Last active March 28, 2022 15:57
Show Gist options
  • Save lynsei/608936396a3b7086ddf8109bc7ff875e to your computer and use it in GitHub Desktop.
Save lynsei/608936396a3b7086ddf8109bc7ff875e to your computer and use it in GitHub Desktop.
[es6-cjs] #murder #hornets because every project seems to suffer in the ecosystem of #nodejs. It is probably #ryan #dahls fault somehow.

"Murder Hornets attacked my code.

"this code never had errors before, but now through the magic of async functions, it has all sorts of problems. Why do node CommonJS authors hate ESM authors so much?" --- some guy on the internet

a.k.a. What it feels like to work in Typescript sometimes

Look folks, the Node and ECMAScript (ESM) developers don't hate each other (too much)! Nor do the users or developers of these two subtly different loading systems for Node packages, because we ALL use them, and we also use many of the NPM features that go hand-in-hand like workspaces, modules, linking etc. We don't hate anyone because we aren't 12 years old, yet, either!

The problem is there are subtle differences between ECMA Script 6 or ES6, and CommonJS. These subtle differences make me curse Ryan Dahl, but that's okay. I mean who doesn't? Imports are newer and much more complex, but they are built in to the Javascript Language, after all. Somehow I still think Ryan is responsible for this. In 2022 it seems like every other package or codebase I touch has a problem with CJS/ESM due to these sorts of bugs, THANKS RYAN! Nobody seems to have any idea what to do to stop projects from suffering across the ecosystem from import/require murder hornets. I'm supposed to be writing cool AI shit right now, but instead I'm fixing import statements!

I stole the rest of this article from RedFin's Principal Engineer, but like, if one architect steals another engineer's article to publish a gist in the woods, does it make a sound? ...and since this is all open source MIT/ISC banter doesn't that just mean they wind up cancelling each other out? I hope he has a laugh about it, anyhow.

Background: What’s CJS? What’s ESM?

Since the dawn of Node, Node modules were written as CommonJS modules. We use require() to import them. When implementing a module for other people to use, we can define exports, either “named exports” by setting module.exports.foo = 'bar' or a “default export” by setting module.exports = 'baz'.

Here’s a CJS example using named exports, where util.cjs has an export named sum.

// @filename: util.cjs

module.exports.sum = (x, y) => x + y;// @filename: main.cjs

const {sum} = require('./util.cjs');

console.log(sum(2, 4));

Here’s a CJS example where util.cjs sets a default export. The default export has no name; modules using require() define their own name.

// @filename: util.cjs

module.exports = (x, y) => x + y;// @filename: main.cjs

const whateverWeWant = require('./util.cjs');

console.log(whateverWeWant(2, 4));

In ESM scripts, import and export are part of the language; like CJS, they have two different syntaxes for named exports and the default export.

Here’s an ESM example with named exports, where util.mjs has an export named sum.

// @filename: util.mjs

export const sum = (x, y) => x + y;// @filename: main.mjs

import {sum} from './util.mjs'

console.log(sum(2, 4));

Here’s an ESM example where util.mjs sets a default export. Just like in CJS, the default export has no name, but the module using import defines its own name.

// @filename: util.mjs

export default (x, y) => x + y;// @filename: main.mjs

import whateverWeWant from './util.mjs'

console.log(whateverWeWant(2, 4));

ESM and CJS are completely different animals

In CommonJS, require() is synchronous; it doesn't return a promise or call a callback. require() reads from the disk (or perhaps even from the network), and then immediately runs the script, which may itself do I/O or other side effects, and then returns whatever values were set on module.exports.

In ESM, the module loader runs in asynchronous phases. In the first phase, it parses the script to detect calls to import and export without running the imported script. In the parsing phase, the ESM loader can immediately detect a typo in named imports and throw an exception without ever actually running the dependency code.

The ESM module loader then asynchronously downloads and parses any scripts that you imported, and then scripts that your scripts imported, building out a “module graph” of dependencies, until eventually it finds a script that doesn’t import anything. Finally, that script is allowed to execute, and then scripts that depend on that are allowed to run, and so on.

All of the “sibling” scripts in the ES module graph download in parallel, but they execute in order, guaranteed by the loader specification.

CJS is the default because ESM changes a lot of stuff

ESM changes a bunch of stuff in JavaScript. ESM scripts use Strict Mode by default (use strict), their this doesn't refer to the global object, scoping works differently, etc.

This is why, even in browsers, <script> tags are non-ESM by default; you have to add a type="module" attribute to opt into ESM mode.

Switching the default from CJS to ESM would be a big break in backwards compatibility. (Deno, the hot new alternative to Node, makes ESM the default, but as a result, its ecosystem is starting from scratch.)

CJS can’t require() ESM because of top-level await

The simplest reason that CJS can’t require() ESM is that ESM can do top-level await, but CJS scripts can't.

Top-level await lets us use the await keyword outside of an async function, at the “top level.”

ESM’s multi-phase loader makes it possible for ESM to implement top-level await without making it a “footgun.” Quoting from the V8 team’s blog post:

Perhaps you have seen the infamous gist by Rich Harris which initially outlined a number of concerns about top-level await and urged the JavaScript language not to implement the feature. Some specific concerns were:

• Top-level await could block execution.

• Top-level await could block fetching resources.

• There would be no clear interop story for CommonJS modules.

The stage 3 version of the proposal directly addresses these issues:

• As siblings are able to execute, there is no definitive blocking.

• Top-level await occurs during the execution phase of the module graph. At this point all resources have already been fetched and linked. There is no risk of blocking fetching resources.

• Top-level await is limited to [ESM] modules. There is explicitly no support for scripts or for CommonJS modules.

(Rich now approves of the current top-level await implementation.)

Since CJS doesn’t support top-level await, it’s not even possible to transpile ESM top-level await into CJS. How would you rewrite this code in CJS?

export const foo = await fetch('./data.json');

It’s frustrating, because the vast majority of ESM scripts don’t use top-level await, but, as one commenter wrote in that thread, “I don’t think designing a system with the blanket assumption that some feature just won’t get used is a viable path.”

There’s an active debate on how to require() ESM in this thread. (Please read the whole thread and the linked discussions before commenting. If you dive in, you’ll find that top-level await isn’t even the only problematic case… what do you think happens if you synchronously require ESM which can asynchronously import some CJS which can synchronously require some ESM? What you get is a sync/async zebra stripe of death, that’s what! Top-level await is just the last nail in the coffin, and the easiest to explain.)

Reviewing that conversation, it doesn’t look like we’re going to be able to require() ESM any time soon!

CJS Can import() ESM, but It’s Not Great

For now, if you’re writing CJS and you want to import an ESM script, you’ll have to use asynchronous dynamic import().

(async () => {
	
	const {foo} = await     
	import('./foo.mjs');

})();

It’s… fine, I guess, as long as you don’t have any exports. If you do need to do some exports, you’ll have to export a Promise instead, which may be a huge inconvenience to your users:

module.exports.foo = (async () => {
	const {foo} = await import('./foo.mjs');
	return foo;

})();

ESM can’t import named CJS exports unless CJS scripts execute out of order

You can do this:

import _ from './lodash.cjs'

But you can’t do this:

import {shuffle} from './lodash.cjs'

That’s because CJS scripts compute their named exports as they execute, whereas ESM’s named exports must be computed during the parsing phase.

Fortunately for us, there’s a workaround! The workaround is annoying, but totally doable. We just have to import CJS scripts like this:

import _ from './lodash.cjs';

const {shuffle} = _;

There are no real downsides to this, and ESM-aware CJS libraries can even provide their own ESM wrappers that encapsulate this boilerplate for us.

This is totally fine! I just… wish it were better.

Out-of-order execution would work, but it might be even worse

A number of people have proposed executing CJS imports before ESM imports, out of order. That way, the CJS named exports could be computed at the same time as ESM named exports.

But that would create a new problem.

import {liquor} from 'liquor';

import {beer} from 'beer';

If liquor and beer are both initially CJS, changing liquor from CJS to ESM would change the ordering from liquor, beer to beer, liquor , which would be nauseatingly problematic if beer relied on something from liquor being executed first.

Out-of-order execution is still under debate, though the conversation seems to have mostly fizzled out a few weeks ago.

Dynamic Modules could save us, but their star is poisoned

There’s an alternative proposal that doesn’t require out-of-order execution or wrapper scripts, called Dynamic Modules.

In the ESM specification, the exporter statically defines all named exports. Under dynamic modules, the importer would define the export names in the import. The ESM loader would initially just trust that dynamic modules (CJS scripts) would provide all required named exports, and then throw an exception later if they didn’t satisfy the contract.

Unfortunately, dynamic modules would require some JavaScript language changes to be approved by the TC39 language committee, and they do not approve.

Specifically, ESM scripts can export * from './foo.cjs', which means to re-export all of the names that foo exports. (This is called a “star export.” 🤩)

Unfortunately, there’s no way for the loader to know what’s being exported when we star export from a dynamic module.

Dynamic-module star exports also create issues for spec compliance. For example, export * from 'omg'; export * from 'bbq'; is supposed to throw when both omg and bbq export the same named export wtf. Allowing the names to be user/consumer-defined means this validation phase needs to be post-handled / ignored somehow.

Proponents of dynamic modules proposed banning star exports from dynamic modules, but TC39 rejected that proposal. One TC39 member referred to this proposal as “syntax poisoning,” as star exports would be “poisoned” by dynamic modules.

This poison star is very angry with you. Credit: seekpng

(In my opinion, we’re already living in a world of syntax poison. In Node 14, named imports are poisoned, and under dynamic modules, star exports would be poisoned. Since named imports are extremely common and star exports are relatively rare, dynamic modules would reduce syntax poison in the ecosystem.)

This may not be the end of the road for dynamic modules. One proposal on the table is for all Node modules to become dynamic modules, even pure ESM modules, abandoning the ESM multi-phase loader in Node. Surprisingly, this would have no user-visible effect, except perhaps slightly worse startup performance; the ESM multi-phase loader was designed for loading scripts over a slow network.

But I don’t feel lucky. The Github issue for dynamic modules was recently closed, because there has been no discussion of dynamic modules in the last year.

One more idea is in the air, to make a best-effort attempt to parse CJS modules to detect their exports, but this approach can never work in 100% of cases. (The latest PR works on just 62% of the top 1,000 modules on npm.) Since the heuristics are so unreliable, some members of the Node modules working group are opposed to it.

ESM can require(), but it’s probably not worth it

require() is not in scope by default in ESM scripts, but you can get it back very easily.

import { createRequire } from 'module';

const require = createRequire(import.meta.url);

  

const {foo} = require('./foo.cjs');

The problem with this approach is that it doesn’t really help; it’s actually more lines of code than just doing a default import and destructuring.

import cjsModule from './foo.cjs';

const {foo} = cjsModule;

Plus, bundlers like Webpack and Rollup have no idea what to do with this createRequire pattern. So what’s the point?

How to Create a Good “Dual Package” Containing Both CJS and ESM

If you maintain a library today that needs to support CJS and ESM, do your users a favor and follow these guidelines to create a “dual package” that works great in CJS and ESM.

1. Provide a CJS version of your library

This is for the convenience of your CJS users. This also ensures that your library can work in older versions of Node.

(If you’re writing in TypeScript or another language that transpiles to JS, transpile to CJS.)

If your library only provides a default (unnamed) export, you’re done. You don’t need to do anything further if you expect users to write import mylibrary from 'mylibrary'.

If your users expect to use named exports, e.g. import {foo} from 'mylibrary', read on.

2. Provide a thin ESM wrapper for your CJS named exports

(Note that it’s easy to write an ESM wrapper for CJS libraries, but it’s not possible to write a CJS wrapper for ESM libraries.)

import cjsModule from '../index.js';export const foo = cjsModule.foo;

Put the ESM wrapper in an esm subdirectory, alongside a one-line package.json file that says {"type": "module"}. (You could rename your wrapper file to .mjs instead, and that will work fine in Node 14, but some tools don’t work well with .mjs files, so I prefer to use a subdirectory.)

Avoid double transpiling. If you’re transpiling from TypeScript, you could transpile to both CJS and ESM, but this introduces a hazard that users may accidentally both import your ESM scripts and require() your CJS separately. (For example, suppose one library omg.mjs depends on index.mjs, while another library bbq.cjs depends on index.cjs, and then you depend on both omg.mjs and bbq.cjs.)

Node normally dedupes modules, but Node doesn’t know that your CJS and your ESM are the “same” files, so your code will run twice, keeping two copies of your library’s state. That can cause all kinds of weird bugs.

3. Add an exports map to your package.json

Like this:

"exports": {

"require": "./index.js",

"import": "./esm/wrapper.js"

} 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment