Extensionless NodeJS Executable using non-default Node Options

The Issue

Commands installed into system folders should not have extensions. Its not relavant to the users of commands what language a command is written in. OSs use the shebang syntax to determine what engine should process a text script file.

$ cat /user/bin/mycmd 
#!/usr/bin/env node
console.log('Hello World');

$ mycmd 
Hello World
$

However, if your JS code in the file requires node to be invoked with an option, not all OS supported by node will accept options in the shebang line so there is no way to communicate to node from the script file what options it should adopt to run the file as intended.

The Solution

This is a pattern for resolving this issue. We subclass the node command to make a variant that adopts a different set of defaults. This should not be used for options that a user would likely want to vary from run to run, but only for options that support a different class of script files. When developers run commands with special options, they can invoke node explitly. Users of system commands, on the other hand, should not be aware of the language that the command is implemented in.

$ cat /usr/bin/node-esm
#!/usr/bin/env bash
node --input-type=module "${@:2}" < "$1"
$
$ 
$ cat /usr/bin/mycmd
#!/usr/bin/env node-esm
import { adder } from './mylib.mjs'
console.log('now I know that 2+2 is '+adder(2,2));
$
$ cat mylib.mjs 
export function adder(a,b) {return a+b;}
$ 
$ mycmd 
now I know that 2+2 is 4
$

Its good practice to make the new command support the same syntax of the base node command.

Node does not support doing this correctly for Modern JS Standards in JS Installed Command.

The Node team implemented a new option in 12.x to support using modern syntax in 'input' source where extesions are not available to indicate the type of the source but incredibly, they deliberately choose that it should not apply to extensionless input files. See #32316 (or goto the wikipedia page for "design by committee" which I sure just redirects to that and its related issues).

node-esm is required in order to write system commands to the modern javascript standard. I personally believe that could be a growth area for the node project to make node compete favorably to other local application languages that do not have easy access to convenient UI features.

This implementation of node-esm has two flaws that stem from the --input-type node option being implemented inconsistently. First, it does not support the same command line syntax as node. You can not pass in other options that a user might want to specify orthogonally to the ESM module issue. In order to do that with the current --input-type behavior, the script would need to repeat the options processing code in node to be able to identify which token is the input file that should be redirected to node's stdin. That would be significant work and brittle going forward as node options change. I think it will be a source of confusion and bugs to people who want to write JS commands but do not want to be indoctrinated into the nodejs web devlopment community lifestyle.

Second, the compliler and debuggers, and execption reporting will not will not report source references as coming from the command script file because node considers that cntent as an anonymous stdin stream.

If the --input-type applied to extensionless input files the script would simply pass the command line passed to it, onto node, prepending the --input-type="module" option. That would solve both problems and make node-ems trivial, generic, and likely to never have bug reports filed against it.

Why?

There is a legit need for a mechanism that allows JS file authors to communicate the standard that their file adheres to.

The new .msj and .cjs extensions allow explicit declaration for library source files
.js needs to be .cjs by default to support much existing code but the node team supports the future by using the package.json["type":"module] property in a package folder to change the default for .js files in that folder hierachy. (well done)
there is also string source content that does not have extension information. ** stdin stream to node ** the -e string option to node ** an extensionless input file (note that an input file could have an extension and in that case it is not in this category) ** code eval statements within source files (note that this is never "input" from node's point of view -- its some unknonwn thing that only the source author knows).

In my opinion the node team got several things wrong on that last point because they all are inside the web development ecosystem and therefore no one in the long and sordid converstion on the topic brought up the actual structure inherent in the problem domain. It was all andedotal about what they or user reports wanted to accomplish.

They appearantly equated the long established nodejs feature of allowing the author to leave out the file extension when importing (aka requiring) a file with actual extensionless input files

That is just a mistake. They are not the same thing.

import { Foo } from 'foo'
and
$ node [options] foo

The entensionless import line is just a convenience to the author but it has to resolve at runtime to a file or folder, and there is no use-case voiced where that file or folder does not have unambigous information. A file it resolves to will have an extension because libraries have extensions and a folder will have a package.json or rely on the default commonjs behavior.

The node invocation line really referes to a file that does not have any extension information. There was no use-case voiced other than the case of commands in an OS not using extensions.

The difference is not ambiguous at all. If the input file has an extension, the author of that file has made their choice and it should be observed.

They grouped String Content sources incorrectly.

** stdin: stdin stream to node ** -eOpt: the -e string option to node ** cmdFile: an extensionless input file ** eval: code eval statements within source files

One of these things is not like the others but the node team chose the wrong one.

They grouped (stdin, -eOpt, eval) as common cases to be controlled by the new --input-type option. They excluded cmdFile.

The correct grouping is (stdin, -eOpt, cmdFile) and eval should be excluded.

(stdin, -eOpt, cmdFile) are fundamentally the same thing -- they are ways to pass node the entry point source that it will execute. cmdFile is unique in that group in that the authors may name the file with and extension, thus identying its type.

eval is something else that only the source author knows. The source author should have predictability and control in how it is interpreted, but that is decidily not the --input-type directly. I will refrane from going down that path here but I will say that eval looks a lot like the situaation of importing a .js file.

Is there ambiguity in the case where both --input-type and the cmdFile extention are specified? No. The file extension, if present, is direct information from the author of the file and should be honoured. --input-type is specifying the input type default when the information is not avaliable from the source.

What about if --input-type is presnt and the cmdFile extension is .js? The .js handling is already a contextual default algorithm where the most specific information (the nearest package.json) wins. It would be reasonable to insert the information from --input-type into the contextual chain only for the cmdFile (not other inports) but where is not that clear. This is such a narrow use-case that it does not matter unless someone voices a use-case. So let it do what the natual code chage does and then document it in the --input-type description.

bobjunga/extensionless-nodejs-executable.md