Last active
April 19, 2022 20:57
-
-
Save isaacs/cf8684d4c4c4ae720f08fca5d073c441 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{"name":"hello-world-bespoke-archive-format","version":"1.0.0","main":"lib/index.js"} | |
console.log('hello, world!') | |
# hello-world-bespoke-archive-format | |
An example of a "hello world" program, but instead of being a tarball, it's | |
shown in the bespoke package format that npm *SHOULD* have used, instead of | |
tar. | |
One can be forgiven for not wanting to reinvent the wheel, but let this be a | |
lesson that, in fact, some wheels _ought_ to be reinvented, when the | |
alternative does not roll as easily. | |
{"package.json":[0,86],"lib/index.js":[86,29],"README.md":[115,379]}} | |
00000000000000000000000000000070 |
AND AnOThER ThiNG!! If you nest these files within one another, then the parent can re-index the files of the children. So bundleDependencies could have something like:
{"name":"module","version":"1.2.3","bundleDependencies":["dep"],"dependencies":{"dep":"1"}}
require('dep').doSomething() // this is module's index.js
{"name":"dep","version":"1.5.4","main":"index.js"}
exports.doSomething = () => console.log('this is dep')
{"package.json":[0,59],"index.js":[59,100]}
00000000000000000000000000000044
{"package.json":[0,99],"index.js":[99,123],"node_modules/dep.npm":[222,192,{"node_modules/dep/package.json":[222,59],"node_modules/dep/index.js":[281,100]}]}
00000000000000000000000000000159
So any .npm
file that shows up in the archive has its entries automatically added to the index. Want to unpack the bundled deps? Easy peasy! They're right there! Want to just do 1 level? Or just 2 levels? Also easy!
Jesus, I might have to sit down and write this some day just to get it out of my system.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is what I should have done instead of using tar for npm packages.
package.json
is always first, andJSON.stringify()
-ed with no indentation and the"name"
and"version"
moved to the top of the object, and a\n
appended. No other files are mutated in any way.[start, length]
as the value, and append it with a trailing\n
.10^32
, or npm would have other problems anyway. A mere10^20
puts it outside the range of 64 bit ints, so this is pretty future proof.No index entries for directories, they're just created implicitly. (No empty directories.)
No inclusion of symlinks at all, so everything is either a file or an implicitly created directory.
To unpack, read the size of the index by looking at the last 32 bytes of the file, and casting the string as a number. If it doesn't match
/^[0-9]{32}$/
, file is bad.Then read that many bytes + 32 from end, up to the start of the index. Parse that as JSON. If it's not an object, then file is bad.
If the file doesn't start with
{"name":"
, it's bad. If it doesn't match{"name":"<valid name>","version":"<valid semver>"[,}]
, it's bad.Check each entry to ensure that the start value is the
start+length
of some other value. If there's any gaps, or if the final start+length doesn't match the start of the index, or if thepackage.json
doesn't have a start of 0, file is bad.Then just spit the files out to disk. No need for mode - if a file is a package.json
bin
, it gets 0o755, otherwise it's 0o644.The index arrays for each file can be extended to provide a signature for each file, or a synthetic file could be created at the end with the signatures, and another synthetic file for a signature of the set of signatures. It's very extensible in the ways we would have wanted, and not at all extensible in the ways that have caused so many headaches.
You can also do stuff like just read the first line of the file to get the package manifest, if that's all you care about. Or easily create a module system that can pull files out on demand, or not at all.
Tar was a bad choice.