Hey guys, I've been thinking about this issue as of lately, so I thought that I would expose my thoughts to you, kind of under the form of a draft for an RFC. I'm not used to writing long and complex stuff in English, so wording might be off, and if something is not clear, please say it, I'll gladly try to improve whatever is needed.
- Feature Name: cargo_artifacts_definition
Add a key to the Cargo.toml
file to specify a tree of URLs that maps to files that must be associated with the release
of a crate.
This structure can then be used by cargo install
to know where to download prebuilt binaries in a first
time, with possible other improvements like shell completions/licenses/man pages, while not being tied to cargo install
.
The proposal is for the modification to the Cargo.toml
format, and not for any modification of the cargo install
subcommand, but this proposal is needed for some modifications to the cargo install
subcommand.
Facilitating release of binaries, while allowing further improvements.
Facilitating release of binaries, because for the time being, asking crates.io to have a CI infra to build the binaries for each crate on release is too much to ask.
Allowing further improvements, because the current situation is not fixed at all.
I think that everybody here want the ability to download prebuilt binary for a lot of tools, or offer that possibility
for their own tools. To me, it feels particularly useful for cargo
plugins which are designed to be used in CI (avoid
long compilation times), but also easy installation for a lot of tools. I don't think that tools like ripgrep would
benefit that much from this proposal, because when speaking of binary distribution, nothing beats "native" distribution
(using the system package manager for each system), and ripgrep already has all that integrated. But that's up to its
maintainer to express himself on the matter, of course.
The key package.artifacts
(to bikeshed) contains a tree structure, where leaf are URLs pointing to artifacts
linked to this version of the crate, and where node are arbitrary labels.
For example, a key binaries
(name to bikeshed 1) may be interpreted by
cargo install
as a mapping between a target and an URL where an archive containing the binaries obtained while
building this crate for the target.
[package.artifacts.binaries]
x86_64-apple-darwin = 'https://github.com/japaric/trust/releases/download/v0.1.2/trust-v0.1.2-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://github.com/japaric/trust/releases/download/v0.1.2/trust-v0.1.2-x86_64-pc-windows-msvc.zip'
x86_64-pc-unknown-linux-gnu = 'https://github.com/japaric/trust/releases/download/v0.1.2/trust-v0.1.2-x86_64-pc-unknown-linux-gnu.tar.gz'
And why not add some metadata while doing a cargo publish
, like adding another file, describing the artifacts?
Well, thats another manifest, and Cargo.toml
is one. Also, not tied to the release process, so allows
cargo install --git GIT_URL
to work.
A solution might be that crates.io would download the artifacts and host them itself, and expose them from the API. I'm
not sure we can ask that much from crates.io in term of storage capacity. To get an idea of the load it might represent,
the cargo-edit
.crate
file (what's hosted on crates.io) weights 27Kb, while the tiniest of the three binaries it
produces, cargo-rm
, weights 2.17Mb (windows machine, binary must have been compiled a long time ago, so maybe it's
better now, but the point still holds). The three binaries together weight 11.5Mb.
I still think that the artifacts information must be exposed by the crates.io API.
To me, the current feature system of Cargo is indeed powerful and useful, but there are still a lot of rough edges, and I think that some redesign is needed, so any design of this current proposal around features must be extremely cautious. I see three solutions on how to manage them:
Have a key in package.artifacts
for each feature combination that will be exposed (you'll want to compile your binary for each combination
anyway).
Have two special keys:
default-features
, which maps to the default set of features, and will be chosen if no features are chosenno-features
, which maps todefault-features = false
any-features
(name to bikeshed), which is the fallback. So, if the default config file is the same for all set of features, then you can put it inany-features
, and it will be downloaded. But if one specific set of features changes the default config file, you can set that config file in the artifacts of the set of features, and it will override.
Here, the key any-features
is strictly optional, and aims to reduce the complexity of the resulting config. I don't
think that default-features
can be made the fallback, because I don't see how one could specify that an artifact is
relevant to the default-features
set, but not to another set of features.
[package.artifacts.default-features]
binaries = {
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'
[package.artifacts.'postgres,pretty-print']
binaries = {
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'
[package.artifacts.any-features]
manpage = '...'
Drawback: the string representing the features is hard to read, and we can use the array type of TOML. Also, there are special keys to be added, which could conflict with existing features.
Simpler variant: forbid keys to be combinations of features, must be only one feature each time. Don't allow features to
override others, but keep the any-features
key. So one needs to add a feature called pg-pretty-print = ['postgres', 'pretty-print']
,
and set the artifact key to pg-pretty-print
: [package.artifacts.pg-pretty-print]
This is a bit easier on the eye, at least for finding the set of features, and theres is no need for special keys. But the syntax can be more verbose, one needs to use nested array of tables if one wants to not specify the whole artifact tree at once.
[[package.artifacts]] # no default-features, inferred
binaries = {
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'
[[package.artifacts]]
feature-set = ['postgres', 'pretty-print']
binaries = {
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'
[[package.artifacts]]
any-features = true
manpage = '...'
# with nested array of tables, which is not ideal to me
[[package.artifacts]]
feature-set = ['postgres', 'pretty-print']
[package.artifacts.binaries]
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
This is merge between the two previous solutions. Basically, you still have the same structure as the first solution,
but the keys of the package.artifacts
table have no meaning, it's just a name, and you must specify the feature set
just like the second solution.
# no features set, inferred as default feature set
[package.artifacts.default]
binaries = {
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'
[package.artifacts.pg-pretty-print]
feature-set = ['postgres', 'pretty-print']
binaries = {
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'
[package.artifacts.any]
any-features = true
manpage = '...'
# with nested table
[package.artifacts.pg-pretty-print]
feature-set = ['postgres', 'pretty-print']
[package.artifacts.pg-pretty-print.binaries]
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
A possibility is to allow to define artifacts for each features, and if you choose multiples, they add themselves if they do not conflicts. But you still need to define how conflicts are handled. To be short, I think that the feature system needs a big overhaul so that its full power can be exposed without too much friction.
Personally, I would go for either the simpler variant of the first solution, or the third solution. I think I prefer the
third solution, maybe without any-features
, because we do some kind of abstraction over the feature system that can be
surfaced without too much friction. For example, with cargo install
: cargo install some-cli --feature-set pg-pretty-print
,
and it would allow to expose only cohesive features set. Well, that would still need some modifications to the feature
system, but I think it's the most forward compatible one.
TODO
Complexity of the resulting Cargo.toml
. Not to be underestimated.
Absolutely no guarantees out of the box that the pointed artifacts are indeed built from the crate. Might be problematic that crates.io, or any registry, expose in some way files that they can't verify/attest their authenticity.
Thats for me the biggest drawback that I can think for now, I think its a serious one and it must be considered carefully.
1: This RFC shall not define that key, it's just an example for now. ↩