Skip to content

Instantly share code, notes, and snippets.

@justinrlle
Last active March 7, 2019 17:33
Show Gist options
  • Save justinrlle/9a1f8150bd6b8906fba74370679631e9 to your computer and use it in GitHub Desktop.
Save justinrlle/9a1f8150bd6b8906fba74370679631e9 to your computer and use it in GitHub Desktop.

Hey guys, I've been thinking about this issue as of lately, so I thought that I would expose my thoughts to you, kind of under the form of a draft for an RFC. I'm not used to writing long and complex stuff in English, so wording might be off, and if something is not clear, please say it, I'll gladly try to improve whatever is needed.


  • Feature Name: cargo_artifacts_definition

Summary

Add a key to the Cargo.toml file to specify a tree of URLs that maps to files that must be associated with the release of a crate.

This structure can then be used by cargo install to know where to download prebuilt binaries in a first time, with possible other improvements like shell completions/licenses/man pages, while not being tied to cargo install.

The proposal is for the modification to the Cargo.toml format, and not for any modification of the cargo install subcommand, but this proposal is needed for some modifications to the cargo install subcommand.

Motivation

Facilitating release of binaries, while allowing further improvements.

Facilitating release of binaries, because for the time being, asking crates.io to have a CI infra to build the binaries for each crate on release is too much to ask.

Allowing further improvements, because the current situation is not fixed at all.

I think that everybody here want the ability to download prebuilt binary for a lot of tools, or offer that possibility for their own tools. To me, it feels particularly useful for cargo plugins which are designed to be used in CI (avoid long compilation times), but also easy installation for a lot of tools. I don't think that tools like ripgrep would benefit that much from this proposal, because when speaking of binary distribution, nothing beats "native" distribution (using the system package manager for each system), and ripgrep already has all that integrated. But that's up to its maintainer to express himself on the matter, of course.

Guide level explanation

The key package.artifacts (to bikeshed) contains a tree structure, where leaf are URLs pointing to artifacts linked to this version of the crate, and where node are arbitrary labels.

For example, a key binaries (name to bikeshed 1) may be interpreted by cargo install as a mapping between a target and an URL where an archive containing the binaries obtained while building this crate for the target.

[package.artifacts.binaries]
x86_64-apple-darwin = 'https://github.com/japaric/trust/releases/download/v0.1.2/trust-v0.1.2-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://github.com/japaric/trust/releases/download/v0.1.2/trust-v0.1.2-x86_64-pc-windows-msvc.zip'
x86_64-pc-unknown-linux-gnu = 'https://github.com/japaric/trust/releases/download/v0.1.2/trust-v0.1.2-x86_64-pc-unknown-linux-gnu.tar.gz'

Why in Cargo.toml?

And why not add some metadata while doing a cargo publish, like adding another file, describing the artifacts?

Well, thats another manifest, and Cargo.toml is one. Also, not tied to the release process, so allows cargo install --git GIT_URL to work.

What if the hosting service goes down?

A solution might be that crates.io would download the artifacts and host them itself, and expose them from the API. I'm not sure we can ask that much from crates.io in term of storage capacity. To get an idea of the load it might represent, the cargo-edit .crate file (what's hosted on crates.io) weights 27Kb, while the tiniest of the three binaries it produces, cargo-rm, weights 2.17Mb (windows machine, binary must have been compiled a long time ago, so maybe it's better now, but the point still holds). The three binaries together weight 11.5Mb.

I still think that the artifacts information must be exposed by the crates.io API.

What about features?

To me, the current feature system of Cargo is indeed powerful and useful, but there are still a lot of rough edges, and I think that some redesign is needed, so any design of this current proposal around features must be extremely cautious. I see three solutions on how to manage them:

1) Have a key for each features combination

Have a key in package.artifacts for each feature combination that will be exposed (you'll want to compile your binary for each combination anyway).

Have two special keys:

  • default-features, which maps to the default set of features, and will be chosen if no features are chosen
  • no-features, which maps to default-features = false
  • any-features (name to bikeshed), which is the fallback. So, if the default config file is the same for all set of features, then you can put it in any-features, and it will be downloaded. But if one specific set of features changes the default config file, you can set that config file in the artifacts of the set of features, and it will override.

Here, the key any-features is strictly optional, and aims to reduce the complexity of the resulting config. I don't think that default-features can be made the fallback, because I don't see how one could specify that an artifact is relevant to the default-features set, but not to another set of features.

Example:

[package.artifacts.default-features]
binaries = {
  x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
  x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'


[package.artifacts.'postgres,pretty-print']
binaries = {
  x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
  x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'

[package.artifacts.any-features]
manpage = '...'

Drawback: the string representing the features is hard to read, and we can use the array type of TOML. Also, there are special keys to be added, which could conflict with existing features.

Simpler variant: forbid keys to be combinations of features, must be only one feature each time. Don't allow features to override others, but keep the any-features key. So one needs to add a feature called pg-pretty-print = ['postgres', 'pretty-print'], and set the artifact key to pg-pretty-print: [package.artifacts.pg-pretty-print]

2) Make package.artifacts an array, and ask for a key defining the set of features

This is a bit easier on the eye, at least for finding the set of features, and theres is no need for special keys. But the syntax can be more verbose, one needs to use nested array of tables if one wants to not specify the whole artifact tree at once.

Example

[[package.artifacts]] # no default-features, inferred
binaries = {
  x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
  x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'

[[package.artifacts]]
feature-set = ['postgres', 'pretty-print']
binaries = {
  x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
  x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'

[[package.artifacts]]
any-features = true
manpage = '...'

# with nested array of tables, which is not ideal to me
[[package.artifacts]]
feature-set = ['postgres', 'pretty-print']
[package.artifacts.binaries]
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'

3) Name set of features, and make them specify the set of features

This is merge between the two previous solutions. Basically, you still have the same structure as the first solution, but the keys of the package.artifacts table have no meaning, it's just a name, and you must specify the feature set just like the second solution.

Example

# no features set, inferred as default feature set
[package.artifacts.default]
binaries = {
  x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
  x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'

[package.artifacts.pg-pretty-print]
feature-set = ['postgres', 'pretty-print']
binaries = {
  x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
  x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'
}
default-config = '...'

[package.artifacts.any]
any-features = true
manpage = '...'

# with nested table
[package.artifacts.pg-pretty-print]
feature-set = ['postgres', 'pretty-print']

[package.artifacts.pg-pretty-print.binaries]
x86_64-apple-darwin = 'https://...-x86_64-apple-darwin.tar.gz',
x86_64-pc-windows-msvc = 'https://...-x86_64-pc-windows-msvc.zip'

Other possibilities

A possibility is to allow to define artifacts for each features, and if you choose multiples, they add themselves if they do not conflicts. But you still need to define how conflicts are handled. To be short, I think that the feature system needs a big overhaul so that its full power can be exposed without too much friction.

Personally, I would go for either the simpler variant of the first solution, or the third solution. I think I prefer the third solution, maybe without any-features, because we do some kind of abstraction over the feature system that can be surfaced without too much friction. For example, with cargo install: cargo install some-cli --feature-set pg-pretty-print, and it would allow to expose only cohesive features set. Well, that would still need some modifications to the feature system, but I think it's the most forward compatible one.

Reference-level explanation

TODO

Drawbacks

Complexity of the resulting Cargo.toml. Not to be underestimated.

Absolutely no guarantees out of the box that the pointed artifacts are indeed built from the crate. Might be problematic that crates.io, or any registry, expose in some way files that they can't verify/attest their authenticity.

Thats for me the biggest drawback that I can think for now, I think its a serious one and it must be considered carefully.

1: This RFC shall not define that key, it's just an example for now.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment