Initially taken by Niko Matsakis and lightly edited by Ryan Levick
- Introductions
- Cargo inside large build systems
- FFI
- Foundations and financial support
- Joe, Microsoft, Seattle Rust Meetup
- Tom at Mozilla, using Rust for sync
- Lena at Mozilla, sync storage etc
- Jack Moffit at FB, Libra team
- Brian Anderson at Pingcap
- acrichto
- erickt
- dtolnay, David Tolnay
- Raj Vengalil, Azure IoT
- cuviper, Redhat
- Rain, FB
- Jeremy F, FB
- Manish
- Ben, Google
- Philip, Qumulo, Rust dev tools + infra
- Remi, Qumulo
- Sebastian, MS, pushing for Rust adoption from sec pov
- Thomas Ekerd, MS, site reliability engineer
- James, MS
- Brandom Williams, FB
- JR, Mozilla backend services
- Phil
- Will, crash ingestion mozilla
- Stjepan, Ferrous system
-
FB dev env -- backend services repo -- is mostly C++ and Java. Very polyglot environment. Glued together with Buck, FB's Bazel.
- Buck: Language agnostic. Supports Rust.
- rustc drops in quite nicely, basically equivalent to C++ compiler.
- wanted to use cargo but it just does too much to fit in
- need to delineate parts of cargo that are desired with those that conflict with Buck
- ecosys is big advantage for Rust but hard to separate from cargo
- current scheme:
- big cargo.toml including all the things used in internal repo
- cargo builds artifacts that are presented to buck
- buck can link against those
- reasonably successful
- but approaching 700 crates in transitive dep graph, getting very cumbersome to rebuild etc
- plus pinned to a specific version of compiler (prebuilt artifacts)
- works ok but build.rs build scripts are a big complication
- specific cargo pain points:
- build scripts
- "features" feature
- a lot of crates don't use features the way they're intended -- they're used for exclusive A or B choices
- this creates the possibility to break the build
- need some sort of "cfg" feature that represents forks of a crate
-
Google does a similar thing for fuschia
- cargo builds 3rd party artifacts, normal build consumes those
- problems:
- handful of 3rd party artifacts depend on things built in tree
- want to be able to do partial builds, e.g. w/o a feature, or just for some targets
- developing for a new OS, so we compile some code for host, some for target
- presently do 2 full builds, but it's a pain
- don't have as much control over the flags getting passed to rustc as we'd like
- dep flags + linker flags aren't as specific as we need to distribute deps that are needed for indiv targets
- prototype using "cargo raise" (use "gn" (from Chrome) to generate ninja files)
- based on a modification of cargo raise that generates bazel build files
- has its own handling of build.rs stuff
- rather than outputting build files, it outputs a json format that could be the basis for the proposed "cargo build plans" feature
- would be good to know what inputs etc are needed, how this would fit for Buck
- can Buck consume internal files?
- gn is aware of the concept of a Rust target
-
Qumulo build system:
- doesn't use cargo, invokes rustc directly
- cargo just builds json
- build all deps as shared libraries, whether or not they want that
.so
libraries,.rmeta
files- hits a lot of problems
- ran into problems, notably lack of support for build.rs -- have to reimpl cargo
- building for 2 different targets
- have own platform
- linux target for procedural macros
- need sometimes to pass flags that are target specific, build a target config map
- would prefer to use cargo
-
does cargo raise support build.rs?
- has some builtin support for build.rs?
- not automatic: you declare purpose of build.rs
- things that do rustc version detection?
- sometimes you want to (e.g.) disable build.rs that supply native deps which come from bazel
-
why can't you run build.rs as part of the build tool?
- fundamental problems:
- no declared inputs, no declared outputs
- buck/bazel etc has to know what files the build script is consuming, producing etc
- also, they are arbitrary execution, which can be a security concern
- proc macros have some similar concerns.
- e.g., pest which looks at cargo source dir env variable and finds your grammar def'n file
- doesn't fit well
- e.g., pest which looks at cargo source dir env variable and finds your grammar def'n file
- proc macros have some similar concerns.
- fundamental problems:
-
one thing that was discussed years ago:
- capability system for build.rs that restrict what scripts can do
- e.g., read from this directory, write to that one
- cargo can then audit/sandbox to enforce said rules
- run build script in a sandbox
- e.g. crossvm has an impl of this inside of chrome; all crossvm devices run in their own jail
- nontrivial engineering effort
- run build script in a sandbox
- could do at a higher level, sandbox
-
jeremy: build scripts classified into 3 or 4 distinct types, is this complete?
- doing codegen. read a file, bindgen, etc
- gateway to some other library, using pkgconfig or something to find the library, or they build it from source
- feature detection on rustc
- "scary ones" -- database reads, timestamps
-
plausibly could address those use cases in other ways
- feature detection is an obvious one, e.g. we had an rfc for compiler versions
-
version compat is a common thing
-
what version of rust are people using?
- stable
- "stableish" -- bootstrap
- nightly
-
who here is using toolchains distributed by rust?
- ms (partially), mozilla, libra
-
why a custom toolchain?
- config.toml tweaks
- use clang's version of some unwinding code
- custom linker
- panic=abort
- custom targets
- compliance reasons (wanting to build from source for security reasons)
- config.toml tweaks
-
bootstrapping + compliance
- where to get initial rust version?
- several attempts:
- most successful is using mrustc at version 1.22 and building from there
- ms, google did that
- is there a possibility of long term drift?
- builds are not quite reproducible at present, but almost
- was a point where build w/ mrustc + build with toolchain had non-matching hashes
- might have to tweak the paths
- in principle it can be done, should maybe prioritize it
-
maybe have an approved "how to bootstrap from C" documentation
-
specific reason fb builds from source:
- want to always have the option to apply a local patch
- don't want to get stuck with a "we must have this patch yesterday" scenario and have to figure out how to apply patch then
-
in most cases, also building llvm, want to share llvm for cross-lang LTO
- must have a newer LLVM than what rust ships with
-
some folks have cross-lang LTO working
- but rustc doesn't want to produce bitcode files
- pass the linker
/bin/echo
-
pgo -- coming soon
-
fb uses after the fact binary rewriting
-
splitting out linker was a potential change to rustc or cargo that google wants
-
would be interesting to know "here is what must be passed to gcc to successfully link"
-
another option: give a python script as the linker
- turns out servo does it, too
-
show of hands survey:
- "who is interested in a common backend for 'those things'"
- nobody knows what that means
- "who is interested in a common backend for 'those things'"
-
buck needs a "fully specified dep dag", seems like a common thing for other build systems
- seems like we have to do a few cases to work out the general rules first
-
rudimentary cargo build plan support:
- gives a dag of rustc executions
- but it's too low level for buck, also bazel
-
pressure: every once in a while people propose "rewriting cargo.toml" into the tree
- so far resisted that
- a possible outcome buck has thought of:
- buck support for cargo.toml
- ton of code that's open source for people (natch) don't want to build w/ buck out of tree
- want ability to simultaneously maintain buck/cargo support
- currently done by hand and horrible
- internally even people want this for mac/win builds which buck doesn't support
- google w/ gn does something similar, keeps cargo.toml in order to upstream it
- in some cases can generate a cargo.toml file programatically
- also imp't for IDE support
-
IDE support
- RLS kind of working with buck
- knowing laughter :)
- problematic assumptions: e.g., searching the filesystem for cargo.toml, but it's millions of files
- symptom of a larger thing
- cargo is designed for managing rust code
- assumes source tree is mostly rust code
- but often rust is embedded in a large source tree with tons of non-rust
- so having some "root for all rust code" where you search below is problematic
- top-level directory not gonna work
- always having to create artificial "root" directories
- rust-analyzer avoids this by not baking cargo in as deeply
- but still has this "top level directory" model that contains all the rust code which means a small amount of rust amongst everything else
-
generating a cargo.toml for 1 project works well, but when you have multiple targets that interact
-
Qumulo has a ton of C and Rust code that must be all combined into one big final artifact
- IDE support that avoids cargo is a must
- current state of the art: ctags
-
cramertj: cargo.toml is basically the intermediate repr for specifying deps
- are there other things one might want?
- build system has its own custom language to do that description
- can use that to generate cargo.toml files though for IDE etc
- what changes might one want in a "non-cargo IDE language"?
- maybe cargo would work fine
- what changes might one want in a "non-cargo IDE language"?
- can use that to generate cargo.toml files though for IDE etc
-
manish: does this also cause problems for clippy and rustfmt?
- cargo.toml is also useful for this
-
who uses clippy? most folks
-
rustfmt? most folks
- fb invokes it on individual files for that
-
libra uses cargo to build
- "cacheability" (sccache) has gotten worse over time
- procedural macros aren't getting cached (dylibs)
- are other people doing anything with this?
- ff has a distributed cache in the office
- (buck does caching of everything)
- native deps? also integrated into buck
- assume that if a C dep changes, rust must be rebuilt?
-lnative
is not very well-scoped (just to a directory, not specific libs)- problem: can't cache link steps as a result
- maybe also part of the problem with sccache
- in buck, each lib gets its own directory, sidestepping this problem
-
linker want:
- ability to specify a specific mapping from link name to the native library
- option to ignore link directories or transform
- in buck case, if you have a dep on a native library, you get two options (
-lfoo
and full path to foo)
-
crate features, misuse thereof:
- people seem to want option to have mutually exclusive features
- want to have impls clone etc for testing but not in a release build
- hacked up something using cargo features but doesn't work all the time
- problems:
- dev dependency
foo
with feature "testing" - sometimes testing gets turned on semi-randomly (???)
- but you can also accidentally use "testing" in a normal tree
- dev dependency
- deps for build scripts leak through to the real graph, perhaps part of the "semi-random" behavior
-
designing from the wrong direction, perhaps?
- a lot of requirements coming up that are "above and beyond" existing cargo spec and design
- contra: goal is to have cargo co-exist with buck/bazel/etc, these are the features needed for that?
-
do we want to build another tool that is not cargo?
- but everybody already has a tool and wants to use it
- but how can we do minimal work so that integration of cargo + these other tools is smoother
- working with rest of rust ecosys
-
de facto standard that crates.io + cargo have created
- defined entirely by impl of cargo
- only access at present is through cargo's impl
- refactoring cargo into indep chunks with better interfaces might be the sol'n (and has been discussed)
- cargo build plans, but they're not there yet
- key thing: version resolution, very much in cargo's domain, would be good to specify
-
external dependencies + FFI?
- can we use FFI to talk to rust?
- want module boundary between rust things, using ffi
- today: build scripts in cargo exist, common thing is to build+link to native libraries
- one of the things that cargo raise does, you can describe the purpose of a build.rs (e.g., primarily to produce that 3rd party lib)
- but you can translate that to a dep for that native library in your build system
-
summarize + action items?
- cramertj wants to know what
- dtolnay is working on a potential design ideas for a successor to build.rs
- cargo metadata description to specify what it is doing, maybe replace build.rs?
- just listing inputs would be a huge improvement
- yes but we want something that's easier than build.rs today, to incentivize it
- caching, can we improve it
- some of it may be low-hanging fruit, e.g. on mac
.a
file has timestamps - but part of it is the growing popularity of procedural macros (
.so
are uncachable by sccache)- if linker were more predictable, sccache could handle it, but it's not
- might be able to handle by separating out linking
- some of it may be low-hanging fruit, e.g. on mac
-
how to translate cargo.toml etc?
- buck today runs cargo, takes output with dep info + rlib files
- but new tool goal is to determine from cargo metadata
- no way of "definitively connecting" resolved deps with unresolved deps
-
cargo vendor tends to be a bit overagressive
- lots of things people want, seems to vary between groups
-
when developing procedural macros, could do better job of noticing token stream output hasn't changed..
- incremental
- sccache sometimes handles that well (e.g. w/ build.rs)
-
related topic: distributed builds
- sccache has support for that
- but maybe sends whole dep folder, not always ok
- would need more precise dep information to handle that (passing precise info for transitive dependencies)
--extern
is precise, but transitive deps are still figured out by rustc
- related: would be nice if, for rustc, could pass all the sources explicitly
- in buck do you list all sources?
- yes but a lot of globs :)
- in buck do you list all sources?
- sccache has support for that
-
would be nice to have a tool that handled all the easy cases, with room for "extra" cases here and there
-
alex: interested in solving a lot of these issues and have thoughts
- open to talking later about this stuff
- a lot of small details, bug fixes, etc -- long road, no silver bullet
-
some kind of "enterprise cargo" place to hold this discussion(s)
-
a lot of needs boil down to:
- quick fix combined with longer re-architecture
- two distinct languages invoking one another
- sometimes linked into one process, sometimes cross process (RPC)
- COM requires symbols to be ABI compatible
- inline assembly, direct syscalls
- "C parity"
- FFI with C and C++
- FB is doing C++ interop, as is Google
- FFI beyond C or C++?
- Java
- syscalls
- C# perhaps
- (Ruby, Python)
- Bindings to other languages are often mediated through a C layer
- Increasing number of users -- C and C++ wanting to consume Rust APIs
- Concerns:
- unwinding
- Qumulo: basically spent most of the last year preparing to do bidir FFI between Rust and C
- fairly larger codebase in a dialect of C
- rules you can impose on C side which helps sometimes
- in one direction (Rust calling C) we have been able to use bindgen
- but in the other direction (C calling Rust) we wrote a compiler plugin (uh oh) to generate C headers
- Specification questions
- concerned about cross-lang lto revealing a lot of interactions
- Cross-lang thin lto
- Dynamic testing and static testing
- Have aliasing rules proven to be a problem?
- FB: not so much. Mostly mediating rules through bindgen and trying to set things up to get compilation failures
- Google: currently checking for changes
- Google: pursuing a bit ways to annotate C and C++ headers so that can generate safe rust signatures from it
- might be an interesting thing to standardize on
- bindgen has a cumbersome mechanism for that (do)
- would be nice to include small shim layers e.g. to translate to
Result
- FB:
- C++ codebase in FB uses exceptions, have wrappers that captures and converts exceptions, this becomes a
Result
on the Rust side- manually annotating noexcept functions? basically all of them can
- C headers are manually created with a
try { } except
block in C++
- the code being interop'd is mostly C++ but have to manually write C APIs for it
- build with panic=abort? no, unwind
- also catching Rust exceptions at boundary?
- C code doesn't call into Rust code that often
- happy to make it abort though
- but mozilla wants to handle panics, though it does it by translating it into a swift/java exception
- usually the purpose is wanting to capture the call stack and report it
- in theory could panic=abort if could capture java stack
- but mozilla wants to handle panics, though it does it by translating it into a swift/java exception
- also catching Rust exceptions at boundary?
- FB sets a custom panic handler to report errors, then exits (could use panic=abort)
- C++ codebase in FB uses exceptions, have wrappers that captures and converts exceptions, this becomes a
- For COM FFI case? how handling virtual dispatch
- manual adaptation with vtables and things
- on Rust side, does that "look like" a trait?
- active area of investigation
- believe that (with proc macro support) can expose a trait that is actually a struct + vtable
- similar to what GNOME projects are doing for glib bindings
- mozilla does it for XPCOM, which is basically same thing
- various bits of existing crates, but it's mostly nasty
- Jeremy: one thing I've been thinking about:
- standard set of library functions corresponding to C++ types
- e.g. some way to use std-string from within rust code
- good to have for templated types (unique-ptr, shared-ptr, and so on)
- all types that can be directly used from Rust in some way
- quite clunky today to have a C++ function that returns something Rust can use
- on C side, it'd use the plain C++ types
- but on Rust side, it'd invoke and do the right things
- one of the pieces needed for C++ interop
- instantiate the vec/string/other impls
- should this part of bindgen?
- missing part: manually instantiating separate things for each specialization
- major topics of FFI
- being able to "use header files" and get a "reasonably safe" FFI in Rust
- what are building blocks we'd need to move things to user space?
- template instantation list is one building block -- somebody has to write the tool, nothing needed from rustc
- expectation is that there is always some work to manually bind
- but what is minimal work we can do to make it easy to translate
- annotations might be company specific -- fb vs google?
- maybe? but can we collaborate?
- different C++ dialects and patterns in use
- what about from other languages, esp. around C++?
- closest inspiration might come from Swift
- rich bindings from Rust to C++ for hashmaps etc
- because FB uses thrift for RPC mechanism (and sometimes FFI)
- would be useful to be able to do tricks like that for hashmap and sets perhaps
- some kind of tool for consuming a C++ header file to automatically produce an interface in Rust
- complication in some environments: multiple allocators
- ms: would like to know how to control use of unsafe in codebase
- google: grep
- servo used the compiler directives to disallow unsafe where possible
- in some cases, allow unsafe within a specific file
- integrate with review tool to draw attention
- unsafe is really many things: sometimes simple, sometimes not
- C++ code: all unsafe? not reviewed under the same standard?
- more interesting question is unsafe in dependencies
- auditing in crate graph in general is a problem
- geometric growth of deps
- how do you audit safe code?
- would be great if there were some central place doing auditing (and getting paid to do it)
- but we'd also need some mechanism to declare what's been audited etc
- blessed crates and versions
- let crates.io metadata include auditing
- presumably want to know also things like 2fa, review policy, etc
- attacks these days are very targeted in other ecosystems -- e.g., replacing specific versions of crates to attack specific targets
- number of deps are in the hundreds, ranging from a few hundred to ~800 depending on project
- in some cases, can pull in a frozen diff and not update
- but not all
- auditing of the compiler itself?
- would prefer to have two implementations maybe
- MS: do we know what's going into the compiler?
- do we know what changes are going in?
- FB: not been a big concern of ours
- in some cases, had issues where things got stabilized or bug fixes that broke code
- would like to be canarying the nightly compiler regularly
- but having more impl's would increase confidence
- ways to support?
- contracting
- full time hires
- how can we give $$ to rust org?
- need a foundation
- money/resources for Rust CI
- participating in crater?
- working on a way to run crater and send back pass/fail
- ecosystem support
- filling gaps in ecosystem
- supporting key crates
- helping to file GSoc proposals?
- don't need super frequent updates
- most helpful thing is to identify topics and spin off topics
- try to provide feedback for roadmap
- organize a regular meeting on zulip to talk about issues
- quarterly maybe
- we might want to consider f2f meetings in other conferences or at least in europe
- maybe rustfest
- key point:
- don't want to alienate and separate enterprise from the Rust community at large
- focusing on working groups and zulip for communication is a win