Trait system

Leads: nikomatsakis Chat: WG-compiler-traits Rust Contribution G uide README for the Rust Crate README for the Rust trait system implementation Chalk blog posts

Getting started

The best way to start is to read this document, then join the chat and say hello! We’ll help you from there.

Goal

To rework the trait system for improved correctness and efficiency — not to mention a bunch of new features! This working group is actually split between two different tasks:

Prototyping
- This takes place primarily in the chalk repository. The goal of chalk is to serve as a “reference implementation” for the Rust trait system, as well as a proving ground where we can try out ideas for extensions.
- Chalk already includes support for a number of features that we would like to move to rustc — for example, chalk supports generic associated types and lazy normalization.
Implementation in rustc
- Here we want to bring features from chalk into rustc.
- Since rustc works somewhat differently, some of this work is gated on restructuring rustc in a suitable way (notably, lazy normalization and query-ification)

Here is a list of some of the features and other goals we have in mind:

Faster performance
Lazy normalization — a change to how we normalize associated types
Implied bounds
Generic associated types (RFC)
Various extensions to impl Trait:
- Permit impl Trait in argument position
- Support type Foo = impl Trait (or abstract type Foo, depending on the syntax we choose)
- Permit impl Trait in traits

Work items

The work items for this working group are tracked via issues in the rust-lang/rust github repository. You can also find them on the Rust “find-work” website.

Want something to work on?
- Here is a list of issues that have mentors or mentoring instructions!
Want to help mentor others?
- Here is a list of issues that still need mentoring instructions.

Standing Meeting / Office Hours

To help and get coordinated, I would like to try and organize some design meetings for discussion. It’s unclear how best to schedule this, so I’ll just take a stab at a first meeting time:

Wednesday Sep 20 at 1pm EDT (Boston time)

For the first try, we’ll meet in the gitter channel. If this time doesn’t work for you, please message!

Moving parts

Let me try to sketch out the various ‘moving parts’ that are involved. These should evolve into concrete work items as plans shape up.

Chalk vs rustc

The model in rustc differs from chalk in some important ways:

How we handle higher-ranked things
- rustc uses a model based on the “leak check” described by SPJ in this paper.
  - This is somewhat inconsistent, in that trait matching etc generally doesn’t care at all what regions are — but it does care “where regions are bound”
  - This causes some complications when it comes to lazy normalization and also
  - In any case, current checking is based on three core algorithms, all known to have bugs (i.e., reach wrong decisions):
    - leak-check, which is used for subtyping (known to be too strict)
    - the LUB/GLB checks, which are kind of ad-hoc and probably flawed. In any case, LUB/GLB in Rust often don’t have unique solutions.
  - Including implications and other things makes things even more complicated
    - consider exists<``'``a> { for<``'``b> { if (``'``a: '``b) { '``a: '``b } } }
      - ok, a very simple case, but the point is that we have some relationship between '``a and '``b, even though there is a kind of universe violation here
- chalk uses the universe model developed for Lambda Prolog
  - in general, the universe model is rather more powerful and more consistent
    - makes higher-ranked things almost orthogonal to solving writ large
    - basically you can “bump” the universe to enter a subuniverse, and then have skolemized things in that sub-universe; unification knows to permit universe contamination
    - more details in Scoping Constructs in Logic Programming: Implementation Problems and their Solution — an excellent read =)
    - another resource: http://www.sciencedirect.com/science/article/pii/074771719290011R
  - this allows us to make trait solving totally orthogonal to regions, even where they are bound (as it is in chalk)
    - essentially you have a procedure Solve(Env |- Goal) which gives back a condition R such that Env, R |- Goal
    - then you have a separate solver for R
    - (we never directly try to show that Env, R |- Goal)
  - one nice thing here:
    - our solver for R can be conservative and sometimes fail to find solutions
      - (which is good, because it’s a very complex space)
    - under current model, we don’t have that luxury
      - if we decide that something is an error, then it permits multiple implementations through coherence, so we can’t alter those rules
      - e.g. fixing issue 33684 (a definite bug) may cause existing code to stop compiling
- Changing the model:
  - I plan to propose changing to the universe model
  - I have a branch that has implemented this
  - Not yet complete, in that I was shooting for the branch to include lazy normalization
Query model
- In Rustc, we do the solving in the context of a given inference context
  - we have fulfillment cx, that sort of does a “breadth-first” search
  - and evaluation, which does a “depth-first” search
- In Chalk, we first extract a “closed query” and solve that
  - the fulfillment cx is used as part of this query, but there is only one mode of operation
  - for nested things, we do sub-queries
  - being independent of the inference context means we can cache very easily
  - integrates very cleanly with incremental compilation if queries have actual names
- (However, Chalk’s model is not as complete as Prolog’s; there are times we yield an ambiguous result that we might have succeeded)
- We do have to handle cycles at least as well as chalk for implied bounds to work out

Higher-ranked type equality

Why would impl<'a, A> Foo for fn(&'a A) and impl<A> Foo for fn(A) not be both permitted, if fn(&u8) never matches the latter?

We were talking about some theoretical concerns, but here's the practical thing: we might want for<'b> fn(&'α &'b u32) (for some "free" lifetime 'α) to be equal to fn(&'α &'α u32). If we want to allow that, we obviously can't allow there to be 2 separate impls. Let's see why is that is sane:

Obviously, for<'b> fn(&'α &'b u32) is a subtype of fn(&'α &'α u32) (just substitute 'b ← 'α).
In the other direction: suppose we have a fn(&'α &'α u32), and we want to use it as an for<'b> fn(&'α &'b u32).
A for<'b> fn(&'α &'b u32) is actually a for<'b> WF(&'α &'b u32) ⇒ fn(&'α &'b u32) - the caller has to prove that it is well-formed before calling it (this is just how higher-ranked functions work in Rust). However, that means that the assigned lifetime for 'b must be longer than 'α, which means the argument is a valid &'α &'α u32. So calling the fn(&'α &'α u32) is sound.

Now, if we want, we can take account of that other subtyping as a part of our subtyping rules, and by antisymmetry of subtyping (which I think we'll like to maintain) add that equality to our type equality rules. However, I'm not sure we really want to implement it - it feels like a too confusing addition to an already-complicated thing - HRTs are already complicated enough on their own, and type equality is by itself complicated enough on its own, so doing a breaking change in preparation for a feature we don't actually want to implement feels silly.

Where Chalk is incomplete

Chalk is intended to eventually model the entire Rust trait system in every detail. It is not there yet. Here are some of the things that would be good to add to chalk:

There are some aspects of Rust that cannot be lowered to a finite set of Rust rules:
- Examples:
  - object types (dyn (Debug + Send): Debug)
  - function types (fn(&u32): Copy)
- Trait objects really need to be built-in, I think.
- fn(&u32): Copy — in particular, anything that is true of all function types (because they may have arbitrary binders and things)
- Design question: To what extent can these should be “built in” to Chalk?
We are not modeling impl Trait in any way
- We probably want to target the more advanced forms that have been specified in the RFC
- This will require higher-order pattern unification
We do not model the cases where Rust must pick between competing traits
- Examples:
  - method-calls (x.foo()), which would also require having some notion of inherent methods and a few other details
  - normalizing T::Item syntax into <T as Iterator>::Item syntax
- Currently this is handled in rustc by some hokey code during astconv.
- I would like to move this into chalk.
We do not model type aliases
- In the HIR today, they are handled during astconv
- I would like to move these into chalk as well
- This would allow better handling of things like type Foo<T: Bar> = .. (where we really ought to check that Foo<U> requires U: Bar

What on earth is on Niko’s branch?

So I’ve got this branch that is pushing towards lazy normalization. It does way too many things. I want to at least write them down and think about how they can be landed:

Move to the universe model
Move unification to use the code from the ena crate
Replace existing skolemization with universes
- replace leak-check with a universe-aware region solver
Refactor trait selection interface substantially
- introduce PredicateAtom which has no binders
- make Predicate a choice between Poly(Binder<PredicateAtom>) and Atom(PredicateAtom)
- rework trait selection, projection etc to operate only on atoms
- fulfillment “flattens” predicates into atoms, moving into new universes as needed
- this work could probably all be backported, using leak-check instead of atoms in fresh universes

Features we would like to implement

GATs
- Parser modifications: (@Sunjay V is working on this)
  - Modify the AST to include new syntax
  - Modify pretty-printer
  - Write some tests of parsing etc
  - Freak out when we get to HIR lowering
- Some work on HIR
- Have to modify the project code and a few other places to unify projection-ty directly rather than trait-ref
  - interestingly, in my chalkify-universe branch, I have done that
- If we limit ourselves to GATs with higher-ranked lifetimes, don’t need full support for other kinds of HRTB, but lacking implications, we may find a lot of practical use cases don’t work
impl Trait work
- Chalk work: implement higher-ranked pattern unification algorithm
- Pieces:
  - Support impl Trait in argument position
    - How are we even going to approach this? It’s going to require a stronger notion of desugaring than we currently have
    - What happens if you do foo::<i32> with fn foo(x: impl Iterator<Item=u32>)?
      - maybe we want a split between “visible generics” and “actual generics”?
      - or just track a prefix
Implied bounds
- The main thing here is adopting a chalk-like model around cycles etc
- After that it largely “falls out”

wfcheck in chalk

From the outside

If we have a trait like

trait Foo {
    type Assoc: Bar;
}

and a function like

fn foo<T: Foo>(t: T) {
}

Then within the function, we want to be able to assume T::Assoc: Bar.

That of course means that in order to call foo, the caller has to prove T::Assoc: Bar

The way we guarantee this is “wf checking”. rustc and chalk currently use 2 different strategies, and both have problems.

What rustc does - “RFC1214”:

This part of integration with the trait system wasn’t decided when we implemented RFC1214, so it’s somewhat informal.

rustc makes “unknowable associated types” (aka “normalized” <T as Foo>::Assoc) always implement their trait bounds, so you always know that <T as Foo>::Assoc: Bar without having to prove anything.

To make sure this hold, rustc has a wfcheck pass on every impl, which checks that all the where-clauses of the impl hold in the impl’s parameter environment.

For example, if we have an impl like

impl<T: Xyz> Foo for Xyz {
    type Assoc = Something<T>;
}

Then rustc would check whether

T: Xyz + Sized ⊢ <Foo as Xyz>::Assoc: Bar;

Now, that might seem to be obviously broken - T: Xyz + Sized implies that T: Foo, which seems to imply that <Foo as Xyz>::Assoc: Bar (without even looking at what it means). However, the inference rules in rustc are not “invertible” - we normalize <Foo as Xyz>::Assoc to Something<T>, and from then on we don’t look back at <Foo as Xyz>::Assoc. This appears to be sound (as in, we don’t know of soundness issues), but is restricting.

The place where this “didn’t scale” was trait object types. Suppose I have an object dyn Foo<Assoc=Xxx>, then:

If I let dyn Foo<Assoc=Xxx>: Foo, then I can pass it to a function, and from that it follows that <dyn Foo<Assoc=Xxx> as Foo>::Assoc: Bar
This means that I must check the where-clause by then - trivially, that ``

It might be possible to do some “shortcutting” - to directly convert the <T as Foo>::Assoc: Bar to a Xxx: Bar somehow - but you would have to think of a way to handle trait bounds such as

trait Bar2<T> {}
trait Foo2 {
     type Assoc: Bar2<<Self as Foo2>::Assoc>;
}

Maybe ^ is not that much of a problem.

What chalk does:

Chalk works entirely differently. The chalk inference engine is always transitive - it can follow rules both ways (that’s not technically correct, but in practice we always make our rules symmetric). This means that it can’t use the same wfcheck strategy as rustc.

Instead, chalk has completely different path for “impl matching” and “WF”, with the predicates being named T: Foo and WF(T: Foo).

Associated type normalization is keyed off T: Foo and does not care about WF. Therefore, Chalk can’t derive trait predicates from T: Foo predicates (e.g. it can’t derive <T as Foo>::Assoc: Bar from just T: Foo).

Instead, in order to have functions work, Chalk functions (Chalk doesn’t support dynamic types yet) require (and assume), in addition to every user-specified bound, an additional WF(BOUND) predicate.

WF(T: Foo) is an I think stronger (Chalk doesn’t have WF(T: Foo) imply T: Foo, but I think it could) predicate, which is basically equivalent to all of the predicates on the trait. e.g. with the old impl

trait Foo {
    type Assoc: Bar;
}

This is equivalent to this rule

∀T. WF(T: Foo) ↔ (WF(<T as Assoc>::Foo) ∧ WF(T) ∧ <T as Assoc>::Foo: Bar

And a function

fn foo<T: Foo>(t: T) { /* .. */ }

Is equivalent to

∀T. T: Foo, WF(T: Foo) ⇒ fn(T)

Because we know the WF predicate is satisfied, we can derive <T as Assoc>::Foo: Bar from it. Callers to foo have to prove WF(T: Foo), and therefor also <T as Assoc>::Foo: Bar unless they can find a pre-done WF(T: Foo) predicate. This is sound (WF is only ever used to functions/objects) and does not have the odd wfcheck-projection interactions of rustc’s scheme.

The place this gets stuck is when there’s an infinite chain of associated types. In that case, to prove that a type is well-formed you’ll have to prove that all the elements of the infinite chain are WF, which you can’t do, for example here where you’ll have to prove Unit: Foo, Unit::Assoc: Foo, Unit::Assoc::Assoc: Foo etc.

?- program
Enter a program; press Ctrl-D when finished
| struct Unit {}
| trait Foo3 where <Self as Foo>::Assoc: Foo3 { type Assoc; }
| 
?- WellFormed(Unit: Foo3)
No possible solution: no applicable candidates

?- forall<T> { if (T: Foo3) { WellFormed(<T as Foo3>::Assoc: Foo3) } }
**** <takes a lots of time tabling the world> ****
Unique; substitution [], lifetime constraints []

The tabling taking a ton of time case is clearly suboptimal, but we hope it is fixable. Maybe coinduction can help in some cases here? That might be worth investigating.

However, simple coinduction won’t give us “parity” with rustc - it has no problem handling even “unbounded” examples here as long as you don’t actually do non-standard recursion:

trait Trait {
    type Foo: Trait;
}

impl<T> Trait for T {
    type Foo = Box<T>;
}

fn x<X: Trait>(_x: X) {}

fn main() {
    x(());
}

What do we want to do

I’m still not quite sure.

An aside regarding existentials and WF

This is a different set of notes for another day, but let’s quote myself here for now:

The "this part is new" part is actually more subtle than I realized. Object types (aka dyn Trait<'λ₀...'λₙ, τ₀...τₙ>, to avoid confusion) are actually a funny way of writing exists<σ> (σ, for<...> WF(dyn Trait<'λ₀...'λₙ, τ₀...τₙ>: Trait<'λ₀...'λₙ, τ₀...τₙ>) → σ: Trait<'λ₀...'λₙ, τ₀...τₙ>)

@nikomatsakis - feel free to comment on my writeup

People interested

Here is a list of people who have expressed interest in “joining” the working group:

@nikomatsakis
@arielb1
@Sunjay V
@soltanmm

nikomatsakis/Trait-system.md