TODO
The role of garbage collection (GC) in Rust has been heavily revised during the language's design. It was originally deeply integrated into the language, complete with dedicated syntax; then over time we found ways to lessen the dependency on GC, and then finally remove it from the language entirely.
There remain reasons to support GC in Rust.
Libraries have largely replaced uses of GC with reference counting
(e.g. via smart pointers like Rc<T>
and Arc<T>
). But reference
counting is inherently prone to leaking of cyclic structure, and
incurs a ref-count maintenance burden. Regarding the latter:
-
there is runtime overhead maintaining the reference-counts, and
-
there is also an ergonomic burden for the programmer: the associated smart-pointers (e.g.
Rc<T>
) cannot implementCopy
, so uses ofRc<T>
in Rust is often peppered with calls to the.clone()
method, making such references in Rust less ergonomic than a well-integratedGc<T>
.
An important domain for Rust is integrating a Rust library or program with some other application framework. An obvious example of this is Servo's use of the SpiderMonkey Virtual Machine for its Javascript support.
Servo is relying on SpiderMonkey's garbage collection for memory management, not only for Javascript values, but even for native-code DOM objects.
As described in that post (in the section entitled "Custom static analysis"), one can copy an unrooted pointer onto a local variable on the stack, and at that point, the SpiderMonkey garbage collector no longer knows about that root.
However, proper GC support in Rust would include a stack tracing mechanism for discovering roots on the stack. This mechanism could be extended to allow such root-discovery to be communicated to other collectors such as SpiderMonkey's.
So, let us assume that we do indeed want garbage collector support in Rust.
I only recently have come to realize that the use cases outlined above come from two fundamentally different audiences.
The first audience is made up of Rust application and library
developers, who want garbage collection in Rust to expose them solely
to an ergonomic (and efficient) smart-pointer. This group wants to
call Gc::new(T)
(instead of Rc::new(T)
), and wants the use of
Gc<T>
to be essentially just as if they were using a special variant
of the reference type &T
whose lifetime was magically guaranteed to
last as long as necessary.
The second audience is made up of Rust integrators, who are seeking to deploy or embed Rust in contexts where memory is already being managed by some external library that is not attempting to integrate with the Rust language and runtime. The primary concern for this group is that their memory management system is supported at all.
The reason that these audiences are distinct comes down to the
requirement from the first audience that use of Gc<T>
be
ergonomic.
To provide programmer ergonomics, the main trick that smart pointers
like Rc<T>
use is an implementation of the Deref
trait:
impl<T> Deref for Rc<T> where T: ?Sized { type Target = T; fn deref<'a>(&'a self) -> &'a T { ... } }
With this implementation in place, the overloaded dereference operator
*expr
is extended with an implementation that, given an &Rc<T>
will yield a &T
. This may not sound like much on its own, but it
indirectly provides two programmer conveniences:
-
"dot-operations" (i.e. method calls
recv.m(..)
and field dereferencesrecv.f
) will automatically dereference therecv
expression as much as necessary to find the associated method or field. -
coercions from
&S
to&T
will automatically dereference theS
as much as necesasry to extract a&T
.
(For more discussion, see TRPL.)
Support for the above API is, as far as I can tell, a strict
requirement for an ergonomic Gc<T>
in Rust today.
- (In other words, if
Gc<T>
cannot implementDeref
, then only other option is to extend the Rust language with some alternative overloading of*expr
thatGc<T>
can support. This may be an interesting avenue to explore in the long-term, but is not a realistic short-term solution.)
But, supporting Deref
poses a problem for some memory management
strategies, and thus may exclude some Rust integrators.
One important goal for garbage collection in Rust is that it should support "copying" or "compacting" garbage collectors that move some objects in memory.
In a fully-copying GC, the collector may freely rearrange objects in memory. The address assigned to an object may change whenever control is yielded to the collector (e.g. via a GC allocation request or via all threads reaching a GC safepoint). Relocating an object from one location to another is often called "forwarding" that object. A collection that forwards all live objects is called a "full compaction."
In a partly-copying GC, some objects may be forwarded, but the collector is designed so that on every collection, there may exist some objects that cannot be forwarded. Some such systems support dynamically shifting objects to and from non-relocatable status; we refer to marking an object as non-relocatable for a limited dynamic extent as "pinning" the object.
There are a fair amount of details, but the most important thing is to allow scenarios like this:
before garbage collection:
STACK GC-HEAP (pages start at e.g. address 0x100000)
+------+
root A -------> A D ----+
| | |
| C -------+
| | | |
+------+ | |
| |
+------+ | |
B | | |
| | | |
|"datb"| | |
+------+ | |
| |
+------+ | |
+-> C |<----+
| | | |
| |"datc"| |
| +------+ |
| |
| +------+ |
| D |<-+
+----- C |
|"datd"|
+------+
after garbage collection:
STACK GC-HEAP (pages start at e.g. address 0x200000)
+------+
root A'-------> A' D'----+
| | |
| C'-------+
| | | |
+------+ | |
| |
+------+ | |
D' |<-+ |
+----- C' | |
| |"datd"| |
| +------+ |
| |
| +------+ |
+-> C' |<----+
| |
|"datc"|
+------+
In this picture, the collector has started from the root that points to the object "A", and then gone from there to discover the objects "C" and "D" that "A" points to. It copies the objects to new locations in memory (thus the picture added prime marks to all such relocated objects, "A'", "C'", and "D'"), and also reclaimed the memory associated with the object "B", which was unreachable.
(The picture notes that the GC-HEAP here may start at an entirely different address, since the act of copying the memory may have involved moving all of the reachable storage to new memory pages, and then freeing the old pages.)
The problem with a picture like the above is: how do we support
the Deref
trait here?
An implementation of the Deref
trait for Gc<T>
will require
that we temporarily acquire references &T
that point directly
into the GC heap.
So if we start with something like this:
STACK GC-HEAP (pages start at e.g. address 0x100000)
+---------+
Gc(A) -------> A Gc(D) ----+
| | |
| Gc(C) -------+
| | | |
+---------+ | |
| |
+---------+ | |
B | | |
| | | |
| "datb" | | |
+---------+ | |
| |
+---------+ | |
+-> C |<----+
| | | |
| | "datc" | |
| +---------+ |
| |
| +---------+ |
| D |<-+
+----- Gc(C) |
| "datd" |
+---------+
We might very well end up accessing any of the Gc<T>
pointers
in the above picture via Deref
, extracting a &T
. For example,
we might extract the second field of the "A" object and deref it
to a reference to the object "C":
STACK GC-HEAP (pages start at e.g. address 0x100000)
+---------+
Gc(A) -------> A Gc(D) ----+
| | |
&C ---+ | Gc(C) -------+
| | | | |
| +---------+ | |
| | |
| +---------+ | |
| B | | |
| | | | |
| | "datb" | | |
| +---------+ | |
| | |
| +---------+ | |
+-> +-> C |<----+
| | | |
| | "datc" | |
| +---------+ |
| |
| +---------+ |
| D |<-+
+----- Gc(C) |
| "datd" |
+---------+
The problem is that when the garbage collector collector now runs, it would not be safe to try to move the object "C" to a new location.
-
Keep in mind that the
&C
above may in turn be cast to a*C
or even ausize
. One fundamental principle that I want to preserve is what I call "The Frozen Reference Property": library code can safely assume that a reference&'a T
can be treated as accessible memory of typeT
, with a constant address for the extent of the lifetime'a
. -
We could also consider lifting the Frozen Reference Property. This would allow us to write pointers of the form
&T
. But as Niko noted during the whistler work week:-
Doing this would require that we tell LLVM to not store such pointers in registers during any extent where a GC might occur.
-
And it would make casting
&T
to*T
even more unsafe than it already is -
note that in Rust today we do not actually require an
unsafe
{.rust} block for such casts!
-
-
Note also that references migrates can be sent between threads (if their contents are
Sync
).
However, it would be sound, in the above scenario, to leave "C" where it was, but copy the other reachable objects to new locations
STACK GC-HEAP
+---------+
Gc(A') -------------------> A' Gc(D') ---+
| | |
&C ---+ | Gc(C) | |
| | | |
| +---------+ |
| |
| +---------+ |
| D' |<-+
| | Gc(C) ------+
| | "datd" | |
| +---------+ |
| |
| +---------+ |
+-----> C | <---------------+
| |
| "datc" |
+---------+
This act, of forcing the object "C" to keep its current location, is known as "pinning".
Pinning provides a convenient way for us to support the Deref
trait
on Gc<T>
in Rust: the deref()
method on Gc
would pin the
referenced object on the GC heap, and return a smart pointer
GcRef<T>
. GcRef<T>
would also implement Deref
(and would return
&T
), but GcRef<T>
would also carry a destructor that unpinned the
referenced object. The &T
provided by deref'ing GcRef<T>
would not
be allowed to outlive the GcRef<T>
itself, so this would ensure that
values remain pinned for the extent of the time that a &T
has been
lent out.
Support for pinning is provided by some but not all garbage collectors.
Explicit object pinning and unpinning is not the only way to solve the
Deref
problem. For example, one might consider instead adopting the
strategy of a "mostly-copying collector", which, at the start of each
collection, simply pins everything that is directly reachable from the
stack and processor registers. This would also achieve the effect above
of forcing the object "C" to stay in place. It would probably also force
"A" to stay in place as well, so that only "D" would be relocated:
STACK GC-HEAP
+---------+ +---------+
Gc(A) -------> A Gc(D') --------> D' |
| | | |
&C ---+ | Gc(C) ----+ +---- Gc(C) |
| | | | | | "datd" |
| +---------+ | | +---------+
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| +---------+ | |
+-----> C |<-+ <-+
| |
| "datc" |
+---------+
But, mostly-copying is simply a variation on a theme. It does not avoid the requisite support for pinning in the underlying GC.
In short, I do not see a way to support
impl<T> Deref for Gc<T>
{.rust} in a moving GC without pinning
(at least not without also breaking the "The Frozen Reference
Property"* defined above).
As discussed in the previous section, Deref
-support is a likely
requirement for ergonomic Gc<T>
.
But, some garbage collectors do not support pinning of arbitrary objects. For example, the SpiderMonkey garbage collector does not support pinning.
Some reasons pinning is sometimes unsupported:
-
It is useful in a generational collector to be able to assume on every collection that the nursery area (where most objects are initially allocated) will always be entirely evacuated.
-
A bump-pointer allocation scheme that assumes that the allocation buffer ends at a fixed location is incompatible with a system that allows arbitrary objects to be pinned to their current memory location. (Note that not all bump-pointer schemes need be so restrictive.)
So, does this rule out Rust support for SpiderMonkey and other GC's?
My answer is: No, it just rules out ergonomic support for such collectors.
We are not required to provide a single type that is meant to be all things to all GC's.
In particular, I recommend that we provide (at least) two types.
(These types may be structs that Rust provides in the standard library
or a gc
crate, or they may end up being traits that the GC
implementations are expected to implement and provide via some
associated type. Time will tell.)
For the audience desiring ergonomics, we provide Gc<T>
that
supports the Deref
trait as described above. It will only work
with some GC implementation strategies. We can provide a reference
implementation of one that supports the key feature (pinning) while
also exercising object-relocation as much as possible (in order to
try to flush out bugs where unsafe
-code is holding on to
references into the GC heap via unsafe pointers).
For the audience desiring interop, we provide VolatileGc<T>
that does not support Deref
, and in fact has a number of other
restrictions.
struct VolatileGc<T> { ... }
impl VolatileGc<T> {
/// Returns a pointer to the `T` that is allocated on the GC heap.
/// The caller *must* ensure that no garbage collector activity
/// occurs while the returned value is still in use.
///
/// In particular,
/// the caller must ensure that no GC-allocation takes place
/// (since such allocation could cause a garbage collection).
///
/// Depending on the GC details, the caller may also need to
/// ensure that no GC read- or write-barriers are crossed
/// while the returned value is still in use
/// (since in some GC's that may cause a garbage collection).
///
/// Finally, depending on the GC details, (namely whether it
/// is a GC managing memory for multiple threads), the caller may need
/// to ensure that no so-called "safepoints" are crossed
/// (because doing so may cause a garbage collection
/// to be initiated across the coordinating threads).
fn ptr(&self) -> *mut T
}
This type is clearly very restricted in utility.
It also may be difficult to enforce the described requirements
(e.g. higher-order procedures may cause an allocation while there
is still a *mut T
to the GC heap in use).
In the long term Rust may want to add an effect system, even just one that integrates with the lint system, to try to detect errors of this kind.
In the short term, calls to the ptr
method can be hidden behind
abstraction barriers that enforce the requirements, for example
as sketched below:
struct List<X> { car: X, cdr: Option<VolatileGc<List<X>>> }
impl<X> VolatileGc<List<X>> {
fn car(&self) -> X where X:Copy {
unsafe { (*self.ptr()).car }
}
fn cdr(&self) -> Option<Self> {
unsafe { (*self.ptr()).cdr.clone() } // (see below)
}
fn set_car(&self, x: X) {
unsafe { (*self.ptr()).x = x; }
}
fn set_cdr(&self, y: Option<Self>) {
unsafe { (*self.ptr()).y = y; }
}
}
Side Question: Should VolatileGc
implement Copy
? It would preclude
e.g. a backing collector that is implemented via a reference-counting
system. But I think we need to draw the line somewhere, and I am
willing for that line to include the condition "if you are a Gc
,
your references must themselves be copyable."
Perhaps the previous section jumped the gun a little bit, diving into the details of what kind of smart-pointer to provide without providing any perspective of how this all fits into Rust as a whole.
Let me try to correct that now.
GC-allocated memory is often described as an "object graph": GC-tracked references are edges, and the nodes are objects (values allocated on the GC heap) plus a special set of nodes (the "root set") representing the processor registers and stack(s).
The task of a garbage collector is to identify a connected component that includes the root set, and then discard the remaining heap-allocated objects (aka the "garbage"). The "reachable objects" make up the smallest such connected component; in the general case a given collection may retain a larger connected component because it is expensive to identify the smallest one. Any objects that are unreachable but are not discarded are "floating garbage" that has survived the collection attempt.
A garbage collector typically works by doing a graph traversal, starting from the root set, and then working its way across the reachable nodes on the GC heap. Each time it reaches a node, it needs to identify its outgoing edges: this means it needs to be able to extract the GC-tracked references for any object on the GC heap.
So, from the above description, we can identify two important steps:
-
Identifying the root set to start off any GC
-
Given a GC-heap allocated object, trace through its GC-tracked references
We will discuss the second step first, as the first step will end up having some relevant overlap.
The tracing process will be responsible for extracting the Gc
-refs
within an GC-heap allocated object (and potentially rewriting them to
reflect an object's new location due to a copy)
In the common case we would like for this code to be automatically generated by the compiler based on its type structure. I.e. given a struct definition:
struct List<X> { car: X, cdr: Option<Gc<List<X>>> }
when gc-support is enabled, the compiler should generate code to do
the following on an instance of List<X>
, where X
is traceable:
impl<X> Trace for List<X> {
fn trace(&mut self, gc_state: &mut GcState) {
self.car.trace(gc_state);
match *self {
None => {}
Some(ref mut g) => gc_state.enqueue(g),
}
}
}
(We can probably reuse much of the derive
machinery for this.)
To be clear, here is what I am envisioning for the gc_state
's
enqueue
method:
impl GcState {
/// Marks `gc_ref` as reachable; if this is the first time we have
/// encountered it, enqueues it for future tracing,
///
/// Note that "marking" `gc_ref` may actually imply forwarding its
/// associated heap-object to a new location, or, if it was already
/// forwarded, may imply rewriting it to point to its new home.
fn enqueue<T>(&mut self, gc_ref: &mut Gc<T>) { ... }
}
We should also allow a user to provide their own implementation (and
potentially override the automatically generated one) since some types
will use reprsentations like *const X
where we will not be able to
derive the tracing strucure properly.
Identifying the root set is an interesting problem, because
we should allow references like Gc<T>
to be held in boxed
objects, like Box<Gc<T>>
or Vec<Gc<T>>
.
This means that it would not suffice to simply scan the stack to find the root set, as one might think from skimming the pictures used above to illustrate copying gc.
A more realistic picture looks like this:
STACK RUST-HEAP GC-HEAP
+---------+
Gc(A) ----------------------------> A |
| Gc(C) -----+
| | |
+---------+ |
|
+-Box-----+ |
Box(B) --------> B | +---------+ |
| Gc(C) ---------> C | <-+
+---------+ | |
| Gc(X) -----+
+-------------------------- Box(O) | |
| | | |
| +---------+ |
| |
| +---------+ |
| +-Vec-----+ G | |
| V | <--------- Vec(V) | |
| | | +---------+ |
| +---------+ |
| |
| +-Box-----+ |
+--> O | +---------+ |
| Gc(X) ---------> X | <-+
+---------+ | "datae" |
+---------+
In the above picture, the root set needs to include both the Gc(A)
on the stack and the Gc(C)
held in the Box(B)
.
The garbage collector is not responsible for directly managing the memory in the column labelled "RUST-HEAP." Those boxes will be handled by the system allocator. So, regardless of whether the GC copies objects, those boxes should stay in the same place.
What the GC may need to do, however, is rewrite the Gc
-references in
the boxes (and the Gc(A)
stack as objects are copied.
(This seems reasonable, as long as none of the Gc<T>
's in question
have curently lent-out any &T
-references, as discussed
above.
- (In addition, the GC, or some other entity, will need to run
Drop
code when it collects objects. For example, after determining that the object "G" on the GC-heap above is garbage, we will also need to callVec::drop
to deallocate the object labelled a"V" that "G" owns on the Rust-heap.)
A further twist is added when we consider that in general,
the references on the stack will not have nice types like
Box<B>
. In general unsafe code may only be holding a
*mut B
or *const B
that it is planning to later convert back into a Box<B>
.
Therefore we must also handle cases like this:
STACK RUST-HEAP GC-HEAP
+-Box-----+
addr B --------> B | +---------+
| Gc(C) ---------> C |
+---------+ | ... |
| |
+---------+
In any case, if we assume that we can somehow trace values (on the
stack, on the Rust-heap, and on the GC-heap), and we impose a
restriction that Gc<T>
values are only allowed to be written into
memory that is managed by Rust, then we should be able to extract all
of the roots in a picture like the above.
We control the Rust memory allocator.
Some clients will want to have direct access to malloc
and free
(or rather, their jemalloc
-equivalents that we expose), and we can
provide that to those clients, assuming they are willing to forego
Gc<T>
support (or are willing to jump through other hoops to get it
back; see below)
But in general, most clients will be happy using higher-level API's,
i.e. something along the lines of the typed_alloc
module
of RFC PR 244.
If we start with the assumption that all data on the Rust-heap is going through a high-level API like that, then the runtime can store meta-data mapping every address to the allocated block and an associated root-extractor.
The root extractor is much like a tracer, but is solely in charge of
finding instances of Gc<T>
and enqueuing them in the GcState
as
outlined above.
This should suffice for cases such as outlined in the picture:
STACK RUST-HEAP GC-HEAP
+-Box-----+
addr B --------> B | +---------+
| Gc(C) ---------> C |
+---------+ | ... |
| |
+---------+
Perhaps we should just have one Trace
trait that serves for both
root-extraction (on the stack and Rust-heap) and for tracing the
GC-heap itself. I have not thought too carefully about that yet.
Internal Question: assuming we do indeed trace through raw addresses on the stack (a detail that in the past some people have claimed we can avoid, but I'm going to assume it for now), will we also trace through raw addresses that we encounter in objects on the Rust heap? In other words, are we planning to handle this case:
STACK RUST-HEAP GC-HEAP
+-Box-----+
addr A --------> A |
| addr B -----+
+---------+ |
|
+-Box-----+ |
B | <-+ +---------+
| Gc(C) ---------> C |
+---------+ | ... |
| |
+---------+
I currently assume the answer to this question is "yes".
But a reasonable person might argue that our reasons for tracing through raw pointers on the stack do not extend to arbitrary Rust heap-allocated values. And that answer is probably fine too.
- indeed, while tracing the stack might be reasonable fast, tracing the full tree structure implied by following raw-addresses on the Rust-heap may be unreasonably slow.
Anyway, we just need to make sure our story is consistent here.
One detail we have not yet addressed is how to handle trait-objects, whose underlying data-structure is not exposed in their static type.
Presumably Gc-allocated trait objects will have some type of the form
Gc<Trait>
. The GC allocator can ensure that whenever an object is
allocated on the GC heap, it includes a word for method dispatch on
the allocated value itself; this will handle tracing instances of
Gc<Trait>
.
Another problem is how to deal with trait objects on the Rust
heap. They too may contain Gc<_>
-references. However, as far as I
can tell, we should be able to use the same allocator meta-data that
is used to resolve arbitrary addresses, as shown in
Tracing through the Rust-heap.
-
During the whistler work week, there seemed to be some potential concern that "all objects must be treated as if they contain roots".
-
Presumably the issue here is that we do not want to waste time tracing through the Rust heap in a vain search for a
Gc<_>
-ref that does not exist on a chain of trait-objects. -
However, I do not see a way around this.
-
In any case, we will probably employ tricks in the allocator meta-data to ensure that values with types that cannot ever contain
Gc
-roots are tagged so that any attempt to trace them immediately returns; such tricks should work for trait objects, at least at the first level.- (It does not resolve the chain of trait-objects problem, though, since an inner reference to a trait-object will cause the container to be flagged as potentially holding a GC-root.)
There was the condition "where X
is traceable" on the above text.
I chose not to encode this via a bound like impl <X:Trace>
, because
my assumption is that if we are to have any hope of GC-support actually
being used by Rust code in the wild, then support-for-tracing needs
to be something that gets added to standard data-types without the
original designer having to account for it in their design.
In other words, Trace
could and should be a defaulted trait that is
automatically added to type parameters, analogous to the Sized
bound.
As a counterpart to this, we will also have a defaulted Opaque
trait bound
that enforces the contraint that if X: Opaque
, then X
cannot own any
Gc<_>
values. This means that such an X: Opaque
is trivially traceable
(you just skip it).
-
So, when you are implementing a generic container type
Container<X>
and you do not want to think about the case whereX = Gc<T>
, you can impose the bound:Container<X:Opaque>
. Then a trivialTrace
implementation suffices forContainer<X>
. -
We might also allow a library to generalize their type parameters (to allow types that do not implement trace-support) in the same manner that is done for the
Sized
bound:impl <X: ?Trace> Foo<X> { ... }
Of course we hope that libraries will not find this necessary.
I suspect that the is overlap between the two types outlined in
Smart Pointer: No (One) Silver Bullet. For example, I see no reason
that the Gc<T>
type (or trait) should not itself extend (or
implement) the VolatileGc<T>
type (or trait). Having some sort of
subtyping or coercion relationship between the two types may help
people who want to write maximally-generic code, even if doing so is
extremely unergonomic.
However, I also suspect that the main client of VolatileGc<T>
, for
the short term, is Servo and SpiderMonkey. That client has no need
to work with stable Rust. Therefore, the VolatileGc<T>
type could
probably remain unstable for a long time, which would give us room
to iterate and refine its design.
I do not know if the same story holds for Gc<T>
. I suspect if we do
not stabilize it as soon as we can, then we may have problems with
unsafe
native code making assumptions about their input types that
simply do not hold. (But then again, we may already partway down this
hole anyway, and thus it may not matter how quickly we attempt to
stabilize this type and provide a reference implementation.)
In any case, I recommend that we attempt to develop the reference
collector that integrates with Gc<T>
in parallel with the
VolatileGc<T>
support for SpiderMonkey, to guard against building in
assumptions about the tracing mechanism that are tied to one collector
or the other.
Mrrh, you're right, I got this bit wrong. For some reason I thought our
Deref
was more general than this. (I even thought I double-checked against the docs or code, but clearly if I did I was sloppy.)Extending
Deref
seems like a more realistic long-term option.But then again, if we are already considering extending
Deref
(or perhaps even changingDeref
or offering an alternative trait overload toDeref
), then I might prefer we design something that will actually "work" with non-pinning GC's, e.g. something that couples the pointer-dereference and the lookup of the associated value. (But then again, I only have an inkling of how to make the latter work for fields, and no idea about whether it can work for Rust methods.)