Created
May 6, 2021 06:38
-
-
Save dlukes/37f9027f8d26cc83668ec474f1ba339c to your computer and use it in GitHub Desktop.
E0499 and function signatures extending mutable borrow lifetimes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Some instances of E0499 ("cannot borrow X as mutable more than once | |
// at a time") are straightforward to understand, but some can be | |
// tricky. Among the latter ones (at least for me) are those related to | |
// the way a function potentially extends the lifetime of a borrow by | |
// tying it to another value. | |
// | |
// For instance, this function: https://github.com/benhoyt/countwords/blob/5318b1acdd5bd313039d480af535cf79565c2e62/rust/optimized-unsafe/main.rs#L72 | |
// | |
// Try changing it so that it accepts a &'a mut Vec<u8> instead of a | |
// &'a Cell<Vec<u8>>. You'll get the following error: | |
// | |
// error[E0499]: cannot borrow `keys` as mutable more than once at a time | |
// --> main.rs:32:27 | |
// | | |
// 32 | increment(&mut keys, &mut counts, &buf[..offset + 1]); | |
// | ^^^^^^^^^ ----------- first borrow later used here | |
// | | | |
// | second mutable borrow occurs here | |
// ... | |
// 45 | increment(&mut keys, &mut counts, &buf[start..i]); | |
// | --------- first mutable borrow occurs here | |
// | |
// error[E0499]: cannot borrow `keys` as mutable more than once at a time | |
// --> main.rs:45:31 | |
// | | |
// 45 | increment(&mut keys, &mut counts, &buf[start..i]); | |
// | ^^^^^^^^^ `keys` was mutably borrowed here in the previous iteration of the loop | |
// | |
// At first glance, you might be surprised: both the Vec and the HashMap | |
// are repeatedly mutably borrowed, each time the function is called. So | |
// why does only the Vec's borrow from the previous iteration of the | |
// loop mysteriously stick around and cause problems later on? In what | |
// sense does &mut counts on l. 32 "use" the first mutable borrow of | |
// keys? Let's find out! | |
// | |
// Additional resources: | |
// | |
// - https://stackoverflow.com/a/49929322 | |
// - https://stackoverflow.com/a/32300133 | |
// - https://stackoverflow.com/a/31067272 | |
use std::cell::{Cell, RefCell}; | |
fn main() { | |
// ----------------------------------------- bind_lifetimes_together {{{1 | |
// Let's create a situation analogous to that in the code from the | |
// GitHub link above, but trimmed down to the essentials, so that | |
// it's clearer what causes (or doesn't cause) the problem we're | |
// seeing. | |
struct X(u8); | |
struct Y; | |
// bind_lifetimes_together is our analog to increment in the | |
// original code; X corresponds to Vec<u8>, and Option<&Y> | |
// corresponds to HashMap<&[u8], u64>. | |
fn bind_lifetimes_together<'a>(_x: &'a mut X, _y: &mut Option<&'a Y>) {} | |
let mut x = X(0); | |
let mut y = None; | |
bind_lifetimes_together(&mut x, &mut y); | |
// When we first call this function, we create two mutable borrows: | |
// of both x and y. But while the borrow of y is free to end | |
// whenever convenient (it's not constrained by any explicit | |
// lifetime, as per the function definition above), the mutable | |
// borrow of x is now tied to the lifetime of y (or more precisely, | |
// the reference inside y), via the explicit lifetime 'a. | |
// | |
// This happens even though they're clearly unrelated otherwise -- | |
// the function has no body, so it can't intertwine the two values | |
// in any way, and even if it had a body, it couldn't, because it | |
// looks like there is no potential for overlap between the types | |
// (x: X and y: Option<&Y>). Except for the lifetimes, of course -- | |
// and that caveat is the crux of the biscuit (I can personally | |
// attest it's hard to fully grok that lifetimes are part of the | |
// type). In practice though, this type of problem will often happen | |
// in code which *does* somehow intertwine the values of x and y (as | |
// in the introductory example), e.g. y could be an Option<&X> and | |
// the function would store &x in there or something of the sort. | |
// | |
// But it's important to realize that while Rust wouldn't allow us | |
// to do those things without specifying the appropriate | |
// constraints, the reverse -- i.e. overspecifying the constraints | |
// when it's not necessary -- is entirely possible, and violating | |
// them will still trigger a compilation error, even though the code | |
// does nothing that would create a problem in practice when the | |
// constraints are broken. | |
// To wit: a second call to the same function will fail to | |
// compile... | |
// bind_lifetimes_together(&mut x, &mut y); | |
// ... with a message along the following lines: | |
// | |
// | | |
// | bind_lifetimes_together(&mut x, &mut y); | |
// | ------ first mutable borrow occurs here | |
// | bind_lifetimes_together(&mut x, &mut y); | |
// | ^^^^^^ ------ first borrow later used here | |
// | | | |
// | second mutable borrow occurs here | |
// | |
// This can be confusing -- in what sense does the second &mut y | |
// "use" the first &mut x? Well, again, in the sense that after the | |
// first call, the lifetime of the mutable borrow of x is tied to | |
// the lifetime of y. That means that when we call the function for | |
// the second time, Rust can drop the first mutable borrow of y and | |
// make a new one (there's no constraint there), but it *cannot* do | |
// the same thing with the first mutable borrow of x -- y is still | |
// alive at that point, so the first &mut x must be kept alive as | |
// well. | |
// | |
// Still, let's acknowledge that the choice of word "use" to | |
// describe this state of affairs can be somewhat misleading or | |
// unintuitive, especially if you cause the error by accidentally | |
// over-constraining the lifetimes of two otherwise unrelated | |
// references. In our running example here, x and y don't (and | |
// cannot) share access to any values, so it's hard for a budding | |
// Rustacean to make sense of the claim that y somehow "uses" a | |
// previous borrow of x. | |
// | |
// The same type of problem can easily happen when calling the | |
// function in a loop, in which case you'll get an error message | |
// stating that the previous mutable borrow happened in the previous | |
// iteration of the loop. | |
// | |
// The moral of the story is: don't just automatically sprinkle 'a | |
// on every reference. Wait for Rust to complain that you need to | |
// specify lifetimes, and then try to use as many different ones as | |
// possible, so as to have the most relaxed constraints possible. | |
// You might still end up needing to bind lifetimes together, in | |
// which case see the tips below on how to reconcile that with | |
// mutability and repeated calls, but at least you'll know you're | |
// not needlessly hamstringing yourself. | |
// ---------------------------------------------------- bind_and_ret {{{1 | |
// If you end up in a situation where Rust requires x and y | |
// lifetimes to be bound together in this way, one way to make it | |
// possible to call the function repeatedly is to sort of | |
// "roundtrip" the mutable borrow of x through the function. | |
fn bind_and_ret<'a>(x: &'a mut X, _y: &mut Option<&'a Y>) -> &'a mut X { | |
x | |
} | |
let mut x = X(0); | |
let mut y = None; | |
// Using distinct variable names for clarity (though you could keep | |
// reassigning the same binding): | |
let mut_x1 = &mut x; | |
let mut_x2 = bind_and_ret(mut_x1, &mut y); | |
let mut_x3 = bind_and_ret(mut_x2, &mut y); | |
// What happens here (I think): when mut_x1 is passed into the | |
// function, it's reborrowed. That reborrow's lifetime is then bound | |
// to the lifetime of y, as previously. But unlike previously, we | |
// then return the reborrow and store it in mut_x2. That means that | |
// when we next need to call the function and provide a mutable | |
// borrow of x, we don't need to borrow again (which would be an | |
// error, since the first mutable borrow is still active): we can | |
// just use the active borrow, which we happen to have a handle onto | |
// (mut_x2), unlike previously. | |
// This works in a loop as well, of course: | |
let mut mut_x = mut_x3; | |
for _ in 0..10 { | |
mut_x = bind_and_ret(mut_x, &mut y); | |
} | |
// I'm not sure whether this would be considered idiomatic or hacky, | |
// what the performance characteristics are etc. The alternative I | |
// saw in real-world code by experienced Rustaceans is to use a "get | |
// out of immutability free card", like a shared ref to a (Ref)Cell | |
// wrapping x (instead of a mutable ref to x; see below). | |
// -------------------------------------------- bind_shared_ref_cell {{{1 | |
// This is how you could achieve repeated calls to a function of the | |
// same general shape and capabilities as those above via using a | |
// RefCell. | |
fn bind_shared_ref_cell<'a>(x: &'a RefCell<X>, _y: &mut Option<&'a Y>) { | |
// proof that you can mutate x in this setup | |
x.borrow_mut().0 = 42; | |
} | |
let x = RefCell::new(X(0)); | |
let mut y = None; | |
for _ in 0..10 { | |
bind_shared_ref_cell(&x, &mut y); | |
} | |
// ------------------------------------------------ bind_shared_cell {{{1 | |
// And this is how you could do it with a Cell. | |
// | |
// In both cases, the borrows of x are still very much tied to the | |
// lifetime of y (via the lifetime annotations), but since they're | |
// now immutable, you can have as many of them as you like. | |
fn bind_shared_cell<'a>(x_outer: &'a Cell<X>, _y: &mut Option<&'a Y>) { | |
// proof that you can mutate x in this setup; X(0) is a dummy | |
// value which won't be used; if the type wrapped by the Cell | |
// implements default, you can also just use .take() here | |
let mut x = x_outer.replace(X(0)); | |
x.0 = 42; | |
x_outer.set(x); | |
} | |
let x = Cell::new(X(0)); | |
let mut y = None; | |
for _ in 0..10 { | |
bind_shared_cell(&x, &mut y); | |
} | |
// ----------------------------------------------------- Performance {{{1 | |
// Between the three functions that allow being called repeatedly, | |
// which would be the most idiomatic solution? And performance-wise? | |
// Are there significant differences, or is it a wash? The overheads | |
// that come to mind: | |
// | |
// - bind_and_ret: This is the only function that returns a value, | |
// which is not free. On the other hand, it doesn't have to do any | |
// additional bookkeeping, it just uses plain mutable references. | |
// - bind_shared_ref_cell: Requires wrapping in a RefCell and | |
// dynamically checking borrow rules. | |
// - bind_shared_cell: Requires wrapping in a Cell and fiddling with | |
// its contents, including instantiating a dummy value to swap | |
// into the cell so that we can take out the "real" one. | |
} | |
// vi: set foldmethod=marker |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment