This lesson aims to teach the reader about the inner-workings of Rust's *(deref operator)
and trait Deref, as well as how to take advantage of those features.
This lesson assumes that:
- You have amateurish understanding of Rust's type system. Most importantly, understand that
references are types, meaning
&T,&mut TandTare all different types.
A lot of terms used in this lesson include the word "reference". For disambiguation purposes, I'll explicitly list here how I'll refer to each term for the remainder of the lesson.
| Original Term | Description | How it may be referred to |
|---|---|---|
| trait Deref | The Deref trait (std::ops::Deref) |
reference trait |
| Deref::Target | The associated type Target of the reference trait |
Target, <T as Deref>::Target, <T as DerefMut>::Target |
| * | The deref operator | deref operator |
| &T | A shared reference to a variable of type T | &T, shared reference or borrowed value |
| &mut T | A unique reference to a variable of type T | &mut, mutable reference or mutably borrowed value |
| & | The borrow operator, used to create shared references | &, borrow or & operator |
| &mut | The mutable borrow operator, used to create mutable references | &mut, mutably borrow or &mut operator |
pub trait Deref {
/// The resulting type after dereferencing.
type Target: ?Sized;
/// Dereferences the value.
fn deref(&self) -> &Self::Target;
}With the trait being called Deref, it might be confusing to see that the function deref
returns a reference, not a de-referenced value.
A different way of looking at it is:
Given a type
T, implementingDerefforTis a way of telling the compiler that&Tmay also be referenced as&Target.
By itself, this trait doesn't do much, but the Rust compiler uses its implementations in multiple ways, which I'll explain later on.
| Type | Deref::Target |
|---|---|
| &T | T |
| &mut T | T |
| Rc | T |
| Box | T |
Implicitly invoking deref() to access fields or invoke methods on Target.
Consider the following example:
struct Vector2 { x: f32, y: f32 }
fn main() {
let boxed: Box<Vector2> = Box::new(Vector2 { x: 5, y: 10 });
println!("X: {}", boxed.x);
}The type Box<Vector2> does not have a field named x, yet, this example compiles, why is that?
A: Given a type T:
When accessing a field or invoking a method on a variable of type `T`, the Rust compiler will
first check if `T` has a field/method with the provided name.
If `T` does not have any fields/methods with the provided name, the compiler will then check
if `<T as Deref>::Target` has a field/method with the provided name, using that instead if it exists.
With that in mind, the example above generates code identical to:
struct Vector2 { x: f32, y: f32 }
fn main() {
let boxed: Box<Vector2> = Box::new(Vector2 { x: 5, y: 10 });
println!("X: {}", boxed.deref().x);
}The compiler allows us to omit the deref() invocation here.
The reference trait is quite useful for reducing boilerplate code. If it didn't exist, you would have
to somehow get a reference to T(Vector2 in the example above) when accessing fields or invoking
methods on the value that's inside a Box<T>, the same applies for most wrapper types (Rc, Arc,
Mutex, etc.).
Implicitly invoking Deref::deref() when using the operators &(borrow) or &mut(mutably borrow).
Consider the following example:
fn main() {
let boxed: Box<i32> = Box::new(5);
takes_int(&boxed);
}
fn takes_int(input: &i32) {
println!("Input: {input}");
}The function takes_int requires an input of type &i32, but we are calling it with a parameter
of type &Box<i32>, yet, this example compiles, why is that?
A: Given a type T:
When using the operators `&(borrow)` or `&mut(mutably borrow)`, the compiler will check if the type
`&T`(or `&mut T` if using the `&mut operator`) satisfies the type "conditions" in that specific context.
If the type `&T` does not satisfy those conditions, the compiler will then check if `&Target` (or
`&mut Target` if using the `&mut operator`) satisfies the conditions, using that instead if it does.
With that in mind, the example above generates code identical to:
fn main() {
let boxed: Box<i32> = Box::new(5);
takes_int(boxed.deref());
}
fn takes_int(input: &i32) {
println!("Input: {input}");
}The compiler replaces the & operator with an invocation of <i32 as Deref>::deref().
Unlike with the & and &mut operators, *(deref) interacts directly with the reference trait.
As such, it can only be used on types that implement that trait.
In an oversimplified way, the *(deref) operator does the following, given a type T:
- Invokes
<T as Deref>::deref(), which returns a variable of type&Target. - Accesses the value that the returned reference points to, which will be of type
Target.
Given a type T that implements the reference trait, the *(deref) operator can be used to move
Target out of a variable of type T
Consider the example:
fn main() {
let boxed: Box<&str> = Box::new("boxed str");
let deref_result: &str = *boxed;
}Given any type T, Box<T> implements Deref<Target = T>. This means that using the *(deref)
operator on Box<T> will return a variable of type T, essentially opening the box.
This usage is allowed on all types that implement the reference trait, except &T and &mut T,
moving out of references is forbidden (in other words: you can only move out of variables you own).
Given a type T that implements DerefMut, the *(deref) operator can be used to replace the
value Target inside a variable of type T
The same applies for any of the assignment operators (=, +=, -=, etc.)
Consider the example:
fn increment(input: &mut i32) {
*input += 1;
}We know that for any given T, &mut T implements Deref<Target = T> and DerefMut.
In the example above, the *(deref) operator is used to replace the value of type i32 that input
points to.
This isn't limited to the implementation of &mut T, consider the example:
fn main() {
let mut boxed: Box<Vec<i32>> = Box::new(vec![2, 5]);
*boxed = vec![3, 4]; // valid
*boxed = Box::new(vec![3, 5]); // invalid! We can only replace `Target`
}Given any type T, Box<T> implements DerefMut<Target = T>. This means we can use the deref
operator here to replace the value Target inside a variable of type Box.
Since we are mutating variables, usage #4 requires for the outermost type to be mutable (mutable
references are inherently mutable, there's no need for them to be preceded by the keyword
mut - like let mut mut_ref = &mut 5;).
Given a type T that implements Deref, and Deref::Target implements Copy, the *(deref) operator
can be used on T to get a clone of Target.
Consider the example:
fn main() {
let x: &i32 = &10;
let cloned_x: i32 = *x;
println!("X: {x}, Clone: {cloned_x}");
}Since &i32 implements Deref<Target = i32>, we can use the *(deref) operator to get a clone of
Target(i32), in this case, the value 10.
With this in mind, the following example generates code identical to the above:
fn main() {
let x: &i32 = &10;
let cloned_x: i32 = x.deref().clone();
println!("X: {x}, Clone: {cloned_x}");
}Usage #5 is not exclusive to references, the following example is also valid:
fn main() {
let boxed_int: Box<i32> = Box::new(7);
let cloned_int: i32 = *boxed_int;
println!("Boxed: {boxed_int}, Clone: {cloned_int}");
}Just like &i32, Box<i32> implements Deref<Target = i32>, which allows us to use the *(deref)
operator to clone Target.
Supposed we want to ensure an integer is always between 0 ~ 100, we can easily enforce that by wrapping the integer in a type with a private field:
pub struct Percent {
inner: u8,
}
impl Percent {
pub fn new(value: u8) -> Percent {
// enforce that inner is always between 0 ~ 100
let inner = u8::clamp(inner, 0, 100);
Percent { inner }
}
// We can provide a public `set` method to allow users to mutate inner, while still enforcing the bounds
pub fn set(&mut self, value: u8) {
*self = Self::new(value); // Quietly using #4 on `&mut Percent`
}
}But how would the user access inner?
A basic implementation would be adding a get method to provide readonly access to it:
impl Percent {
pub fn get(&self) -> u8 {
self.inner
}
}Although that is fine, if this type is frequently used, invoking get() can quickly get verbose/tiring.
We can solve this by implementing Deref, taking advantage of the compiler usages #1, #2, #3, and #5:
impl Deref for Percent {
type Target = u8;
fn deref(&self) -> &Self::Target {
&self.inner
}
}Which then gives users readonly view of Percent::inner:
fn main() {
let percent = Percent::new(200);
// Usage #1: the compiler invokes `deref` to access `Target`, which we can invoke `u8::count_zeros()` on
let zeros = percent.count_zeros();
// Usage #2: the compiler invokes deref to borrow `Target` to the function `print_int()`
print_int(&percent);
// Usage #5: we clone `Target` which returns an `u8`, which we can compare to `100`
if *percent == 100 {
println!("Maximum percent!");
}
// Usage #3: we use deref to reference -> move `Target` out of `Percent`
let inner: u8 = *percent;
}
fn print_int(int: &u8) {
println!("Int: {int}");
}In this example, it's important to note that we do not want to implement DerefMut for Percent,
this would allow anyone to replace Percent::inner, bypassing the constraints enforced on Percent::new().
If you're familiar with the type-state pattern, you might have come across a situation where you need to store the possible states somewhere, which can be done by using enum variants:
pub struct Npc<T> {
name: String,
max_health: i32,
health: i32,
state: T,
}
pub struct Idle;
pub struct Charging { time_remaining: f32 }
pub enum NpcEnum {
Idle(Npc<Idle>),
Charging(Npc<Charging>),
}Imagine we have a variable of type NpcEnum, then in a specific context, we want to access its field
name, but we don't really care about state.
A "brute-force" implementation could be done by defining a getter-method name(&self) -> &str
that matches on NpcEnum to return such field:
impl NpcEnum {
pub fn name(&self) -> &str {
match self {
NpcEnum::Idle(this) => &this.name,
NpcEnum::Charging(this) => &this.name,
}
}
}Although that is fine, you'll have a lot of maintenance to do whenever you add new states or new fields.
This can be solved by implementing the reference trait for NpcEnum:
impl Deref for NpcEnum {
type Target = Npc<dyn std::any::Any>;
fn deref(&self) -> &Self::Target {
match self {
NpcEnum::Idle(this) => this,
NpcEnum::Charging(this) => this,
}
}
}
// Note: since `dyn Any` does not implement `Sized`,
// the implementation above requires relaxing the bounds on `T`:
pub struct Npc<T: ?Sized> {
name: String,
max_health: i32,
health: i32,
state: T,
}With that implementation, we can access any fields/methods of Npc as long as those don't require state:
fn main() {
let npc = Npc {
name: Houtamelo,
max_health: 69, //nice
health: 7,
state: Idle,
};
let npc_enum = NpcEnum::Idle(npc);
let name = &npc_enum.name;
let max_health = npc_enum.max_health;
let health = npc_enum.health;
println!(
"Npc stats:\n\
\tName: {name}\n\
\tMax Health: {max_health}\n\
\tHealth: {health}\n"
);
}This also means that adding more fields/states does not require any additional maintenance.
Deref is a tool, and like any other tool, it doesn't fit all cases.
Given a type T, implementing Deref can cause problems if both T and Target
implement the same trait or have a method with the same signature.
Consider the example:
fn main() {
let rc: Rc<Vec<i32>> = Rc::new(vec![5, 3]);
let clone = rc.clone();
}Both Rc<Vec<i32>> and Vec<i32> implement Clone. Since Rc<Vec<i32>> implements
Deref<Target = i32>, which implementation is being called here? If you check usage #1,
you can deduce that Rc::clone is the one prioritized by the compiler, but that's still
implicit, and it may be confusing for other people reading your code (or your future self).
However, having a few overlapping implementations between T and Target is almost impossible
to dodge, in those cases, I recommend explicitly stating which implementation is being called:
fn main() {
let rc: Rc<Vec<i32>> = Rc::new(vec![5, 3]);
let cloned_rc = Rc::clone(&rc);
}Check this section of the official book for more "words" of caution.
I hope you learned something, any feedback is appreciated.
Nice Lesson!