Skip to content

Instantly share code, notes, and snippets.

@malcolmgreaves
Created February 18, 2025 19:31
Show Gist options
  • Save malcolmgreaves/ea1d95259f2e848e2fe7032b3764c69b to your computer and use it in GitHub Desktop.
Save malcolmgreaves/ea1d95259f2e848e2fe7032b3764c69b to your computer and use it in GitHub Desktop.
rustc arm assembly code differences for &dyn fn vs. &impl fn: accepting a function parameter
fn foo() -> impl Fn(i32) -> i32 {
|x| x + 1
}
fn bar(f: &impl Fn(i32) -> i32) -> i32 {
f(10)
}
fn baz(f: &dyn Fn(i32) -> i32) -> i32 {
f(10)
}
pub fn main() {
let f = foo();
println!("foo()(10): {}", f(10));
println!("from bar: f(10): {}", bar(&f));
println!("from baz: f(10): {}", baz(&f));
}
// Compile and run:
// $ rustc rust_impl_vs_dyn_function_parameter.rs && ./rust_impl_vs_dyn_function_parameter
// Produces:
// foo()(10): 11
// from bar: f(10): 11
// from baz: f(10): 11
/*
# Assembly Code
Let's take a look at the compiled assembly code for bar vs. baz! (aka impl vs dyn for a function pointer!)
---------------------
- get assembly code using `objdump`: `objdump -D ./rust_impl_vs_dyn_function_parameter`
- grep for "bar" and "baz"
Here's the generated assembly when we use `&impl TRAIT`:
--------------------------------------------------------
000000010000446c <__ZN1y3bar17hd084ba6d3d1efec4E>:
10000446c: a9bf7bfd stp x29, x30, [sp, #-0x10]!
100004470: 910003fd mov x29, sp
100004474: 52800141 mov w1, #0xa ; =10
100004478: 97ffffee bl 0x100004430 <__ZN1y3foo28_$u7b$$u7b$closure$u7d$$u7d$17h54096ff91efe5082E>
10000447c: a8c17bfd ldp x29, x30, [sp], #0x10
100004480: d65f03c0 ret
--------------------------------------------------------
Notice that we have only 1 instruction for invoking our closure:
(1) call the closure generated by `foo`
Here's the generated assembly when we use `&dyn TRAIT`:
--------------------------------------------------------
0000000100004484 <__ZN1y3baz17hc9afa0533ff6b436E>:
100004484: a9bf7bfd stp x29, x30, [sp, #-0x10]!
100004488: 910003fd mov x29, sp
10000448c: f9401428 ldr x8, [x1, #0x28]
100004490: 52800141 mov w1, #0xa ; =10
100004494: d63f0100 blr x8
100004498: a8c17bfd ldp x29, x30, [sp], #0x10
10000449c: d65f03c0 ret
--------------------------------------------------------
Notice that here we have 2 distinct instructions for invoking `f(10)`:
(1) load the address to the closure into a register (`ldr`)
(2) invoke the function loaded in the register (`blr`)
In both cases, we have the same number of instructions for setting up the stack of parameters to our closure.
We put the input value, 10, into a register (mov w1, #0xA). Note that it uses the literal 10 value, written out
here in hexadecimal.
# Why?
Why is there a difference? Why does the `dyn` version take 2 instructions to call the closure when the `impl`
version only takes 1?
This is because `dyn` means dynamic dispatch. It must have a pointer to the data as well as the virtual function
table. The address of these **cannot** be known statically. Thus, in `baz`, the compilier must always have a
**procedure** it runs in order to determine where the closure exists so it can invoke it.
In contrast, when we use `impl`, we are **hiding** the concrete type behind the trait. However, __(1) this type
exists and (2) the compiler knows about it!_. It is a **unique, unnamed type** from the compiler's perspective.
This means the `rustc` can statically link the closure returned by foo in `bar`.
# Tradeoffs?
Dynamic dispatch requires 2 pointers (location of data, location of virtual function lookup table) whereas static
dispatch only requires 1 (where in the binary the function starts). It'll also require 1 additional instruction.
Dynamic dispatch reduces binary bloat compared to using exestential types (i.e. `impl TRAIT`). For the latter, the
compiler makes *a new* type for each place it's written & used in the source code. In the former's case, a unique
type needn't be created, leading the way for more reuse in binary form.
*/
@malcolmgreaves
Copy link
Author

malcolmgreaves commented Mar 11, 2025

Using a generic parameter that is constrained to a function type using a where clause generates considerably more assembly code! (i.e. <F> ... where F: Fn(i32) -> i32)
https://gist.github.com/malcolmgreaves/a6d17be0d1fecafc9cb0421c7d78e6bf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment