Created
February 18, 2025 19:31
-
-
Save malcolmgreaves/ea1d95259f2e848e2fe7032b3764c69b to your computer and use it in GitHub Desktop.
rustc arm assembly code differences for &dyn fn vs. &impl fn: accepting a function parameter
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
fn foo() -> impl Fn(i32) -> i32 { | |
|x| x + 1 | |
} | |
fn bar(f: &impl Fn(i32) -> i32) -> i32 { | |
f(10) | |
} | |
fn baz(f: &dyn Fn(i32) -> i32) -> i32 { | |
f(10) | |
} | |
pub fn main() { | |
let f = foo(); | |
println!("foo()(10): {}", f(10)); | |
println!("from bar: f(10): {}", bar(&f)); | |
println!("from baz: f(10): {}", baz(&f)); | |
} | |
// Compile and run: | |
// $ rustc rust_impl_vs_dyn_function_parameter.rs && ./rust_impl_vs_dyn_function_parameter | |
// Produces: | |
// foo()(10): 11 | |
// from bar: f(10): 11 | |
// from baz: f(10): 11 | |
/* | |
# Assembly Code | |
Let's take a look at the compiled assembly code for bar vs. baz! (aka impl vs dyn for a function pointer!) | |
--------------------- | |
- get assembly code using `objdump`: `objdump -D ./rust_impl_vs_dyn_function_parameter` | |
- grep for "bar" and "baz" | |
Here's the generated assembly when we use `&impl TRAIT`: | |
-------------------------------------------------------- | |
000000010000446c <__ZN1y3bar17hd084ba6d3d1efec4E>: | |
10000446c: a9bf7bfd stp x29, x30, [sp, #-0x10]! | |
100004470: 910003fd mov x29, sp | |
100004474: 52800141 mov w1, #0xa ; =10 | |
100004478: 97ffffee bl 0x100004430 <__ZN1y3foo28_$u7b$$u7b$closure$u7d$$u7d$17h54096ff91efe5082E> | |
10000447c: a8c17bfd ldp x29, x30, [sp], #0x10 | |
100004480: d65f03c0 ret | |
-------------------------------------------------------- | |
Notice that we have only 1 instruction for invoking our closure: | |
(1) call the closure generated by `foo` | |
Here's the generated assembly when we use `&dyn TRAIT`: | |
-------------------------------------------------------- | |
0000000100004484 <__ZN1y3baz17hc9afa0533ff6b436E>: | |
100004484: a9bf7bfd stp x29, x30, [sp, #-0x10]! | |
100004488: 910003fd mov x29, sp | |
10000448c: f9401428 ldr x8, [x1, #0x28] | |
100004490: 52800141 mov w1, #0xa ; =10 | |
100004494: d63f0100 blr x8 | |
100004498: a8c17bfd ldp x29, x30, [sp], #0x10 | |
10000449c: d65f03c0 ret | |
-------------------------------------------------------- | |
Notice that here we have 2 distinct instructions for invoking `f(10)`: | |
(1) load the address to the closure into a register (`ldr`) | |
(2) invoke the function loaded in the register (`blr`) | |
In both cases, we have the same number of instructions for setting up the stack of parameters to our closure. | |
We put the input value, 10, into a register (mov w1, #0xA). Note that it uses the literal 10 value, written out | |
here in hexadecimal. | |
# Why? | |
Why is there a difference? Why does the `dyn` version take 2 instructions to call the closure when the `impl` | |
version only takes 1? | |
This is because `dyn` means dynamic dispatch. It must have a pointer to the data as well as the virtual function | |
table. The address of these **cannot** be known statically. Thus, in `baz`, the compilier must always have a | |
**procedure** it runs in order to determine where the closure exists so it can invoke it. | |
In contrast, when we use `impl`, we are **hiding** the concrete type behind the trait. However, __(1) this type | |
exists and (2) the compiler knows about it!_. It is a **unique, unnamed type** from the compiler's perspective. | |
This means the `rustc` can statically link the closure returned by foo in `bar`. | |
# Tradeoffs? | |
Dynamic dispatch requires 2 pointers (location of data, location of virtual function lookup table) whereas static | |
dispatch only requires 1 (where in the binary the function starts). It'll also require 1 additional instruction. | |
Dynamic dispatch reduces binary bloat compared to using exestential types (i.e. `impl TRAIT`). For the latter, the | |
compiler makes *a new* type for each place it's written & used in the source code. In the former's case, a unique | |
type needn't be created, leading the way for more reuse in binary form. | |
*/ |
Side note for you type theorists out there: this isn't an existential, still a universal. In other words, impl Trait is universal in an input position, but existential in an output position.
Using a generic parameter that is constrained to a function type using a where
clause generates considerably more assembly code! (i.e. <F> ... where F: Fn(i32) -> i32
)
https://gist.github.com/malcolmgreaves/a6d17be0d1fecafc9cb0421c7d78e6bf
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
main
runner.