Last active
October 9, 2018 16:45
-
-
Save sunfishcode/b2c32cd1b448c2fcb6e6aa4ae385545a to your computer and use it in GitHub Desktop.
Custom calling convention for Rust for returning Result and Option on x86/x64:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
foo() -> Result<T, E> | |
When returning an Ok, foo would clear CF via "clc", and then return | |
the value as if the return type were just a plain T. | |
When returning an Err, foo would set the CF flag via "stc", and then | |
return the value as if the return type were just a plain E. | |
On the caller side, immediately after a call to foo, the caller could | |
do a "jc error_label". Fallthrough would be the success path and it'd | |
behave just as if the function returned a normal result. `err_handling` | |
would have the error path, which would behave as if the function | |
returned an error result. Tail calls would work too. | |
That way, you avoid the overhead of packing values into a Result struct | |
on the callee side, and unpacking them on the caller side in common | |
cases. | |
An advanced form could even be cases where the Ok path knows that CF is | |
already clear, since it's cleared by instructions like `test`, `and`, | |
`or`, and others. | |
The above idea assumes that most of the time, one will want to do a | |
branch to test for Ok or Err. However, if it's needed to materialize | |
the condition into a GPR, there are a few ways to do this with CF: | |
- setc/setnc work, though they only write an 8-bit register. Unfortunately | |
the common idiom of zeroing a 32-bit register with xor beforehand | |
isn't always feasible because an xor would clobber CF, but movzbl | |
always works. | |
- there's also the `sbb %reg, %reg` trick, which produces a full | |
32-bit register with either 0 or -1. That's not always the needed | |
value, but it can be convered fairly easily. | |
Another potential tricky area is that compilers often use instructions that | |
clobber CF in their epilogues, such as `add` to adjust the stack pointer. | |
If CF is set before the epilogue, the epilogue may have to use lea instead, | |
to avoid clobbering CF. | |
This technique could be used for Option as well. | |
Other notes about CF: | |
- `adc`, `rcl`, `rcr` are instructions that read `CF`, though they're not | |
obviously useful here. | |
- `cmc` is an instruction that inverts `CF`, which might occasionally be handy. | |
- `inc` and `dec` leave `CF` unmodified (though beware of partial FLAGS updates). | |
This technique may be implemented on other architectures as well, using | |
whatever condition code features they provide. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment