I want to speed up integer overflow checking for hardening purposes by keeping a sticky overflow flag and only trapping when necessary. I want to keep it super simple while hopefully giving the optimizers room to do their thing.
In the codegen part of clang:
- each function gets an i1 for storing overflow information, initialized to 0
- each integer overflow check ORs its result into the overflow flag
- before each function call, return instruction, or other side-effecting operation, execude ud2 if overflow is set
Reasonable?
@regehr In theory there are limited resources in the branch prediction tables. However, my experience with Swift and JavaScript was that it was never a problem. I don't think that non-taken-overflow-branches are inserted into the branch prediction tables.
There is also the cost of fetching, decoding and executing the 'JO' instruction, but like @Chandler pointed out, the alternative is worse.