I want to speed up integer overflow checking for hardening purposes by keeping a sticky overflow flag and only trapping when necessary. I want to keep it super simple while hopefully giving the optimizers room to do their thing.
In the codegen part of clang:
- each function gets an i1 for storing overflow information, initialized to 0
- each integer overflow check ORs its result into the overflow flag
- before each function call, return instruction, or other side-effecting operation, execude ud2 if overflow is set
Reasonable?
@nadavrot I really liked a recent AMD branch predictor which they actually bothered to document -- from memory each branch would perfectly predict not-taken until taken, then predict taken until not-taken, and only then start using up the normal history tables. Really nice because branches like this (never taken) were reliably free in the prediction table.
But much like your experience with JS and Swift, I've never seen the prediction table's resources actually matter in practice. The closest are collisions in the table due to density of branches, and even that seems to happen much less these days. Yay for modern branch predictors, makes many things easier.