Last active
April 17, 2020 19:55
-
-
Save pervognsen/c7911fd29a7e8f68c25e93109a4fbecb to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Suppose you are writing a simple one-pass compiler in the Wirth style, and you want to | |
target non-RISC architectures like the x86 where conditional operations use shared condition state. | |
Case 1: | |
if (x <= y) z = w | |
CMP x, y | |
JGT skip | |
MOV z, w | |
skip: | |
Case 2: | |
z = (x <= y) | |
CMP x, y | |
SETLE z | |
Case 3: | |
z = w if x <= y | |
CMP x, y | |
CMOVLE z, w | |
How do you handle this kind of thing simply and efficiently? You don't want to materialize | |
logical values to registers if they're going to be immediately used for a conditional operation. | |
But you do need to materialize logical values to registers if there are instructions intervening | |
between the CMP and the use of the CMP-generated condition state that might clobber the state. E.g. | |
a lot of instructions, including non-logical operations like ADD, update the condition state on x86. | |
(You could also try to use code motion to move the CMP closer to its use, or to rematerialize the | |
logical value on demand to reduce register pressure, but that is beyond the scope of this note.) | |
When compiling a logical expression such as x <= y, rather than emitting its value to a general purpose | |
register, we emit its value to the implicit register that represents the condition state. We handle the | |
allocation, spilling and use of this implicit register as follows: | |
Operand *condition_operand; | |
ConditionCode condition_code; // Specifies how to interpret the condition state as a boolean. | |
void FreeOperand(Operand *operand) { | |
// ... | |
if (operand == condition_operand) | |
condition_operand = NULL; | |
} | |
void SetCondition(Operand *new_condition_operand, Condition new_condition_code) { | |
condition_operand = new_condition_operand; | |
condition_code = new_condition_code; | |
} | |
void SpillCondition() { | |
if (condition_operand && condition_operand->kind == OPERAND_LOGICAL) { | |
// An OPERAND_LOGICAL's home location is the condition state, so we have to spill it to a register. | |
AllocateOperandRegister(condition_operand); | |
EmitConditionalSet(condition_operand, condition_code); // SETcc | |
} | |
} | |
void EmitCondition(Operand *operand) { | |
if (operand != condition_operand) { | |
SpillCondition(); | |
EmitCmp0(operand); // CMP operand, 0 | |
SetCondition(operand, NONZERO); | |
} | |
} | |
void EmitComparison(Operand *destination, Operand *source, ConditionCode condition_code) { | |
SpillCondition(); | |
EmitCmp(destination, source); // CMP destination, source | |
destination->kind = OPERAND_LOGICAL; | |
SetCondition(destination, condition_code); | |
} | |
void EmitAddition(Operand *destination, Operand *source) { | |
SpillCondition(); | |
EmitAdd(destination, source); // ADD destination, source | |
SetCondition(destination, NONZERO); | |
} | |
void CompileIfStatement() { | |
// ... | |
Operand conditional_operand; | |
CompileExpression(&conditional_operand); | |
EmitCondition(&conditional_operand); | |
FreeOperand(&conditional_operand); | |
EmitConditionalBranch(NegateConditionCode(condition_code), skip_target); // Jcc skip_target | |
// ... | |
} | |
This approach also lends itself well to simple one-pass optimizations. For example, if you want to compile | |
!<expression> and <expression> resides in the condition operand, you just negate the condition code at compile | |
time, so the negation operator actually doesn't emit any code here! | |
void CompileNegationExpression(Operand *destination) { | |
CompileExpression(destination); | |
EmitCondition(destination); | |
SetCondition(destination, NegateConditionCode(condition_code)); | |
} | |
Let's see an example of how z = !!(x + y) would compile with this scheme: | |
MOV r, [RBP + x.offset] | |
ADD r, [RBP + y.offset] | |
SETNZ [RBP + z.offset] | |
The double negation idiom for boolean coercion is translated to the optimal x86 code sequence. There's | |
no CMP r, 0 after the ADD because ADD registers itself as the current condition. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment