Skip to content

Instantly share code, notes, and snippets.

@niden
Forked from nikic/php_evaluation_order.md
Created April 15, 2019 15:10
Show Gist options
  • Save niden/9750c8b46eb4e20f50a3c59f3e4d4b3b to your computer and use it in GitHub Desktop.
Save niden/9750c8b46eb4e20f50a3c59f3e4d4b3b to your computer and use it in GitHub Desktop.
Analysis of some weird evaluation order in PHP

Order of evaluation in PHP

Yesterday I found some people on my favorite reddit wonder about the output of the following code:

<?php

$a = 1;
$c = $a + $a++;
var_dump($c); // int(3)

$a = 1;
$c = $a + $a + $a++;
var_dump($c); // int(3)

As you can see the expressions $a + $a++ and $a + $a + $a++ have the same result, which is rather unexpected. What's happening here?

Operator precedence and associativity

At this point many people seem to think that the order in which an expression is evaluated is determined by operator precedence and associativity. But that's not true. Precedence and associativity only tell you how the expressions are grouped:

<?php

// in the first expression
$a + $a++;
// "++" has higher precedence than "+", so "$a++" is grouped:
$a + ($a++);

// in the second expressions
$a + $a + $a++;
// "++" again has higher precedence than "+":
$a + $a + ($a++);
// and "+" is a left-associative operator, so the left "+" is grouped:
($a + $a) + ($a++);

What does this tell us about the order of evaluation? Nothing. Operator precedence and associativity specify grouping, but they do not specify in which order the groups are executed. In the last example either ($a + $a) or ($a++) could run first.

PHP does not specify what will actually happen. One version of PHP can give you one result and a different version another. Don't write code that depends on some particular evaluation order.

CV optimization

Even though PHP does not define an order, it would still be interesting to know why you get that rather odd result in the first code sample (this result is consistent across all recent PHP versions).

The reason behind it is the "compiled variables" (CV) optimization that was introduced in PHP 5.1. This optimization basically comes down to allowing simple variables (like $a, but not $a->b or $a['b']) to directly act as operands of an opcode. (Opcodes are what PHP generates from your script and what the Zend VM executes. Every opcode has at most two operands and an optional result.)

Now, lets look at the opcodes generated by the two code snippets. We'll start with $a + $a + $a++:

// code:
$a = 1;
$c = ($a + $a) + ($a++);

// opcodes:
         ASSIGN   $a, 1
$tmp_1 = ADD      $a, $a
$tmp_2 = POST_INC $a
$tmp_3 = ADD      $tmp_1, $tmp_2
         ASSIGN   $c, $tmp_3

The generated opcodes should be rather intuitive: First assign $a = 1, add $a + $a and store the result in $tmp_1, then perform a post-increment on $a and store the result in $tmp_2, then add both temporary variables and assign the result to $c.

The evaluation here happened left-to-right (first $a + $a was run, then $a++) as you would probably expect. Now let's look at the $a + $a++ case:

// code:
$a = 1;
$c = $a + ($a++);

// opcodes:
         ASSIGN   $a, 1
$tmp_1 = POST_INC $a
$tmp_2 = ADD      $a, $tmp_1
         ASSIGN   $c, $tmp_2

As you can see, in this case the POST_INC ($a++) happens first and the value of $a is only read after that in the ADD opcode. Why? Because reading the value of a variable does not require an extra opcode. Any opcode can handle reading the value of a simple variable. This is what the CV optimization does.

Avoiding the CV optimization

There are some (rare) circumstances in which the CV optimization is not performed, e.g. when the @ error suppression operator is in use.

Lets try it out. We use the $a + $a++ expression again, but this time prepend a @ before it:

<?php

$a = 1;
@ $c = $a + $a++;
var_dump($c); // int(2)

With the error-suppression operator present, the result suddenly becomes 2 rather than 3. To figure out why, lets look at the opcodes once again:

         ASSIGN        $a, 1
$tmp_1 = BEGIN_SILENCE
$var_3 = FETCH_R       'a'
$tmp_4 = POST_INC      $a
$tmp_5 = ADD           $var_3, $tmp_4
$var_2 = FETCH_W       'c'
         ASSIGN        $var_2, $tmp_5
         END_SILENCE   $tmp_1

Several things changed here: Firstly, everything is now wrapped in BEGIN_SILENCE and END_SILENCE opcodes for handling of @. Those are of no interest to us. Secondly, $a and $c are now fetched using FETCH_R (fetch for read) and FETCH_W (fetch for write) rather than being used directly as operands.

Because the fetch of $a now has an actual opcode, the fetch will happen before the increment and as such the result changes.

Takeaway

If you take anything away from this, let it be these two things:

  • Don't rely on order of evaluation within an expression. It is undefined.
  • @ disables CV optimizations and as such hurts performance. @ also hurts performance in other ways.

~nikic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment