Skip to content

Instantly share code, notes, and snippets.

@joliss
Last active August 29, 2015 13:56
Show Gist options
  • Save joliss/9331281 to your computer and use it in GitHub Desktop.
Save joliss/9331281 to your computer and use it in GitHub Desktop.
#!/bin/bash
(
echo "(function() {"
for i in {0..9999}; do
echo "
var var$i = 1
var read$i = function() {
// Prevent inlining by feigning recursion
if (!var$i) read$i()
return var$i
}
var write$i = function() {
// Prevent inlining by feigning recursion
if (!var$i) write$i()
var$i = var$i
}
"
done
echo '
console.time("test")
// Performance is O(n**2) with n being the number of variables
// in the closure (10k in the for loop *above*). It is not
// related to the number of iterations in this loop, as long
// as it is >=23557 (on my system). Below that magic number,
// execution time drops to <1ms.
for (var i = 0; i < 100000; i++) {
write0()
read0()
}
console.timeEnd("test")
'
echo "})()"
) > massive-closure.js
node massive-closure.js
@mraleph
Copy link

mraleph commented Mar 4, 2014

I looked into this benchmark and here is what I found. I am going to document my investigation process.

tl;dr actual O(n2) behavior is in the optimizing compiler, not in the execution itself. you see it only if you hit optimization of the outer closure (the one that contains loop).

First hint comes from this comment:

// It is not related to the number of iterations in this loop, 
// as long as it is >=23557

There no visible growing datastructures so it is a strong indicator that pathological behavior appears when V8 decides to perform OSR (on stack replacement) - replace hot function while it is still running with an optimiized version.

First thing I tried is reproducing the problem at bleeding_edge V8. It does not reproduce. I get node v0.10.26. The problem is there.

∮ bash massive-closure.sh
test: 7359ms

Immediate suspect is optimizing compiler, which is written in C++ and thus easy to profile

∮ perf record -g node massive-closure.js
∮ perf report --stdio
...
# Events: 7K cycles
#
# Overhead  Command        Shared Object                                                                                      
# ........  .......  ...................  ....................................................................................
#
    59.97%     node  node                 [.] v8::internal::RelocIterator::next()
               |
               --- v8::internal::RelocIterator::next()
                  |          
                  |--91.21%-- v8::internal::HGraphBuilder::VisitFunctionLiteral(v8::internal::FunctionLiteral*)
...

It's optimizing compiler alright. Quick look at HGraphBuilder::VisitFunctionLiteral reveals O(n2) algorithm right away: for each function literal optimizing compiler encounters it iterates relocation table of unoptimized code object from the very beginning. Which makes graph building O(n2) with respect to number of function literals inside the function we are trying to optimize.

Function is too big for optimizing compiler to chew anyway as --trace-opt reveals:

∮ node --trace-opt massive-closure.js
...
[disabled optimization for , reason: not enough virtual registers for values]
test: 8269ms

Now that we know where the problem lies we can reduce reproduction generation part to:

(
  echo "(function() {"
  for i in {0..9999}; do
    echo "
     (function() { });
     (function() { });
    "
  done
  echo '
    console.time("test")
    for (var i = 0; i < 100000; i++);
    console.timeEnd("test")
  '
  echo "})()"
) > massive-closure.js

This will repro it on node v0.10.26 but not on the bleeding_edge V8 because OSR heuristics are now structured in a way that prevents OSR of a very large functions: our function never accumulates enough ticks to actually cause an OSR and hit pathological case in graph builder. However the issue can still be reproduced even on the bleeding_edge V8 if you rewrite code like this:

(
  echo "var f = (function() {"
  for i in {0..9999}; do
    echo "
     (function() { });
     (function() { });
    "
  done
  echo '
    console.time("test")
    for (var i = 0; i < 100000; i++);
    console.timeEnd("test")
  '
  echo "});"
  # warm up
  echo "f();"
  # trigger normal (non OSR) optimization
  echo "f();"
) > massive-closure.js

But it is unclear whether code like this is related to what you were investigating in the first place.

Unrelated notes

Inlining is not involved

One can't prevent inlining by faking recursion. There is no such limitation in the V8's optimizing compiler. The issue however stops reproducing if we rewrite the reproduction like this:

(
  echo "(function() {"
  for i in {0..9999}; do
    echo "
      var var$i = 1
      var read$i = function() {
        return var$i
      }
      var write$i = function() {
        var$i = var$i
      }
    "
  done
  echo '
    console.time("test")
    // Performance is O(n**2) with n being the number of variables
    // in the closure (10k in the for loop *above*). It is not
    // related to the number of iterations in this loop, as long
    // as it is >=23557 (on my system). Below that magic number,
    // execution time drops to <1ms.
    for (var i = 0; i < 100000; i++) {
      write0()
      read0()
    }
    console.timeEnd("test")
  '
  echo "})()"
) > massive-closure.js
∮ bash massive-closure.sh
test: 106ms

Why? The reason is that optimizing compiler bails out very early in the compilation pipeline before it hits graph building:

∮ node --trace-opt massive-closure.js
...
[disabled optimization for , reason: too many parameters/locals]
test: 106ms

When we were calling read$i and write$i before we were causing their promotion into the heap allocated context, thus they were not counted towards locals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment