Skip to content

Instantly share code, notes, and snippets.

@dstokes
Forked from trevnorris/perf-crib-sheet.md
Created March 14, 2014 18:35
Show Gist options
  • Save dstokes/9553924 to your computer and use it in GitHub Desktop.
Save dstokes/9553924 to your computer and use it in GitHub Desktop.

build

Clone and build Node for analysis:

$ git clone https://github.com/joyent/node.git
$ cd node
$ export GYP_DEFINES="v8_enable_disassembler=1 v8_object_print=1"
$ ./configure
$ make -j4

Note: To run with IRHydra run like the following:

$ node --trace-hydrogen --trace-deopt --code-comments --print-opt-code --trace-phase=Z script.js > code.asm

Now build v8:

$ cd deps/v8
$ make dependencies
$ make -j4

Building modules with custom local build of Node:

$ /path/to/node/node $(which node-gyp) rebuild --release --nodedir=/path/to/node

(note: can also use --debug)

Install npm libraries using custom local build:

$ /path/to/node/node $(which npm) --nodedir=/path/to/node install

v8 flags

Print deoptimizations:

$ ./node --trace-deopt --code-comments ./script.js

Four types of deoptimizations:

  • eager: Occurs with unexpected element transitions.
  • lazy: Occurs when an assumption is made on a global which no longer holds true.
  • soft: As v8 collects type information on functions it will mark dead subgraphs of the AST (abstract syntax tree). If one of those marked graphs is called this will cause the function to deoptimize and tell v8 more type information needs to be collected.
  • debugger: Not really a bailout, but used by the debugger to deoptimize stack frames to allow inspection.

Print optimizations and inlining:

$ ./node --trace-opt --trace-inlining ./script.js

Check activity of the garbage collector:

$ ./node --trace_gc --print_cumulative_gc_stat ./script

Trace for allocations of external memory (e.g. Buffers/ArrayBuffers):

./node --trace_external_memory ./script

Grab v8 profile of script and use the tick processor:

$ ./node --prof ./script
$ D8_PATH=/path/to/deps/v8/out/native /path/to/deps/v8/tools/linux-tick-processor

Key things to look for from tick processor output:

  • Stub: Hand written assembly that follows the JS calling convention.
  • LazyCompile: Code was "lazily compiled", not that v8 spent that much time compiling code.
  • * prefix: The function has been run through the optimizing compiler (i.e. Crankshaft).
  • ~ prefix: The function was run through the full-codegen (or unoptimized) compiler, but v8 is letting us know the function could be optimized.
  • no prefix: Either that v8 doesn't have enough type info on the function to know whether it could be optimized, or the function has been marked as unoptimizable.

Find unnamed function in --trace-* output (using --trace-opt as example):

$ ./node --trace-opt --print-opt-code ./script > code.asm

The output will be a mix of the two flags. From --trace-opt the output will usually look like so:

[marking 0x1531090902b9 myFunctionName <JS Function (SharedFunctionInfo 0x3f91d3335139)> ...

But in the case where the function has no name it will look like this:

[marking 0x1531090902b9 <JS Function (SharedFunctionInfo 0x3f91d3335139)> ...

So grab the SharedFunctionInfo address and search for object: 0x3f91d3335139 <SharedFunctionInfo>. This will drop you into the execution body of the calling function. Scroll up and you'll see a body of text that contains the unnamed function.

perf

I suggest getting latest kernel and building perf (example from time of this writing):

$ wget https://www.kernel.org/pub/linux/kernel/v3.x/testing/linux-3.13-rc2.tar.xz
$ tar xvf linux-3.13-rc2.tar.xz
$ linux-3.13-rc2/tools/perf/
$ make -f Makefile.perf -j4

I installed it globally, so did the following:

$ sudo make -f Makefile.perf -j4 install prefix=/usr

Before getting perf output we'll need to prevent kernel pointers from being hidden:

$ sudo sysctl kernel/kptr_restrict=0

Now, let's start with some basic output:

$ perf stat -- ./node script.js

And some basic recording:

$ perf record -e cycles:u ./node script.js
$ perf report

Let's get an inverted call graph:

$ perf record --call-graph dwarf -e cycles -- ./node script.js
$ perf report

Get a list of all system tracepoints:

$ sudo perf list

Print how many times a specific syscall(s) is performed during execution:

$ sudo perf stat -e syscalls:sys_*_mmap -- ./node script.js

Get the stack traces for specific syscalls:

$ sudo perf record -e syscalls:sys_*_mmap --call-graph dwarf -- ./node script.js

List possible probes on Node

$ perf probe --no-demangle -F -x /path/to/node/node

Add probe for specific call in Node:

$ sudo perf probe --no-demangle -x /path/to/node/node -a uv_fs_open

Quick script to add probes for all Node (i.e. libuv) file system operations:

for i in `perf probe --no-demangle -F -x /path/to/node/node | egrep "^uv_fs_"`
do
  sudo perf probe --no-demangle -x /path/to/node/node -a $i
done

Or for everything:

for i in `perf probe --no-demangle -F -x /path/to/node/node | egrep "^uv_[^_]"`
do
  sudo perf probe --no-demangle -x /path/to/node/node -a $i
done

See how many times each are called:

$ sudo perf stat -e probe_node:uv_fs_* -- ./node script.js

And don't forget to remove them when you're all done :)

$ sudo perf probe -d uv_fs_*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment