I have been generating CPU Flame Graphs of Node.js applications with Linux perf
command, and noticed that some of the graphs have suspicious labels. For example, with a sandbox project and Node 10.8.0, I got LazyCompile:*a.renderDOM /opt/app/node_modules/react-dom/cjs/react-dom-server.node.production.min.js:35
where it should have been something related to streams.
For example, the following is a stack trace from perf script
that I used to genearte a CPU Flame Graph. 227bbdff955b
has a label LazyCompile:*a.renderDOM
, but it doesn't make much sense that React.renderDOM
was called from _stream_readable.js
and parsed JSON in it.
node 22640 56531.256247: 10101010 cpu-clock:
7fff9f6b72cb finish_task_switch ([kernel.kallsyms])
7fff9ff92f6f __schedule ([kernel.kallsyms])
7fff9ff935ac schedule ([kernel.kallsyms])
7fff9f6055c9 exit_to_usermode_loop ([kernel.kallsyms])
7fff9f605b85 do_syscall_64 ([kernel.kallsyms])
7fffa0002081 entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
c715 pthread_cond_signal@@GLIBC_2.3.2 (/lib/x86_64-linux-gnu/libpthread-2.19.so)
// ...
b79110 _ZN2v88internal10JsonParserILb1EE14ParseJsonValueEv (/usr/local/bin/node)
b79173 _ZN2v88internal10JsonParserILb1EE9ParseJsonEv (/usr/local/bin/node)
78f1a8 _ZN2v88internal17Builtin_JsonParseEiPPNS0_6ObjectEPNS0_7IsolateE (/usr/local/bin/node)
227bbdc8a1dd [unknown] (/tmp/perf-22640.map)
/* ??? */ 227bbdff955b LazyCompile:*a.renderDOM /opt/app/node_modules/react-dom/cjs/react-dom-server.node.production.min.js:35 (/tmp/perf-22640.map)
227bbdcfb0a1 LazyCompile:*endReadableNT _stream_readable.js:1074 (/tmp/perf-22640.map)
227bbdfd4f31 [unknown] (/tmp/perf-22640.map)
227bbdc90f95 Builtin:JSEntryTrampoline (/tmp/perf-22640.map)
227bbdc8a4a1 [unknown] (/tmp/perf-22640.map)
a597f3 _ZN2v88internal9Execution4CallEPNS0_7IsolateENS0_6HandleINS0_6ObjectEEES6_iPS6_ (/usr/local/bin/node)
6e84b3 _ZN2v88Function4CallENS_5LocalINS_7ContextEEENS1_INS_5ValueEEEiPS5_ (/usr/local/bin/node)
6e8671 _ZN2v88Function4CallENS_5LocalINS_5ValueEEEiPS3_ (/usr/local/bin/node)
// ...
I noticed that suspicious ones appeared more often in long-running processes than in fresh processes. So I guessed that locations of symbols in V8 changed over time.
So, I wrote a small script to find an address in /tmp/perf-${pid}.map
.
node symbols.js find /tmp/perf-${pid}.map ${address}
And when I ran it...
$ node symbols.js find /tmp/perf-22640.map 227bbdff955b
Total: 3285 entries
Fresh: 2063 entries
All Matches: 4 entries
227bbdff8940 1942 LazyCompile:*a.renderDOM /opt/app/node_modules/react-dom/cjs/react-dom-server.node.production.min.js:35
227bbdff8940 1942 LazyCompile:*a.renderDOM /opt/app/node_modules/react-dom/cjs/react-dom-server.node.production.min.js:35
227bbdff9420 1942 LazyCompile:*a.renderDOM /opt/app/node_modules/react-dom/cjs/react-dom-server.node.production.min.js:35
227bbdff9420 2a0 LazyCompile:*stream.on /opt/app/src/api.js:44
Fresh Matches: 1 entries
227bbdff9420 2a0 LazyCompile:*stream.on /opt/app/src/api.js:44
The output means that the address was used for a.renderDOM
for a while, but was reused for stream.on
later. Because of this, perf script
picked up the old label, LazyCompile:*a.renderDOM
instead of LazyCompile:*stream.on
.
node symbols.js tidy /tmp/perf-${pid}.map ${output_filename}
This command checks intersections of addresses and removes old entries that intersect with new entries. It should work as long as symbols don't move while perf record
and perf script
are running.
Sometimes, it's not ideal to update /tmp/perf-${pid}.map
. In that case, we can generate a stacktrace file with perf script
without modifying /tmp/perf-${pid}.map
and update the stacktrace file later.
for f in /tmp/perf-*.map
do
pid=${f//[^0-9]/}
node symbol.js update perf-${pid}.map stacks.${pid}.out stacks.${pid}.out.updated
done
The solutions above may still can have a few wrong symbols if functions move around during perf record
. If it's an issue, We can try merging mutiple symbols like "*stream.on /opt/app/src/api.js:44" or "LazyCompile:*a.renderDOM /opt/app/node_modules/react-dom/cjs/react-dom-server.node.production.min.js:35"
.
Not fully tested, but a faster version https://gist.github.com/shuhei/1b0ca2735298ae331ec6b402d3bb7e70