Skip to content

Instantly share code, notes, and snippets.

@khatwaniNikhil
Created February 14, 2021 07:19
Show Gist options
  • Save khatwaniNikhil/8ed23f7ece454edbadf2f68c3892c377 to your computer and use it in GitHub Desktop.
Save khatwaniNikhil/8ed23f7ece454edbadf2f68c3892c377 to your computer and use it in GitHub Desktop.
Java Profiling Notes
******************************************************************************************************************************************
async profiler
1 installation
https://github.com/jvm-profiling-tools/async-profiler
2 for memory profiling, install java debug symbols
https://www.javaadvent.com/2014/12/recompiling-the-java-runtime-library-with-debug-symbols.html
https://gist.github.com/khatwaniNikhil/694e44e7c939d96848b23afd5693b175
notes
2.1 ensure JAVA_HOME is set where jre/lib sub folder is present
2.2 download jdk source(openjdk or oracle) also and update the path within above git script at below location
<unzip src="${env.JAVA_HOME}/src.zip" dest="${project.src}"/>
wget http://hg.openjdk.java.net/jdk8/jdk8/archive/tip.zip
2.3
once above script on git is completely executed(rt_debug.jar is generated), replace jre/lib rte jar with the same
3
//configure our kernel to capture call stacks using the perf_events by all users
sudo sh -c 'echo 1 >/proc/sys/kernel/perf_event_paranoid'
//set the kptr_restrict to 0 to remove the restrictions on exposing kernel addresses:
sudo sh -c 'echo 0 >/proc/sys/kernel/kptr_restrict'
4
Add following flags -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints
5 restart tomcat process and profile for diff events like cpu, alloc, malloc, mprotect etc.
./profiler.sh -d 5 -e mprotect -f cache.svg jps
6. study flame graph in browser
************************************************************************************
jemalloc and jeprof
1 installation (need compilaton with --enable-prof flag)
sudo su
wget https://github.com/jemalloc/jemalloc/archive/5.2.0.zip
cd jemalloc-5.2.0
./autogen.sh
./configure --enable-prof
make
make install
Note: ensure step 2 is applicable to the shell environment tomcat is running
2 Profiling
step 1
export LD_PRELOAD=/usr/local/lib/libjemalloc.so
// write a profile to the disk every few 1Gb allocations and record a stack trace
export MALLOC_CONF=prof:true,lg_prof_interval:30,lg_prof_sample:17
step2
service tomcat start
step 3
jemalloc-5.2.0/bin/jeprof --show_bytes --svg `which java` jeprof*.heap > ~/test.svg
study the image file ...check for major memory consumtption areas like java.util.zip.Inflater.inflateBytes
what is lg_prof_sample
Average interval (log base 2) between allocation samples, as measured in bytes of allocation activity. The default sample interval is 512 KiB
what is lg_prof_interval
Average interval (log base 2) between memory profile dumps, as measured in bytes runs o of allocation activity
*****************************************************************************************************************************************
Real world leak examples
Native Leak example 1 - https://technology.blog.gov.uk/2015/12/11/using-jemalloc-to-get-to-the-bottom-of-a-memory-leak/
leak
1GB in java.util.zip.Inflater.inflateBytes -> inflate -> updatewindow.
solution
stopped the decompression of frontend assets by
1) Using an uncompressed jar (jars are effectively zip files and are compressed by default).
2) Serve assets from the nginx instance sitting in front of the frontend server.
Native leak example 2 https://www.evanjones.ca/java-native-leak-bug.html
an exception logger that was compressing messages without closing the GZIPOutputStream
faces issue only in case of multiple exceptions between gc cycles and OOM occurs
otherwise, eventually the finalizers would run to clean up the native allocations
solution
Adding a try/finally block to close the stream fixed the problem, and stabilized our service's memory usage under load.
Native leak example 3
// , Java process RSS constant but os level free command free is continously reduced and buffer/cache continously reduced
// fix - log rotation was not setup in the sample code with infinite loop with logger line.
//async profiler
./profiler.sh -d 5 -e filemap:mm_filemap_add_to_page_cache -f cache.svg jps
Native leak example 4
// use jvm native leak, profiling mprotect(used to increase committed size, point to increase in memory allocation) instead of malloc (as malloc based allocations can be immediately freed but still shown in flame graph)
// fix Outputstream not closed
./profiler.sh -d 5 -e mprotect -f cache.svg jps
******************************************************************************************************************************************
References
https://stackoverflow.com/questions/53451103/java-using-much-more-memory-than-heap-size-or-size-correctly-docker-memory-limi/53624438#53624438
https://vimeo.com/364039638
https://blogs.oracle.com/poonam/troubleshooting-native-memory-leaks-in-java-applications
https://medium.com/swlh/native-memory-the-silent-jvm-killer-595913cba8e7
https://github.com/jeffgriffith/native-jvm-leaks
https://sleeplessinslc.blogspot.com/2014/08/jvm-native-memory-leak.html
https://www.javaadvent.com/2014/12/recompiling-the-java-runtime-library-with-debug-symbols.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment