- Do you run a JVM inside a container on Kubernetes (or maybe OpenShift)?
- Do you struggle with REQUEST and LIMIT parameters?
- Do you know the impact of those parameters on your JVM?
- Have you met OOM Killer?
Hope you will find answers to these questions in this example-based article.
There are actually 3 common ways how get configured your Heap size.
- JVM Ergonomics - in general 1/4 of identified memory by JVM (if you don't have a very limited device, I think it is 256MB and less, where the ratio between the heap size and provided memory by OS becomes bigger .. 1/2)
- MaxRAMPercentage specifies the Heap size by a percentage of memory identified by JVM
- Xmx is directly provided size of the heap without any ergonomics
Some time ago, JVM became Container-aware. That means that the JVM is able to recognize that it runs inside the Container and its memory and CPU is somehow limited. Based on these limitations, JVM adjusts maximum heap size to reflect the constraints.
Let's create a simple Java program and place it into a TEMP folder
public class Blocker {
public static void main(String[] args) throws InterruptedException {
System.out.println("Running...");
Thread.currentThread().join();
}
}
You can notice below -XX:MaxHeapSize=5188354048
and it corresponds to total memory of my laptop (1/4)
$ docker run -it --rm --name test -v /tmp:/tmp adoptopenjdk java /tmp/Blocker.java
Running...
$ docker exec test jcmd 1 VM.flags
-XX:CICompilerCount=4 -XX:ConcGCThreads=2 -XX:G1ConcRefinementThreads=8 -XX:G1HeapRegionSize=4194304 -XX:GCDrainStackTargetSize=64 -XX:InitialHeapSize=327155712 -XX:MarkStackSize=4194304 -XX:MaxHeapSize=5188354048 -XX:MaxNewSize=3112173568 -XX:MinHeapDeltaBytes=4194304 -XX:MinHeapSize=8388608 -XX:NonNMethodCodeHeapSize=5839372 -XX:NonProfiledCodeHeapSize=122909434 -XX:ProfiledCodeHeapSize=122909434 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:SoftMaxHeapSize=5188354048 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseG1GC
$ grep MemTotal /proc/meminfo
MemTotal: 20258148 kB
Let's limit the memory to 1GB -> -XX:MaxHeapSize=268435456
- we can even see a different GC !!! (Is G1GC a default Garbage Collector? :))
$ docker run -it --rm -m 1g --name test -v /tmp:/tmp adoptopenjdk java /tmp/Blocker.java
Running...
$ docker exec test jcmd 1 VM.flags
1:
-XX:CICompilerCount=4 -XX:InitialHeapSize=16777216 -XX:MaxHeapSize=268435456 -XX:MaxNewSize=89456640 -XX:MinHeapDeltaBytes=196608 -XX:MinHeapSize=8388608 -XX:NewSize=5570560 -XX:NonNMethodCodeHeapSize=5839372 -XX:NonProfiledCodeHeapSize=122909434 -XX:OldSize=11206656 -XX:ProfiledCodeHeapSize=122909434 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:SoftMaxHeapSize=268435456 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseSerialGC
I'm using Openshift, however, you can do the same thing (maybe a bit different syntax) on Kubernetes or different schedulers.
resources:
limits:
cpu: '1'
memory: 2Gi
requests:
cpu: '1'
memory: 2Gi
If we don't pass any memory limit:
docker run -it adoptopenjdk cat /sys/fs/cgroup/memory/memory.limit_in_bytes
9223372036854771712
With 1GB memory limit:
docker run -it -m 1g adoptopenjdk cat /sys/fs/cgroup/memory/memory.limit_in_bytes
1073741824
Try the same with you favourite scheduler! Spin up the POD in Kubernetes and execute the command above to figure out the memory limit!
- What if I configure different REQUEST and LIMIT sizes in Kubernetes?
- We need to somehow calculate Heap Size of our JVM process if it based on ergonomics or percentage
- What is the "identified memory" by JVM?
Try to set different resources for your POD:
resources:
limits:
cpu: '1'
memory: 2Gi
requests:
cpu: '1'
memory: 1Gi
- get a container limit which is passed using the configuration above inside the container/POD.
- execute this command inside your POD:
cat /sys/fs/cgroup/memory/memory.limit_in_bytes
2147483648
- you can notice that the memory limit is configured according to the LIMIT value!
- let's have a look what JVM says in its logs, spin up a new java process inside the existing POD:
java -Xlog:os+container=trace -version
Picked up JAVA_TOOL_OPTIONS: -XX:InitialRAMPercentage=80 -XX:MaxRAMPercentage=80 -XX:+UseSerialGC
[0.001s][trace][os,container] OSContainer::init: Initializing Container Support
[0.001s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, using cgroups v1 controllers
[0.001s][trace][os,container] Path to /memory.use_hierarchy is /sys/fs/cgroup/memory/memory.use_hierarchy
[0.001s][trace][os,container] Use Hierarchy is: 1
[0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
[0.001s][trace][os,container] Memory Limit is: 2147483648
[0.001s][info ][os,container] Memory Limit is: 2147483648
[0.001s][trace][os,container] Path to /cpu.cfs_quota_us is /sys/fs/cgroup/cpuacct,cpu/cpu.cfs_quota_us
[0.001s][trace][os,container] CPU Quota is: 100000
[0.001s][trace][os,container] Path to /cpu.cfs_period_us is /sys/fs/cgroup/cpuacct,cpu/cpu.cfs_period_us
[0.001s][trace][os,container] CPU Period is: 100000
[0.001s][trace][os,container] Path to /cpu.shares is /sys/fs/cgroup/cpuacct,cpu/cpu.shares
[0.001s][trace][os,container] CPU Shares is: 1024
[0.001s][trace][os,container] CPU Quota count based on quota/period: 1
[0.001s][trace][os,container] OSContainer::active_processor_count: 1
[0.003s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 1
[0.017s][trace][os,container] CgroupSubsystem::active_processor_count (cached): 1
[0.028s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
[0.028s][trace][os,container] Memory Limit is: 2147483648
[0.028s][trace][os,container] Path to /memory.usage_in_bytes is /sys/fs/cgroup/memory/memory.usage_in_bytes
[0.028s][trace][os,container] Memory Usage is: 1194393600
We can see a plenty of interesting information!
- This line is interesting for us right now:
[0.001s][info ][os,container] Memory Limit is: 2147483648
- That means that JVM sees the value from
memory.limit_in_bytes
and this value is corresponds to theLIMIT
value
- You can notice that I started the JVM with these flags
Picked up JAVA_TOOL_OPTIONS: -XX:InitialRAMPercentage=80 -XX:MaxRAMPercentage=80 -XX:+UseSerialGC
- Let's get the information about the VM flags from the currently runnings JVM process
jcmd 1 VM.flags
Picked up JAVA_TOOL_OPTIONS: -XX:InitialRAMPercentage=80 -XX:MaxRAMPercentage=80 -XX:+UseSerialGC
1:
-XX:CICompilerCount=2 -XX:InitialHeapSize=1719664640 -XX:InitialRAMPercentage=80.000000 -XX:MaxHeapSize=1719664640 -XX:MaxNewSize=573177856 -XX:MaxRAM=2147483648 -XX:MaxRAMPercentage=80.000000 -XX:MinHeapDeltaBytes=196608 -XX:MinHeapSize=8388608 -XX:NewSize=573177856 -XX:NonNMethodCodeHeapSize=5826188 -XX:NonProfiledCodeHeapSize=122916026 -XX:OldSize=1146486784 -XX:ProfiledCodeHeapSize=122916026 -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:SoftMaxHeapSize=1719664640 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseFastUnorderedTimeStamps -XX:+UseSerialGC
- Calculated Heap size is
-XX:MaxHeapSize=1719664640
and that could correspond to 80% of provided total memory to the container!!
- Isn't a bit weird?!
- My Heap size is ~1.7G and the REQUEST 1GB !?!?
- REQUEST is the only the guaranteed memory which is "reserved" to be given to our process.
You are standing on very thin ice. You can encounter 2 situations:
- You are lucky! Your environment is under provisioned, your scheduler have a plenty resources and gives you more than you requested and if you are really lucky you get the whole LIMIT size.
| ------------ (REQUEST) ------ (MAX HEAP) ------- (LIMIT - OS provided an entire LIMIT) |
- You are not so lucky! Your environment is a bit busy and your scheduler is not able to give you the entire LIMIT and gives you only what you requested. That means that you started your JVM process and your heap is filling up, you allocate new objects. The RSS metric constantly grows because of minor faults happen that means that JVM uses untouched pages and OS needs to map pages of process's virtual memory to a physical memory. JVM keeps allocating new objects without doing GC and when your RSS goes over the requested size (1GB in our case) you get OOM Kill and your container/processes is shut down.
| ------------ (REQUEST - OS provided only REQUEST) ----- (MAX HEAP) -------- (LIMIT) |
Yes. There are two options:
-
You will somehow achieve that your environment is busy, then start your application and keep you eyes on RSS metric and growing heap
-
or you can cheat a bit and use
-XX:+AlwaysPreTouch
when you start your container. This option ensures that your RSS will not grow with your heap but the all pages dedicated to JVM heap are touched at the very beginning, that means that your RSS belonging to your Heap is immediatelly fully mapped.AlwaysPreTouch
is often used in very large heaps (databases, ..) to sacrifice a startup (dependening on the heap size it can go to minutes - therefore ZGC implemented a parallel pretouch https://malloc.se/blog/zgc-jdk14) -
setup your POD with this resource configuration:
resources:
limits:
cpu: '1'
memory: 1Gi
requests:
cpu: '1'
memory: 1Gi
- run your JVM application with this options
-Xmx1g -Xms1g -XX:+AlwaysPreTouch
- you shouldn't be able to start the process and get OOM killed because JVM does not contain only heap memory but native memory as well and both really don't into the container :)