Created
July 24, 2014 16:38
-
-
Save asimihsan/d49f15fc38858684eb8a to your computer and use it in GitHub Desktop.
Memory usage on Linux systems
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Memory usage on Linux systems | |
(TODO I'm going to edit and refactor this article a bit) | |
h2. tl;dr | |
* The Proportional Set Size (PSS) of a process is the count of pages it | |
has in memory, where each page is divided by the number of processes | |
sharing it. | |
* The Unique Set Size (USS) of a process is the count of unshared pages. | |
If you a kill a process the USS is the number of pages returned to the | |
operating system. | |
* Code means the actual machine-language instructions of the process | |
* Stack means memory allocated for called functions' stack frames, local | |
variables, function arguments, and return values. | |
* Heap means dynamically allocated memory. | |
* Other writable means other kinds of writable memory private to the | |
process. | |
* Other readonly means other kinds of readable memory private to a | |
process. | |
* Although they don't realise it, people are seeking an answer to the | |
question "If I add feature X or if the user does Y given state Z will | |
it cause thrashing?" | |
h2. What is a process? | |
A process is an instance of a program, some blob of executable code that | |
sits on a disk. For interpreted languages like Python a process is the | |
e.g. CPython interpreter whilst executing some scripts. | |
Memory is allocated to a process in parts called segments: \[3\] | |
1. Text segment (aka code segment): the machine-language instructions of | |
the program. | |
2. Data segment. Global and static variables, and read-only data | |
associated with shared libraries. | |
3. Stack. Dynamically grows and shrinks. Contains a stack of frames for | |
each currently called function and its local variables, function | |
arguments, and return value. | |
4. Heap. Dynamically allocated memory at run-time. | |
5. Shared libraries and memory. As discussed below, although each | |
process perceives this segment as unique to itself, under the covers | |
the operating system will share the underlying physical pages amongst | |
many processes. | |
h2. How do processes access memory? | |
Processes think they are addressing physical memory, but they are not. | |
Instead processes address a virtual address space, with addresses from 0 | |
to some maximum value, and the operating system and part of the CPU | |
convert these virtual addresses to real physical addresses. There are | |
many benefits to this scheme, only a few of which will be covered in | |
this article, and all modern operating systems use it. \[1\] | |
The virtual address space is composed of pages. Typical page sizes are | |
4KB or 8KB. Virtual pages map either to a physical page or some part of | |
a hard drive, e.g. a swap partition. When a process accesses a page of | |
virtual memory it is either valid, meaning it maps either to physical | |
memory or a swap partition, or invalid, meaning it has invalidly tried | |
to use some part of memory, resulting in a segmentation fault (SIGSEGV, | |
signal 11). \[1\] \[3\] | |
Even if valid, a process may only use a page of virtual memory if it is | |
in physical memory. If it isn't a page fault occurs. The kernel steps in | |
when a page fault occurs, loading data from the swap partition to | |
physical memory. \[1\] | |
vmstat is an easy way of monitoring "page in" and "page out", which | |
records how many pages the kernel is swapping in and out of physical | |
memory: | |
{code} | |
$ vmstat | |
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- | |
r b swpd free buff cache si so bi bo in cs us sy id wa | |
1 1 1507180 2532556 556472 4848080 0 0 4 17 0 1 2 3 94 0 | |
{code} | |
Note that non-zero values for "si" (swap in) and "so" (swap out) are not | |
necessarily bad. The kernel is constantly planning ahead and trying to | |
optimise the resident set of pages, i.e. those pages loaded into | |
physical memory. However a sustained increase in si and so usually | |
indicates thrashing, meaning the kernel is struggling to figure out how | |
to fit the active set of pages in the physical memory available, and you | |
would expect a corresponding increase in the system CPU usage and | |
decrease in perceived responsiveness of processes. | |
h2. What are the consequences of virtual memory and on-demand paging? | |
This seems convoluted. Why not keep processes entirely and permanently | |
in physical memory? Surely all of a process is required all the time in | |
order for it to execute? Emphatically and empirically: no. Measurements | |
and experience shows that two principles of locality are largely true: | |
\[2\] \[3\] | |
1. Spatial locality. Processes tend to reference memory addresses that | |
are near to others that were recently accessed. For example, both | |
instructions and data structures tend to be sequentially processed. | |
2. Temporal locality. Processes tend to reference the same memory | |
addresses in the near future. Think of loops, subroutines that tend | |
to call other subroutines, hot code, etc. | |
By exploiting locality an operating system can overcommit its memory, | |
and allocate more virtual memory than can be mapped onto physical | |
memory. This also means that a static perspective on the memory | |
occupancy of a set of processes will not allow you to predict if the set | |
of processes will continue execute without requiring heavy swapping or | |
even the system running out of memory. | |
Another consequence is that the operating system is able to share | |
physical pages amongst many processes that are using the same code | |
shared library. Each process perceives its virtual memory as unique but, | |
under the covers, the operating system and hardware memory mapping units | |
are free to map these distinct virtual pages to the same physical pages. | |
h2. An example | |
Run the following once in a terminal window: | |
{code} | |
$ less /proc/self/smaps | |
00400000-00421000 r-xp 00000000 fd:01 658107 /usr/bin/less | |
Size: 132 kB | |
Rss: 108 kB | |
Pss: 108 kB | |
Shared_Clean: 0 kB | |
Shared_Dirty: 0 kB | |
Private_Clean: 108 kB | |
Private_Dirty: 0 kB | |
Referenced: 108 kB | |
Anonymous: 0 kB | |
AnonHugePages: 0 kB | |
Swap: 0 kB | |
KernelPageSize: 4 kB | |
MMUPageSize: 4 kB | |
Locked: 0 kB | |
VmFlags: rd ex mr mw me dw | |
00620000-00621000 r--p 00020000 fd:01 658107 /usr/bin/less | |
Size: 4 kB | |
Rss: 4 kB | |
Pss: 4 kB | |
<snip> | |
{code} | |
For more detail about what the output means please see \[5\] \[6\] \[7\]. | |
However, an instructive exercise is to open a new terminal window and | |
run the same command a second time, and compare the two outputs: | |
{code} | |
00400000-00421000 r-xp 00000000 fd:01 658107 /usr/bin/less | |
Size: 132 kB | |
Rss: 108 kB | |
Pss: 54 kB | |
Shared_Clean: 108 kB | |
Shared_Dirty: 0 kB | |
Private_Clean: 0 kB | |
Private_Dirty: 0 kB | |
Referenced: 108 kB | |
Anonymous: 0 kB | |
AnonHugePages: 0 kB | |
Swap: 0 kB | |
KernelPageSize: 4 kB | |
MMUPageSize: 4 kB | |
Locked: 0 kB | |
VmFlags: rd ex mr mw me dw | |
00620000-00621000 r--p 00020000 fd:01 658107 /usr/bin/less | |
Size: 4 kB | |
Rss: 4 kB | |
Pss: 4 kB | |
<snip> | |
{code} | |
Curious! Before we only had one instance of 'less' running, and the | |
'r-xp' portion (the text segment) had an RSS of 108KB and a PSS of | |
108KB. However, the second time we run 'less' the text segment the RSS | |
of 108KB and a PSS of 54KB. | |
This simple example contains the core truth that is most important to | |
take away. | |
The Resident Set Size (RSS) shown in top combines both the private and | |
shared pages of a process. Hence, that is why it is 108KB for both | |
instances of 'less'. The Proportional Set Size (PSS) shown in smaps | |
is at first 108KB and then becomes 54KB. This is because PSS divides | |
a process's pages by the number of other processes sharing them. | |
Of course by now you know that an operating system shares the code | |
segment between multiple processes. | |
RSS should not be interpreted as "the memory a process is using". The | |
closest measurement available in Linux for this definition is PSS. | |
However, \[5\] \[6\] | |
RSS means less than you think it does. PSS is where the money is. | |
Often Unique Set Size (USS) is important too. For more information | |
on how to calculate it again see \[5\] \[6\] \[7\], but for an example | |
scroll down in your less output to '\[heap\]': | |
{code} | |
01f7c000-01f9d000 rw-p 00000000 00:00 0 [heap] | |
Size: 132 kB | |
Rss: 40 kB | |
Pss: 40 kB | |
Shared_Clean: 0 kB | |
Shared_Dirty: 0 kB | |
Private_Clean: 0 kB | |
Private_Dirty: 40 kB | |
Referenced: 40 kB | |
Anonymous: 40 kB | |
AnonHugePages: 0 kB | |
Swap: 0 kB | |
KernelPageSize: 4 kB | |
MMUPageSize: 4 kB | |
Locked: 0 kB | |
VmFlags: rd wr mr mw me ac | |
{code} | |
By definition heap is private to a process and not shared. Notice how, | |
confirming our intution: | |
* RSS = PSS. Since the number of processes sharing these pages is one, | |
PSS = RSS / 1. | |
* The pages are not Shared, but instead Private. | |
h2. References | |
1. Linux System Programming, Chapter 6 (Memory Management) | |
2. Understanding the Linux Kernel, 3rd Edition | |
3. The Linux Programming Interface, Chapter 6 (Processes) | |
4. Understanding Memory (University of Alberta) http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/mem.html | |
5. ELC: How much memory are applications really using? http://lwn.net/Articles/230975/ | |
6. Getting information about a process' memory usage from /proc/pid/smaps (http://unix.stackexchange.com/questions/33381/getting-information-about-a-process-memory-usage-from-proc-pid-smaps) | |
7. smaps root README file in git-dev |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment