Any running process has several memory regions: code, read-only data, read-write data, et cetera. Some regions, such as code and read-only data, are static and do not change over time. Other regions are dynamic: they can expand and shrink. Usually there are two such regions: dynamic read-write data region, called heap, and a region called stack. Heap holds dynamic memory allocations, and stack is mostly used for keeping function frames.
Both stack and heap can grow. An OS doesn't know in advance whether stack or heap will be used predominantly. Therefore, an OS must layout these two memory regions in a way to guarantee maximum space for both. And here is the solution:
- Layout static memory regions at the edges of process's virtual memory
- Put heap and stack on edges too, and let them grow towards each other: one grows up, one grows down
This solution is reflected in a hardware design: most CPUs have support for stack, and it grows down. For example, 32-bit Linux implements this layout in the following way:
- code, shared libraries and read-only data are mapped at the beginning of the memory
- then goes heap that grows up
- kernel is mapped at the last 1G
- stack starts at kernel's lower boundary and grows down
Here's the simplified illustration of the memory map (Windows uses similar layout):
|---| DLLs | code | data | heap |--> <---| stack | kernel |
0 3G 4G
(first page is never mapped to catch NULL dereferences)
That is fine for single-threaded programs. Multi-threading requires separate stack for each thread. Therefore, main thread's stack begins at it's usual place - at the kernel boundary. Next thread's stack starts from some offset that defines the maximum stack size of the main thread. Thread API allows to set stack size.
|---| DLLs | code | data | heap |--> <--| stack 2 | <--| stack 1 | kernel |
0 3G 4G
Many threads can consume a lot of virtual space, which can be a problem on 32-bit machine. E.g. a program with 2000 threads, each taking a default of 1M stack size, eats about 2G or virtual memory, leaving very little space for heap. In such case, thread stack size should be reduced.
In the majority of modern architectures, stack grows down. However, there are some processors (e.g. B5000) where stack grows up. Some architecrures (e.g. System Z, RCA1802A) allow to choose stack direction. SEAforth/GreenArrays, SPARC processors have cyclical stack.
Author: Sergey Lyubka, January 2014
Corrected by: Vsevolod Stakhov, January 2014
Corrected by: Peter Sovietov, January 2014