What is an "arena" in memory allocation?

I first leanred about "arena" when I was trying to understand the internal of glibc malloc around 2010, but I later realized the concept is narrowly defined in other context. This blog post explains the difference in definition and the limitations in the so-called "arena allocators" we use today.

Background: why is malloc needed?

A program can request memory from OS via system calls such as sbrk() or mmap() on Unix. These system calls are inefficient and inflexible. Deallocation with these low-level functions is particularly complicated. Few programs directly use them for routine memory allocations.

Malloc() solves this problem. It only calls sbrk and mmap to request large chunks of memory and often returns a memory block inside the chunks if the size requested by users is small. This greatly reduces expensive system calls and simplifies the usage.

On Linux, the glibc malloc is the most popular implementation. The FreeBSD malloc was derived from jemalloc which can be used as an independent library as well. There are many other malloc libraries such as mimalloc, rpmalloc and TCMalloc.

Arena in glibc and jemalloc

I could not find the precise definition of arena, but based on the glibc wiki and jemalloc manpage, I would define it as:

An arena is a collection of memory chunks that are managed together and typically share the same lifetime

In principle, malloc only needs one arena. However in this case, we will have to lock the arena when multiple threads access it. Modern malloc implementations all have multiple anenas instead. In the words of glibc manual, this "allows multiple threads to allocate memory simultaneously in separate arenas, thus improving performance". The exact mechanism varies between implementations and is beyond the scope of this blog post.

"The arena allocator" and its limitations

These days, however, I more often see an arena referring to a single memory chunk of fixed size and an "arena allocator" can't deallocate blocks inside the chunk. Such an "arena" is the most simplistic form of arena in comparison to modern malloc implementations. If you do a Google search or ask ChatGPT/DeepSeek, you will end up with this simplistic definition.

Such a simplistic arena allocator has severe limitations. Once the chunk is full, we can't grow it as a realloc call on non-Linux systems may relocate the chunk and invalidate all active pointers pointing to the old chunk. We may solve this by creating a singly linked list to chain multiple chunks, but such an implementation still can only grow and cannot reuse freed memory. It does not work in complex senarios.

More general arena allocators

As a matter of fact, many modern malloc libraries already provide much more sophisticated arena allocators. For example, Doug Lea's malloc, which glibc is indirectly derived from, allows to create a mspace (effectively an arena) which isloates allocations from other mspaces. Mimalloc/rpmalloc provides similar functionality. Their APIs look like those in simplistic arena allocators but they support dynamic growth and deallocation backed by efficient algorithms.

These malloc implementations have thousands of lines of code. I have my own arena allocator kalloc that extends the malloc implementation in the K&R book with additional functionality. It comes with the following core APIs:

void *km_init(void);                 // initialize an arena
void *kmalloc(void *km, size_t sz);  // allocate from arena km
void kfree(void *km, void *ptr);     // return memory to the arena (if needed)
void *km_destroy(void *km);          // free all memory allocated to the arena

Internally, kalloc uses a singly linked list to keep all memory chunks allocated by malloc and uses a circularly linked list to maintain all free blocks inside chunks. It is much smaller but much slower than full-pledge allocators when the arena is highly fragmented with many holes.

Conclusion

"The arena allocator" now widely refers to those using a single memory chunk. It is too late to argue over the definition now given that popular chat bots have followed the trend. With this blog post, I hope readers can learn "arena" has a more general meaning before. It is also worth noting that a simplistic arena allocator operating on a single chunk is not a silver bullet and should be used wisely. Finally, I hope someone could write a standalone arena allocator that is simpler than mimalloc/etc but faster than the K&R algorithm.

attractivechaos/2025-02-02_What-is-an-arena-in-memory-allocation.md

What is an "arena" in memory allocation?

Background: why is malloc needed?

Arena in glibc and jemalloc

"The arena allocator" and its limitations

More general arena allocators

Conclusion

g-berthiaume commented Feb 3, 2025

Uh oh!

attractivechaos commented Feb 4, 2025 •

edited

Loading

Uh oh!

g-berthiaume commented Feb 4, 2025

Uh oh!

attractivechaos/2025-02-02_What-is-an-arena-in-memory-allocation.md

What is an "arena" in memory allocation?

Background: why is malloc needed?

Arena in glibc and jemalloc

"The arena allocator" and its limitations

More general arena allocators

Conclusion

g-berthiaume commented Feb 3, 2025

Uh oh!

attractivechaos commented Feb 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

g-berthiaume commented Feb 4, 2025

Uh oh!

attractivechaos commented Feb 4, 2025 •

edited

Loading