Skip to content

Instantly share code, notes, and snippets.

@Z903
Last active April 9, 2025 14:23
Show Gist options
  • Save Z903/a6ba787f42dd07ad952095bc99087f09 to your computer and use it in GitHub Desktop.
Save Z903/a6ba787f42dd07ad952095bc99087f09 to your computer and use it in GitHub Desktop.
Physical Memory Manager (PMM) Frame Allocator
/*
* Physical Memory Manager (PMM) Frame Allocator by Z903
*
* This allocator manages physical memory frames in both 4KiB and 2MiB sizes using a two-layer
* hierarchical bitmap structure. It's optimized for efficient allocation and deallocation
* through two single-linked lists - one for 4KiB frames and one for 2MiB frames. The allocator
* can manage up to 64 GiB of physical memory.
*
* Public Functions:
* - void pmm_allocator_init(t_multiboot_info* mboot_info, t_u8 reserved_memory):
* Initializes the physical memory allocator with multiboot information.
* Parameters:
* - mboot_info: Pointer to multiboot information structure containing memory map
* - reserved_memory: Memory address below which memory is reserved and will be skipped
*
* - t_s8 pmm_allocator_allocate(bool large):
* Allocates a physical memory frame of specified size.
* Parameters:
* - large: If true, allocates a 2MiB frame; if false, allocates a 4KiB frame
* Return value:
* - Physical base address of the allocated frame
* - -1 if allocation failed (no frames available)
*
* - void pmm_allocator_release(t_u8 frame):
* Releases a previously allocated physical memory frame back to the allocator.
* Parameters:
* - frame: Physical address of the frame to release
* Note: Internal tracking is used to determine whether the frame is 4KiB or 2MiB
*
* - t_u4 pmm_allocator_get_total_available_frames():
* Returns the total ammount of available physical memory
* Return value:
* - Total number of available 4KiB frames
*
* Dual-Mode Operation:
* The allocator maintains two separate free lists to handle both 4KiB and 2MiB pages efficiently:
* 1. 4KiB Mode (Small Pages):
* - next_4k pointer links entries that have individual 4KiB pages available
* - When requested and empty, a 2MiB frame is fragmented into 4KiB pages
* - When freed, the frame is merged back into a 2MiB frame if possible
*
* 2. 2MiB Mode (Large Pages):
* - next_2m pointer links entries that are completely free (all 512 pages available)
* - Used for large memory allocations
*
* Note: An entry can exist in both the 4KiB and 2MiB lists simultaneously. The is_2m flag indicates
* which mode the entry is currently being used in. When an entry is encountered in the wrong list
* (is_2m flag doesn't match the list type), it is popped from that list and the next entry is tried.
* This lazy cleanup is an optimization so we dont have to iterate over all entries.
*
* Synchronization:
* - A global pmm_stack_lock protects the free lists (next_4k and next_2m chains)
* - Each entry has its own lock for modifying pagefield bits
* - Two-phase locking is used to minimize blocking:
* 1. Acquire stack lock, modify list pointers, then release stack lock
* 2. Acquire entry lock, modify entry contents, then release entry lock
*
* Bitmap Structure:
* - Each pmm_entry manages 2MiB of physical memory (512 * 4KiB pages)
* - The bitmap is organized as follows:
* - 16 pagefields (32 pages each = 512 pages total per entry)
* - has_pages: 16-bit field where each bit indicates if corresponding pagefield has any free pages
* - full_pages: 16-bit field where each bit indicates if corresponding pagefield is completely full
*
* Memory Layout:
* - Frame addresses are encoded as follows:
* - Bits 21+: Index into pmm_bitmap array
* - Bits 17-20: Entry index (0-15) within pagefield array
* - Bits 12-16: Page index (0-31) within selected pagefield
* - Bits 0-11: Offset within the page are ignored
*
* TODO:
* There are a number of todo's in the code. How you want to handle them is up to you.
* Dynamically allocate the bitfied and change pmm_entry.pagefield static array to be pointers into that table.
* pmm_index_low is unused and we should use it to store low memory for legacy DMA.
*/
#include "types.h"
#include <stdbool.h>
#ifdef _MSC_VER
#include <intrin.h>
#endif
typedef struct
{
// Emtries below must hold pmm_stack_lock
t_s2 next_4k; // Next entry in 4KiB list
t_s2 next_2m; // Next entry in 2MiB list
// Entries below must hold pmm_entry->lock
t_u4 lock; // Synchronization pmm_entry lock
bool is_2m; // Flag to indicate if entry is being used as 2MiB page
t_u2 has_pages; // Bits 0-15: has_pages flags (1 if pagefield[i] != 0)
t_u2 full_pages; // Bits 0-15: full_pages flags (1 if pagefield[i] == 0xFFFFFFFF)
t_u4 pagefield[16]; // Array of 16 page fields, each bit represents a 4KiB page
} __attribute__((__packed__)) pmm_entry;
// Bitmap for pages, 1 entry is 512 frames (4KiB * 512 frames/entry * 32768 entries = 64 GiB)
static pmm_entry pmm_bitmap[32768];
// Index of the first free frame (low memory < 16 MiB)
// Must hold pmm_stack_lock lock
// static t_s2 pmm_index_low = -1;
// Index of the first free frame (high memory >= 16 MiB)
// Must hold pmm_stack_lock lock
static t_s2 pmm_index_high_4k = -1;
static t_s2 pmm_index_high_2m = -1;
// Frame size 12 bits (4KiB)
const t_u8 pmm_size = 12;
// Add at the top with other globals
static t_u4 pmm_stack_lock = 0;
static t_s4 pmm_total_available_frames = 0;
void pmm_allocator_init()
{
for (t_s4 i = lenof(pmm_bitmap) - 1; i >= 0; i--)
{
for (int j = 0; j < 16; j++) {
pmm_bitmap[i].pagefield[j] = 0;
}
pmm_bitmap[i].has_pages = 0;
pmm_bitmap[i].full_pages = 0;
pmm_bitmap[i].lock = 0;
pmm_bitmap[i].next_4k = -1;
pmm_bitmap[i].next_2m = -1;
pmm_bitmap[i].is_2m = false;
}
}
//void pmm_allocator_init(t_multiboot_info* _mboot_info, t_u8 reserved_memory)
//{
// // Zero pmm_bitmap (should be clear anyway)
// for (t_s4 i = lenof(pmm_bitmap) - 1; i >= 0; i--)
// {
// for (int j = 0; j < 16; j++) {
// pmm_bitmap[i].pagefield[j] = 0;
// }
// pmm_bitmap[i].has_pages = 0;
// pmm_bitmap[i].full_pages = 0;
// pmm_bitmap[i].lock = 0;
// pmm_bitmap[i].next_4k = -1;
// pmm_bitmap[i].next_2m = -1;
// pmm_bitmap[i].is_2m = false;
// }
//
// t_u8 available_memory = 0;
// t_multiboot_info_mmap* mmap = (t_multiboot_info_mmap*)_mboot_info->mmap_address;
//
// // Align to 4KiB
// t_u8 mask = (1ULL << pmm_size) - 1;
// for (mmap = (t_multiboot_info_mmap*)_mboot_info->mmap_address; (t_u4)mmap < _mboot_info->mmap_address + _mboot_info->mmap_length; mmap++)
// {
// // Usable Ram
// if (mmap->type == 1)
// {
// // Mask off the lower bits and round to page size
// t_u8 addr_s = ((mmap->address > reserved_memory ? mmap->address : reserved_memory) + mask) & (~mask);
// t_u8 addr_e = ((mmap->address + mmap->length) & (~mask)) - 1ULL;
// if (addr_e + 1ULL <= addr_s) continue;
//
// console_write_string("\nMemory 0x"); console_write_hex64(addr_s);
// console_write_string(" - 0x"); console_write_hex64(addr_e);
// console_write_string(" ["); console_write_hex64(addr_e - addr_s);
// console_write_string("]");
//
// for (t_u8 addr_p = addr_s; addr_p < addr_e; addr_p += (mask + 1))
// {
// // Add free page to memory manager
// pmm_allocator_release(addr_p);
//
// // Add 4KiB
// available_memory += (1ULL << pmm_size);
// }
// }
// }
// console_write_string("\nFree Ram 0x"); console_write_hex64(available_memory);
//}
// Atomic and lock operations
#ifdef _MSC_VER
static inline void pmm_spinlock_acquire(volatile t_u4* lock) {
while (1) {
if (_InterlockedCompareExchange((volatile long*)lock, 1, 0) == 0)
{
break;
}
// Spin without the lock prefix
while (*lock != 0) {
// Add a small delay or yield instruction for better performance
_mm_pause();
}
}
}
static inline void pmm_spinlock_release(volatile t_u4* lock) {
_InterlockedExchange((volatile long*)lock, 0);
}
static inline t_s4 pmm_atomic_add(volatile t_s4* value, t_s4 amount) {
return _InterlockedExchangeAdd((volatile long*)value, amount);
}
#else
static inline void pmm_spinlock_acquire(volatile t_u4* lock) {
while (1) {
t_u4 expected = 0;
if (__atomic_compare_exchange_n(lock, &expected, 1, 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
break;
}
// Spin without the lock prefix
while (*lock != 0) {
// Add a small delay or yield instruction for better performance
__builtin_ia32_pause();
}
}
}
static inline void pmm_spinlock_release(volatile t_u4* lock) {
__atomic_store_n(lock, 0, __ATOMIC_RELEASE);
}
static inline t_s4 pmm_atomic_add(volatile t_s4* value, t_s4 amount) {
return __atomic_fetch_add(value, amount, __ATOMIC_RELAXED);
}
#endif
// Count Trailing Zeros for 32-bit integers
static const t_u1 pmm_ctz_table[16] = { 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0 };
static t_s4 pmm_ctz32(t_u4 x)
{
if (x == 0) return -1;
t_s4 n = 0;
if ((x & 0x0000FFFF) == 0) { n += 16; x >>= 16; }
if ((x & 0x000000FF) == 0) { n += 8; x >>= 8; }
if ((x & 0x0000000F) == 0) { n += 4; x >>= 4; }
n += (t_s4)pmm_ctz_table[x & 0xF];
return n;
}
// Pop the stack and return the locked entry index else return -1
static t_s4 pmm_stack_pop_and_lock(bool large)
{
t_s2* tos = large ? &pmm_index_high_2m : &pmm_index_high_4k;
while (true) {
pmm_spinlock_acquire(&pmm_stack_lock);
t_s2 old_index = *tos;
if (old_index < 0) {
pmm_spinlock_release(&pmm_stack_lock);
return -1;
}
pmm_entry* entry = &pmm_bitmap[old_index];
t_s2* new_index = large ? &entry->next_2m : &entry->next_4k;
// Pop the frame from the stack
*tos = *new_index;
*new_index = -1;
pmm_spinlock_release(&pmm_stack_lock);
// Access the entry
pmm_spinlock_acquire(&entry->lock);
// Entry might be in the wrong list, check if it's in the correct list
if (entry->is_2m != large) {
pmm_spinlock_release(&entry->lock);
continue;
}
return old_index;
}
}
// Push the entry back onto the stack
// Must hold entry->lock
static void pmm_stack_push(t_s2 index)
{
if (index < 0 || index >= lenof(pmm_bitmap)) {
return;
}
pmm_spinlock_acquire(&pmm_stack_lock);
pmm_entry* entry = &pmm_bitmap[index];
if (entry->is_2m) {
// Entry next_2m = -1 here
entry->next_2m = pmm_index_high_2m;
pmm_index_high_2m = index;
} else {
// Entry next_4k = -1 here
entry->next_4k = pmm_index_high_4k;
pmm_index_high_4k = index;
}
pmm_spinlock_release(&pmm_stack_lock);
}
// Helper function to set or clear a bit for a specific frame
// Returns the previous value of the bit (true if set, false if cleared)
// Must hold entry->lock
static bool pmm_bit_operation(t_u8 frame, bool set)
{
// Extract the indices from the frame address
t_u4 page_index = (frame >> 12) & 0x1F; // 5 bits for page index (32 pages)
t_u4 entry_index = (frame >> 17) & 0xF; // 4 bits for entry index (16 entries)
t_u4 index = (t_u4)(frame >> 21); // Rest of bits for index
pmm_entry* entry = &pmm_bitmap[index];
// Get the current bit value
bool previous_value = (entry->pagefield[entry_index] & (1u << page_index)) != 0;
if (set) {
// Set the bit in pagefield
entry->pagefield[entry_index] |= 1u << page_index;
// Update entryfield - set has_pages and full bits based on pagefield state
entry->has_pages |= (entry->pagefield[entry_index] != 0x00000000) << entry_index;
entry->full_pages |= (entry->pagefield[entry_index] == 0xFFFFFFFF) << entry_index;
} else {
// Clear the bit for this page
entry->pagefield[entry_index] &= ~(1u << page_index);
// Update entryfield - clear has_pages if empty, always clear full bit
entry->has_pages &= ~((entry->pagefield[entry_index] == 0x00000000) << entry_index);
entry->full_pages &= ~(1u << entry_index);
}
return previous_value;
}
// Allocate a frame
// Returns the frame physical address if successful, -1 if failed
t_s8 pmm_allocator_allocate(bool large)
{
if (large) {
// Try to pop a 2MiB frame
t_s4 index = pmm_stack_pop_and_lock(true);
if (index == -1) {
return -1;
}
pmm_entry* entry = &pmm_bitmap[index];
// For 2MiB pages, we only need to clear entryfield
entry->full_pages = 0x0000;
// is_2m is already set since we popped from 2MiB list
pmm_spinlock_release(&entry->lock);
// Decrease available frames count by 512 (2MiB = 512 * 4KiB)
pmm_atomic_add(&pmm_total_available_frames, -512);
// Calculate the frame address (index * 2MiB)
return ((t_s8)index) << 21;
}
while (1) {
// Try to pop from 4k stack first then 2m stack if 4k stack is empty
t_s4 index = pmm_stack_pop_and_lock(false);
if (index == -1) {
index = pmm_stack_pop_and_lock(true);
if (index == -1) {
// We failed to allocate a frame
return -1;
}
}
pmm_entry* entry = &pmm_bitmap[index];
// Find first set bit in entryfield
t_s4 entry_index = pmm_ctz32(entry->has_pages);
if (entry_index == -1) {
// This should never be called, but if it happens; pop the next frame
pmm_spinlock_release(&entry->lock);
continue;
}
// Get the pagefield for this entry
t_u4 pagefield = entry->pagefield[entry_index];
if (pagefield == 0) {
// This should never be called, but if it happens; cleanup and then pop the next frame
// Clear the bit in entryfield since this pagefield is empty
entry->has_pages &= ~(1u << entry_index);
// Push back onto stack if there are other entries
if (entry->has_pages != 0x0000) {
pmm_stack_push(index);
}
pmm_spinlock_release(&entry->lock);
continue;
}
// Find first free page in pagefield
t_s4 page_index = pmm_ctz32(pagefield);
// Calculate the frame address (index * 2MiB + entry_index * 128KiB + page_index * 4KiB)
t_s8 frame = (((t_s8)index) << 21) | (((t_s8)entry_index) << 17) | (((t_s8)page_index) << 12);
// Clear the bit for this page
pmm_bit_operation(frame, false);
entry->is_2m = false;
// Push entry back if it still has free frames
if (entry->has_pages != 0x0000) {
pmm_stack_push(index);
}
pmm_spinlock_release(&entry->lock);
// Decrease available frames count by 1 (4KiB frame)
pmm_atomic_add(&pmm_total_available_frames, -1);
return frame;
}
}
// Release a frame by physical address
void pmm_allocator_release(t_u8 frame)
{
if (frame == 0 || frame >= 0x0000001000000000) return; // 64 GiB
// Extract the indices from the frame address
t_u4 page_index = (frame >> 12) & 0x1F; // 5 bits for page index (32 pages)
t_u4 entry_index = (frame >> 17) & 0xF; // 4 bits for entry index (16 entries)
t_u4 index = (t_u4)(frame >> 21); // Rest of bits for index
pmm_entry* entry = &pmm_bitmap[index];
pmm_spinlock_acquire(&entry->lock);
// Check if this is a 2MiB release
if (entry->is_2m) {
// For 2MiB pages, just set entryfield to full and mark as 2MiB
if (entry->full_pages != 0x0000) {
// Double free detected!
// TODO: Handle error - could panic, log, or return
pmm_spinlock_release(&entry->lock);
return;
}
entry->full_pages = 0xFFFF;
pmm_stack_push(index);
pmm_spinlock_release(&entry->lock);
// Increase available frames count by 512 (2MiB = 512 * 4KiB)
pmm_atomic_add(&pmm_total_available_frames, 512);
return;
}
// Regular 4KiB page release
// Get the previous state before we set the bit
bool was_empty = entry->has_pages == 0x0000;
bool was_full = entry->full_pages == 0xFFFF;
// Set the bit and check for double-free
if (pmm_bit_operation(frame, true) == true) {
// Double free detected!
// TODO: Handle error - could panic, log, or return
pmm_spinlock_release(&entry->lock);
return;
}
// Check if it's fully populated (all bits in all pagefields set)
bool is_full = entry->full_pages == 0xFFFF;
// Only push if we're transitioning between states
if (is_full && !was_full) {
// Transitioning to full - push to 2MiB list
entry->is_2m = true;
pmm_stack_push(index);
} else if (was_empty && !is_full) {
// Transitioning from empty to partial - push to 4KiB list
entry->is_2m = false;
pmm_stack_push(index);
}
pmm_spinlock_release(&entry->lock);
// Increase available frames count by 1 (4KiB frame)
pmm_atomic_add(&pmm_total_available_frames, 1);
}
// Get the total available 4KiB frames
t_u4 pmm_allocator_get_total_available_frames() {
return (t_u4)pmm_total_available_frames;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment