Skip to content

Instantly share code, notes, and snippets.

@dblalock
Created August 30, 2017 22:11
Show Gist options
  • Save dblalock/255e76195676daa5cbc57b9b36d1c99a to your computer and use it in GitHub Desktop.
Save dblalock/255e76195676daa5cbc57b9b36d1c99a to your computer and use it in GitHub Desktop.
C / C++ portable aligned memory allocation
/* Allocate aligned memory in a portable way.
*
* Memory allocated with aligned alloc *MUST* be freed using aligned_free.
*
* @param alignment The number of bytes to which memory must be aligned. This
* value *must* be <= 255.
* @param bytes The number of bytes to allocate.
* @param zero If true, the returned memory will be zeroed. If false, the
* contents of the returned memory are undefined.
* @returns A pointer to `size` bytes of memory, aligned to an `alignment`-byte
* boundary.
*/
void *aligned_alloc(size_t alignment, size_t size, bool zero) {
size_t request_size = size + alignment;
char* buf = (char*)(zero ? calloc(1, request_size) : malloc(request_size));
size_t remainder = ((size_t)buf) % alignment;
size_t offset = alignment - remainder;
char* ret = buf + (unsigned char)offset;
// store how many extra bytes we allocated in the byte just before the
// pointer we return
*(unsigned char*)(ret - 1) = offset;
return (void*)ret;
}
/* Free memory allocated with aligned_alloc */
void aligned_free(void* aligned_ptr) {
int offset = *(((char*)aligned_ptr) - 1);
free(((char*)aligned_ptr) - offset);
}
@dblalock
Copy link
Author

Exactly. For anyone else reading this, the basic idea is that we add some number of bytes to the pointer returned by malloc/calloc to get an aligned pointer (which we return), and store how many bytes this was in the byte right before the returned address. When freeing, we read this number of bytes so we know the "real" address of the buffer and can pass that to free().

In mediocre ascii art, the layout is this:


pointer returned by malloc    byte where we store # of offset bytes, at address "a-1"
||                            ||
\/       offset bytes         \/                         size bytes
===============================----------------------------------------------------------------
                               ^ address "a" we return

@chaoticbob
Copy link

Hi,

In the case where remainder works out to be 0 and since the allocation somehow came back aligned, wouldn't the writing of the offset be out of bounds?

Thanks

@karanbale
Copy link

I think in that case, we need to add a check that sets offset to length of alignment and that should fix it.

@dblalock
Copy link
Author

In the case that remainder == 0, we get offset = alignment, not offset = 0. So it doesn't end up out of bounds.

@kaan2463
Copy link

kaan2463 commented Mar 7, 2024

Thank you for this implementation.
But in C++ I stored offset at the end of the array.
I modified the code like char* buf = new char[alignment + type_size * size + sizeof(size_t)]; and the offset stored by *((size_t*)((char*)buf+ type_size * size)) = (size_t)alignment - remainder;.
And to free the pointer:

size_t size = mSize;
size_t type_size = sizeof(double);
size_t offset = *((size_t*)((char*)data+ type_size * size));
delete[]((char*)mImagData - offset);

@matu3ba
Copy link

matu3ba commented Jun 23, 2024

Not really sure who still uses non-power of two aligned memory and why alignment is part of the memory (counter to what alignment typically is in C and C++).

With for example 16-byte alignment this simplifies to void * ptr = (void *)(((uintptr_t)mem+15) & ~ (uintptr_t)0x0F); and generally one can do bit shifts instead of more expensive modulo.

However, this one is much simpler. Feel free to reuse/adjust for storing the alignment:

#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void memset_16aligned(void * ptr, char byte, size_t size_bytes, uint16_t alignment) {
    assert((size_bytes & (alignment-1)) == 0); // Size aligned
    assert(((uintptr_t)ptr & (alignment-1)) == 0); // Pointer aligned
    memset(ptr, byte, size_bytes);
}

// 1. Careful with segmented address spaces: lookup uintptr_t semantics
// 2. Careful with long standing existing optimization compiler bugs pointer to
// integer and back optimizations in for example clang and gcc
// 3. Careful with LTO potentially creating problem 2.
// 4. Consider C11 aligned_alloc or posix_memalign
void ptrtointtoptr() {
  const uint16_t alignment = 16;
  const uint16_t align_min_1 = alignment - 1;
  void * mem = malloc(1024+align_min_1);
  // C89: void *ptr = (void *)(((INT_WITH_PTR_SIZE)mem+align_min_1) & ~(INT_WITH_PTR_SIZE)align_min_1);
  // ie void *ptr = (void *)(((uint64_t)mem+align_min_1) & ~(uint64_t)align_min_1);
  // offset ptr to next alignment byte boundary
  void * ptr = (void *)(((uintptr_t)mem+align_min_1) & ~(uintptr_t)align_min_1);
  printf("0x%08" PRIXPTR ", 0x%08" PRIXPTR "\n", (uintptr_t)mem, (uintptr_t)ptr);
  memset_16aligned(ptr, 0, 1024, alignment);
  free(mem);
}

@matu3ba
Copy link

matu3ba commented Jun 23, 2024

other nit: size_t requires C99.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment