-
-
Save dblalock/255e76195676daa5cbc57b9b36d1c99a to your computer and use it in GitHub Desktop.
/* Allocate aligned memory in a portable way. | |
* | |
* Memory allocated with aligned alloc *MUST* be freed using aligned_free. | |
* | |
* @param alignment The number of bytes to which memory must be aligned. This | |
* value *must* be <= 255. | |
* @param bytes The number of bytes to allocate. | |
* @param zero If true, the returned memory will be zeroed. If false, the | |
* contents of the returned memory are undefined. | |
* @returns A pointer to `size` bytes of memory, aligned to an `alignment`-byte | |
* boundary. | |
*/ | |
void *aligned_alloc(size_t alignment, size_t size, bool zero) { | |
size_t request_size = size + alignment; | |
char* buf = (char*)(zero ? calloc(1, request_size) : malloc(request_size)); | |
size_t remainder = ((size_t)buf) % alignment; | |
size_t offset = alignment - remainder; | |
char* ret = buf + (unsigned char)offset; | |
// store how many extra bytes we allocated in the byte just before the | |
// pointer we return | |
*(unsigned char*)(ret - 1) = offset; | |
return (void*)ret; | |
} | |
/* Free memory allocated with aligned_alloc */ | |
void aligned_free(void* aligned_ptr) { | |
int offset = *(((char*)aligned_ptr) - 1); | |
free(((char*)aligned_ptr) - offset); | |
} |
Hi,
In the case where remainder works out to be 0 and since the allocation somehow came back aligned, wouldn't the writing of the offset be out of bounds?
Thanks
I think in that case, we need to add a check that sets offset
to length of alignment and that should fix it.
In the case that remainder == 0
, we get offset = alignment
, not offset = 0
. So it doesn't end up out of bounds.
Thank you for this implementation.
But in C++ I stored offset at the end of the array.
I modified the code like char* buf = new char[alignment + type_size * size + sizeof(size_t)];
and the offset stored by *((size_t*)((char*)buf+ type_size * size)) = (size_t)alignment - remainder;
.
And to free the pointer:
size_t size = mSize;
size_t type_size = sizeof(double);
size_t offset = *((size_t*)((char*)data+ type_size * size));
delete[]((char*)mImagData - offset);
Not really sure who still uses non-power of two aligned memory and why alignment is part of the memory (counter to what alignment typically is in C and C++).
With for example 16-byte alignment this simplifies to void * ptr = (void *)(((uintptr_t)mem+15) & ~ (uintptr_t)0x0F);
and generally one can do bit shifts instead of more expensive modulo.
However, this one is much simpler. Feel free to reuse/adjust for storing the alignment:
#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void memset_16aligned(void * ptr, char byte, size_t size_bytes, uint16_t alignment) {
assert((size_bytes & (alignment-1)) == 0); // Size aligned
assert(((uintptr_t)ptr & (alignment-1)) == 0); // Pointer aligned
memset(ptr, byte, size_bytes);
}
// 1. Careful with segmented address spaces: lookup uintptr_t semantics
// 2. Careful with long standing existing optimization compiler bugs pointer to
// integer and back optimizations in for example clang and gcc
// 3. Careful with LTO potentially creating problem 2.
// 4. Consider C11 aligned_alloc or posix_memalign
void ptrtointtoptr() {
const uint16_t alignment = 16;
const uint16_t align_min_1 = alignment - 1;
void * mem = malloc(1024+align_min_1);
// C89: void *ptr = (void *)(((INT_WITH_PTR_SIZE)mem+align_min_1) & ~(INT_WITH_PTR_SIZE)align_min_1);
// ie void *ptr = (void *)(((uint64_t)mem+align_min_1) & ~(uint64_t)align_min_1);
// offset ptr to next alignment byte boundary
void * ptr = (void *)(((uintptr_t)mem+align_min_1) & ~(uintptr_t)align_min_1);
printf("0x%08" PRIXPTR ", 0x%08" PRIXPTR "\n", (uintptr_t)mem, (uintptr_t)ptr);
memset_16aligned(ptr, 0, 1024, alignment);
free(mem);
}
other nit: size_t requires C99.
Exactly. For anyone else reading this, the basic idea is that we add some number of bytes to the pointer returned by
malloc
/calloc
to get an aligned pointer (which we return), and store how many bytes this was in the byte right before the returned address. When freeing, we read this number of bytes so we know the "real" address of the buffer and can pass that tofree()
.In mediocre ascii art, the layout is this: