-
-
Save dblalock/255e76195676daa5cbc57b9b36d1c99a to your computer and use it in GitHub Desktop.
/* Allocate aligned memory in a portable way. | |
* | |
* Memory allocated with aligned alloc *MUST* be freed using aligned_free. | |
* | |
* @param alignment The number of bytes to which memory must be aligned. This | |
* value *must* be <= 255. | |
* @param bytes The number of bytes to allocate. | |
* @param zero If true, the returned memory will be zeroed. If false, the | |
* contents of the returned memory are undefined. | |
* @returns A pointer to `size` bytes of memory, aligned to an `alignment`-byte | |
* boundary. | |
*/ | |
void *aligned_alloc(size_t alignment, size_t size, bool zero) { | |
size_t request_size = size + alignment; | |
char* buf = (char*)(zero ? calloc(1, request_size) : malloc(request_size)); | |
size_t remainder = ((size_t)buf) % alignment; | |
size_t offset = alignment - remainder; | |
char* ret = buf + (unsigned char)offset; | |
// store how many extra bytes we allocated in the byte just before the | |
// pointer we return | |
*(unsigned char*)(ret - 1) = offset; | |
return (void*)ret; | |
} | |
/* Free memory allocated with aligned_alloc */ | |
void aligned_free(void* aligned_ptr) { | |
int offset = *(((char*)aligned_ptr) - 1); | |
free(((char*)aligned_ptr) - offset); | |
} |
that's cool, thx :)
Hello, can you explain exactly what is line #19 and #23 are doing here?
char *ret = buf + (unsigned char)offset;
and
*(unsigned char*)(ret-1) = offset;
Why are we adding offset to the buffer in line #19 that already holds requested_bytes + alignment bytes? what is exactly being done in those two lines and why? Appreciate the answer :)
Answering my own question here:
I think I understand the code now.
char *ret = buf + (unsigned char)offset;
here, we're setting a new pointer which is ahead of address of buf by offset bytes.
E.g. we want to allocate 68 bytes in a 16-bit aligned memory it would look something like this:
requested_size = 68+16 = 84
and buf = 0x112223341
then
remainder = sizeof(buf)%16 = (84%16) = 4
offset = 16 - 4 = 12 (i.e. 0x0C)
ret = &buf + offset = 0x11223341+0x0C = 0x1122334D
Now we store the number of offset bytes we have at ret-1
address and read it back while freeing up the memory.
While freeing up the memory, we take the return address, offset it back (by substracting offset from base address) to original buf address of 0x11223341
and then free up the entire memory, ensuring there is no memory leak!
Exactly. For anyone else reading this, the basic idea is that we add some number of bytes to the pointer returned by malloc
/calloc
to get an aligned pointer (which we return), and store how many bytes this was in the byte right before the returned address. When freeing, we read this number of bytes so we know the "real" address of the buffer and can pass that to free()
.
In mediocre ascii art, the layout is this:
pointer returned by malloc byte where we store # of offset bytes, at address "a-1"
|| ||
\/ offset bytes \/ size bytes
===============================----------------------------------------------------------------
^ address "a" we return
Hi,
In the case where remainder works out to be 0 and since the allocation somehow came back aligned, wouldn't the writing of the offset be out of bounds?
Thanks
I think in that case, we need to add a check that sets offset
to length of alignment and that should fix it.
In the case that remainder == 0
, we get offset = alignment
, not offset = 0
. So it doesn't end up out of bounds.
Thank you for this implementation.
But in C++ I stored offset at the end of the array.
I modified the code like char* buf = new char[alignment + type_size * size + sizeof(size_t)];
and the offset stored by *((size_t*)((char*)buf+ type_size * size)) = (size_t)alignment - remainder;
.
And to free the pointer:
size_t size = mSize;
size_t type_size = sizeof(double);
size_t offset = *((size_t*)((char*)data+ type_size * size));
delete[]((char*)mImagData - offset);
Not really sure who still uses non-power of two aligned memory and why alignment is part of the memory (counter to what alignment typically is in C and C++).
With for example 16-byte alignment this simplifies to void * ptr = (void *)(((uintptr_t)mem+15) & ~ (uintptr_t)0x0F);
and generally one can do bit shifts instead of more expensive modulo.
However, this one is much simpler. Feel free to reuse/adjust for storing the alignment:
#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void memset_16aligned(void * ptr, char byte, size_t size_bytes, uint16_t alignment) {
assert((size_bytes & (alignment-1)) == 0); // Size aligned
assert(((uintptr_t)ptr & (alignment-1)) == 0); // Pointer aligned
memset(ptr, byte, size_bytes);
}
// 1. Careful with segmented address spaces: lookup uintptr_t semantics
// 2. Careful with long standing existing optimization compiler bugs pointer to
// integer and back optimizations in for example clang and gcc
// 3. Careful with LTO potentially creating problem 2.
// 4. Consider C11 aligned_alloc or posix_memalign
void ptrtointtoptr() {
const uint16_t alignment = 16;
const uint16_t align_min_1 = alignment - 1;
void * mem = malloc(1024+align_min_1);
// C89: void *ptr = (void *)(((INT_WITH_PTR_SIZE)mem+align_min_1) & ~(INT_WITH_PTR_SIZE)align_min_1);
// ie void *ptr = (void *)(((uint64_t)mem+align_min_1) & ~(uint64_t)align_min_1);
// offset ptr to next alignment byte boundary
void * ptr = (void *)(((uintptr_t)mem+align_min_1) & ~(uintptr_t)align_min_1);
printf("0x%08" PRIXPTR ", 0x%08" PRIXPTR "\n", (uintptr_t)mem, (uintptr_t)ptr);
memset_16aligned(ptr, 0, 1024, alignment);
free(mem);
}
other nit: size_t requires C99.
Yes, that's the idea. It just also allocates extra space before this pointer to guarantee the proper alignment.