Last active
January 2, 2021 04:16
-
-
Save srishanbhattarai/4cf9b713f57f2b00c5ba3ba2378ee445 to your computer and use it in GitHub Desktop.
Example of alignment of structures, and padding fields
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdio.h> | |
struct Foo { | |
char a; | |
int b; | |
char c; | |
}; | |
// This little program shows how memory alignment can affect layout in memory, and how compilers do extra | |
// work to insert padding into your structures. | |
// | |
// The struct Foo has 6 bytes worth of data, and that is it's expected size, but sizeof | |
// will most likely return a value higher than 6 (depends on your system, was 12 on my computer). | |
// | |
// This is because the compiler inserts padding fields to align the address to the required byte | |
// boundaries. We can verify this by calculating the pointer of field 'b' in Foo. | |
// | |
// Without padding, we expect the address of foo.b to be address(foo.a) + 1, because sizeof(char) is 1. | |
// But, if we take a look at the actual pointer, it's most likely a few bytes further away, because there's | |
// a padding field between foo.a and foo.b. | |
// | |
// Recall that an unaligned access is when you try to access N bytes of data, but the data is located at an | |
// address such that address % N != 0. | |
// | |
// However, if foo.a takes up 1 byte at address 0, then the integer (say 4 bytes) foo.b, will occupy bytes | |
// 1, 2, 3, and 4. The address of foo.b, is 1, and 1 % 4 != 0 so foo.b is unaligned. Accessing this int | |
// has consequences, which depend on the system (ranging from performance issues, to undefined behavior, to crashes) | |
// | |
// To ensure alignment of foo.b, the next available 4 byte boundary is at address 4-8. (recall that foo.a is at 0) | |
// So the bytes 1, 2, and 3 need to be padded to ensure this. | |
// The struct therefore looks more like: | |
// struct Foo { | |
// char a; // address 0 | |
// char padding[3]; // address 1, 2, 3 | |
// int b; // address 4, 5, 6, 7 | |
// char c; // address 8 | |
// } | |
// | |
// At this point, the size of the structure is 1 + 3 + 4 + 1 = 9 | |
// | |
// Now, consider that you have a bunch of Foo(s) laid out in memory (in an array for instance). If each Foo occupies 9 | |
// bytes of memory, then the first Foo is at address 0 (let's suppose this is the case), and the second Foo will begin at address 9, | |
// and foo.b of the second Foo will be at address 13 so the int is misaligned again because 13 % 4 != 0. | |
// | |
// This is a roundabout way of showing that the struct _itself_ must also be aligned to some byte boundary - in this case 4. | |
// Therefore, there is another layer of 3 bytes padded at the end of the struct, which makes everything nicely aligned. | |
// | |
// The final struct is: | |
// struct Foo { | |
// char a; // address 0 | |
// char padding_for_b[3]; // address 1, 2, 3 | |
// int b; // address 4, 5, 6, 7 | |
// char c; // address 8 | |
// char padding_for_Foo[3]; // address 9, 10, 11 | |
// } | |
// | |
// which occupies 12 bytes. | |
// | |
// A caveat: | |
// In the examples above, foo.a is assumed to be at 0, and therefore it necessistates exactly 3 bytes of padding between foo.a | |
// and foo.b. | |
// Now, consider a case when foo.a starts at address 1 (chars by themselves can exist at any byte). In that case, the next | |
// 4 byte boundary is still byte 4, which means we would only need 2 byte of padding (bytes 2 and 3). HOWEVER, this cannot happen | |
// because all pointers (Except char and char[]) have to be aligned in C. In this case, the address of foo.a is the same as the address of the | |
// struct itself, so it must be the case that foo.a is aligned to a 4, or 8-byte boundary. If it was just a random char (not the first | |
// member of the struct), then this may or may not be the case. So, in this case, the padding between a and b, will always be 3. | |
// | |
// Read section 5 of this book if it still doesn't make sense: http://www.catb.org/esr/structure-packing | |
int main() { | |
Foo foo; | |
foo.a = 'x'; | |
foo.b = 5; | |
foo.c = 'y'; | |
// The struct contains these 3 fields | |
int expected_size = sizeof(char) + sizeof(int) + sizeof(char); | |
int actual_size = sizeof(Foo); | |
printf("Size in bytes: expected=%d, actual=%d\n", expected_size, actual_size); | |
printf("Address of a: %p\n", &(foo.a)); | |
int* maybe = (int*) ((&(foo.a)) + 1); | |
printf("Expected address of b: %p\n", maybe); | |
printf("Actual Address of b: %p\n", &(foo.b)); | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment