Object/Struct Alignment in C/C++ - c++

#include <iostream>
using namespace std;
struct test
{
int i;
double h;
int j;
};
int main()
{
test te;
te.i = 5;
te.h = 6.5;
te.j = 10;
cout << "size of an int: " << sizeof(int) << endl; // Should be 4
cout << "size of a double: " << sizeof(double) << endl; //Should be 8
cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)
//These two should be the same
cout << "start address of the object: " << &te << endl;
cout << "address of i member: " << &te.i << endl;
//These two should be the same
cout << "start address of the double field: " << &te.h << endl;
cout << "calculate the offset of the double field: " << (&te + sizeof(double)) << endl; //NOT THE SAME
return 0;
}
Output:
size of an int: 4
size of a double: 8
size of test: 24
start address of the object: 0x7fffb9fd44e0
address of i member: 0x7fffb9fd44e0
start address of the double field: 0x7fffb9fd44e8
calculate the offset of the double field: 0x7fffb9fd45a0
Why do the last two lines produce different values? Something I am doing wrong with pointer arithmetic?

(&te + sizeof(double))
This is the same as:
&((&te)[sizeof(double)])
You should do:
(char*)(&te) + sizeof(int)

You are correct -- the problem is with pointer arithmetic.
When you add to a pointer, you increment the pointer by a multiple of that pointer's type
Therefore, &te + 1 will be 24 bytes after &te.
Your code &te + sizeof(double) will add 24 * sizeof(double) or 192 bytes.

Firstly, your code is wrong, you'd want to add the size of the fields before h (i.e. an int), there's no reason to assume double. Second, you need to normalise everything to char * first (pointer arithmetic is done in units of the thing being pointed to).
More generally, you can't rely on code like this to work. The compiler is free to insert padding between fields to align things to word boundaries and so on. If you really want to know the offset of a particular field, there's an offsetof macro that you can use. It's defined in <stddef.h> in C, <cstddef> in C++.
Most compilers offer an option to remove all padding (e.g. GCC's __attribute__ ((packed))).
I believe it's only well-defined to use offsetof on POD types.

struct test
{
int i;
int j;
double h;
};
Since your largest data type is 8 bytes, the struct adds padding around your ints, either put the largest data type first, or think about the padding on your end! Hope this helps!

&te + sizeof(double) is equivalent to &te + 8, which is equivalent to &((&te)[8]). That is — since &te has type test *, &te + 8 adds eight times the size of a test.

You can see what's going on more clearly using the offsetof() macro:
#include <iostream>
#include <cstddef>
using namespace std;
struct test
{
int i;
double h;
int j;
};
int main()
{
test te;
te.i = 5;
te.h = 6.5;
te.j = 10;
cout << "size of an int: " << sizeof(int) << endl; // Should be 4
cout << "size of a double: " << sizeof(double) << endl; // Should be 8
cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)
cout << "i: size = " << sizeof te.i << ", offset = " << offsetof(test, i) << endl;
cout << "h: size = " << sizeof te.h << ", offset = " << offsetof(test, h) << endl;
cout << "j: size = " << sizeof te.j << ", offset = " << offsetof(test, j) << endl;
return 0;
}
On my system (x86), I get the following output:
size of an int: 4
size of a double: 8
size of test: 16
i: size = 4, offset = 0
h: size = 8, offset = 4
j: size = 4, offset = 12
On another system (SPARC), I get:
size of an int: 4
size of a double: 8
size of test: 24
i: size = 4, offset = 0
h: size = 8, offset = 8
j: size = 4, offset = 16
The compiler will insert padding bytes between struct members to ensure that each member is aligned properly. As you can see, alignment requirements vary from system to system; on one system (x86), double is 8 bytes but only requires 4-byte alignment, and on another system (SPARC), double is 8 bytes and requires 8-byte alignment.
Padding can also be added at the end of a struct to ensure that everything is aligned properly when you have an array of the struct type. On SPARC, for example, the compile adds 4 bytes pf padding at the end of the struct.
The language guarantees that the first declared member will be at an offset of 0, and that members are laid out in the order in which they're declared. (At least that's true for simple structs; C++ metadata might complicate things.)

Compilers are free to space out structs however they want past the first member, and usually use padding to align to word boundaries for speed.
See these:
C struct sizes inconsistence
Struct varies in memory size?
et. al.

Related

Is there a way to access members of a struct

I want to be able to find the size of the individual members in a struct. For example
struct A {
int a0;
char a1;
}
Now sizeof(A) is 8, but let's assume I am writing a function that will print the alignment of A as shown below where "aa" represents the padding.
data A:
0x00: 00 00 00 00
0x04: 00 aa aa aa
*-------------------------
size: 8 padding: 3
In order for me to calculate padding, I need to know the size of each individual members of a struct. So my question is how can I access to individual members of a given struct.
Also, let me know if there is another way to find the number of padding.
A simple approach would be to use sizeof operator (exploiting the fact that it does not evaluate its operand, only determines the size of the type that would result if it was evaluated) and the offsetof() macro (from <cstddef>).
For example;
#include <iostream>
#include <cstddef>
struct A
{
int a0;
char a1;
};
int main()
{
// first calculate sizes
size_t size_A = sizeof(A);
size_t size_a0 = sizeof(((A *)nullptr)->a0); // sizeof will not dereference null
size_t size_a1 = sizeof(((A *)nullptr)->a1);
// calculate positions
size_t pos_a0 = offsetof(A, a0); // will be zero, but calculate it anyway
size_t pos_a1 = offsetof(A, a1);
// now calculate padding amounts
size_t padding_a0 = pos_a1 - pos_a0 - size_a0; // padding between a0 and a1 members
size_t padding_a1 = size_A - pos_a1 - size_a1;
std::cout << "Data A:\n";
std::cout << "0x" << std::hex << std::setw(2) << std::setfill('0') << pos_a0;
size_t i = pos_a0;
while (i < pos_a0 + size_a0) // print out zeros for bytes of a0 member
{
std::cout << " 00";
++i;
}
while (i < pos_a1) // print out aa for each padding byte after a_0
{
std::cout << " aa";
++i;
}
std::cout << std::endl;
std::cout << "0x" << std::hex << std::setw(2) << std::setfill('0') << pos_a1;
while (i < pos_a1 + size_a1) // print out zeros for bytes of a1 member
{
std::cout << " 00";
++i;
}
while (i < size_A) // print out aa for each padding byte after a_1
{
std::cout << " aa";
++i;
}
std::cout << std::endl;
std::cout << "size: " << size_A << " padding: " << padding_a0 + padding_a1 << std::endl;
}
You can work this out if you know the content of the struct. Passing usually works like this,
Assume that your struct is this.
struct A {
int a0; // 32 bits (4 bytes)
char a1; // 8 bits (1 byte)
};
But this is not memory efficient as if you pack these structs in memory, you might get some fragmentation issues thus making the application slower. So the compiler optimizes the struct like this and the final struct to the compiler would look something like this.
struct A {
int a0;
char a1;
// padding
char __padd[3]; // 3 * 1 byte = 3 bytes.
/// Adding this padding makes it 32 bit aligned.
};
Now using this knowledge, you can see why its not easy to get the padding of an object without knowing the content of it. And paddings aren't always placed at the end of the object. For example,
struct Obj {
int a = 0;
char c = 'b';
// Padding is set here by the compiler. It may look something like this: char __padd[3];
int b = 10;
};
So how to get the padding of the struct?
You can use something called Reflection to get the content of the struct at runtime. Then workout the sizes of the data types in the struct and then you can calculate the padding by deducting the size of the previous type and the next type which gives you how the padding would look like.
As other answers have said, the offsetof macro is clearly the best solution here, but just to demonstrate that you could find the positions of your members at run time by looking at the pointers:
#include <iostream>
struct A
{
char x;
int y;
char z;
};
template <typename T>
void PrintSize ()
{
std::cout << " size = " << sizeof(T) << std::endl;
}
void PrintPosition (char * ptr_mem, char * ptr_base)
{
std::cout << " position = " << ptr_mem - ptr_base << std::endl;
}
template <typename T>
void PrintDetails (char member, T * ptr_mem, A * ptr_base)
{
std::cout << member << ":" << std::endl;
PrintSize<T>();
PrintPosition((char*) ptr_mem, (char*) ptr_base);
std::cout << std::endl;
}
int main()
{
A a;
PrintDetails('x', &a.x, &a);
PrintDetails('y', &a.y, &a);
PrintDetails('z', &a.z, &a);
}
Output on my machine:
x:
size = 1
position = 0
y:
size = 4
position = 4
z:
size = 1
position = 8
(Surprisingly, on my intel, with gcc/clang, A is of size 12! I thought that the compiler did a better job of rearranging elements)
To calculate the padding of a structure, you need to know the offset of the last member, and the size:
Concisely, if type T has a member last which is of type U, the padding size is:
sizeof(T) - (offsetof(T, last) + sizeof(U))
To calculate the total amount of padding in a structure, if that is what this question is about, I would use a GCC extension: declare the same structure twice (perhaps with the help of a macro), once without the packed attribute and once with. Then subtract their sizes.
Here is a complete, working sample:
#include <stdio.h>
#define X struct { char a; int b; char c; }
int main(void)
{
printf("%zd\n", sizeof(X) - sizeof(X __attribute__((packed))));
return 0;
}
For the above structure, it outputs 6. This corresponds to the 3 bytes of padding after a necessary for the four-byte alignment of b and at the end of the structure, necessary for the alignment of b if the structure is used as an array member.
The packed attribute defeats all padding, and so the difference between the packed and unpacked structure gives us the total amount of padding.

C++ Function Return Issue [duplicate]

This question already has answers here:
getting size of array from pointer c++
(6 answers)
Closed 3 years ago.
I am trying to make a function that can return the size of an array that can be different data types. I believe the expression (sizeof(z)/sizeof(*z))) returns the memory allocated to z divided by the memory size of the data type. The following code is my attempt to overload the function and return the size of the array as an integer. When I run the expression in the main function it works, but when I try to pass the array to the function I do not get the correct values and not sure what I am doing wrong. 68 / 4 = 17 which is the correct size of the array.
(1) outputs sizeof(z) and sizeof(*z) in the size function
(2) expression in main function
(3) outputs sizeof(z) in the main function
(4) outputs sizeof(*z) in the main function
#include <iostream>
using namespace std;
//
int size(int *data){
cout << sizeof(data) << ", " << sizeof(*data) << ", ";
return((sizeof(data))/(sizeof(*data)));
}
int size(char *x){return(sizeof(x)/sizeof(*x));}
int size(float *x){return(sizeof(x)/sizeof(*x));}
int size(double *x){return(sizeof(x)/sizeof(*x));}
int size(short int *x){return(sizeof(x)/sizeof(*x));}
int size(long int *x){return(sizeof(x)/sizeof(*x));}
int main(){
double x[9];
int z[17];
char k[29];
cout << "(1) Size : " << size(z) << endl;
cout << "(2) Size : " << (sizeof(z)/sizeof(*z)) << endl;
cout << "(3) Size : " << sizeof(z) << endl;
cout << "(4) Size : " << sizeof(*z) << endl;
cout << "(5) Size : " << size(z) << endl;
cout << "(6) Size : " << size(k) << endl;
return 0;
}
Terminal Output:
(1) Size : 8, 4, 2
(2) Size : 17
(3) Size : 68
(4) Size : 4
(5) Size : 8, 4, 2
(6) Size : 8
An array decays to a pointer when you pass it to another function, so sizeof(arr) will give you the actual amount of memory allocated, only in the scope of the function in which arr was declared.
There's a common mistake:
char* a // is a pointer
char b[] // is an array
when using:
char b[5]; // is an array of 5 bytes
sizeof(b); // 5
// but
char* a = b;
sizeof(a); // 8 (x64)
the last sizeof(a) is giving you the sizeof char * which is a pointer.
You can pass the name of an Array as a Pointer to the size() function, but in that function, the argument is treated as Pointer.

Placement new and aligning for possible offset memory

I've been reading up on placement new, and I'm not sure if I'm "getting" it fully or not when it comes to proper alignment.
I've written the following test program to attempt to allocate some memory to an aligned spot:
#include <iostream>
#include <cstdint>
using namespace std;
unsigned char* mem = nullptr;
struct A
{
double d;
char c[5];
};
struct B
{
float f;
int a;
char c[2];
double d;
};
void InitMemory()
{
mem = new unsigned char[1024];
}
int main() {
// your code goes here
InitMemory();
//512 byte blocks to write structs A and B to, purposefully misaligned
unsigned char* memoryBlockForStructA = mem + 1;
unsigned char* memoryBlockForStructB = mem + 512;
unsigned char* firstAInMemory = (unsigned char*)(uintptr_t(memoryBlockForStructA) + uintptr_t(alignof(A) - 1) & ~uintptr_t(alignof(A) - 1));
A* firstA = new(firstAInMemory) A();
A* secondA = new(firstA + 1) A();
A* thirdA = new(firstA + 2) A();
cout << "Alignment of A Block: " << endl;
cout << "Memory Start: " << (void*)&(*memoryBlockForStructA) << endl;
cout << "Starting Address of firstA: " << (void*)&(*firstA) << endl;
cout << "Starting Address of secondA: " << (void*)&(*secondA) << endl;
cout << "Starting Address of thirdA: " << (void*)&(*thirdA) << endl;
cout << "Sizeof(A): " << sizeof(A) << endl << "Alignof(A): " << alignof(A) << endl;
return 0;
}
Output:
Alignment of A Block:
Memory Start: 0x563fe1239c21
Starting Address of firstA: 0x563fe1239c28
Starting Address of secondA: 0x563fe1239c38
Starting Address of thirdA: 0x563fe1239c48
Sizeof(A): 16
Alignof(A): 8
The output appears to be valid, but I still have some questions about it.
Some questions I have are:
Will fourthA, fifthA, etc... all be aligned as well?
Is there a simpler way of finding a properly aligned memory location?
In the case of struct B, it is set up to not be memory friendly. Do I need to reconstruct it so that the largest members are at the top of the struct, and the smallest members are at the bottom? Or will the compiler automatically pad everything so that it's member d will not be malaligned?
Will fourthA, fifthA, etc... all be aligned as well?
yes if the alignement of a type is a multiple of the size
witch is (i think) always the case
Is there a simpler way of finding a properly aligned memory location?
yes
http://en.cppreference.com/w/cpp/language/alignas
or
http://en.cppreference.com/w/cpp/memory/align
as Dan M said.
In the case of struct B, it is set up to not be memory friendly. Do I need to reconstruct it so that the largest members are at the top of the struct, and the smallest members are at the bottom? Or will the compiler automatically pad everything so that it's member d will not be malaligned?
you should reorganize if you think about it.
i don't think compiler will reorganize element in a struct for you.
because often when interpreting raw data (coming from file, network ...) this data is often just interpreted as a struct and 2 compiler reorganizing differently could break code.
I hope my explanation are clear and that I did not make any mistakes

How is an array aligned in C++ compared to a type contained? 2

Following the topic How is an array aligned in C++ compared to a type contained? I made an experiment.
Here is the code:
#include<iostream>
using namespace std;
int main()
{
const int N = 12;
{
float p1[N], p2[N];
cout << p1 << " " << p2 << " " << p2 - p1 << endl;
}
{
float* p1, *p2;
// allocate memory
p1 = new float[N];
p2 = new float[N];
cout << p1 << " " << p2 << " " << p2 - p1 << endl;
delete[] p1;
delete[] p2;
}
}
According to the cited question and wiki I would expect that p1 and p2 would be sizeof(float) == 4 bytes aligned. But the result is:
0x7fff4fd2b410 0x7fff4fd2b440 12
0x7f8cc9c03bd0 0x7f8cc9c03c00 12
Same 12 distance between arrays for N = 9, 11 and 12. Distance (p2-p1) is 8 for N = 8.
So it looks like float arrays is 16 bytes aligned. Why?
P.S.
My processor is Intel Core i7
Compiler - g++ 4.6.3
It appears that putting arrays in a structure one can get 10 floats distance:
const int N = 10;
struct PP{
float p1[N], p2[N];
};
int main() {
PP pp;
cout << pp.p1 << " " << pp.p2 << " " << pp.p2 - pp.p1 << endl;
}
0x7fff50139440 0x7fff50139468 10
Memory allocated by operator new always has suitable alignment for any object type, whatever the actual type being created. Additionally, it may also have a wider alignment (for example, alignment with a cache line) to improve performance.
If you replace your dynamic arrays with automatic ones (or better still, make them consecutive members of a class), then you may see narrower alignments. Or you may not; the exact details of how objects are aligned are up to the compiler, as long as it meets the minimum requirement for the type.

Object array alignment with __attribute__aligned() Or alignas()?

Quick Question guys... Are these code spinets have the same alignment ?
struct sse_t {
float sse_data[4];
};
// the array "cacheline" will be aligned to 64-byte boundary
struct sse_t alignas(64) cacheline[1000000];
Or
// every object of type sse_t will be aligned to 64-byte boundary
struct sse_t {
float sse_data[4];
} __attribute((aligned(64)));
struct sse_t cacheline[1000000];
Are these code spinets have the same alignment ?
Not quite. Your two examples are actually very different.
In your first example, you will get an array of sse_t objects. A sse_t object is only guaranteed 4-byte alignment. But since the entire array is aligned to 64-bytes, each sse_t object will be properly aligned for SSE access.
In your second example, you are forcing each sse_t object to be aligned to 64-bytes. But each sse_t object is only 16 bytes. So the array will be 4x larger. (You will have 48 bytes of padding at the end of each sse_t object).
struct objA {
float sse_data[4];
};
struct objB {
float sse_data[4];
} __attribute((aligned(64)));
int main(){
cout << sizeof(objA) << endl;
cout << sizeof(objB) << endl;
}
Output:
16
64
I'm pretty sure that the second case is not what you want.
But why do you want to align to 64 bytes?
http://ideone.com/JNEIBR
#include <iostream>
using namespace std;
struct sse_t1 {
float sse_data[4];
};
// the array "cacheline" will be aligned to 64-byte boundary
struct sse_t1 alignas(64) cacheline1[1000000];
// every object of type sse_t will be aligned to 64-byte boundary
struct sse_t2 {
float sse_data[4];
} __attribute((aligned(64)));
struct sse_t2 cacheline2[1000000];
int main() {
cout << "sizeof(sse_t1) = " << sizeof(sse_t1) << endl;
cout << "sizeof(sse_t2) = " << sizeof(sse_t2) << endl;
cout << "array cacheline1 " << (((size_t)(cacheline1) % 64 == 0)?"aligned to 64":"not aligned to 64") << endl;
cout << "array cacheline2 " << (((size_t)(cacheline2) % 64 == 0)?"aligned to 64":"not aligned to 64") << endl;
cout << "cacheline1[0] - cacheline1[1] = " << (size_t)&(cacheline1[1]) - (size_t)&(cacheline1[0]) << endl;
cout << "cacheline2[0] - cacheline2[1] = " << (size_t)&(cacheline2[1]) - (size_t)&(cacheline2[0]) << endl;
return 0;
}
Output:
sizeof(sse_t1) = 16
sizeof(sse_t2) = 64
array cacheline1 aligned to 64
array cacheline2 aligned to 64
cacheline1[0] - cacheline1[1] = 16
cacheline2[0] - cacheline2[1] = 64