Is there a way to access members of a struct - c++

I want to be able to find the size of the individual members in a struct. For example
struct A {
int a0;
char a1;
}
Now sizeof(A) is 8, but let's assume I am writing a function that will print the alignment of A as shown below where "aa" represents the padding.
data A:
0x00: 00 00 00 00
0x04: 00 aa aa aa
*-------------------------
size: 8 padding: 3
In order for me to calculate padding, I need to know the size of each individual members of a struct. So my question is how can I access to individual members of a given struct.
Also, let me know if there is another way to find the number of padding.

A simple approach would be to use sizeof operator (exploiting the fact that it does not evaluate its operand, only determines the size of the type that would result if it was evaluated) and the offsetof() macro (from <cstddef>).
For example;
#include <iostream>
#include <cstddef>
struct A
{
int a0;
char a1;
};
int main()
{
// first calculate sizes
size_t size_A = sizeof(A);
size_t size_a0 = sizeof(((A *)nullptr)->a0); // sizeof will not dereference null
size_t size_a1 = sizeof(((A *)nullptr)->a1);
// calculate positions
size_t pos_a0 = offsetof(A, a0); // will be zero, but calculate it anyway
size_t pos_a1 = offsetof(A, a1);
// now calculate padding amounts
size_t padding_a0 = pos_a1 - pos_a0 - size_a0; // padding between a0 and a1 members
size_t padding_a1 = size_A - pos_a1 - size_a1;
std::cout << "Data A:\n";
std::cout << "0x" << std::hex << std::setw(2) << std::setfill('0') << pos_a0;
size_t i = pos_a0;
while (i < pos_a0 + size_a0) // print out zeros for bytes of a0 member
{
std::cout << " 00";
++i;
}
while (i < pos_a1) // print out aa for each padding byte after a_0
{
std::cout << " aa";
++i;
}
std::cout << std::endl;
std::cout << "0x" << std::hex << std::setw(2) << std::setfill('0') << pos_a1;
while (i < pos_a1 + size_a1) // print out zeros for bytes of a1 member
{
std::cout << " 00";
++i;
}
while (i < size_A) // print out aa for each padding byte after a_1
{
std::cout << " aa";
++i;
}
std::cout << std::endl;
std::cout << "size: " << size_A << " padding: " << padding_a0 + padding_a1 << std::endl;
}

You can work this out if you know the content of the struct. Passing usually works like this,
Assume that your struct is this.
struct A {
int a0; // 32 bits (4 bytes)
char a1; // 8 bits (1 byte)
};
But this is not memory efficient as if you pack these structs in memory, you might get some fragmentation issues thus making the application slower. So the compiler optimizes the struct like this and the final struct to the compiler would look something like this.
struct A {
int a0;
char a1;
// padding
char __padd[3]; // 3 * 1 byte = 3 bytes.
/// Adding this padding makes it 32 bit aligned.
};
Now using this knowledge, you can see why its not easy to get the padding of an object without knowing the content of it. And paddings aren't always placed at the end of the object. For example,
struct Obj {
int a = 0;
char c = 'b';
// Padding is set here by the compiler. It may look something like this: char __padd[3];
int b = 10;
};
So how to get the padding of the struct?
You can use something called Reflection to get the content of the struct at runtime. Then workout the sizes of the data types in the struct and then you can calculate the padding by deducting the size of the previous type and the next type which gives you how the padding would look like.

As other answers have said, the offsetof macro is clearly the best solution here, but just to demonstrate that you could find the positions of your members at run time by looking at the pointers:
#include <iostream>
struct A
{
char x;
int y;
char z;
};
template <typename T>
void PrintSize ()
{
std::cout << " size = " << sizeof(T) << std::endl;
}
void PrintPosition (char * ptr_mem, char * ptr_base)
{
std::cout << " position = " << ptr_mem - ptr_base << std::endl;
}
template <typename T>
void PrintDetails (char member, T * ptr_mem, A * ptr_base)
{
std::cout << member << ":" << std::endl;
PrintSize<T>();
PrintPosition((char*) ptr_mem, (char*) ptr_base);
std::cout << std::endl;
}
int main()
{
A a;
PrintDetails('x', &a.x, &a);
PrintDetails('y', &a.y, &a);
PrintDetails('z', &a.z, &a);
}
Output on my machine:
x:
size = 1
position = 0
y:
size = 4
position = 4
z:
size = 1
position = 8
(Surprisingly, on my intel, with gcc/clang, A is of size 12! I thought that the compiler did a better job of rearranging elements)

To calculate the padding of a structure, you need to know the offset of the last member, and the size:
Concisely, if type T has a member last which is of type U, the padding size is:
sizeof(T) - (offsetof(T, last) + sizeof(U))
To calculate the total amount of padding in a structure, if that is what this question is about, I would use a GCC extension: declare the same structure twice (perhaps with the help of a macro), once without the packed attribute and once with. Then subtract their sizes.
Here is a complete, working sample:
#include <stdio.h>
#define X struct { char a; int b; char c; }
int main(void)
{
printf("%zd\n", sizeof(X) - sizeof(X __attribute__((packed))));
return 0;
}
For the above structure, it outputs 6. This corresponds to the 3 bytes of padding after a necessary for the four-byte alignment of b and at the end of the structure, necessary for the alignment of b if the structure is used as an array member.
The packed attribute defeats all padding, and so the difference between the packed and unpacked structure gives us the total amount of padding.

Related

How to read memcpy struct result via a pointer

I want to copy a struct content in memory via char* pc the print it back but here I have an exception (reading violation)
struct af {
bool a;
uint8_t b;
uint16_t c;
};
int main() {
af t;
t.a = true;
t.b = 3;
t.c = 20;
char* pc = nullptr;
memcpy(&pc, &t, sizeof(t));
std::cout << "msg is " << pc << std::endl; // here the exception
return 0;
}
then I want to recover data from memory to another structure of same type.
I did af* tt = (af*)(pc); then tried to access to tt->a but always an exception.
You need to allocate memory before you can copy something into it. Also, pc is already the pointer, you need not take the address of it again. Moreover, the byte representation is very likely to contain non-printable characters. To see the actual effect the following copies from the buffer back to an af and prints its members (note that a cast is needed to prevent std::cout to interpret the uint8_t as a character):
#include <iostream>
#include <cstring>
struct af {
bool a;
uint8_t b;
uint16_t c;
};
int main() {
af t;
t.a = true;
t.b = 3;
t.c = 20;
char pc[sizeof(af)];
std::memcpy(pc, &t, sizeof(t)); // array pc decays to pointer to first element
for (int i=0;i<sizeof(af); ++i){
std::cout << i << " " << pc[i] << "\n";
}
af t2;
std::memcpy(&t2, pc,sizeof(t));
std::cout << t2.a << " " << static_cast<unsigned>(t2.b) << " " << t2.c;
}
Output:
0
1
2
3
1 3 20
Note that I replaced the output of pc with a loop that prints individual characters, because the binary representation might contain null terminators and pc is not a null terminated string. If you want it to be a null-terminated string, it must be of size sizeof(af) +1 and have a terminating '\0'.

Placement new and aligning for possible offset memory

I've been reading up on placement new, and I'm not sure if I'm "getting" it fully or not when it comes to proper alignment.
I've written the following test program to attempt to allocate some memory to an aligned spot:
#include <iostream>
#include <cstdint>
using namespace std;
unsigned char* mem = nullptr;
struct A
{
double d;
char c[5];
};
struct B
{
float f;
int a;
char c[2];
double d;
};
void InitMemory()
{
mem = new unsigned char[1024];
}
int main() {
// your code goes here
InitMemory();
//512 byte blocks to write structs A and B to, purposefully misaligned
unsigned char* memoryBlockForStructA = mem + 1;
unsigned char* memoryBlockForStructB = mem + 512;
unsigned char* firstAInMemory = (unsigned char*)(uintptr_t(memoryBlockForStructA) + uintptr_t(alignof(A) - 1) & ~uintptr_t(alignof(A) - 1));
A* firstA = new(firstAInMemory) A();
A* secondA = new(firstA + 1) A();
A* thirdA = new(firstA + 2) A();
cout << "Alignment of A Block: " << endl;
cout << "Memory Start: " << (void*)&(*memoryBlockForStructA) << endl;
cout << "Starting Address of firstA: " << (void*)&(*firstA) << endl;
cout << "Starting Address of secondA: " << (void*)&(*secondA) << endl;
cout << "Starting Address of thirdA: " << (void*)&(*thirdA) << endl;
cout << "Sizeof(A): " << sizeof(A) << endl << "Alignof(A): " << alignof(A) << endl;
return 0;
}
Output:
Alignment of A Block:
Memory Start: 0x563fe1239c21
Starting Address of firstA: 0x563fe1239c28
Starting Address of secondA: 0x563fe1239c38
Starting Address of thirdA: 0x563fe1239c48
Sizeof(A): 16
Alignof(A): 8
The output appears to be valid, but I still have some questions about it.
Some questions I have are:
Will fourthA, fifthA, etc... all be aligned as well?
Is there a simpler way of finding a properly aligned memory location?
In the case of struct B, it is set up to not be memory friendly. Do I need to reconstruct it so that the largest members are at the top of the struct, and the smallest members are at the bottom? Or will the compiler automatically pad everything so that it's member d will not be malaligned?
Will fourthA, fifthA, etc... all be aligned as well?
yes if the alignement of a type is a multiple of the size
witch is (i think) always the case
Is there a simpler way of finding a properly aligned memory location?
yes
http://en.cppreference.com/w/cpp/language/alignas
or
http://en.cppreference.com/w/cpp/memory/align
as Dan M said.
In the case of struct B, it is set up to not be memory friendly. Do I need to reconstruct it so that the largest members are at the top of the struct, and the smallest members are at the bottom? Or will the compiler automatically pad everything so that it's member d will not be malaligned?
you should reorganize if you think about it.
i don't think compiler will reorganize element in a struct for you.
because often when interpreting raw data (coming from file, network ...) this data is often just interpreted as a struct and 2 compiler reorganizing differently could break code.
I hope my explanation are clear and that I did not make any mistakes

Sizeof of subclass not always equals to sum of sizeof of superclass. So what is with non used memory?

The code that I've written:
#include <iostream>
using std::cout;
using std::endl;
struct S
{
long l;
};
struct A
{
int a;
};
struct B : A, S{ };
int main()
{
cout << "sizeof(A) = " << sizeof(A) << endl; //4
cout << "sizeof(S) = " << sizeof(S) << endl; //8
cout << "sizeof(B) = " << sizeof(B) << endl; //16 != 4 + 8
}
demo
Why do we have to allocate an additional 4 bytes for B? How this additional memory is used?
Memory image of a B instance:
int a; // bytes 0-3
long l; // bytes 4-11
Problem:
l is an 8-byte variable, but its address is not aligned to 8 bytes. As a result, unless the underlying HW architecture supports unaligned load/store operations, the compiler cannot generate correct assembly code.
Solution:
The compiler adds a 4-byte padding after variable a, in order to align variable l to an 8-byte address.
Memory image of a B instance:
int a; // bytes 0-3
int p; // bytes 4-7
long l; // bytes 8-15
For POD data type, there can be padding and alignments.
In B's memory layout, int comes first, then long. The compiler likely inserts 4 bytes of padding between them to 8-align the long assuming any instance of the structure will also have its starting address 8-aligned.
Similar things would happen if you put an A and a S instance on your stack: the compiler would leave 4 bytes empty in between.
See Data Structure Alignment.

Object array alignment with __attribute__aligned() Or alignas()?

Quick Question guys... Are these code spinets have the same alignment ?
struct sse_t {
float sse_data[4];
};
// the array "cacheline" will be aligned to 64-byte boundary
struct sse_t alignas(64) cacheline[1000000];
Or
// every object of type sse_t will be aligned to 64-byte boundary
struct sse_t {
float sse_data[4];
} __attribute((aligned(64)));
struct sse_t cacheline[1000000];
Are these code spinets have the same alignment ?
Not quite. Your two examples are actually very different.
In your first example, you will get an array of sse_t objects. A sse_t object is only guaranteed 4-byte alignment. But since the entire array is aligned to 64-bytes, each sse_t object will be properly aligned for SSE access.
In your second example, you are forcing each sse_t object to be aligned to 64-bytes. But each sse_t object is only 16 bytes. So the array will be 4x larger. (You will have 48 bytes of padding at the end of each sse_t object).
struct objA {
float sse_data[4];
};
struct objB {
float sse_data[4];
} __attribute((aligned(64)));
int main(){
cout << sizeof(objA) << endl;
cout << sizeof(objB) << endl;
}
Output:
16
64
I'm pretty sure that the second case is not what you want.
But why do you want to align to 64 bytes?
http://ideone.com/JNEIBR
#include <iostream>
using namespace std;
struct sse_t1 {
float sse_data[4];
};
// the array "cacheline" will be aligned to 64-byte boundary
struct sse_t1 alignas(64) cacheline1[1000000];
// every object of type sse_t will be aligned to 64-byte boundary
struct sse_t2 {
float sse_data[4];
} __attribute((aligned(64)));
struct sse_t2 cacheline2[1000000];
int main() {
cout << "sizeof(sse_t1) = " << sizeof(sse_t1) << endl;
cout << "sizeof(sse_t2) = " << sizeof(sse_t2) << endl;
cout << "array cacheline1 " << (((size_t)(cacheline1) % 64 == 0)?"aligned to 64":"not aligned to 64") << endl;
cout << "array cacheline2 " << (((size_t)(cacheline2) % 64 == 0)?"aligned to 64":"not aligned to 64") << endl;
cout << "cacheline1[0] - cacheline1[1] = " << (size_t)&(cacheline1[1]) - (size_t)&(cacheline1[0]) << endl;
cout << "cacheline2[0] - cacheline2[1] = " << (size_t)&(cacheline2[1]) - (size_t)&(cacheline2[0]) << endl;
return 0;
}
Output:
sizeof(sse_t1) = 16
sizeof(sse_t2) = 64
array cacheline1 aligned to 64
array cacheline2 aligned to 64
cacheline1[0] - cacheline1[1] = 16
cacheline2[0] - cacheline2[1] = 64

Object/Struct Alignment in C/C++

#include <iostream>
using namespace std;
struct test
{
int i;
double h;
int j;
};
int main()
{
test te;
te.i = 5;
te.h = 6.5;
te.j = 10;
cout << "size of an int: " << sizeof(int) << endl; // Should be 4
cout << "size of a double: " << sizeof(double) << endl; //Should be 8
cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)
//These two should be the same
cout << "start address of the object: " << &te << endl;
cout << "address of i member: " << &te.i << endl;
//These two should be the same
cout << "start address of the double field: " << &te.h << endl;
cout << "calculate the offset of the double field: " << (&te + sizeof(double)) << endl; //NOT THE SAME
return 0;
}
Output:
size of an int: 4
size of a double: 8
size of test: 24
start address of the object: 0x7fffb9fd44e0
address of i member: 0x7fffb9fd44e0
start address of the double field: 0x7fffb9fd44e8
calculate the offset of the double field: 0x7fffb9fd45a0
Why do the last two lines produce different values? Something I am doing wrong with pointer arithmetic?
(&te + sizeof(double))
This is the same as:
&((&te)[sizeof(double)])
You should do:
(char*)(&te) + sizeof(int)
You are correct -- the problem is with pointer arithmetic.
When you add to a pointer, you increment the pointer by a multiple of that pointer's type
Therefore, &te + 1 will be 24 bytes after &te.
Your code &te + sizeof(double) will add 24 * sizeof(double) or 192 bytes.
Firstly, your code is wrong, you'd want to add the size of the fields before h (i.e. an int), there's no reason to assume double. Second, you need to normalise everything to char * first (pointer arithmetic is done in units of the thing being pointed to).
More generally, you can't rely on code like this to work. The compiler is free to insert padding between fields to align things to word boundaries and so on. If you really want to know the offset of a particular field, there's an offsetof macro that you can use. It's defined in <stddef.h> in C, <cstddef> in C++.
Most compilers offer an option to remove all padding (e.g. GCC's __attribute__ ((packed))).
I believe it's only well-defined to use offsetof on POD types.
struct test
{
int i;
int j;
double h;
};
Since your largest data type is 8 bytes, the struct adds padding around your ints, either put the largest data type first, or think about the padding on your end! Hope this helps!
&te + sizeof(double) is equivalent to &te + 8, which is equivalent to &((&te)[8]). That is — since &te has type test *, &te + 8 adds eight times the size of a test.
You can see what's going on more clearly using the offsetof() macro:
#include <iostream>
#include <cstddef>
using namespace std;
struct test
{
int i;
double h;
int j;
};
int main()
{
test te;
te.i = 5;
te.h = 6.5;
te.j = 10;
cout << "size of an int: " << sizeof(int) << endl; // Should be 4
cout << "size of a double: " << sizeof(double) << endl; // Should be 8
cout << "size of test: " << sizeof(test) << endl; // Should be 24 (word size of 8 for double)
cout << "i: size = " << sizeof te.i << ", offset = " << offsetof(test, i) << endl;
cout << "h: size = " << sizeof te.h << ", offset = " << offsetof(test, h) << endl;
cout << "j: size = " << sizeof te.j << ", offset = " << offsetof(test, j) << endl;
return 0;
}
On my system (x86), I get the following output:
size of an int: 4
size of a double: 8
size of test: 16
i: size = 4, offset = 0
h: size = 8, offset = 4
j: size = 4, offset = 12
On another system (SPARC), I get:
size of an int: 4
size of a double: 8
size of test: 24
i: size = 4, offset = 0
h: size = 8, offset = 8
j: size = 4, offset = 16
The compiler will insert padding bytes between struct members to ensure that each member is aligned properly. As you can see, alignment requirements vary from system to system; on one system (x86), double is 8 bytes but only requires 4-byte alignment, and on another system (SPARC), double is 8 bytes and requires 8-byte alignment.
Padding can also be added at the end of a struct to ensure that everything is aligned properly when you have an array of the struct type. On SPARC, for example, the compile adds 4 bytes pf padding at the end of the struct.
The language guarantees that the first declared member will be at an offset of 0, and that members are laid out in the order in which they're declared. (At least that's true for simple structs; C++ metadata might complicate things.)
Compilers are free to space out structs however they want past the first member, and usually use padding to align to word boundaries for speed.
See these:
C struct sizes inconsistence
Struct varies in memory size?
et. al.