if i have a struct , say:
struct A {
int a,b,c,d,e;
}
A m;//struct if 5 ints
int n[5];//array of 5 ints.
i know that elements in the array are stored one after other so we can use *(n+i) or n[i]
But in case of struct ,is each element is stored next to each other (in the struct A)?
The compiler may insert padding as it wishes, except before the first item.
In C++03 you were guaranteed increasing addresses of items between access specifiers.
I'm not sure if the access specifier restriction is still there in C++11.
The only thing that is granted is that members are stored in the same order.
Between elements there can be some "padding" the compiler may insert so that each value is aligned with the processor word length.
Different compiler can make different choices also depending on the target platform and can be forced to keep a given alignment by option switches or pragma-s.
Your particular case is "luky" for the most of compiler since int is normally implemented as "the integral that better fits the integer arithmetic of the processor". With this idea, a sequence of int-s is aligned by definition. But that may not be the case, for example if you have
struct test
{
char a;
short b;
long c;
long long d;
};
You can dscovery that (&a)+1 != &b and (&b)+1 != &c or (&b)-1 != &a etc.
What is granted is the progression &a < &b; &b < &c; &c < &d;
Structs members in general are stored in increasing addresses but they are not guaranteed to be contiguous.so elements may not always be contiguous. In the example above, given $base is the base address of the struct the layout will be the following.
a will be stored at $base+0
b will be stored at $base+4
c will be stored at $base+8 ... etc
You can see the typical alignment values at http://en.wikipedia.org/wiki/Data_structure_alignment#Typical_alignment_of_C_structs_on_x86
I have written simple program to show strut elements are next to each other
int main() {
struct S {
int a;
int b;
int c;
};
S s = {1,2,3};
int* p = reinterpret_cast <int *> (&s);
cout<<p[0]<<" "<<p[1]<<" "<<p[2];
return 0;
}
Output : 1,2,3
Remember, [] or *(i+1) are symantic construct, that suits with pointers, not with struct variables directly.
As suggested in Cheers and hth. - Alf's answer, there can be padding, before or after struct elements.
Related
Let's say I have the following
struct MyType { long a, b, c; char buffer[remainder] }
I wanted to do something like
char buffer[4096 - offsetof(MyType, buffer)]
But it appears that it's illegal
You can do:
struct ABC {long a,b,c; }
struct MyType : ABC {char buffer[4096-sizeof(ABC)];};
static_assert(sizeof(MyType)==4096,"!");
Your problem stems from trying to use the not-yet-fully-defined MyType type while defining it. You could do this with a union:
#include <iostream>
struct MyType {
union {
struct { long a, b, c; } data;
char buffer[4096];
};
};
static_assert(sizeof(MyType) == 4096, "MyType size should be exactly 4K");
int main() {
MyType x;
x.data.a = 42;
std::cout << sizeof(x) << " " << x.data.a << "\n";
return 0;
}
The output (on my system):
4096 42
Because it's a union, the type actually holds the a/b/c tuple and buffer area in an overlapped region of memory, big enough to hold the larger of the two. So, unless your long variable are really wide, that will be the 4K buffer area :-)
In any case, that size requirement is checked by the static_assert.
That may be less than ideal as buffer takes up the entire 4K. If instead you want to ensure that buffer is only the rest of the structure (after the long variables), you can use the following:
struct MyType {
long a, b, c;
char buffer[4096 - 3 * sizeof(long)];
};
and ensure that you use x.something rather than x.data.something when accessing the a, b, or c variables.
This solves your problem by using the size of three longs (these are fully defined) instead of the size of something not yet defined. It's still a good idea to keep the static_assert to ensure overall size is what you wanted.
Technically, the compiler has total control over padding and layout. A union/struct combo combined with a static_assert sanity check might be enough for government work, but std::aligned_storage is also there to give you memory blocks that are safe to put objects in.
struct MyType {
long a, b, c;
};
using MyTypeStorage = std::aligned_storage<4096, std::alignment_of<MyType>::value>::type;
/* ... */
MyTypeStorage myTypeStorage;
MyType* x = new (&myTypeStorage) MyType {};
https://godbolt.org/z/87e7Tc
So to this is part 2 of a question I asked and was answered yesterday. So today I am coming back with a part 2. I am not sure if this should be this somewhere else so if a moderator wants to move it feel free.
So I am not going to reintroduce my problem here, so please go read part 1
Iterating through a vector of stucts's members with pointers and offsets
So I have come up with a solution to the problem so let me post a modified snippet of code that represents the solution I am going for,
#include <iostream>
#include <vector>
// knows nothing about foo
class Readfoo
{
private:
int offSetSize;
char* pchar;
public:
void SetPoint(double* apstartDouble, int aoffSetSize)
{
offSetSize = aoffSetSize;
pchar = static_cast<char*> (static_cast<void*>(apstartDouble));
};
const double& printfoo(int aioffset) const
{
return *(static_cast<double*> (static_cast<void*>(pchar + aioffset*offSetSize)));
};
};
// knows nothing about readFoo
struct foo
{
int a[5];
double b[10];
};
int main()
{
// populate some data (choose b [2] or other random entry.).
std::vector<foo> bar(10);
for(int ii = 0; ii < bar.size(); ii++)
bar[ii].b[2] = ii;
// access b[2] for each foo using an offset.
Readfoo newReadfoo;
newReadfoo.SetPoint(&(bar[0].b[2]), sizeof(foo)/sizeof(char));
for(int ii = 0; ii < bar.size(); ii++)
std::cout<<"\n"<<newReadfoo.printfoo(ii);
return 0;
}
This, in my opinion, is legal, well I suppose that is what I am asking. Since now, in essence, I am converting my 'interpretation' of the struct foo and the vector bar (an array of foos) into a single array of bytes, or chars.
I.e. In this interpretation the data structure is a single array of chars, of foo size times bar size. When I iterate through this with an integral type I am in essence moving to some hypothetical char element (point 4.2 in the answer to part 1). The printfoo function then combines the next 8 bytes to form a double to return.
So is this legal and other than moving out of bounds of the bar vectors is there any reason why this will not work (I have tested it and it has yet to fail.)?
So is this legal ...
No it is not.
In the following expression:
pchar + aioffset*offSetSize
you manipulate pchar as if it were a pointer to a char array, which it is not (it's a pointer to a double array). And this is undefined behavior:
[expr.add]/6
For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T and the array element type are not similar, the behavior is undefined. [ Note: In particular, a pointer to a base class cannot be used for pointer arithmetic when the array contains objects of a derived class type. — end note ]
In your case P is pchar and has type pointer to char but the array element it points to is a double.
... and other than moving out of bounds of the bar vectors is there any reason why this will not work (I have tested it and it has yet to fail.)?
Yes: Does the C++ standard allow for an uninitialized bool to crash a program?
To go further: pointer manipulation in C++ is a red flag. C++ pointer manipulation is dark magic which will burn your soul and consume your dog. C++ offers a lot of tools to write generic code. My advice: Ask about what you're trying to achieve rather than about your attempted solution. You'll learn a lot.
I've come to work on an ongoing project where some unions are defined as follows:
/* header.h */
typedef union my_union_t {
float data[4];
struct {
float varA;
float varB;
float varC;
float varD;
};
} my_union;
If I understand well, unions are for saving space, so sizeof(my_union_t) = MAX of the variables in it. What are the advantages of using the statement above instead of this one:
typedef struct my_struct {
float varA;
float varB;
float varC;
float varD;
};
Won't be the space allocated for both of them the same?
And how can I initialize varA,varB... from my_union?
Unions are often used when implementing a variant like object (a type field and a union of data types), or in implementing serialisation.
The way you are using a union is a recipe for disaster.
You are assuming the the struct in the union is packing the floats with no gaps between then!
The standard guarantees that float data[4]; is contiguous, but not the structure elements. The only other thing you know is that the address of varA; is the same as the address of data[0].
Never use a union in this way.
As for your question: "And how can I initialize varA,varB... from my_union?". The answer is, access the structure members in the normal long-winded way not via the data[] array.
Union are not mostly for saving space, but to implement sum types (for that, you'll put the union in some struct or class having also a discriminating field which would keep the run-time tag). Also, I suggest you to use a recent standard of C++, at least C++11 since it has better support of unions (e.g. permits more easily union of objects and their construction or initialization).
The advantage of using your union is to be able to index the n-th floating point (with 0 <= n <= 3) as u.data[n]
To assign a union field in some variable declared my_union u; just code e.g. u.varB = 3.14; which in your case has the same effect as u.data[1] = 3.14;
A good example of well deserved union is a mutable object which can hold either an int or a string (you could not use derived classes in that case):
class IntOrString {
bool isint;
union {
int num; // when isint is true
str::string str; // when isint is false
};
public:
IntOrString(int n=0) : isint(true), num(n) {};
IntOrString(std::string s) : isint(false), str(s) {};
IntOrString(const IntOrString& o): isint(o.isint)
{ if (isint) num = o.num; else str = o.str); };
IntOrString(IntOrString&&p) : isint(p.isint)
{ if (isint) num = std::move (p.num);
else str = std::move (p.str); };
~IntOrString() { if (isint) num=0; else str->~std::string(); };
void set (int n)
{ if (!isint) str->~std::string(); isint=true; num=n; };
void set (std::string s) { str = s; isint=false; };
bool is_int() const { return isint; };
int as_int() const { return (isint?num:0; };
const std::string as_string() const { return (isint?"":str;};
};
Notice the explicit calls of destructor of str field. Notice also that you can safely use IntOrString in a standard container (std::vector<IntOrString>)
See also std::optional in future versions of C++ (which conceptually is a tagged union with void)
BTW, in Ocaml, you simply code:
type intorstring = Integer of int | String of string;;
and you'll use pattern matching. If you wanted to make that mutable, you'll need to make a record or a reference of it.
You'll better use union-s in a C++ idiomatic way (see this for general advices).
I think the best way to understand unions is to just to give 2 common practical examples.
The first example is working with images. Imagine you have and RGB image that is arranged in a long buffer.
What most people would do, is represent the buffer as a char* and then loop it by 3's to get the R,G,B.
What you could do instead, is make a little union, and use that to loop over the image buffer:
union RGB
{
char raw[3];
struct
{
char R;
char G;
char B;
} colors;
}
RGB* pixel = buffer[0];
///pixel.colors.R == The red color in the first pixel.
Another very useful use for unions is using registers and bitfields.
Lets say you have a 32 bit value, that represents some HW register, or something.
Sometimes, to save space, you can split the 32 bits into bit fields, but you also want the whole representation of that register as a 32 bit type.
This obviously saves bit shift calculation that a lot of programmers use for no reason at all.
union MySpecialRegister
{
uint32_t register;
struct
{
unsigned int firstField : 5;
unsigned int somethingInTheMiddle : 25;
unsigned int lastField : 6;
} data;
}
// Now you can read the raw register into the register field
// then you can read the fields using the inner data struct
The advantage is that with a union you can access the same memory in two different ways.
In your example the union contains four floats. You can access those floats as varA, varB... which might be more descriptive names or you can access the same variables as an array data[0], data[1]... which might be more useful in loops.
With a union you can also use the same memory for different kinds of data, you might find that useful for things like writing a function to tell you if you are on a big endian or little endian CPU.
No, it is not for saving space. It is for ability to represent some binary data as various data types.
for example
#include <iostream>
#include <stdint.h>
union Foo{
int x;
struct y
{
unsigned char b0, b1, b2, b3;
};
char z[sizeof(int)];
};
int main()
{
Foo bar;
bar.x = 100;
std::cout << std::hex; // to show number in hexadec repr;
for(size_t i = 0; i < sizeof(int); i++)
{
std::cout << "0x" << (int)bar.z[i] << " "; // int is just to show values as numbers, not a characters
}
return 0;
}
output: 0x64 0x0 0x0 0x0 The same values are stored in struct bar.y, but not in array but in sturcture members. Its because my machine have a little endiannes. If it were big, than the output would be reversed: 0x0 0x0 0x0 0x64
You can achieve the same using reinterpret_cast:
#include <iostream>
#include <stdint.h>
int main()
{
int x = 100;
char * xBytes = reinterpret_cast<char*>(&x);
std::cout << std::hex; // to show number in hexadec repr;
for (size_t i = 0; i < sizeof(int); i++)
{
std::cout << "0x" << (int)xBytes[i] << " "; // (int) is just to show values as numbers, not a characters
}
return 0;
}
its usefull, for example, when you need to read some binary file, that was written on a machine with different endianess than yours. You can just access values as bytearray and swap those bytes as you wish.
Also, it is usefull when you have to deal with bit fields, but its a whole different story :)
First of all: Avoid unions where the access goes to the same memory but to different types!
Unions did not save space at all. The only define multiple names on the same memory area! And you can only store one of the elements in one time in a union.
if you have
union X
{
int x;
char y[4];
};
you can store an int OR 4 chars but not both! The general problem is, that nobody knows which data is actually stored in a union. If you store a int and read the chars, the compiler will not check that and also there is no runtime check. A solution is often to provide an additional data element in a struct to a union which contains the actual stored data type as an enum.
struct Y
{
enum { IS_CHAR, IS_INT } tinfo;
union
{
int x;
char y[4];
};
}
But in c++ you always should use classes or structs which can derive from a maybe empty parent class like this:
class Base
{
};
class Int_Type: public Base
{
...
int x;
};
class Char_Type: public Base
{
...
char y[4];
};
So you can device pointers to base which actually can hold a Int or a Char Type for you. With virtual functions you can access the members in a object oriented way of programming.
As mentioned already from Basile's answer, a useful case can be the access via different names to the same type.
union X
{
struct data
{
float a;
float b;
};
float arr[2];
};
which allows different access ways to the same data with the same type. Using different types which are stored in the same memory should be avoided at all!
In the following code, can the value of int be predicted ( how ? ), or it is just the garbage ?
union a
{
int i;
char ch[2];
};
a u;
u.ch[0] = 0;
u.ch[1] = 0;
cout<<u.i;
}
I would say that depends on the size of int and char. A union contains the memory of the largest variable. If int is 4 bytes and char[2] represents 2 bytes, the int consumes more memory than the char-array, so you are not initialising the full int-memory to 0 by setting all char-variables. It depends on your memory initialization mechanisms but basically the value of the int will appear to be random as the extra 2 bytes are filled with unspecified values.
Besides, filling one variable of a union and reading another is exactly what makes unions unsafe in my oppinion.
If you are sure that int is the largest datatype, you can initialize the whole union by writing
union a
{
int i;
char ch[2];
};
void foo()
{
a u = { 0 }; // Initializes the first field in the union
cout << u.i;
}
Therefore it may be a good idea to place the largest type at the beginning of the union. Althugh that doesn't garantuee that all datatypes can be considered zero or empty when all bits are set to 0.
I would like to take the memory generated from my struct and push it into a byte array (char array) as well as the other way around (push the byte array back into a struct). It would be even better if I could skip the string generation step and go directly to writing memory into the EEPROM. (Do not worry about the eeprom bit, I can handle that by reading & writing individual bytes)
// These are just example structs (I will be using B)
typedef struct {int a,b,c;} A;
typedef struct {A q,w,e;} B;
#define OFFSET 0 // For now
void write(B input)
{
for (int i=0;i<sizeof(B);i++)
{
eepromWrite(i+OFFSET,memof(input,i));
}
}
B read()
{
B temp;
for (int i=0;i<sizeof(B);i++)
{
setmemof(temp,i,eepromRead(i+OFFSET));
}
return temp;
}
This example I wrote is not supposed to compile, it was meant to explain my ideas in a platform independent environment.
PLEASE NOTE: memof and setmemof do not exist. This is what I am asking for though my question. An alternative answer would be to use a char array as an intermediate step.
Assuming your structures contain objects and not pointers, you can do this with a simple cast:
save_b(B b) {
unsigned char b_data[sizeof(B)];
memcpy(b_data, (unsigned char *) &b, sizeof(B));
save_bytes(b_data, sizeof(B));
}
Actually, you shouldn't need to copy from the structure into a char array. I was just hoping to make the idea clear.
Be sure to look into #pragma pack, with determines how the elements in the stuctures are aligned. Any alignment greater than one byte may increase the size unnecessarily.