memory alignment issues with union

memory alignment issues with union - c++

Is there guarantee, that memory for this object will be properly aligned if we create object of this type in stack?
union my_union
{
int value;
char bytes[4];
};
If we create char bytes[4] in stack and then try to cast it to integer there might be alignment problem. We can avoid that problem by creating it in heap, however, is there such guarantee for union objects? Logically there should be, but I would like to confirm.
Thanks.

Well, that depends on what you mean.
If you mean:
Will both the int and char[4] members of the union be properly aligned so that I may use them independently of each other?
Then yes. If you mean:
Will the int and char[4] members be guaranteed to be aligned to take up the same amount of space, so that I may access individual bytes of the int through the char[4]?
Then no. This is because sizeof(int) is not guaranteed to be 4. If ints are 2 bytes, then who knows which two char elements will correspond to the int in your union (the standard doesn't specify)?
If you want to use a union to access the individual bytes of an int, use this:
union {
int i;
char c[sizeof(int)];
};
Since each member is the same size, they're guaranteed to occupy the same space. This is what I believe you want to know about, and I hope I've answered it.

Yeah, unions would be utterly useless otherwise.

Related

Is there any conversions when storing/reading integers of different size to/from union?

Reading Rapidjson code I found some interesting optimization with "type punning".
// By using proper binary layout, retrieval of different integer types do not need conversions.
union Number {
#if RAPIDJSON_ENDIAN == RAPIDJSON_LITTLEENDIAN
struct I {
int i;
char padding[4];
}i;
struct U {
unsigned u;
char padding2[4];
}u;
#else
struct I {
char padding[4];
int i;
}i;
struct U {
char padding2[4];
unsigned u;
}u;
#endif
int64_t i64;
uint64_t u64;
double d;
}; // 8 bytes
It looks like only BE CPUs are affected by this optimization. How does this increases performance? I'd like to test but do not have BE machine.
Wikipedia says:
While not allowed by C++, such type punning code is allowed as "implementation-defined" by the C11 standard[15] and commonly used[16] in code interacting with hardware.[17]
So is it legal in C++? I believe in absolute most cases it works fine. But should I use it in new code?

So is it legal in C++?
No, it isn't legal in c++ (Wikipedia also already stated "While not allowed by C++ ...").
In c++ a union is just reserving memory for the contained union members, such that it is enough to fit the largest member. That memory is shared by all members.
Accessing a different member from the union as was used to initialize it, is undefined behavior. You need to decide beforehand with which union members to work, if these are shared by any functions (this is often done using a type discriminator).

How does an union determine max size from a list of objects?

I am not sure the question is well put, because I understood how, but I don't know to write the questions with the thing I don't understand. Here it is:
I have some classes:
class Animal{};
class Rabbit{}: public Animal;
class Horse{}: public Animal;
class Mouse{}: public Animal;
class Pony{}: public Horse;
My goal was to find the maximum size from this object list in order to use it in memory allocation afterwards. I've stored each sizeof of the object in an array then took the max of the array. The superior(to whom I send the code for review) suggested me to use an union in order to find maximum size at pre-compilation time. The idea seemed very nice to me so I've did it like this:
typedef union
{
Rabbit rabbitObject;
Horse horseObject;
Mouse mouseObject;
Pony ponyObject;
} Size;
... because an union allocates memory according to the greatest-in-size element.
The next suggestion was to do it like this:
typedef union
{
unsigned char RabbitObject[sizeof(Rabbit)];
unsigned char HorseObject[sizeof(Horse)];
unsigned char MouseObject[sizeof(Mouse)];
unsigned char PonyObject[sizeof(Pony)];
} Interesting;
My question is:
How does Interesting union get the maximum size of object? To me, it makes no sense to create an array of type unsigned char, of length sizeof(class) inside it. Why the second option would solve the problem and previous union it doesn't?
What's happening behind and I miss?
PS: The conditions are in that way that I cannot ask the guy personally.
Thank you in advance

The assumptions are incorrect, and the question is moot. Standard does not require the union size to be equal of the size of the largest member. Instead, it requires union size to be sufficient to hold the largest member, which is not the same at all. Both solutions are flawed is size of the largest class needs to be known exactly.
Instead, something like that should be used:
template<class L, class Y, class... T> struct max_size
: std::integral_constant<size_t, std::max(sizeof (L), max_size<Y, T...>::value)> { };
template<class L, class Y> struct max_size<L, Y>
: std::integral_constant<size_t, std::max(sizeof (L), sizeof (Y))> { };
As #Caleth suggested below, it could be shortened using initializer list version of std::max (and template variables):
template<class... Ts>
constexpr size_t max_size_v = std::max({sizeof(Ts)...});

The two approaches provide a way to find a maximum size that all of the objects of the union will fit within. I would prefer the first as it is clearer as to what is being done and the second provides nothing that the first does not for your needs.
And the first, a union composed of the various classes, offers the ability to access a specific member of the union as well.
See also Is a struct's address the same as its first member's address?
as well as sizeof a union in C/C++
and Anonymous union and struct [duplicate]
.
For some discussions on memory layout of classes see the following postings:
Structure of a C++ Object in Memory Vs a Struct
How is the memory layout of a class vs. a struct
memory layout C++ objects [closed]
C++ Class Memory Model And Alignment
What does an object look like in memory? [duplicate]
C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
Since the compiler is free to add to the sizes of the various components in order to align variables on particular memory address boundaries, the size of the union may be larger than the actual size of the data. Some compilers offer a pragma or other type of directive to instruct the compiler as to whether packing of the class, struct, or union members should be done or not.
The size as reported by sizeof() will be the size of the variable or type specified however again this may include additional unused memory area to pad the variable to the next desirable memory address alignment. See Why isn't sizeof for a struct equal to the sum of sizeof of each member?.
Typically a class, struct, or union is sized so that if an array of the type is created then each element of the array will begin on the most useful memory alignment such as a double word memory alignment for an Intel x86 architecture. This padding is typically on the end of the variable.

You superior suggested you use the array version because a union could have padding. For instance if you have
union padding {
char arr[sizeof (double) + 1];
double d;
};
The this could either be of size sizeof(double) + 1 or it could be sizeof (double) * 2 as the union could be padded to keep it aligned for double's (Live example).
However if you have
union padding {
char arr[sizeof(double) + 1];
char d[sizeof(double)];
};
The the union need not be double aligned and the union most likely has a size of sizeof(double) + 1 (Live example). This is not guanrteed though and the size can be greater than it's largest element.
If you want for sure to have largest size I would suggest using
auto max_size = std::max({sizeof(Rabbit), sizeof(Horse), sizeof(Mouse), sizeof(Pony)});

alignment when referring to structs vs variables

when we're talking about alignment we're always referring to variables inside a struct and not to single variables.
could you please tell me why is that?
when we're referring to a variable , would it take the whole size of "word"?

Most likely because if you're messing with variable alignement you are either in the world of low level optimisation (based on cache line estimations and whatnot) or doing some embed programming.
Most developers don't do that and the ones who do are more likely to go read their plateforme specs and rethink about the alignement principles that they probably already know rather than discuss it on internet (there are exceptions of course, it's just not the general tendency).
I have yet to see a variable alignement which is not a derivate of:
// the array "cacheline" will be aligned to 128-byte boundary
alignas(128) char cacheline[128];
On the other hand you don't need very specific situations to see the impact of aggregate(struct) alignement on a program.
This is something a beginner will write and question at some point or another:
#include <iostream>
struct no_align
{
char c;
double d;
int i;
};
struct align
{
double d;
int i;
char c;
};
int main(void)
{
no_align no_align_array[100];
align align_array[100];
std::cout << sizeof(no_align_array) << std::endl;
std::cout << sizeof(align_array) << std::endl;
}
On my machine the result is:
2400
1600
And that's the point where you'll go around on internet asking why in the world one version makes you use 800 more bytes than the other if no teacher ever explained that to you.

Every type has a size, which is fixed, and an alignment requirement.
A struct has members with their own alignment requirements. As a logical consequence, a struct must have an alignment requirement at least as strong as those of all its members. A struct may have to add padding so that all its members meet their alignment requirements.
An array stores multiple array elements consecutively without any padding. As a logical consequence, the size of any type must be a multiple of its alignment requirement (so a struct containing an int and a char cannot have an alignment requirement of four bytes and a size of five bytes, because that wouldn't work for the second array element in an array of two such structs).
Variables need to have addresses so their alignment requirements are satisfied, so your first sentence is wrong.
However, there is the "as-if" rule: Normally, the compiler has to do what the language tells it. But the "as-if" rule says that the compiler can do whatever it wants to do as long as a program cannot find the difference. So if storing an int on an unaligned address makes no difference (except maybe a tiny cost in time), the compiler is allowed to do this.

Two arrays in a union in C++

is it possible to share two arrays in a union like this:
struct
{
union
{
float m_V[Height * Length];
float m_M[Height] [Length];
} m_U;
};
Do these two arrays share the same memory size or is one of them longer?

Both arrays are required to have the same size and layout. Of course,
if you initialize anything using m_V, then all accesses to m_M are
undefined behavior; a compiler might, for example, note that nothing in
m_V has changed, and return an earlier value, even though you've
modifed the element through m_M. I've actually used a compiler which
did so, in the distant past. I would avoid accesses where the union
isn't visible, say by passing a reference to m_V and a reference to
m_M to the same function.

It is implicitly guaranteed that these will be the same size in memory. The compiler is not allowed to insert padding anywhere in either the 2D array or the 1D array, because everything must be compatible with sizeof.
[Of course, if you wrote to m_V and read from m_M (or vice versa), you'd still be type-punning, which technically invokes undefined behaviour. But that's a different matter.]

C/C++: Packing or padding data in a struct

I'm using the Code::Blocks IDE with the GNU GCC compiler.
struct test
{
char a;
char e;
char f;
char b;
char d;
};
sizeof(test) returns 5.
I read this answer:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
How come there is no padding after the last char, so that sizeof(test) returns 6 or 8? There are a ton more questions I could ask once I add short and int, etc. But I think this question is good for now. Would not padding make it easier for the processor to work with the struct?

The alignment of a char is only 1, so there is no need for the struct to be padded out to meet a larger alignment requirement.

Since at most times you work with one member at time, or pass the address of the struct, the compiler doesn't care to align the whole struct more than the alignment needed for its members. That means that if you assign this struct (or pass it to function), the processor will have to read it member-by-member. (and it will be a little slowly).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

memory alignment issues with union - c++

Yeah, unions would be utterly useless otherwise.

Related

Is there any conversions when storing/reading integers of different size to/from union?

How does an union determine max size from a list of objects?

alignment when referring to structs vs variables

Two arrays in a union in C++

C/C++: Packing or padding data in a struct

Categories

Resources