How to indicate absence of aliasing for a struct member? - c++

If you write statements like:
a[i] = b[i] + c[i];
...you might want to indicate to the compiler that a[i], b[i] and c[i] point to different places in memory, thus enabling various optimizations (e.g. vectorization). This can be done by adding a special keyword in their declaration:
float * __restrict__ a; // same for b and c
However, what do you do if instead of float you are using a more complex object, say:
struct item {
float foo, bar;
};
Consider the following code:
float *a, *b;
item *c;
// ...
for (int i = 0; i < 42; ++i) {
a[i] = b[i] * c[i].foo + c[i].bar;
}
In the loop, the compiler does not care about c[i] but rather about c[i].foo and c[i].bar. How can an equivalent assertion about pointer independence be declared in this case with respect to individual fields of the structure?

For now, I have managed to solve the problem using:
#pragma GCC ivdep
However, this turns off the dependency checking completely. I would still be interested in more selective solution.

The Standard does not require compilers to accommodate the possibility that a[anything] and b[anything] might be used to access any part of a struct item. It lists the types of lvalues that implementations must always allow to be used to access a structure such as struct item, and non-character member types such as int are not among them. No special permission would be required to let a compiler assume that neither a nor b will alias a struct item.
Of course, it would be rather unhelpful for a compiler to allow code to take the address of a struct or union member of non-character type, but never actually allow the resulting pointer to be used to access the member even in cases that don't involve aliasing(*). The Standard makes no distinction between cases that involve aliasing and those that don't, however, and leaves the question of when to allow structures to be accessed using member-type lvalues entirely up to implementers' judgment. The authors of gcc and clang may have decided that the easiest way to support the use of pointers to struct members in non-aliasing scenarios was to support them in all scenarios, but the Standard hardly requires such treatment, and I don't think the authors wanted it to.
The Standard gives implementations room to benefit from an assumption that neither a not b will alias c, and yet still support most programs that would need to work with pointers to structure or union members. While there are some cases where more qualifiers similar to __restrict would be helpful, I'm not sure what additional permission you would think an implementation would need in cases like the one you show.
(*) i.e. situations where no operations accesses a region of storage, or derive a pointer/reference that will be used to do so, at times when there exists a newer pointer/reference that will also be used to access or address the same storage in conflicting fashion.

There's no need to do that.
Either a compiler is smart enough to know that these struct members cannot alias, or it's not smart enough to do anything useful with that information.
The reason you have the __restrict__ and C99 restrict keywords is because unlike structs, arrays might alias. Therefore the keywords provide real and useful information to the compiler.

Related

Unions, aliasing and type-punning in practice: what works and what does not?

I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).
In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:
-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same.
For example, an unsigned int can alias an int, but not a void* or a double. A character type may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common.
Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
This is what I think I understood from this example and my doubts:
1) aliasing only works between similar types, or char
Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);
Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?
Consequence of 1) for non similar types (whatever this means), aliasing does not work;
2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;
Doubt: is aliasing a specific case of type-punning where types are similar?
I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning:
not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.
The questions:
can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
Accessing integers as their unsigned/signed counterparts
Accessing anything as a char, unsigned char or std::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char (or similar to char) lvalue, but doing the opposite (i.e. accessing an object of type char through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
According to the footnote 88 in the C11 draft N1570, the "strict aliasing rule" (6.5p7) is intended to specify the circumstances in which compilers must allow for the possibility that things may alias, but makes no attempt to define what aliasing is. Somewhere along the line, a popular belief has emerged that accesses other than those defined by the rule represent "aliasing", and those allowed don't, but in fact the opposite is true.
Given a function like:
int foo(int *p, int *q)
{ *p = 1; *q = 2; return *p; }
Section 6.5p7 doesn't say that p and q won't alias if they identify the same storage. Rather, it specifies that they are allowed to alias.
Note that not all operations which involve accessing storage of one type as another represent aliasing. An operation on an lvalue which is freshly visibly derived from another object doesn't "alias" that other object. Instead, it is an operation upon that object. Aliasing occurs if, between the time a reference to some storage is created and the time it is used, the same storage is referenced in some way not derived from the first, or code enters a context wherein that occurs.
Although the ability to recognize when an lvalue is derived from another is a Quality of Implementation issue, the authors of the Standard must have expected implementations to recognize some constructs beyond those mandated. There is no general permission to access any of the storage associated with a struct or union by using an lvalue of member type, nor does anything in the Standard explicitly say that an operation involving someStruct.member must be recognized as an operation on a someStruct. Instead, the authors of the Standard expected that compiler writers who make a reasonable effort to support constructs their customers need should be better placed than the Committee to judge the needs of those customers and fulfill them. Since any compiler that makes an even-remotely-reasonable effort to recognize derived references would notice that someStruct.member is derived from someStruct, the authors of the Standard saw no need to explicitly mandate that.
Unfortunately, the treatment of constructs like:
actOnStruct(&someUnion.someStruct);
int q=*(someUnion.intArray+i)
has evolved from "It's sufficiently obvious that actOnStruct and the pointer dereference should be expected to act upon someUnion (and consequently all the members thereof) that there's no need to mandate such behavior" to "Since the Standard doesn't require that implementations recognize that the actions above might affect someUnion, any code relying upon such behavior is broken and need not be supported". Neither of the above constructs is reliably supported by gcc or clang except in -fno-strict-aliasing mode, even though most of the "optimizations" that would be blocked by supporting them would generate code that is "efficient" but useless.
If you're using -fno-strict-aliasing on any compiler having such an option, almost anything will work. If you're using -fstrict-aliasing on icc, it will try to support constructs that use type punning without aliasing, though I don't know if there's any documentation about exactly what constructs it does or does not handle. If you use -fstrict-aliasing on gcc or clang, anything at all that works is purely by happenstance.
I think it's good to add a complementary answer, simply because when I asked the question I did not know how to fulfill my needs without using UNION: I got stubborn on using it because it seemed to answer precisely my needs.
The good way to do type punning and to avoid possible consequences of undefined behavior (depending on the compiler and other env. settings) is to use std::memcpy and copy the memory bytes from one type to another. This is explained - for example - here and here.
I've also read that often when a compiler produces valid code for type punning using unions, it produces the same binary code as if std::memcpy was used.
Finally, even if this information does not directly answer my original question it's so strictly related that I felt it was useful to add it here.

Solving the circular dependency conundrum "elegantly"

So I am developing a programming language which compiles to bytecode for VM execution and also to C as an intermediate language for compiling to native binary. I chose C because it is low level enough and portable, saving a tremendous amount of effort by reusing existing compilers and not having to write compilers to assembly for each and every different platform and its oddities.
But existing compilers come with their drawbacks, one of which is the circular dependency issue. I want to solve circular dependencies in an elegant way (unlike C/C++) without awkward forward declarations, without having to use pointers and extra indirection and wasted memory, without having to separate declarations from definitions and so on... In other words, take this issue away from the developer like some programming languages do.
The way I see it, current C/C++ compilers' main problem with this is they cannot "look into the future" even though it is not really future, since the programmer intent is already expressed in code, my compiler does not have that issue, it is not unaware of anything beyond some certain point of parsing progress, it knows the sizes of objects with circular dependencies and can calculate the appropriate offsets and such.
I've already implemented "faked" inheritance which simply does "inline expansion" of inherited structures' members, so I am thinking I can also use the same approach to actually fake aggregation as well. In the most basic and simple example:
typedef struct {
int a;
} A;
typedef struct {
A a;
int b;
} B;
becomes:
typedef struct {
int A_a;
int b;
} B;
and the compiler does a bit of "translation":
B b;
b.a.a = 7;
becomes:
b.A_a = 7;
And in the same fashion all structures are collapsed down to a single root structure which only contains primitive types. This way there are absolutely no types used in structures whose sizes are not known in advance so the order of definition becomes irrelevant. Naturally, this mess is hidden away from the user and is only for the compiler's "eyes to see" while the user side is being kept structured and readable. And it goes without saying, but the binary footprint is preserved for compatibility with regular C/C++ code, the collapsed structure is binary identical to a regular structure that would use aggregation or inheritance.
So my question is: Does this sound like a sound idea? Anything that could go wrong I am missing?
EDIT: I only aim to solve the C/C++ related difficulties with circular dependencies, not the "chicken or egg" logical paradox. Obviosly it is impossible for two objects to contain each-other without leading to some form of infinite loop.
You cannot safely use pointers to the substructures because you cannot get pointers to "compatible types" by pointing to the primitive members. E.g. after
struct Foo {
short a;
int b;
};
struct Bar {
struct Foo foo;
};
struct Bar bar;
the pointers &bar.foo and &bar.foo.a have different types and cannot be used interchangeably. They also cannot be cast to each other's types without violating the strict aliasing rule, triggering undefined behavior.
The problem can be avoided by inlining the entire struct definition each time:
struct Bar {
struct { short a; int b; } foo;
};
Now &bar.a is a pointer to struct {short; int;} which is a compatible type for struct Foo.
(There may also be padding/alignment differences between struct-typed members and primitive members, but I couldn't find an example of these.

What is the difference between a proper defined union and a reinterpret_cast?

Can you propose at least 1 scenario where there is a substantial difference between
union {
T var_1;
U var_2;
}
and
var_2 = reinterpret_cast<U> (var_1)
?
The more i think about this, the more they look like the same thing to me, at least from a practical viewpoint.
One difference that I found is that while the union size is big as the biggest data type in terms of size, the reinterpret_cast as described in this post can lead to a truncation, so the plain old C-style union is even safer than a newer C++ casting.
Can you outline the differences between this 2 ?
Contrary to what the other answers state, from a practical point of view there is a huge difference, although there might not be such a difference in the standard.
From the standard point of view, reinterpret_cast is only guaranteed to work for roundtrip conversions and only if the alignment requirements of the intermediate pointer type are not stronger than those of the source type. You are not allowed (*) to read through one pointer and read from another pointer type.
At the same time, the standard requires similar behavior from unions, it is undefined behavior to read out of a union member other than the active one (the member that was last written to)(+).
Yet compilers often provide additional guarantees for the union case, and all compilers I know of (VS, g++, clang++, xlC_r, intel, Solaris CC) guarantee that you can read out of an union through an inactive member and that it will produce a value with exactly the same bits set as those that were written through the active member.
This is particularly important with high optimizations when reading from network:
double ntohdouble(const char *buffer) { // [1]
union {
int64_t i;
double f;
} data;
memcpy(&data.i, buffer, sizeof(int64_t));
data.i = ntohll(data.i);
return data.f;
}
double ntohdouble(const char *buffer) { // [2]
int64_t data;
double dbl;
memcpy(&data, buffer, sizeof(int64_t));
data = ntohll(data);
dbl = *reinterpret_cast<double*>(&data);
return dbl;
}
The implementation in [1] is sanctioned by all compilers I know (gcc, clang, VS, sun, ibm, hp), while the implementation in [2] is not and will fail horribly in some of them when aggressive optimizations are used. In particular, I have seen gcc reorder the instructions and read into the dbl variable before evaluating ntohl, thus producing the wrong results.
(*) With the exception that you are always allowed to read from a [signed|unsigned] char* regardless of that the real object (original pointer type) was.
(+) Again with some exceptions, if the active member shares a common prefix with another member, you can read through the compatible member that prefix.
There are some technical differences between a proper union and a (let's assume) a proper and safe reinterpret_cast. However, I can't think of any of these differences which cannot be overcome.
The real reason to prefer a union over reinterpret_cast in my opinion isn't a technical one. It's for documentation.
Supposing you are designing a bunch of classes to represent a wire protocol (which I guess is the most common reason to use type-punning in the first place), and that wire protocol consists of many messages, submessages and fields. If some of those fields are common, such as msg type, seq#, etc, using a union simplifies tying these elements together and helps to document exactly how the protocol appears on the wire.
Using reinterpret_cast does the same thing, obviously, but in order to really know what's going on you have to examine the code that advances from one packet to the next. Using a union you can just take a look at the header and get an idea what's going on.
In C++11, union is class type, you can an hold a member with non-trivial member functions. You can't simply cast from one member to another.
§ 9.5.3
[ Example: Consider the following union:
union U {
int i;
float f;
std::string s;
};
Since std::string (21.3) declares non-trivial versions of all of the special member functions, U will have
an implicitly deleted default constructor, copy/move constructor, copy/move assignment operator, and destructor. To use U, some or all of these member functions must be user-provided. — end example ]
From a practical point of view, they're most probably 100% identical, at least on real, non-fictional computers. You take the binary representation of one type and stuff it into another type.
From a language lawyer point of view, using reinterpret_cast is well-defined for some occasions (e.g. pointer to integer conversions) and implementation-specific otherwise.
Union type punning, on the other hand is very clearly undefined behaviour, always (though undefined does not necessarily mean "doesn't work"). The standard says that the value of at most one of the non-static data members can be stored in a union at any time. This means that if you set var1 then var1 is valid, but var2 is not.
However, since var1 and var2 are stored at the same memory location, you can of course still read and write any of the types as you like, and assuming they have the same storage size, no bits are "lost".

A well supported alternative to c++ unions?

I think that unions can be perfect for what i have in mind and especially when I consider that my code should run on a really heterogeneous family of machines, especially low-powered machine, what is bugging me is the fact that the people who creates compilers doesn't seem to care too much about introducing and offering a good union support, for example this table is practical empty when it comes to Unrestricted Unions support, and this is a real unpleasant view for my projects.
There are alternatives to union that can at least mimic the same properties ?
Union is well supported on most compilers, what's not well supported are unions that contains members that have non trivial constructor (unrestricted unions). In practice, you'd almost always want a custom constructor when creating unions, so not having unrestricted union is more of a matter of inconvenience.
Alternatively, you can always use void pointer pointing to a malloc-ed memory with sufficient size for the largest member. The drawback is you'd need explicit type casting.
One popular alternative to union is Boost.Variant.
The types you use in it have to be copy-constructible.
Update:
C++17 introduced std::variant. Its requirements are based on Boost.Variant. It is modernized to take into account the features in C++17. It does not carry the overhead of being compatible with C++98 compilers (like Boost). It comes for free with standard library. If available it is better to use it as alternative to unions.
You can always do essentially the same thing using explicit casts:
struct YourUnion {
char x[the size of your largest object];
A &field1() { return *(A*)&x[0]; }
int &field2() { return *(int*)&x[0]; }
};
YourUnion y;
new(&y.field1()) A(); // construct A
y.field1().~A(); // destruct A
y.field2() = 1;
// ...
(This is zero overhead compared to, e.g., Boost.Variant.)
EDIT: more union-like and without templates.

Casting big POD into small POD - guaranteed to work?

Suppose I've a POD struct which has more than 40 members. These members are not built-in types, rather most of them are POD structs, which in turn has lots of members, most of which are POD struct again. This pattern goes up to many levels - POD has POD has POD and so on - up to even 10 or so levels.
I cannot post the actual code, so here is one very simple example of this pattern:
//POD struct
struct Big
{
A a[2]; //POD has POD
B b; //POD has POD
double dar[6];
int m;
bool is;
double d;
char c[10];
};
And A and B are defined as:
struct A
{
int i;
int j;
int k;
};
struct B
{
A a; //POD has POD
double x;
double y;
double z;
char *s;
};
It's really very simplified version of the actual code which was written (in C) almost 20 years back by Citrix Systems when they came up with ICA protocol. Over the years, the code has been changed a lot. Now we've the source code, but we cannot know which code is being used in the current version of ICA, and which has been discarded, as the discarded part is also present in the source code.
That is the background of the problem. The problem is: now we've the source code, and we're building a system on the top of ICA protocol, for which at some point we need to know the values of few members of the big struct. Few members, not all. Fortunately, those members appear in the beginning of the struct, so can we write a struct which is part of the big struct, as:
//Part of struct B
//Whatever members it has, they are in the same order as they appear in Big.
struct partBig
{
A a[2];
B b;
double dar[6];
//rest is ignored
};
Now suppose, we know pointer to Big struct (that we know by deciphering the protocol and data streams), then can we write this:
Big *pBig = GetBig();
partBig *part = (partBig*)pBig; //Is this safe?
/*here can we pretend that part is actually Big, so as to access
first few members safely (using part pointer), namely those
which are defined in partBig?*/
I don't want to define the entire Big struct in our code, as it has too many POD members and if I define the struct entirely, I've to first define hundreds of other structs. I don't want to that, as even if I do, I doubt if I can do that correctly, as I don't know all the structs correctly (as to which version is being used today, and which is discarded).
I've done the casting already, and it seems to work for last one year, I didn't see any problem with that. But now I thought why not start a topic and ask everyone. Maybe, I'll have a better solution, or at least, will make some important notes.
Relevant references from the language specification will be appreciated. :-)
Here is a demo of such casting: http://ideone.com/c7SWr
The language specification has some features that are similar to what you are trying to do: the 6.5.2.3/5 states that if you have several structs that share common initial sequence of members, you are allowed to inspect these common members through any of the structs in the union, regardless of which one is currently active. So, from this guarantee one can easily derive that structs with common initial sequence of members should have identical memory layout for these common members.
However, the language does not seem to explicitly allow doing what you are doing, i.e. reinterpreting one struct object as another unrelated struct object through a pointer cast. (The language allows you to access the first member of a struct object through a pointer cast, but not what you do in your code). So, from that perspective, what you are doing might be a strict-aliasing violation (see 6.5/7). I'm not sure about this though, since it is not immediately clear to me whether the intent of 6.5/7 was to outlaw this kind of access.
However, in any case, I don't think that compilers that use strict-aliasing rules for optimization do it at aggregate type level. Most likely they only do it at fundamental type level, meaning that your code should work fine in practice.
Yes, assuming your small structure is laid out with the same packing/padding rules as the big one, you'll be fine. The memory layout order of fields in a structure is well-defined by the language specifications.
The only way to get BigPart from Big is using reinterpret_cast (or C style, as in your question), and the standard does not specify what happens in that case. However, it should work as expected. It would probably fail only for some super exotic platforms.
The relevant type-aliasing rule is 3.10/10: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: ...".
The list contains two cases that are relevant to us:
a type similar (as defined in 4.4) to the dynamic type of the object
an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union)
The first case isn't sufficient: it merely covers cv-qualifications. The second case allows at the union hack that AndreyT mentions (the common initial sequence), but not anything else.