The response of LWG 484 shows "convertible" is "well-defined in the core". Which clauses does it refer to?
I have this question because currently the requirements of is_convertible are carefully defined in [meta.rel]/5, which are not in the core. Although it has clearly noted the technical benefits, the coexistence with other styles seems confusing in the overall library clauses. So there are more questions about the general use.
Do the meaning of requirements implied by is_convertible and the "convertible to" (without the code font) wording in the library clauses exactly the same?
If true, why both are used?
If not, is there any guideline or rationale to differentiate these subtle use cases?
Convertible to refers to something that can be a result of a standard conversion sequence defined in [conv] and since is_convertible is defined in terms of what conversion the core language allows during the copy-initialization of a return value, I don't see any difference.
Edit: Implicit conversion sequence might be another useful term not mentioned in the question. It is defined in [over.best.ics]
The "'convertible' is well-defined in core" statement was added at some point between 2004 and 2009, though that page doesn't say exactly when. So I looked at the C++03 and C++11 standards, and it appears that neither actually has a definition of the term "convertible". However, I think we can infer that "convertible" is generally intended to mean "implicitly convertible", and the meaning of "implicitly convertible" is given by [conv] ([conv.general] in newer editions of the standard).
For example C++11 [conv]/3 says:
An expression e can be implicitly converted to a type T if and only if the declaration T t=e; is well-formed, for some invented temporary variable t (8.5). ...
However, it should be noted that the definition of "implicitly converted" uses an expression, not a type, as the source, whereas "convertible to" in the library sometimes has a type as the source. In that case, "T is convertible to U" should be interpreted as "all expressions of type T can be implicitly converted to type U".
There are other plausible meanings of "convertible", for example:
The first sentence of [expr.static.cast] says "The result of the expression static_cast<T>(v) is the result of converting the expression v to type T". This is more powerful than an implicit conversion. It can, for example, convert void* to int*, or from an enum type to an integral type.
The sections of the C++ standard that specify the behaviour of (T) cast-expression and T ( expression-list(opt) ) are titled "Explicit type conversion (cast notation)" and "Explicit type conversion (functional notation)". These conversions are much more powerful than implicit conversions; they can call explicit constructors and explicit conversion operators, and even perform static_casts, reinterpret_casts, and const_casts (and even bypass access control in some cases).
But if we look at how the term "convertible" is actually used in the standard, it seems that it must be intended to mean "implicitly convertible". For example, C++11 [allocator.uses.trait]/1 says:
Remark: automatically detects whether T has a nested allocator_type that is convertible from Alloc. Meets the BinaryTypeTrait requirements (20.9.1). The implementation shall provide a definition
that is derived from true_type if a type T::allocator_type exists and is_convertible<Alloc, T::allocator_type>::value != false, otherwise it shall be derived from false_type. [...]
Evidently the intent here was that "convertible" means the same thing as what is_convertible tests for, that is, implicit convertibility (with a caveat that will be discussed below).
Another example is that in C++11 [container.requirements.general], Table 96, it is stated that for a container X, the type X::iterator shall be "convertible to X::const_iterator". The general understanding of this is that it means implicitly convertible, i.e., this is well-formed:
void foo(std::vector<int>& v) {
std::vector<int>::const_iterator it = v.begin();
}
Imagine if you had to use a cast, or even just direct-initialization notation, to perform this conversion. Could LWG have meant to use the word "convertible" to with a meaning that would make such an implementation conforming? It seems very doubtful.
Wait, does "convertible to" have the same meaning as what the standard type trait std::is_convertible tests for? Assuming that "convertible to" means "implicitly convertible to" with the caveat above, strictly speaking, no, because there is the edge case where std::is_convertible<void, void>::value is true but void is not implicitly convertible to itself (since the definition of a void variable is ill-formed). I think this is the only edge case (also including the other 15 variations involving cv-qualified void), although I'm not 100% sure about that.
However, this doesn't explain why std::is_convertible is sometimes used in the library specification. Actually, it is only used once in C++11 (other than in its own definition), namely in [allocator.uses.trait]/1, and there, it doesn't seem that there is any intent for it to mean anything other than "implicitly convertible", since it is irrelevant what happens when Alloc is void (as it is mentioned elsewhere that non-class types are not permitted as allocator types). The fact that std::is_convertible is even mentioned here seems like it's just a mild case of an implementation detail leaking into the specification. This is probably also the case for most other mentions that appear in newer editions of the standard.
C++20 introduced a concept called std::convertible_to, where std::convertible_to<From, To> is only modelled if
From is implicitly convertible to To, and
static_cast<To>(e) is well-formed, where e has type std::add_rvalue_reference_t<From> (i.e., From is also explicitly convertible to To via static_cast), and
some other requirements hold, which I won't get into here.
It's reasonable to ask whether some uses of "convertible to" actually mean std::convertible_to, given the similar names, and other considerations. In particular, the one mentioned by LWG 484 itself. C++20 [input.iterators] Table 85 states that if X meets the Cpp17InputIterator requirements for the value type T, then *a must be "convertible to T", where a has type X or const X.
This is almost certainly another case where "convertible to", as used in the library specification, implies "implicitly convertible to", but I think that it also should imply "explicitly convertible to", where the implicit and explicit conversions give the same results. We can construct pathological iterator types for which they don't. See this Godbolt link for an example.
I think both users and standard library implementers should feel free to assume that if a function accepts input iterators, it can assume that explicit conversions via static_cast or direct-initialization notation from *a to T can be used with the same result as implicit conversions. (Such explicit conversions may, for example, be used when calling function templates, if we want the deduced type of the function template specialization's argument to be T& or const T& and not some other U& or const U& where U is the actual type of *a.) In other words, I think "convertible to" in the input iterator requirements really means something like std::convertible_to, not std::is_convertible_v. I would be surprised if standard library implementations truly only relied on implicit convertibility in this case, and not both implicit and explicit convertibility (with the two being equivalent). (They can implement the library in an extremely careful way that only uses implicit conversions in cases where one would normally use an explicit conversion, but I'm not sure if they actually do. And even if they are indeed that clever, it is not reasonable to expect the same from anyone who is not a standard library implementer.)
But that's an issue for LWG to resolve. Notice that LWG 484 was reopened in 2009 and remains open to this day. It's probably a lot of work to go through all the uses of "convertible to" and "convertible from" and determine which ones should mean "implicitly convertible to" and "implicitly convertible from" and which ones should mean std::convertible_to (as well as if any uses of "implicitly convertible to" need to be changed to std::convertible_to).
Related
I've seen it come up a few times on StackOverflow and elsewhere that decltype(sizeof(T)) can be used with std::void_t to SFINAE off of whether T is complete or not. This process is even documented by Raymond Chen in Microsoft's blog titled Detecting in C++ whether a type is defined with the explicit comment stating:
I’m not sure if this is technically legal, but all the compilers I tried seemed to be okay with it.
Is this behavior reliable and well-defined as per the C++ standard?
The only indication I can find in the standard is from [expr.sizeof]/1 wherein it states:
... The sizeof operator shall not be applied to an expression that has function or incomplete type, to the parenthesized name of such types, or to a glvalue that designates a bit-field ...
However it is unclear to me whether the wording "shall not be applied" would imply that this is "invalid" for the purposes of substitution as per the rules in [temp], or whether this is ill-formed.
ℹ️ Note: This question is not directed at any particular version of the standard, but it would be interesting to compare if this has changed at any point.
"Shall not be applied" means that it would normally be ill-formed. In an SFINAE context, if something would normally be ill-formed due to resulting in "an invalid type or expression", this becomes a substitution failure, as long as it is in the "immediate context" (C++20 [temp.deduct]/8) and not otherwise excluded from SFINAE (e.g. see p9 regarding lambda expressions).
There is no difference between "invalid" and "ill-formed" in this context. p8 explicitly says: "An invalid type or expression is one that would be ill-formed, with a diagnostic required, if written using the substituted arguments." This wording has been present since C++11. However, in C++03, invalid expressions were not substitution failures. This is the famous "expression SFINAE" feature that was added in C++11, after compiler implementers were sufficiently convinced that they would be able to implement it.
There is no rule in the standard that says that sizeof expressions are an exception to the SFINAE rules, so as long as an invalid sizeof expression occurs in the immediate context, SFINAE applies.
The "immediate context" has still not been explicitly defined in the standard. An answer by Jonathan Wakely, a GCC dev, explains the intent. Eventually, someone might get around to formally defining it in the standard.
However, the case of incomplete types, the problem is that this technique is very dangerous. First, if the completeness check is performed twice in the same translation unit on the same type, the instantiation is only performed once; this implies that the second time it's checked, the result of the check will still be false, because the is_type_complete_v<T> will simply refer to the previous instantiation. Chen's post appears to simply be wrong about this: GCC, Clang, and MSVC all behave the same way. See godbolt. It's possible that the behaviour was different on an older version of MSVC.
Second, if there is cross-translation-unit variance: that is, is_type_complete_v<T> is instantiated in one translation unit and is false, and is instantiated in another translation unit and is true there, the program is ill-formed NDR. See C++20 [temp.point]/7.
For this reason, completeness checks are generally not done; instead, library implementers either say that you are allowed to pass incomplete types to their templates and they will work properly, or that you must pass a complete type but the behaviour is undefined if you violate this requirement, as it cannot be reliably checked at compile time.
One creative way around the template instantiation rules is to use a macro with __COUNTER__ to make sure that you have a fresh instantiation every time you use the type trait, and you have to define the is_type_complete_v template with internal linkage, to avoid the issue of cross-TU variance. I got this technique from this answer. Unfortunately, __COUNTER__ is not in standard C++, but this technique should work on compilers that support it.
(I looked into whether the C++20 source_location feature can replace the non-standard __COUNTER__ in this technique. I think it can't, because IS_COMPLETE may be referenced from the same line and column but within two different template instantiations that somehow both decide to check the same type, which is incomplete in one and complete in the other.)
According to cppreference C++20 now supports floating-point parameters in templates.
I am, however, unable to find any compiler support information on that site as well as on others.
Current gcc trunk just does it, the others are negative.
I would just like to know if this is a very low-priortity feature and/or when to expect it to become commonly supported.
The only related thing I can find is:
P0732R2 Class types in non-type template parameters. Kudos if anyone could briefly explain that instead.
It seems that the real question that can be answered here is about the history of this feature, so that whatever compiler support can be understood in context.
Limitations on non-type template parameter types
People have been wanting class-type non-type template parameters for a long time. The answers there are somewhat lacking; what really makes support for such template parameters (really, of non-trivial user-defined types) complicated is their unknown notion of identity: given
struct A {/*...*/};
template<A> struct X {};
constexpr A f() {/*...*/}
constexpr A g() {/*...*/}
X<f()> xf;
X<g()> &xg=xf; // OK?
how do we decide whether X<f()> and X<g()> are the same type? For integers, the answer seems intuitively obvious, but a class type might be something like std::vector<int>, in which case we might have
// C++23, if that
using A=std::vector<int>;
constexpr A f() {return {1,2,3};}
constexpr A g() {
A ret={1,2,3};
ret.reserve(1000);
return ret;
}
and it's not clear what to make of the fact that both objects contain the same values (and hence compare equal with ==) despite having very different behavior (e.g., for iterator invalidation).
P0732 Class types in non-type template parameters
It's true that this paper first added support for class-type non-type template parameters, in terms of the new <=> operator. The logic was that classes that defaulted that operator were "transparent to comparisons" (the term used was "strong structural equality") and so programmers and compilers could agree on a definition of identity.
P1185 <=> != ==
Later it was realized that == should be separately defaultable for performance reasons (e.g., it allows an early exit for comparing strings of different lengths), and the definition of strong structural equality was rewritten in terms of that operator (which comes for free along with a defaulted <=>). This doesn't affect this story, but the trail is incomplete without it.
P1714 NTTP are incomplete without float, double, and long double!
It was discovered that class-type NTTPs and the unrelated feature of constexpr std::bit_cast allowed a floating-point value to be smuggled into a template argument inside a type like std::array<std::byte,sizeof(float)>. The semantics that would result from such a trick would be that every representation of a float would be a different template argument, despite the fact that -0.0==0.0 and (given float nan=std::numeric_limits<float>::quiet_NaN();) nan!=nan. It was therefore proposed that floating-point values be allowed directly as template arguments, with those semantics, to avoid encouraging widespread adoption of such a hacky workaround.
At the time, there was a lot of confusion around the idea that (given template<auto> int vt;) x==y might differ from &vt<x>==&vt<y>), and the proposal was rejected as needing more analysis than could be afforded for C++20.
P1907R0 Inconsistencies with non-type template parameters
It turns out that == has a lot of problems in this area. Even enumerations (which have always been allowed as template parameter types) can overload ==, and using them as template arguments simply ignores that overload entirely. (This is more or less necessary: such an operator might be defined in some translation units and not others, or might be defined differently, or have internal linkage, etc.) Moreover, what an implementation needs to do with a template argument is canonicalize it: to compare one template argument (in, say, a call) to another (in, say, an explicit specialization) would require that the latter had somehow already been identified in terms of the former while somehow allowing the possibility that they might differ.
This notion of identity already differs from == for other types as well. Even P0732 recognized that references (which can also be the type of template parameters) aren't compared with ==, since of course x==y does not imply that &x==&y. Less widely appreciated was that pointers-to-members also violate this correspondence: because of their different behavior in constant evaluation, pointers to different members of a union are distinct as template arguments despite comparing ==, and pointers-to-members that have been cast to point into a base class have similar behavior (although their comparison is unspecified and hence disallowed as a direct component of constant evaluation).
In fact, in November 2019 GCC had already implemented basic support for class-type NTTPs without requiring any comparison operator.
P1837 Remove NTTPs of class type from C++20
These incongruities were so numerous that it had already been proposed that the entire feature be postponed until C++23. In the face of so many problems in so popular a feature, a small group was commissioned to specify the significant changes necessary to save it.
P1907R1 (structural types)
These stories about template arguments of class type and of floating-point type reconverge in the revision of P1907R0 which retained its name but replaced its body with a solution to National Body comments that had also been filed on the same subject. The (new) idea was to recognize that comparisons had never really been germane, and that the only consistent model for template argument identity was that two arguments were different if there was any means of distinguishing them during constant evaluation (which has the aforementioned power to distinguish pointers-to-members, etc.). After all, if two template arguments produce the same specialization, that specialization must have one behavior, and it must be the same as would be obtained from using either of the arguments directly.
While it would be desirable to support a wide range of class types, the only ones that could be reliably supported by what was a new feature introduced (or rather rewritten) at almost the last possible moment for C++20 were those where every value that could be distinguished by the implementation could be distinguished by its clients—hence, only those that have all public members (that recursively have this property). The restrictions on such structural types are not quite as strong as those on an aggregate, since any construction process is permissible so long as it is constexpr. It also has plausible extensions for future language versions to support more class types, perhaps even std::vector<T>—again, by canonicalization (or serialization) rather than by comparison (which cannot support such extensions).
The general solution
This newfound understanding has no relationship to anything else in C++20; class-type NTTPs using this model could have been part of C++11 (which introduced constant expressions of class type). Support was immediately extended to unions, but the logic is not limited to classes at all; it also established that the longstanding prohibitions of template arguments that were pointers to subobjects or that had floating-point type had also been motivated by confusion about == and were unnecessary. (While this doesn't allow string literals to be template arguments for technical reasons, it does allow const char* template arguments that point to the first character of static character arrays.)
In other words, the forces that motivated P1714 were finally recognized as inevitable mathematical consequences of the fundamental behavior of templates and floating-point template arguments became part of C++20 after all. However, neither floating-point nor class-type NTTPs were actually specified for C++20 by their original proposals, complicating "compiler support" documentation.
Introduction
The standard specifies that each concept is related to two predicates:
predicate "is statisfied by": a concept is satisfied by a sequence of template argument when it evaluates to true. This is almost a syntactic check.
predicate "is modeled by": A sequence Args of template arguments is said to model a concept C if Args satisfies C ([temp.constr.decl]) and meets all semantic requirements (if any) given in the specification of C. [res.on.requirements]
For some concepts, the requirements that makes a satisfied concept modeled are clearly expressed. Example [concept.assignable]
LHS and RHS model assignable_from<LHS, RHS> only if
addressof(lhs = rhs) == addressof(lcopy)
[...]
But I wonder if the syntactic requirements also implicitly implies semantic requirements.
Question
Does the syntactic predicates implicitly imply requirement for the concept to be modeled ?
I see two kind of implicit requirement:
The concept is satisfied because syntactically checked expressions are unevaluated expressions and such expressions would result in the program being ill-formed if those expressions were not unevaluated expressions.
The concept is satisfied because syntactically checked expressions are not evaluated but evaluation of those expression would result in the program having undefined behavior.
Examples
For example, let's consider the default_initializable concept, defined here: [concept.default.init].
default_initializable is satisfied by A<int> but the program is ill-formed if a variable of type A<int> is default-initialized (demo):
template <class T>
struct A {
A() {
f(T{});
}
};
static_assert (default_initializable <A<int>>); // A<int> satisfies default_initializable
A<int> a{}; //compile time error: f not declared in this scope
default_initializable is satisfied by A but default-initialization of A result in undefined behavior (when the default-initialization is not preceded by a zero-initialization) (demo):
struct A {
int c;
A() {
c++;
}
};
static_assert (default_initializable <A>); // A satisfies default_initializable
auto p = new A; //undefined behavior: indeterminate-value as operand of operator ++
a concept is satisfied by a sequence of template argument when it evaluates to true. This is almost a syntactic check.
No, it is not "almost" anything: it is a syntactic check. The constraints specified by a requires clause (for example) verify that a specific syntax is legal syntax for that type. This is all that "satisfying a concept" means.
Does the syntactic predicates implicitly imply requirement for the concept to be modeled?
... no. If satisfying a concept also implied modeling the concept, then the standard wouldn't need different terms for these.
The point of having such a distinction is the recognition that the concept language feature can't specify every requirement that concepts as a concept should encapsulate. So satisfying-a-concept is just the language part, while modelling-a-concept includes things that the language can't do.
But that question is kind of separate from what your two examples show. Your examples represent the difference between "valid syntax" and "can be compiled/executed". Satisfying a concept only cares about the former. And modelling a concept only cares about the latter to the extent that said semantic behavior is explicitly specified.
There is nothing in the standard about implicit semantic requirements. There is no statement to the effect of "all expressions/statements in a concept must be able to be compiled and/or executed in order for it to be modeled". Nor is it intended to.
However much we try to pretend it's more than this, concepts as it exists in C++20 is nothing more than a more convenient mechanism for performing SFINAE. SFINAE can't test compilable/executable validity of the contents of some expression, so neither can concepts. And neither does concepts attempt to pretend that it can.
I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).
In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:
-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same.
For example, an unsigned int can alias an int, but not a void* or a double. A character type may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common.
Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
This is what I think I understood from this example and my doubts:
1) aliasing only works between similar types, or char
Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);
Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?
Consequence of 1) for non similar types (whatever this means), aliasing does not work;
2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;
Doubt: is aliasing a specific case of type-punning where types are similar?
I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning:
not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.
The questions:
can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
Accessing integers as their unsigned/signed counterparts
Accessing anything as a char, unsigned char or std::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char (or similar to char) lvalue, but doing the opposite (i.e. accessing an object of type char through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
According to the footnote 88 in the C11 draft N1570, the "strict aliasing rule" (6.5p7) is intended to specify the circumstances in which compilers must allow for the possibility that things may alias, but makes no attempt to define what aliasing is. Somewhere along the line, a popular belief has emerged that accesses other than those defined by the rule represent "aliasing", and those allowed don't, but in fact the opposite is true.
Given a function like:
int foo(int *p, int *q)
{ *p = 1; *q = 2; return *p; }
Section 6.5p7 doesn't say that p and q won't alias if they identify the same storage. Rather, it specifies that they are allowed to alias.
Note that not all operations which involve accessing storage of one type as another represent aliasing. An operation on an lvalue which is freshly visibly derived from another object doesn't "alias" that other object. Instead, it is an operation upon that object. Aliasing occurs if, between the time a reference to some storage is created and the time it is used, the same storage is referenced in some way not derived from the first, or code enters a context wherein that occurs.
Although the ability to recognize when an lvalue is derived from another is a Quality of Implementation issue, the authors of the Standard must have expected implementations to recognize some constructs beyond those mandated. There is no general permission to access any of the storage associated with a struct or union by using an lvalue of member type, nor does anything in the Standard explicitly say that an operation involving someStruct.member must be recognized as an operation on a someStruct. Instead, the authors of the Standard expected that compiler writers who make a reasonable effort to support constructs their customers need should be better placed than the Committee to judge the needs of those customers and fulfill them. Since any compiler that makes an even-remotely-reasonable effort to recognize derived references would notice that someStruct.member is derived from someStruct, the authors of the Standard saw no need to explicitly mandate that.
Unfortunately, the treatment of constructs like:
actOnStruct(&someUnion.someStruct);
int q=*(someUnion.intArray+i)
has evolved from "It's sufficiently obvious that actOnStruct and the pointer dereference should be expected to act upon someUnion (and consequently all the members thereof) that there's no need to mandate such behavior" to "Since the Standard doesn't require that implementations recognize that the actions above might affect someUnion, any code relying upon such behavior is broken and need not be supported". Neither of the above constructs is reliably supported by gcc or clang except in -fno-strict-aliasing mode, even though most of the "optimizations" that would be blocked by supporting them would generate code that is "efficient" but useless.
If you're using -fno-strict-aliasing on any compiler having such an option, almost anything will work. If you're using -fstrict-aliasing on icc, it will try to support constructs that use type punning without aliasing, though I don't know if there's any documentation about exactly what constructs it does or does not handle. If you use -fstrict-aliasing on gcc or clang, anything at all that works is purely by happenstance.
I think it's good to add a complementary answer, simply because when I asked the question I did not know how to fulfill my needs without using UNION: I got stubborn on using it because it seemed to answer precisely my needs.
The good way to do type punning and to avoid possible consequences of undefined behavior (depending on the compiler and other env. settings) is to use std::memcpy and copy the memory bytes from one type to another. This is explained - for example - here and here.
I've also read that often when a compiler produces valid code for type punning using unions, it produces the same binary code as if std::memcpy was used.
Finally, even if this information does not directly answer my original question it's so strictly related that I felt it was useful to add it here.
https://stackoverflow.com/a/3601661/368896 and others provide a definition of lvalue and rvalue (referencing the standard), as well as xvalue, glvalue, and prvalue.
However, the definition of rvalue refers to the definition of xvalue, and the definition of xvalue refers to "the result of certain kinds of expressions involving rvalue references".
Therefore, to understand the technical definition of rvalue, one must determine which kinds of "expressions involving rvalue references" are referred to in the above definition. I have taken the time to attempt to follow the standard (and different postings) to track through this definition. I would like a definition that does not require tracking through other definitions (except for more obvious, lower-level definitions).
Is this a wild goose chase? Is it possible to provide a concise, but rigorous, definition of the difference between an rvalue expression and an lvalue expression, that covers all possible cases, and that does not make reference to "the result of certain kinds of expressions involving rvalue references"? If so, what is such a definition?
What constitutes an rvalue is defined by a set of ad-hoc rules, so I don't think "concise" is going to be possible.
xvalues are fairly easy: they're "stuff that returns a T&&." This includes functions returning that, explicit casts to T&&, accessing a class data member via a T&& (whether directly or via pointer-to-data-member).
The problem is prvalues, which are all over the place. You can get a general sense of what they are, but the specifics involves a lot of ad-hoc rules. Like the fact that return valueArgument; can consider valueArgument to be a prvalue, even though it's an lvalue everywhere else.
Basically, the standards committee went through the standard and looked for places where they wanted to implicitly move, and then said, "You're a prvalue."
So a "concise but rigorous" definition is not going to be likely.