Is strict aliasing is c or c++ thing?

Is strict aliasing is c or c++ thing? - c++

In ISO/IEC 9899:TC2, the standard says following
6.3.2.3 Pointers
A pointer to an object or incomplete type may be converted to a pointer to a different
object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undeﬁned. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of
the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.
So, it is not clear from the standard that a pointer of one type can be casted to pointer of another type.

Strict aliasing rule is defined somewhere else. This is the wording:
C (ISO/IEC 9899:1999 6.5/7):
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object,
a qualiﬁed version of a type compatible with the effective type of
the object,
a type that is the signed or unsigned type corresponding to the
effective type of the object,
a type that is the signed or unsigned type corresponding to a
qualiﬁed version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
C++ (ISO/IEC 14882:2011 3.10 [basicl.lval] / 15):
If a program attempts to access the stored value of an object through
an lvalue of other than one of the following types the behavior is
undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the
dynamic type of the object,
a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned
types among its elements or non-static data members (including,
recursively, an element or non-static data member of a subaggregate
or contained union),
a type that is a (possibly cv-qualified) base class type of the
dynamic type of the object,
a char or unsigned char type.
The C standard doesn't prohibit you from casting the pointer to an unrelated type, provided there are no allignment problems. However, due to the strict aliasing rule, you basically can't dereference a pointer obtained from such a cast. So the only useful thing to do with such "invalid" pointer is to cast it back to the correct type (or a compatible type).
It's mostly the same in C++ with reinterpret_cast (5.2.10 [expr.reinterpret.cast] / 7):
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void. Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.

Related

Does casting to an unrelated reference type violate the strict aliasing rule?

The strict aliasing rule says
If a program attempts to access the stored value of an object through
a glvalue of other than one of the following types the behavior is
undefined:
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union)
I am wondering if the following program already contains undefined behavior and if there is "an access to the stored value":
#include <cstdint>
void foo() {
std::int32_t i = 1;
float f = 1.0f;
std::int32_t& r = reinterpret_cast<std::int32_t&>(f);
std::int32_t* p = reinterpret_cast<std::int32_t*>(&f);
}
From what I see, the cast of the float pointer to the int reference is equivalent to `*reinterpret_cast(&x):
A glvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly
converted to the type “pointer to T2” using a reinterpret_cast The
result refers to the same object as the source glvalue, but with the
specified type. [ Note: That is, for lvalues, a reference cast
reinterpret_cast(x) has the same effect as the conversion
*reinterpret_cast(&x) with the built-in & and * operators (and similarly for reinterpret_cast(x)). —end note ]
For pointers, reinterpret_cast boils down to conversion to void* and then to the target type:
An object pointer can be explicitly converted to an object pointer of a different type.72 When a prvalue v of object pointer type is
converted to the object pointer type “pointer to cv T”, the result is
static_cast(static_cast(v)).
The semantics of the two static casts are defined as:
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T,” where T is an object type and cv2
is the same cv-qualification as, or greater cv-qualification than,
cv1. The null pointer value is converted to the null pointer value of
the destination type. If the original pointer value represents the
address A of a byte in memory and A satisfies the alignment
requirement of T, then the resulting pointer value represents the same
address as the original pointer value, that is, A. The result of any
other such pointer conversion is unspecified.
Since int32_t and float have same size and alignment, I should get a new pointer pointing to the same address. What I am wondering is if
a reference cast reinterpret_cast(x) has the same effect as the
conversion *reinterpret_cast(&x) with the built-in & and * operators
already constitutes an access to the stored value or if that must be made somewhere later to violate the strict aliasing rule.

The casts you made are not an access to the objects i or f.
3.1 access [defns.access]
⟨execution-time action⟩ to read or modify the value of an object
Since your program does not attempt to do the above, it does not violate strict aliasing. Trying to use those pointers/references to read or write will be a violation, however. So this is a fine line to tread. But merely obtaining the references/pointers is not against the first paragraph you stated. Casting to unrelated reference/pointer types only deals with an object's identity/address, and not its value.

reinterpret_cast of a reference to a shared_ptr

Let B be derived from class A. By reading various posts I've got an impression that casting like in
const std::shared_ptr<const A> a(new B());
const std::shared_ptr<const B>& b = reinterpret_cast<const std::shared_ptr<const B>&>(a);
is for some reason discouraged and that one should use reinterpret_pointer_cast instead. However, I would like to avoid creating a new shared_ptr for performance reasons. Is the above code legal? Does it lead to undefined behavior? It seems to work in gcc and in Visual Studio.

You want static_pointer_cast.
const std::shared_ptr<const A> a(new B());
const std::shared_ptr<const B> b = std::static_pointer_cast<const B>(a);
I highly doubt the above will cause any performance issues. But if you have evidence that a shared_ptr creates a performance problem, then fallback to the raw pointer:
const B* pB = static_cast<const B*>(a.get());
Another hint. Please try to avoid reinterpret_cast between classes with an inheritance relationship. In cases where there are virtual methods and/or multiple inheritance, the static_cast will correctly adjust the pointer offset to the correct vtable or base offset. But reinterpret_cast will not. (Or technically: undefined behavior)

reinterpret_cast will usually lead to UB. Sometimes you are willing to take the risk of using it, for performances reasons, but you will try to avoid this kind of thing as much as you can. In this case, it better for you to use static_pointer_cast.
Pay attention, that even if you don't know, in this case, which other cast can you use, and you willing to take the risk with reinterpret_cast, you must use some validations after and before the casting- otherwise you will be able to get a lot of errors, and a lot of time spending.

First, you create an object a of type const std::shared_ptr<const A> a and initialize it with a pointer to some type B. This only works if you can assign a B* to an A*, so there should be a relationship such as inheritance. Ignoring this, you convert an object of some type to a reference to another type with reinterpret_cast:
A glvalue expression of type T1 can be cast to the type “reference to
T2” if an expression of type “pointer to T1” can be explicitly
converted to the type “pointer to T2” using a reinterpret_cast
The result refers to the same object as the source glvalue, but with
the specified type. [ Note: That is, for lvalues, a reference cast
reinterpret_cast(x) has the same effect as the conversion
*reinterpret_cast(&x) with the built-in & and * operators (and similarly for reinterpret_cast(x)). —end note ]
For pointers, reinterpret_cast boils down to conversion to void* and then to the target type:
An object pointer can be explicitly converted to an object pointer of
a different type.72 When a prvalue v of object pointer type is
converted to the object pointer type “pointer to cv T”, the result is
static_cast<cv T*>(static_cast<cv void*>(v)).
The semantics of the two static casts are defined as:
A prvalue of type “pointer to cv1 void” can be converted to a prvalue
of type “pointer to cv2 T,” where T is an object type and cv2 is the
same cv-qualification as, or greater cv-qualification than, cv1. The
null pointer value is converted to the null pointer value of the
destination type. If the original pointer value represents the address
A of a byte in memory and A satisfies the alignment requirement of T,
then the resulting pointer value represents the same address as the
original pointer value, that is, A. The result of any other such
pointer conversion is unspecified.
The platform I am working on has near and far pointers which are 16 or 32 bit. In that case, the types shared_ptr<A> and shared_ptr<B> are of different size and alignment, and casting one into the other is then unspecified behavior. If alignment matches, the result of the static casts is defined.
However, the first clause about reinterpret_cast to a reference also contains a note
[ Note: That is, for lvalues, a reference
cast reinterpret_cast<T&>(x) has the same effect as the conversion *reinterpret_cast<T*>(&x) with
the built-in & and * operators (and similarly for reinterpret_cast<T&&>(x)). —end note ]
so basically, the cast is semantically identical to a pointer conversion with immediate dereferencing. Even if the pointers are of identical size (and compatible alignment), using the casted pointer will violate the strict alias rule since dereferencing is an access.
If a program attempts to access the stored value of an object
through a glvalue of other than one of the
following types the behavior is undefined:53
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type
of the object,
— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic
data members (including, recursively, an element or non-static data member of a subaggregate
or contained union),

Using functionalities of a shared_ptr, or any other standard class or template, is only defined when calling the functions (including member functions) for the class of the type which you pass to the function (including as implicit this argument):
Nothing in the standard defines what happens when you call a standard function expecting a Foo and passing a Bar, for any two standard types Foo and Bar (or even for user types.)
That's not defined; that's undefined. By not meeting the most basic precondition: to use arguments of the correct type.

Is it a strict aliasing violation to alias a struct as its first member?

Sample code:
struct S { int x; };
int func()
{
S s{2};
return (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
}
I believe this is common and considered acceptable. The standard does guarantee that there is no initial padding in the struct. However this case is not listed in the strict aliasing rule (C++17 [basic.lval]/11):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
(11.1) the dynamic type of the object,
(11.2) a cv-qualified version of the dynamic type of the object,
(11.3) a type similar (as defined in 7.5) to the dynamic type of the object,
(11.4) a type that is the signed or unsigned type corresponding to the dynamic type of the object,
(11.5) a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
(11.6) an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
(11.7) a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
(11.8) a char, unsigned char, or std::byte type.
It seems clear that the object s is having its stored value accessed.
The types listed in the bullet points are the type of the glvalue doing the access, not the type of the object being accessed. In this code the glvalue type is int which is not an aggregate or union type, ruling out 11.6.
My question is: Is this code correct, and if so, under which of the above bullet points is it allowed?

The behaviour of the cast comes down to [expr.static.cast]/13;
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original
pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T , then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
The definition of pointer-interconvertible is:
Two objects a and b are pointer-interconvertible if:
they are the same object, or
one is a union object and the other is a non-static data member of that object, or
one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object, or
there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
So in the original code, s and s.x are pointer-interconvertible and it follows that (int &)s actually designates s.x.
So, in the strict aliasing rule, the object whose stored value is being accessed is s.x and not s and so there is no problem, the code is correct.

I think it's in expr.reinterpret.cast#11
A glvalue expression of type T1, designating an object x, can be cast
to the type “reference to T2” if an expression of type “pointer to T1”
can be explicitly converted to the type “pointer to T2” using a
reinterpret_cast. The result is that of *reinterpret_cast<T2 *>(p)
where p is a pointer to x of type “pointer to T1”. No temporary is
created, no copy is made, and no constructors or
conversion functions are called [1].
[1] This is sometimes referred to as a type pun when the result refers to the same object as the source glvalue
Supporting #M.M's answer about pointer-incovertible:
from cppreference:
Assuming that alignment requirements are met, a reinterpret_cast does
not change the value of a pointer outside of a few limited cases
dealing with pointer-interconvertible objects:
struct S { int a; } s;
int* p = reinterpret_cast<int*>(&s); // value of p is "pointer to s.a" because s.a
// and s are pointer-interconvertible
*p = 2; // s.a is also 2
versus
struct S { int a; };
S s{2};
int i = (int &)s; // Equivalent to *reinterpret_cast<int *>(&s)
// i doesn't change S.a;

The cited rule is derived from a similar rule in C89 which would be nonsensical as written unless one stretches the meaning of the word "by", or recognizes what "Undefined Behavior" meant when C89 was written. Given something like struct S {unsigned dat[10];}s;, the statement s.dat[1]++; would clearly modify the stored value of s, but the only lvalue of type struct S in that expression is used solely for the purpose of producing a value of type unsigned*. The only lvalue which is used to modify any object is of type int.
As I see it, there are two related ways of resolving this issue: (1) recognizing that the authors of the Standard wanted to allow cases where an lvalue of one type was visibly derived from one of another type, but didn't want to get hung up on details of what forms of visible derivation must be accounted for, especially since the range of cases compilers would need to recognize would vary considerably based upon the styles of optimization they performed and the tasks for which they were being used; (2) recognizing that the authors of the Standard had no reason to think it should matter whether the Standard actually required that a particular construct be processed usefully, if it would be have been clear to everyone that there was reason to do otherwise.
I don't think there has consensus among the Committee members over whether a compiler given something like:
struct foo {int ct; int *dat;} it;
void test(void)
{
for (int i=0; i < it.ct; i++)
it.dat[i] = 0;
}
should be required to ensure that e.g. after it.ct = 1234; it.dat = &it.ct;, a call to test(); would zero it.ct and have no other effect. Parts of the Rationale would suggest that at least some committee members would have expected so, but the omission of any rule that would allow for an object of structure type to be accessed using an arbitrary lvalue of member type suggests otherwise. The C Standard has never really resolved this issue, and the C++ Standard cleans things up somewhat but doesn't really solve it either.

Does reinterpret_cast lead to undefined behavior?

I have a class template A which contains a container of pointers (T*):
template <typename T>
class A {
public:
// ...
private:
std::vector<T*> data;
};
and a bunch of functions like:
void f(const A<const T>&);
void g(const A<const T>&);
Is it OK to call these functions via a cast from A<const T> to A<T>?
A<double> a;
...
auto& ac = reinterpret_cast<const A<const double>&>(a);
f(ac);
I'm pretty sure that this code has undefined behaviour.
Is it dangerous to use such conversions in real life?

Although the reinterpret_cast itself might be unspecified behaviour, attempting to access the parameters once you've done the cast is undefined behaviour.
N3337 [basic.lval]/10:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type
of the object,
— an aggregate or union type that includes one of the aforementioned types among its elements or non-
static data members (including, recursively, an element or non-static data member of a subaggregate
or contained union),
— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
— a char or unsigned char type.
Your example is none of the above.

As A<double> and A<const double> are unrelated types, it's actually unspecified (originally I thought undefined) behavior and correspondingly yes it's a bad idea to use in real life: You never know what system(s) or compiler(s) you may port to that change the behavior is strange ways.
Reference:
5.2.10/11:
An lvalue expression of type T1 can be cast to the type “reference to
T2” if an expression of type “pointer to T1” can be explicitly
converted to the type “pointer to T2” using a reinterpret_cast. That
is, a reference cast reinterpret_cast(x) has the same effect as
the conversion *reinterpret_cast(&x) with the built-in & and *
operators (and similarly for reinterpret_cast(x)).
So they've redirected us to an earlier section 5.2.10/7:
An object pointer can be explicitly converted to an object pointer of
a different type. ... ... Converting a prvalue of type
“pointer to T1” to the type “pointer to T2” (where T1 and T2 are
object types and where the alignment requirements of T2 are no
stricter than those of T1) and back to its original type yields the
original pointer value. The result of any other such pointer
conversion is unspecified.
If f and g are algorithms that work on containers, the easy solution is to change them to template algorithms that work on ranges (iterator pairs).

Where does the C++ standard allow pointers to undefined types?

Where in the C++ spec is this allowed? It's cool. I want to know how this is spec'd. I didn't realize the spec allowed having a pointer to a undefined type.
class T;
int main(int argc, char* argv[])
{
int x;
T* p = reinterpret_cast<T*>(&x);
return 0;
}

What you're doing may be legal, but that depends on the real definition of class T, so we can't know for sure based on the code you've shown.
Starting with C++11 §3.9.2/3:
Pointers to incomplete types are allowed although there are restrictions on what can be done with them.
Those restrictions are listed in §3.2/4:
A class type T must be complete if:
an object of type T is defined, or
a non-static class data member of type T is declared, or
T is used as the object type or array element type in a new-expression, or
an lvalue-to-rvalue conversion is applied to a glvalue referring to an object of type T, or
an expression is converted (either implicitly or explicitly) to type T, or
an expression that is not a null pointer constant, and has type other than void*, is converted to the type pointer to T or reference to T using an implicit conversion, a dynamic_cast or a static_cast, or
a class member access operator is applied to an expression of type T, or
the typeid operator or the sizeof operator is applied to an operand of type T, or
a function with a return type or argument type of type T is defined or called, or
a class with a base class of type T is defined, or
an lvalue of type T is assigned to, or
the type T is the subject of an alignof expression, or
an exception-declaration has type T, reference to T, or pointer to T.
The 6th bullet appears pertinent here, as we can see in §5.2.10/7 that a reinterpret_cast between pointer types is defined in terms of static_cast:
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types and the alignment requirements of T2 are no stricter than those of T1, or if either type is void. Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.
But because reinterpret_cast static_casts to void* first, then to the real resulting pointer type, that 6th bullet doesn't apply.
So, so far you're still in legal territory, despite T being an incomplete type. However, if it turns out that T is not a standard-layout type or has stricter alignment requirements than int, then the last sentence in §5.2.10/7 holds true and you're invoking UB.

The only thing you're allowed to do after casting a pointer to an unrelated type (other than char*), is cast back to the original pointer type.
#cli_hlt beat me to providing the section.
Here's the rule itself:
An object pointer can be explicitly converted to an object pointer of a different type.
When a prvalue v of type "pointer to T1" is converted to the type "pointer to cv T2", the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void. Converting a prvalue of type
"pointer to T1" to the type "pointer to T2" (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.
The strict aliasing rule prohibits access to an object through an unrelated type. See https://stackoverflow.com/a/7005988/103167
Another rule somewhat related to your question is found in section 5.4:
The operand of a cast using the cast notation can be a prvalue of type "pointer to incomplete class type". The destination type of a cast using the cast notation can be "pointer to incomplete class type" If both the operand and destination types are class types and one or both are incomplete, it is unspecified whether the static_cast or the reinterpret_cast interpretation is used, even if there is an inheritance relationship
between the two classes.

Section 5.2.10 (7) for your case (as of ISO/IEC14882:1998(E), as well as in the 2011 FDIS).

I'm not sure what you perceive as being allowed. There is a guarantee that you can reinterpret_cast from one pointer type to a sufficiently big other pointer type and back to the original type again and will be the original pointer. The specification for this is in 5.2.10 [expr.reinterpret.cast]. That is, the following is guaranteed to work:
T* ptr = ...;
S* sptr = reinterpret_cast<S*>(ptr);
T* tptr = reinterpret_cast<T*>(sptr);
assert(ptr == tptr);
We had an out of session discussion earlier this week on this topic: if the Death Station 9000 implementation (which would be a conforming implementation of C++ but also tries to break user code wherever it is allowed to do so) XORs the bit pattern of the pointer with a bit pattern randomly chosen at the beginning of the program execution, would this be a permissible implementation if the types involved in the reinterpret_cast<T>(x) are not involving char. The consensus was that this would be OK.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Is strict aliasing is c or c++ thing? - c++

Related

Does casting to an unrelated reference type violate the strict aliasing rule?

reinterpret_cast of a reference to a shared_ptr

Is it a strict aliasing violation to alias a struct as its first member?

Does reinterpret_cast lead to undefined behavior?

Where does the C++ standard allow pointers to undefined types?

Categories

Resources