Strict aliasing rule violation - c++

In this link from the isocpp.org faq in the example provided, a Fred object is being constructed with placement new to a buffer that is being allocated for another object i.e. for
char memory[sizeof(Fred)]
As I know the strict aliasing rules allows us to do the opposite i.e. for an object of whatever type, we are allowed to have a char* point at it and we can dereference that pointer and use it as we want.
But here in the example the opposite is happening. What am I missing?

The strict aliasing rules doesn't mention that Fred* must be cast to char*. Only that variables of type char* and Fred* may point to the same object, and be used to access it.
Quoting [basic.lval] paragraph 8
If a program attempts to access the stored value of an object through
a glvalue of other than one of the following types the behavior is
undefined:
the dynamic type of the object,
[..]
a char or unsigned char type.

Placement-new creates a new object. It doesn't alias the old object. The old object (the char array in this example) is considered to stop existing when the placement-new executes.
Before placement-new, there is storage filled with char objects. After placement-new, there is storage filled with one Fred object.
Since there is no aliasing, there are no strict-aliasing problems.

Related

Is it possible for implicit object creation to not create objects in certain situations?

According to [intro.object]/10:
Some operations are described as implicitly creating objects within a specified region of storage. For each operation that is specified as implicitly creating objects, that operation implicitly creates and starts the lifetime of zero or more objects of implicit-lifetime types ([basic.types.general]) in its specified region of storage if doing so would result in the program having defined behavior. If no such set of objects would give the program defined behavior, the behavior of the program is undefined. If multiple such sets of objects would give the program defined behavior, it is unspecified which such set of objects is created.
it can choose not to create objects if that would make the program legal.
Consider the following code (from this question):
void* p = operator new(1);
new (p) char;
delete static_cast<char*>(p);
operator new implicitly creates objects, according to [intro.object]/13:
Any implicit or explicit invocation of a function named operator new or operator new[] implicitly creates objects in the returned region of storage and returns a pointer to a suitable created object.
It also follows [intro.object]/10, so consider two options:
Implicitly creates a char object. See this answer.
Does not create a char object. First, p points to uninitialized memory allocated by operator new. Then, explicitly create a char object on it by placement new. Finally, it is used on delete-expression.
The question is whether option 2 is legal, that is, whether p automatically points to the object created by placement new.
The standard rules at [basic.life]/8:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object.
But no object occupies the memory pointed to by p. Therefore, no replacement takes place, [basic.life]/8 doesn't apply in this case.
I didn't find a stipulation in the standard. So, does the standard allow option 2?
it can choose not to create objects if that would make the program legal.
Um, no; it cannot. The literal text you quoted says "that operation implicitly creates and starts the lifetime of ... if doing so would result in the program having defined behavior." There is no conditional here, no choice about it. So if the program would otherwise have undefined behavior, and creating a particular object type would give it defined behavior, IOC will do that.
It is not optional; if it is available, it must happen.
Option 1 is not only possible, it is required. It is required because option 2 is undefined behavior.
Let's take it step by step.
void* p = operator new(1);
If IOC does not happen, then p does not point to an object. It merely points to memory. That's just what operator new does.
new (p) char;
This creates a char object in the storage pointed to by p. However, this does not cause p to point to that object. I can't cite a portion of the standard because the standard doesn't say what doesn't happen. [expr.new] explains how new expressions work, and nothing there, or in the definition of the placement-new functions, says that it changes the nature of the argument that it is given.
Therefore, p continues to point at storage and not an object in that storage.
delete static_cast<char*>(p);
This is where UB happens. [expr.delete]/11 tells us:
For a single-object delete expression, the deleted object is the object denoted by the operand if its static type does not have a virtual destructor, and its most-derived object otherwise.
Well, p does not denote an object at all. It just points to storage. static_cast<char*>(p) does not have the power to reach into that storage and get a pointer to an object in that storage. At least, not unless the object is pointer-interconvertible with the object pointed to by p. But since p doesn't point to an object at all at this point... the cast cannot change this.
Therefore, we have achieved undefined behavior.
However, this is where IOC gets triggered. There is a way for IOC to fix this. If operator new(1) performed implicit object creation, and that object would be a char, then this expression would return a pointer to that char. This is what the "pointer to a suitable created object" wording you cited means.
In this case, p points to an actual object of type char.
Then we create a new char, which overlays the original char. Because of that, [basic.life]/8 is triggered. This defines a concept called "transparently replaceable". The new object transparently replaces the old one, and therefore, pointers/references to the old object transparently become pointers/references to the new one:
a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object
Therefore, from this point onward, p points to the new char.
So the static_cast will now return a char* that points to a char, which delete can delete with well-defined behavior. Since the code otherwise would have UB and IOC can fix that UB... it must do so.
Now, if your example had been:
void* p = operator new(1);
auto p2 = new (p) char;
delete p2;
Does an object get created in p? The answer is... it doesn't matter. Because you never use p in a way that requires it to point to a created object, this program has well-defined behavior either way. So the question is irrelevant.
In the case where the observable effect of both options is the same, both are permitted because of the “as-if rule”:
Allows any and all code transformations that do not change the observable behavior of the program.
The as-if rule

Is it valid to cast and access to implicit-lifetime types without explicit object creation?

char* t = (char*)malloc(sizeof(float) * 2);
*(float*)t = 1.0f; // or *reinterpret_cast<float*>(t) = 1.0f;
*((float*)t + 1) = 2.0f; // #1
In some SO questions, there are answers saying that above code is undefined behaviour because of strict-aliasing violation.
But, I read the paper P0593 recently.
I think the paper is saying If you allocate/obtain some storage using certain operations
(such as defining char/byte array, malloc, operator new, ...),
you can treat and use the storage itself as implicit-lifetime types without explicit object creation because the implicit types you want would be created implicitly.
If my thought is correct, doesn't the above code now violate strict-aliasing rule?
In the above code, Is a float array object created implicitly?
(If not, #1 is UB because I tried pointer arithmetic on the storage which is not array)
(If you can't understand what i'm saying, tell me plz.. I'm not good at English..)
Yes, the code is legal, and the objects are created implicitly. (since C++20)
I had doubts whether you need std::launder or not. Seems not, malloc does it implicitly (note "return a pointer to a suitable created object").

What are the requirements and restrictions on "pointer to suitable created object" in C++ 20 standard?

First I need to note that I have asked a similar question several times, and last time I have got an almost satisfactory answer.
However, the wording of the c++20 standard (draft) is not clear to me, and I am curious whether it is a defect within the standard or in the comprehension of the standard.
C++20 standard introduced the notion of implicit-lifetime types.
Scalar types, implicit-lifetime class types ([class.prop]), array types, and cv-qualified versions of these types are collectively called implicit-lifetime types.
Further the standard (draft) says https://eel.is/c++draft/basic.memobj#intro.object-10
Some operations are described as implicitly creating objects within a
specified region of storage. For each operation that is specified as
implicitly creating objects, that operation implicitly creates and
starts the lifetime of zero or more objects of implicit-lifetime types
([basic.types]) in its specified region of storage if doing so would
result in the program having defined behavior. If no such set of
objects would give the program defined behavior, the behavior of the
program is undefined. If multiple such sets of objects would give the
program defined behavior, it is unspecified which such set of objects
is created.
[Note 3: Such operations do not start the lifetimes of subobjects of
such objects that are not themselves of implicit-lifetime types.— end
note]
and https://eel.is/c++draft/basic.memobj#intro.object-11.
Further, after implicitly creating objects within a specified region
of storage, some operations are described as producing a pointer to a
suitable created object. These operations select one of the
implicitly-created objects whose address is the address of the start
of the region of storage, and produce a pointer value that points to
that object, if that value would result in the program having defined
behavior.
If no such pointer value would give the program defined behavior, the
behavior of the program is undefined.
If multiple such pointer values would give the program defined
behavior, it is unspecified which such pointer value is produced.
The paper P0593R6
Implicit creation of objects for low-level object manipulation provides motivation and more examples for these.
One of the motivations for introduction of implicit-lifetime objects into standard was dynamic construction of arrays in order to make the pointer arithmetic in the example non-UB.
One of the proposed changes to the standard 5.9. 16.5.3.5 Table 34: Cpp17Allocator requirements, (which is the same as Example 1 in allocator requirements ) gives the following example.
Example 1: When reusing storage denoted by some pointer value p,
launder(reinterpret_­cast<T*>(new (p)byte[n * sizeof(T)])) can be used to implicitly create a suitable
array object and obtain a pointer to it.
Which basically implies that the idea is to make the syntax (T*)malloc(...) and (T*)(::operator new(... ) ) (and similar cases) valid for subsequent pointer arithmetic.
I also understand the idea of "superposition" - that the suitable created object to which we return the pointer is determined on the usage, as long as an object that allows to use that pointer "legally" exists, we are fine.
Hence, I believe that the following snippets are fine
int* numbers = (int*)::operator new(n * sizeof(int) );
// Fine. Operator new implicitly creates an int array of a size smaller than `n` that is determined later
// and returns a pointer to the first element Both byte and byte array are implicitly constructed, so no
// problem here.
// the pointer to suitable created object points at numbers[0] which is implicitly constructed, so no
// problem here
A slightly modified example from the cited paper..
alignof(int) alignof(float) char buffer[max(sizeof(int), sizeof(float))];
*(int*)buffer = 5;
// First usage implicitly created an int and we have a pointer to that int, (int*)buffer is the pointer
// to suitable created object.
new (buffer) std::byte[sizeof(buffer)];
// ends the lifetime of the int and implicitly creates a float instead of
// it since next usage (float*) buffer implies implicit float creation
*(float*)buffer = 2.0f; // (float*)buffer is the pointer to the suitable created object (float) that was
// implicitly created by previous statement
However, note that we did not have a problem yet - all the types are implicitly constructed, so returning a pointer to first element of the array is fine - it is has implicitly constructed itself.
But what do we do to in order to create an array of type T if T is not an implicit-lifetime type?
Well, that should not be a problem.
auto memory = ::operator new(sizeof(T) * n, std::alignval_t{alignof(T)});
// implicitly creates a T array of size n.
auto arr_ptr = reinterpret_cast<T(*)[n]>(memory); // returns a pointer to that array
auto arr = *arr_ptr; // dereferences the pointer to our implicitly constructed array, and the result
// should be valid
However, there is a problem with this construct.
First of all - one of the motivations in the proposal paper was to make the usage of casting to T* valid, not to T(*)[] or T(*)[n].
The examples in the standard and the proposal paper suggest that cast to T* is a valid usage.
Let's look at the following snippet.
T* arr = (T*)::operator new(sizeof(T) * n, std::align_val_t{alignof(T)});
Now we have a problem. arr points at an object that is not constructed. No matter how we look at it, no matter what the size of the implicitly constructed T array is, its first element is not an implicit-lifetime object and therefore a pointer value that points at arr[0] is not a pointer to one of the implicitly-created objects whose address is the address of the start of the region of storage, hence the requirement for a pointer to suitable created object is not satisfied. I believe that same reasoning makes the example in allocator.requirements invalid. More than that - if n in example is replaced with 1, we basically produce a pointer to T object of non implicit-lifetime type (it is not even array, but just a pointer to T that has not yet been constructed) which is just plain wrong.
So where is the problem? Is the example indeed invalid? But then, the example in proposal paper is still invalid (UB) and making it valid (not UB), was one the motivations for introduction of implicit-lifetime into the standard.
So, how is this conflict solved? Is there an "error" in the wording in the standard and a pointer to suitable created object does not actually have to point to an object, but any valid pointer value that results in defined behavior is fine (and a pointer to an array element that has not yet begun its lifetime is a valid value for pointer arithmetic as far as I am aware, and pointer arithmetic is want we want here, after all) or pointer to first element before beginning of its lifetime a suitable created object (despite the fact that does not have implicit-lifetime) for reasons I do not understand? Or the examples in the motivation and allocator.requirements snippet are wrong, and despite being one of the main motivations for introduction of implicit-lifetime into the standard, they are still UB? Something else?
Diclaimer: Language-lawyer type of question.
An explanation in the answer to the cited question is practically fine. Besides that we have the standard library allocator facilities that allow creation of uninitialized arrays which should cover most of cases. After all, casting to (T*) (along with some laundering may be) work in practice, and the intention of the proposal paper to make it work. I believe that is the intention of the standard authors as well. This a question is how does the standard resolve this contradiction.
PS:
I do not mind if the answer to this question is merged with the cited one (as long as my question here gets answered), but I do believe that these questions are different - that one asked how to create an array without initializing its elements and this one asks about something that is (I believe so at least) an ambiguity (or contradiction) in the c++20 standard (well draft of it) and how to resolve it.

Can std::aligned_storage only validly be used through placement new?

Have been reading a few strict aliasing questions, such as Cast array of bytes to POD or Aliasing `T*` with `char*` is allowed. Is it also allowed the other way around?
From these I gather that the only legal way to access a memory location declared to be any type (specifically also (array of) char) as another type is to invoke placement new on it, as that would change the dynamic type.
Since std::aligned_storage normally has to have an underlying type other than the intended use type, it seems to me it is impossible to use the storage without invoking placement new on it first.
So I would not be allowed to create aligned_storage for, e.g. a double and use it as a double via pointer casting? Or rather, before I would be allowed to access the memory as double via pointer cast, I'd have to do a placement new on it, "turning it into" a dynamic object of type double?

Does access by reference violate strict aliasing rule?

I know that int* ptr = (int*)buffer (where buffer is char*) breaks
strict-aliasing rule.
Does this syntax int& ref = (int&)(*buffer) also break the rule?
I had some SEGFAULTs due to violation of the strict aliasing rule, and this syntax has eliminated that. Though probably still is incorrect, is it?
This is not ok (assuming you're going to use said reference to access the value). § 3.10 [basic.lval] ¶ 10 of the C++14 standard (quoting N4140) says (emphasis mine):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type
of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate
or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
It doesn't matter whether you attempt to access via a pointer or a reference. For a stored object of type char, none of the bullet points apply to make it allowed accessing it as an int.
The last bullet point only says that you may alias any other type as char but not vice versa. It makes sense because a char is the smallest addressable unit with the weakest alignment requirements.
If you want, using a pointer is the same as using a reference except that you need to dereference explicitly in order to access the value.
Strict aliasing rules mean that you should not dereference pointers of different types pointing to the same memory location.
Since in your posted code you never dereference, it's not possible to tell if this violates the rule without seeing all the code.
Also, aliasing to the char* type is an exception and does not violate the rule. Which means you can access a memory location containing any type by converting its pointer to char*, and dereferencing it.
To conclude:
If buffer points on a memory location which contains an int, and was converted from int* to char*, this is valid. However, you should use reinterpret_cast for this
If buffer points to a memory location which contains chars, dereferencing the int* ptr does violate the rule.
The reference version is likely to suffer from the same problem. But the compiler has no obligation to prevent or warn you from doing this
Don't use C style casts, use reinterpret_cast instead, and read the standard about which uses have defined behavior.
Yes, it does.
Neither C nor C++ special case accesses via pointers vs. other accesses, the strict aliasing rules apply regardless of whether you use a pointer, a reference, or any other lvalue.
If you run into trouble, the easiest solution is to use memcpy to copy the memory location into a local variable - any self-respectable compiler will completely optimise this memcpy away and only treat it as an aliasing hint (memcpy is also preferable over unions, because the union method is not as portable).