C++ reinterpret_cast safety with array references and move/copy assignment - c++

My teammates are writing a fixed-size implementation of std::vector for a safety-critical application. We're not allowed to use heap allocation, so they created a simple array wrapper like this:
template <typename T, size_t NUM_ITEMS>
class Vector
{
public:
void push_back(const T& val);
...more vector methods
private:
// Internal storage
T storage_[NUM_ITEMS];
...implementation
};
A problem we encountered with this implementation is that it requires elements present default constructors (which is not a requirement of std::vector and created porting difficulties). I decided to hack on their implementation to make it behave more like std::vector and came up with this:
template <typename T, size_t NUM_ITEMS>
class Vector
{
public:
void push_back(const T& val);
...more vector methods
private:
// Internal storage
typedef T StorageType[NUM_ITEMS];
alignas(T) char storage_[NUM_ITEMS * sizeof(T)];
// Get correctly typed array reference
StorageType& get_storage() { return reinterpret_cast<T(&)[NUM_ITEMS]>(storage_); }
const StorageType& get_storage() const { return reinterpret_cast<const T(&)[NUM_ITEMS]>(storage_); }
};
I was then able to just search and replace storage_ with get_storage() and everything worked. An example implementation of push_back might then look like:
template <typename T, size_t NUM_ITEMS>
void Vector<T, NUM_ITEMS>::push_back(const T& val)
{
get_storage()[size_++] = val;
}
In fact, it worked so easily that it got me thinking.. Is this a good/safe use of reinterpret_cast? Is the code directly above a suitable alternative to placement new, or are there risks associated with copy/move assignment to an uninitialized object?
EDIT: In response to a comment by NathanOliver, I should add that we cannot use the STL, because we cannot compile it for our target environment, nor can we certify it.

The code you've shown is only safe for POD types (Plain Old Data), where the object's representation is trivial and thus assignment to an unconstructed object is ok.
If you want this to work in all generality (which i assume you do due to using a template), then for a type T it is undefined behavior to use the object prior to construction it. That is, you must construct the object before your e.g. assignment to that location. That means you need to call the constructor explicitly on demand. The following code block demonstrates an example of this:
template <typename T, size_t NUM_ITEMS>
void Vector<T, NUM_ITEMS>::push_back(const T& val)
{
// potentially an overflow test here
// explicitly call copy constructor to create the new object in the buffer
new (reinterpret_cast<T*>(storage_) + size_) T(val);
// in case that throws, only inc the size after that succeeds
++size_;
}
The above example demonstrates placement new, which takes the form new (void*) T(args...). It calls the constructor but does not actually perform an allocation. The visual difference is the inclusion of the void* argument to operator new itself, which is the address of the object to act on and call the constructor for.
And of course when you remove an element you'll need to destroy that explicitly as well. To do this for a type T, simply call the pseudo-method ~T() on the object. Under templated context the compiler will work out what this means, either an actual destructor call, or no-op for e.g. int or double. This is demonstrated below:
template<typename T, size_t NUM_ITEMS>
void Vector<T, NUM_ITEMS>::pop_back()
{
if (size_ > 0) // safety test, you might rather this throw, idk
{
// explicitly destroy the last item and dec count
// canonically, destructors should never throw (very bad)
reinterpret_cast<T*>(storage_)[--size_].~T();
}
}
Also, I would avoid returning a refernce to an array in your get_storage() method, as it has length information and would seem to imply that all elements are valid (constructed) objects, which of course they're not. I suggest you provide methods for getting a pointer to the start of the contiguous array of constructed objects, and another method for getting the number of constructed objects. These are the .data() and .size() methods of e.g. std::vector<T>, which would make use of your class less jarring to seasoned C++ users.

Is this a good/safe use of reinterpret_cast?
Is the code directly above a suitable alternative to placement new
No. No.
or are there risks associated with copy/move assignment to an uninitialized object?
Yes. The behaviour is undefined.
Assuming memory is uninitialised, copying the vector has undefined behaviour.
No object of type T has started its lifetime at the memory location. This is super bad when T is not trivial.
The reinterpretation violates the strict aliasing rules.
First is fixed by value-initialising the storage. Or by making the vector non-copyable and non-movable.
Second is fixed by using placement new.
Third is technically fixed by using using the pointer returned by placement new, but you can avoid storing that pointer by std::laundering after reinterpreting the storage.

Related

Initializing an array of trivially_copyable but not default_constructible objects from bytes. Confusion in [intro.object]

We are initializing (large) arrays of trivially_copiable objects from secondary storage, and questions such as this or this leaves us with little confidence in our implemented approach.
Below is a minimal example to try to illustrate the "worrying" parts in the code.
Please also find it on Godbolt.
Example
Let's have a trivially_copyable but not default_constructible user type:
struct Foo
{
Foo(double a, double b) :
alpha{a},
beta{b}
{}
double alpha;
double beta;
};
Trusting cppreference:
Objects of trivially-copyable types that are not potentially-overlapping subobjects are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().
Now, we want to read a binary file into an dynamic array of Foo. Since Foo is not default constructible, we cannot simply:
std::unique_ptr<Foo[]> invalid{new Foo[dynamicSize]}; // Error, no default ctor
Alternative (A)
Using uninitialized unsigned char array as storage.
std::unique_ptr<unsigned char[]> storage{
new unsigned char[dynamicSize * sizeof(Foo)] };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << reinterpret_cast<Foo *>(storage.get())[index].alpha << "\n";
Is there an UB because object of actual type Foo are never explicitly created in storage?
Alternative (B)
The storage is explicitly typed as an array of Foo.
std::unique_ptr<Foo[]> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
This alternative was inspired by this post. Yet, is it better defined? It seems there are still no explicit creation of object of type Foo.
It is notably getting rid of the reinterpret_cast when accessing the Foo data member (this cast might have violated the Type Aliasing rule).
Overall Questions
Are any of these alternatives defined by the standard? Are they actually different?
If not, is there a correct way to implement this (without first initializing all Foo instances to values that will be discarded immediately after)
Is there any difference in undefined behaviours between versions of the C++ standard?
(In particular, please see this comment with regard to C++20)
What you're trying to do ultimately is create an array of some type T by memcpying bytes from elsewhere without default constructing the Ts in the array first.
Pre-C++20 cannot do this without provoking UB at some point.
The problem ultimately comes down to [intro.object]/1, which defines the ways objects get created:
An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]).
If you have a pointer of type T*, but no T object has been created in that address, you can't just pretend that the pointer points to an actual T. You have to cause that T to come into being, and that requires doing one of the above operations. And the only available one for your purposes is the new-expression, which requires that the T is default constructible.
If you want to memcpy into such objects, they must exist first. So you have to create them. And for arrays of such objects, that means they need to be default constructible.
So if it is at all possible, you need a (likely defaulted) default constructor.
In C++20, certain operations can implicitly create objects (provoking "implicit object creation" or IOC). IOC only works on implicit lifetime types, which for classes:
A class S is an implicit-lifetime class if it is an aggregate or has at least one trivial eligible constructor and a trivial, non-deleted destructor.
Your class qualifies, as it has a trivial copy constructor (which is "eligible") and a trivial destructor.
If you create an array of byte-wise types (unsigned char, std::byte, or char), this is said to "implicitly create objects" in that storage. This property also applies to the memory returned by malloc and operator new. This means that if you do certain kinds of undefined behavior to pointers to that storage, the system will automatically create objects (at the point where the array was created) that would make that behavior well-defined.
So if you allocate such storage, cast a pointer to it to a T*, and then start using it as though it pointed to a T, the system will automatically create Ts in that storage, so long as it was appropriately aligned.
Therefore, your alternative A works just fine:
When you apply [index] to your casted pointer, C++ will retroactively create an array of Foo in that storage. That is, because you used the memory like an array of Foo exists there, C++20 will make an array of Foo exist there, exactly as if you had created it back at the new unsigned char statement.
However, alternative B will not work as is. You did not use new[] Foo to create the array, so you cannot use delete[] Foo to delete it. You can still use unique_ptr, but you'll have to create a deleter that explicitly calls operator delete on the pointer:
struct mem_delete
{
template<typename T>
void operator(T *ptr)
{
::operator delete[](ptr);
}
};
std::unique_ptr<Foo[], mem_delete> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
Again, storage[index] creates an array of T as if it were created at the time the memory was allocated.
My first question is: What are you trying to achieve?
Is there an issue with reading each entry individually?
Are you assuming that your code will speed up by reading an array?
Is latency really a factor?
Why can't you just add a default constructor to the class?
Why can't you enhance input.read() to read directly into an array? See std::extent_v<T>
Assuming the constraints you defined, I would start with writing it the simple way, reading one entry at a time, and benchmark it.
Having said that, that which you describe is a common paradigm and, yes, can break a lot of rules.
C++ is very (overly) cautious about things like alignment which can be issues on certain platforms and non-issues on others. This is only "undefined behaviour" because no cross-platform guarantees can be given by the C++ standard itself, even though many techniques work perfectly well in practice.
The textbook way to do this is to create an empty buffer and memcpy into a proper object, but as your input is serialised (potentially by another system), there isn't actually a guarantee that the padding and alignment will match the memory layout which the local compiler determined for the sequence so you would still have to do this one item at a time.
My advice is to write a unit-test to ensure that there are no issues and potentially embed that into the code as a static assertion. The technique you described breaks some C++ rules but that doesn't mean it's breaking, for example, x86 rules.
Alternative (A): Accessing a —non-static— member of an object before its lifetime begins.
The behavior of the program is undefined (See: [basic.life]).
Alternative (B): Implicit call to the implicitly deleted default constructor.
The program is ill-formed (See: [class.default.ctor]).
I'm not sure about the latter. If someone more knowledgeable knows if/why this is UB please correct me.
You can manage the memory yourself, and then return a unique_ptr which uses a custom deleter. Since you can't use new[], you can't use the plain version of unique_ptr<T[]> and you need to manually call the destructor and deleter using an allocator.
template <class Allocator = std::allocator<Foo>>
struct FooDeleter : private Allocator {
using pointer = typename std::allocator_traits<Allocator>::pointer;
explicit FooDeleter(const Allocator &alloc, len) : Allocator(alloc), len(len) {}
void operator()(pointer p) {
for (pointer i = p; i != p + len; ++i) {
Allocator::destruct(i);
}
Allocator::deallocate(p, len);
}
size_t len;
};
std::unique_ptr<Foo[], FooDeleter<>> create(size_t len) {
std::allocator<Foo> alloc;
Foo *p = nullptr, *i = nullptr;
try {
p = alloc.allocate(len);
for (i = p; i != p + len; ++i) {
alloc.construct(i , 1.0f, 2.0f);
}
} catch (...) {
while (i > p) {
alloc.destruct(i--);
}
if (p)
alloc.deallocate(p);
throw;
}
return std::unique_ptr<Foo[], FooDeleter<>>{p, FooDeleter<>(alloc, len)};
}

Why does my variant return a value not equal to what it was assigned?

I am trying to create a simple variant as a learning exercise.
I want to do this without dynamically allocating memory, as is specified by the c++ specification for std::variant.
To simplify things, my variant can only take two values.
Here is my implementation:
//variant style class for two types
template<typename T1, typename T2>
class Either {
using Bigest = std::conditional<sizeof(T1) >= sizeof(T2), T1, T2>;
using ByteArray = std::array<std::byte, sizeof(Bigest)>;
ByteArray val;
std::optional<std::type_index> containedType;
public:
Either() : containedType(std::nullopt) {}
template<typename T>
Either(const T& actualVal) : containedType(typeid(T)) { //ToDo check T is one of correct types
ByteArray* ptr = (ByteArray*)&actualVal;
val = *ptr;
}
class BadVariantAccess {};
template<typename T>
inline T& getAs() const {
if(containedType == typeid(T)) {
T* ptr = (T*)val.data();
return *ptr;
}
else throw BadVariantAccess();
}
};
However, when I test this I get an incorrect number after trying to get the value:
int main() {
Either<int,float> e = 5;
std::cout << e.getAs<int>() << std::endl;
return 0;
}
Returns a random number (e.g 272469509).
What is the problem with my implementation, and how can I fix it?
Your program exhibits Undefined Behavior for various reasons, so it's a bit meaningless to attempt to explain why you observe the given behaviour. But here are some serious issues with your code:
It's not safe to alias any memory as std::array<std::byte, N>. This is a violation of the Strict Aliasing Rule, the only exceptions to which are pointers to char and std::byte. When doing ByteArray* ptr = (ByteArray*)&actualVal; val = *ptr;, you are invoking std::array's copy constructor and passing an instance that does not exist. At this point, literally anything could happen. You should instead be either copying bytes one-by-one (for trivial types), or using placement new to copy-construct an object in your byte-based storage.
Your storage is not aligned, and this can cause crashes or serious performance penalties at runtime, depending on your target platform. I'm not immediately sure whether this can cause Undefined Behaviour but you should certainly address this if you want to continue with such low-level memory management.
Your copy constructor does not check the actual size in memory of the type being assigned. If, on some platform, you have sizeof(int) > sizeof(float), then when you copy-construct an Either<int, float> from a float, you will read bytes past the end of the float and easily cause Undefined Behavior. You must take the size of the type being assigned into account
If you plan to store anything but trivial types (i.e. std::string or std::vector, not just primitives), you'll need to make calls to the appropriate copy/move constructor/assignment operators. For constructors (default, move, copy), you'll need to use placement new to construct live objects in the pre-allocated storage. Furthermore, you'll need to use type-erasure to store some function that will destroy the contained object as the correct type. A lambda inside a std::function<void(std::byte*)> could be very useful here, which simply calls the destructor: [](std::byte* data){ (*reinterpret_cast<T*>(data)).~T(); } This would have to be assigned anytime you store a new type. Note that this sort of situation is just about the only time you would ever manually want to call a destructor.
I strongly urge you to do some careful reading about how to do such low-level memory management correctly. It's really easy to do incorrectly and not have any idea until you get plagued with bizarre bugs much later.
As was kindly pointed out by #MilesBudnek, using Biggest = std::conditional<sizeof(T1) >= sizeof(T2), T1, T2> will give the type trait, and so an instance of Biggest will actually be an instance of a specialization of the struct std::conditional and not T1 or T2. You probably meant std::conditional_t<...> or std::conditional<...>::type; This likely has the effect that your Either class will only ever allocate a single byte, which is obviously incorrect.
ByteArray* ptr = (ByteArray*)&actualVal;
This is nonsensical. You are saying that ptr points to a std::array, but it doesn't. Of course you'll get garbage if you dereference it.

C++ proper way to move element of aligned_storage array

So as the title says, I'm wondering what the proper way to move an element in an array such as:
std::array<std::aligned_storage_t<sizeof(T), alignof(T)>, N> data;
Is it as simple as doing:
data[dst] = data[src];
Or do I need to add something else like a move, being that its storage is uninitialized, do I need to use the copy or move constructors, something like:
new (&data[dst]) T(std::move(data[src]));
Since the data[src] is not the proper type T, do i need to instead do:
new (&data[dst]) T(std::move(*std::launder(reinterpret_cast<T*>(&data[src])));
I'm looking for the most flexible way of moving the item for anything T might be, including move only types etc.
Basically I'm creating a packed array that always moves elements to be contiguous in memory, even when ones are removed to prevent holes in the active section of the array.
Edit:
As the comments want a minimal example, I guess something like:
template<class T, std::size_t N>
class SimpleExampleClass {
std::array<std::aligned_storage_t<sizeof(T), alignof(T)>, N> data;
public:
void move_element(std::size_t src, std::size_t dst) {
// data[dst] = data[src]; ?
// or
// new (&data[dst]) T(std::move(data[src]));
// or
// new (&data[dst]) T(std::move(*std::launder(reinterpret_cast<T*>(&data[src])));
// or
// something else?
// then I would need some way to clean up the src element, not sure what would suffice for that.
// as calling a destructor on it could break something that was moved potentially?
}
// Other functions to manipulate the data here... (example below)
template<typename ...Args>
void emplace_push(Args&&... args) noexcept {
new (&data[/*some index*/]) T(std::forward<Args>(args)...);
}
void push(T item) noexcept {
emplace_push(std::move(item));
}
};
std::aligned_storage itself is, roughly speaking, just a collection of bytes. There is nothing to move, and std::move(data[src]) is just a no-op. You should first use placement new to create an object and then you can move that object by move-constructing it at the new location.
Simple example:
auto ptr = new (&data[0]) T();
new (&data[1]) T(std::move(*ptr));
std::destroy_at(ptr);
in the case of T being something like unique_ptr, or any other similar edge case, there shouldn't be any issue with calling the destroy on the old element index correct?
Moving from an object leaves it in some valid state, and the object still has to be destroyed.
since data[0] is just a collection of bytes, would a pointer to it work, or would that pointer need to be reinterpret cast before being used in the move constructor?
It will work if it is adorned with reinterpret_cast and std::launder, like you wrote in your question:
new (&data[1]) T(std::move(*std::launder(reinterpret_cast<T*>(&data[0]))));
The Standard library contains some useful functions for working with uninitialized memory. The complete list can be found here (see the Uninitialized storage section).

How does std::launder affect containers?

Consider the following, simplified and incomplete, implementation of a fixed-sized vector:
template<typename T>
class Vec {
T *start, *end;
public:
T& operator[](ssize_t idx) { return start[idx]; }
void pop() {
end--;
end->~T();
}
template<typename... U>
void push(U... args) {
new (end) T { std::forward<U>(args)... };
end++;
}
};
Now consider the following T:
struct T {
const int i;
};
And the following use case:
Vec<T> v;
v.push(1);
std::cout << v[0].i;
v.pop();
v.push(2);
std::cout << v[0].i;
The index operator uses the start pointer to access the object. The object at that point was destroyed by pop and another object was created in its storage location by push(2). If I read the documentation surrounding std::launder correctly, this means that the behavior of v[0] in the line below is undefined.
How is std::launder supposed to be used to correct this code? Do we have to launder start and end each time placement new is used? Current implementations of the stdlib seem to be using code similar to the one posted above. Is the behavior of these implementations undefined?
How is std::launder supposed to be used to correct this code? Do we have to launder start and end each time placement new is used?
From P0532R0, you could avoid needing to call launder() if the return value of placement new is assigned to end. You would not need to change your start pointer unless the vector was empty since the object currently pointed to by start would still have an active lifetime with the code you provided.
The same paper indicates that launder() is a no-op unless the object lifetime has ended and has been replaced with a new object, so using launder() will not incur a performance penalty if it is unnecessary:
[...] the type of std::launder(this) is equivalent to just this as Richard Smith pointed out: Remember that launder(p) is a no-op unless p points to an object whose lifetime has ended and where a new object has been created in the same storage.
Current implementations of the stdlib seem to be using code similar to the one posted above. Is the behavior of these implementations undefined?
Yes. P0532R0 also discusses this issue and the content is similar to the discussion in the question's comments: vector does not use placement new directly, the return value of the placement new call is lost in the chain of function calls to the vector's allocator, and in any event placement new is used element by element so constructing the internal vector machinery cannot use the return value anyway. launder() appears to be the tool intended to be used here. However, the pointer type specified by the allocator is not required to be a raw pointer type at all and launder() only works for raw pointers. The current implementation is currently undefined for some types; launder() does not seem to be the appropriate machinery for solving the generic case for allocator based containers.

Prevent calls to default constructor for an array inside class

while writing a offset array class(your idxs go from lets say 100 to 1000, so you create class that takes that into account without wasting first 100 slots in the array) I ran into a problem.
How to initialize a class that has an C array of elements(problem is that T doesnt have def constructor). Basically I want the array to be totally uninitiated. Example:
class MyClass
{
MyClass(int i)
{
}
};
template <typename T, size_t n, size_t offset>
struct offsetedIdxArray
{
T data[n];// error line : error C2512: 'MyClass' : no appropriate default constructor available
offsetedIdxArray()
{
}
T& operator [](size_t pos)
{
return data[(pos-offset)];
}
};
usage:
offsetedIdxArray<MyClass, 1024,offset> oia;
Making def constructor is not the option because class I use is in fact library class.
*EDIT: * not related to problem described here, but it turned out that my precious library class doesnt have copy ctor, just move ctor, so I had to use vector of unique_ptr.
To get a statically-sized uninitialized portion of storage, you can use an "untyped" buffer of aligned storage, like std::aligned_storage<sizeof(T[n]), alignof(T)>::type in C++11 (in C++03 you need to use a char[sizeof(T[n])+something] and do manual corrections for alignment, or use a char[sizeof(T[n])] and compiler extensions to specify alignment).
That means using placement new for constructors, and explicit destructor calls for destruction. It is up to you to track what parts of that storage have objects (and thus needs destruction) and what parts don't have objects (and can't have destructors called on). It is also up to you to cater for when the client requests an element that isn't initialized at all (there's no object at the place it's supposed to be).
An alternative is to use an array of boost::optionals, and then you don't have to care about destruction and can simply assign new elements to their respective index.