How to exactly simulate new T[n] with an allocator? - c++

The expression new T[n] may or may not initialize each object in the array, depending on what T is.
How do I replicate this initialization behavior using an allocator?
struct Foo
{
int x;
Foo() : x(1)
{ }
};
Foo *p = new Foo[1];
assert(p[0].x == 1);

In C++03, the allocator interface only knows one way to initialize objects, and that's to copy from another object. C++11 has more.
You're asking for default initialization, which means (approximately), "either do nothing or call the default constructor". The allocator interface cannot do the latter in C++03.
I suppose you could write something like:
T *ra = allocator.allocate(1);
if (!is_pod<T>::value) {
// in C++03
allocator.construct(ra, T());
// in C++11
allocator.construct(ra);
}
That is_pod test might be wrong, though. Check the standard for exactly what conditions default initialization does nothing. Obviously is_pod doesn't exist in C++03, but I vaguely recall that Boost has something of the kind that works on most implementations.
I think that you're fighting the design here. The allocator interface was designed for use by containers. Containers were designed not to contain uninitialized elements, so they have no use for default initialization.

Related

Initializing an array of trivially_copyable but not default_constructible objects from bytes. Confusion in [intro.object]

We are initializing (large) arrays of trivially_copiable objects from secondary storage, and questions such as this or this leaves us with little confidence in our implemented approach.
Below is a minimal example to try to illustrate the "worrying" parts in the code.
Please also find it on Godbolt.
Example
Let's have a trivially_copyable but not default_constructible user type:
struct Foo
{
Foo(double a, double b) :
alpha{a},
beta{b}
{}
double alpha;
double beta;
};
Trusting cppreference:
Objects of trivially-copyable types that are not potentially-overlapping subobjects are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().
Now, we want to read a binary file into an dynamic array of Foo. Since Foo is not default constructible, we cannot simply:
std::unique_ptr<Foo[]> invalid{new Foo[dynamicSize]}; // Error, no default ctor
Alternative (A)
Using uninitialized unsigned char array as storage.
std::unique_ptr<unsigned char[]> storage{
new unsigned char[dynamicSize * sizeof(Foo)] };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << reinterpret_cast<Foo *>(storage.get())[index].alpha << "\n";
Is there an UB because object of actual type Foo are never explicitly created in storage?
Alternative (B)
The storage is explicitly typed as an array of Foo.
std::unique_ptr<Foo[]> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
This alternative was inspired by this post. Yet, is it better defined? It seems there are still no explicit creation of object of type Foo.
It is notably getting rid of the reinterpret_cast when accessing the Foo data member (this cast might have violated the Type Aliasing rule).
Overall Questions
Are any of these alternatives defined by the standard? Are they actually different?
If not, is there a correct way to implement this (without first initializing all Foo instances to values that will be discarded immediately after)
Is there any difference in undefined behaviours between versions of the C++ standard?
(In particular, please see this comment with regard to C++20)
What you're trying to do ultimately is create an array of some type T by memcpying bytes from elsewhere without default constructing the Ts in the array first.
Pre-C++20 cannot do this without provoking UB at some point.
The problem ultimately comes down to [intro.object]/1, which defines the ways objects get created:
An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]).
If you have a pointer of type T*, but no T object has been created in that address, you can't just pretend that the pointer points to an actual T. You have to cause that T to come into being, and that requires doing one of the above operations. And the only available one for your purposes is the new-expression, which requires that the T is default constructible.
If you want to memcpy into such objects, they must exist first. So you have to create them. And for arrays of such objects, that means they need to be default constructible.
So if it is at all possible, you need a (likely defaulted) default constructor.
In C++20, certain operations can implicitly create objects (provoking "implicit object creation" or IOC). IOC only works on implicit lifetime types, which for classes:
A class S is an implicit-lifetime class if it is an aggregate or has at least one trivial eligible constructor and a trivial, non-deleted destructor.
Your class qualifies, as it has a trivial copy constructor (which is "eligible") and a trivial destructor.
If you create an array of byte-wise types (unsigned char, std::byte, or char), this is said to "implicitly create objects" in that storage. This property also applies to the memory returned by malloc and operator new. This means that if you do certain kinds of undefined behavior to pointers to that storage, the system will automatically create objects (at the point where the array was created) that would make that behavior well-defined.
So if you allocate such storage, cast a pointer to it to a T*, and then start using it as though it pointed to a T, the system will automatically create Ts in that storage, so long as it was appropriately aligned.
Therefore, your alternative A works just fine:
When you apply [index] to your casted pointer, C++ will retroactively create an array of Foo in that storage. That is, because you used the memory like an array of Foo exists there, C++20 will make an array of Foo exist there, exactly as if you had created it back at the new unsigned char statement.
However, alternative B will not work as is. You did not use new[] Foo to create the array, so you cannot use delete[] Foo to delete it. You can still use unique_ptr, but you'll have to create a deleter that explicitly calls operator delete on the pointer:
struct mem_delete
{
template<typename T>
void operator(T *ptr)
{
::operator delete[](ptr);
}
};
std::unique_ptr<Foo[], mem_delete> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
Again, storage[index] creates an array of T as if it were created at the time the memory was allocated.
My first question is: What are you trying to achieve?
Is there an issue with reading each entry individually?
Are you assuming that your code will speed up by reading an array?
Is latency really a factor?
Why can't you just add a default constructor to the class?
Why can't you enhance input.read() to read directly into an array? See std::extent_v<T>
Assuming the constraints you defined, I would start with writing it the simple way, reading one entry at a time, and benchmark it.
Having said that, that which you describe is a common paradigm and, yes, can break a lot of rules.
C++ is very (overly) cautious about things like alignment which can be issues on certain platforms and non-issues on others. This is only "undefined behaviour" because no cross-platform guarantees can be given by the C++ standard itself, even though many techniques work perfectly well in practice.
The textbook way to do this is to create an empty buffer and memcpy into a proper object, but as your input is serialised (potentially by another system), there isn't actually a guarantee that the padding and alignment will match the memory layout which the local compiler determined for the sequence so you would still have to do this one item at a time.
My advice is to write a unit-test to ensure that there are no issues and potentially embed that into the code as a static assertion. The technique you described breaks some C++ rules but that doesn't mean it's breaking, for example, x86 rules.
Alternative (A): Accessing a —non-static— member of an object before its lifetime begins.
The behavior of the program is undefined (See: [basic.life]).
Alternative (B): Implicit call to the implicitly deleted default constructor.
The program is ill-formed (See: [class.default.ctor]).
I'm not sure about the latter. If someone more knowledgeable knows if/why this is UB please correct me.
You can manage the memory yourself, and then return a unique_ptr which uses a custom deleter. Since you can't use new[], you can't use the plain version of unique_ptr<T[]> and you need to manually call the destructor and deleter using an allocator.
template <class Allocator = std::allocator<Foo>>
struct FooDeleter : private Allocator {
using pointer = typename std::allocator_traits<Allocator>::pointer;
explicit FooDeleter(const Allocator &alloc, len) : Allocator(alloc), len(len) {}
void operator()(pointer p) {
for (pointer i = p; i != p + len; ++i) {
Allocator::destruct(i);
}
Allocator::deallocate(p, len);
}
size_t len;
};
std::unique_ptr<Foo[], FooDeleter<>> create(size_t len) {
std::allocator<Foo> alloc;
Foo *p = nullptr, *i = nullptr;
try {
p = alloc.allocate(len);
for (i = p; i != p + len; ++i) {
alloc.construct(i , 1.0f, 2.0f);
}
} catch (...) {
while (i > p) {
alloc.destruct(i--);
}
if (p)
alloc.deallocate(p);
throw;
}
return std::unique_ptr<Foo[], FooDeleter<>>{p, FooDeleter<>(alloc, len)};
}

How to convert C array to std::initializer_list?

I know a pointer to an array and its size. What container can be created from it? I tried to do this:
std::initializer_list<int> foo(arr, arr + size);
It works for the MSVC, but not for the gcc
std::initializer_list is a reference-type designed just for supporting list-initialization, and only has the default-ctor, and implicitly copy-ctor. Any other ctor is an extension.
What you can do is initializing the target-container directly from an iterator-range, without involving any intermediate views.
The standard container to use unless you know better would be std::vector. Or would using a simple view like std::span be enough for you?
If you need an actual data owning container, then what you want is a std::vector. This is going to cost you a copy and an allocation. If all you need is to act like a container, then what you want is the upcoming std::span from C++20. It takes a pointer and a size and wraps it in an interface that is like an array.
MSVS's use of
std::initializer_list<int> foo(arr, arr + size);
is not standard. Per the standard the only constructor for std::initiliazer_list is
constexpr initializer_list() noexcept;
There are kind of two questions here, each with their own answer.
How to convert C array to std::initializer_list?
You can't. It doesn't really make sense to. std::initializer_list is really only used to initialize (as its name implies) objects. It's basically is what is created from the {} notation like this:
myObject obj = {0,1,2,3,4};
Attempting to create an instance of an std::initializer_list isn't really useful in any other sense that I can think of, especially since C++14 it's impossible to create at runtime anyway, since it has a constexpr constructor.
If you have some object foo that accepts an std::initializer_list, like this:
class foo {
foo(std::initializer_list list) {
//...
}
};
And you are wondering how to create this object without anstd::initializer_list, then the answer is to simply add another constructor:
class foo {
// an actual array
foo(type arr[size]) {
//...
}
// as a pointer
foo(type arr*, size_t size) {
//...
}
};
If you are using a third party library, or some other library you don't control, that does not have another constructor, then chances are it's not intended to be used this way. In that case, I would consult your documentation or vendor.
What container can be created from it?
Any sequence container. Most of them have some sort of constructor that accepts pointers to an object (actually, it technically takes iterators, but pointers will work the same in this context) or an array. They are pretty much designed for easy conversion from C arrays. Which one you use will depend on your situation.
Also, std::span (which is not listed as a "sequence container") has been mentioned as a possible container that can get created from a C array at low cost. Although, I can't vouch for them personally, as I'm not too familiar with the upcoming standard.
Final note: If MSVC allows this, then either a) you're possibly in C++11 (though I can't confirm if this was allowed in C++11 either, just that the constructor is not constexpr in C++) or b) it is a compiler bug in MSVC.

Constexpr alternative to placement new to be able to leave objects in memory uninitialized?

I am trying to create a static container which has stack based memory and can hold N instances of T. Much alike std::vector I want currently unused memory to not contain initialized items of T. This is usually solved with placement new but that's not possible to use in constexpr.
Using unions
I found a trick that you can use a union for this as follows:
template <typename value_type>
union container_storage_type
{
struct empty{};
constexpr container_storage_type(): uninitialized{}{}
constexpr container_storage_type(value_type v): value(v){}
constexpr void set(value_type v)
{
*this = literal_container_storage_type{v};
}
empty uninitialized;
value_type value;
};
This lets you store items uninitialized by setting the empty member and this works around the limitation that all members in constexpr have to be initialized.
Now the problem with this approach is that if value_typeis a type that implements operator=, the rule for unions says:
If a union contains a non-static data member with a non-trivial special member function (copy/move constructor, copy/move assignment, or destructor), that function is deleted by default in the union and needs to be defined explicitly by the programmer.
This means that to be able to use this trick, I need to implement operator= in the union too, but how would that look?
constexpr container_storage_type& operator=(const container_storage_type& other)
{
value = other.value; //ATTEMPT #1
//*this = container_storage_type(other.value);ATTEMPT #2
return *this;
}
Attempt #1: This does not seem possible as the compiler complains that changing the active member of a union is simply disallowed in constant expressions.
Attempt #2: This works in the set() method from the previous snippet, as it doesn't change the active member per se, but reassigns the whole union. This trick seems unable to be used in the assignment operator however since that causes endless recursion...
Am I missing something here, or is this truly a dead end for using unions as a placement-new alternative in constexpr?
Are there other alternatives to placement new that I have completely missed?
https://godbolt.org/z/km0nTY Code that illustrates the problem
In C++17, you can't.
The current restrictions on what you cannot do in constant expressions include:
an assignment expression ([expr.ass]) or invocation of an assignment operator ([class.copy.assign]) that would change the active member of a union;
a new-expression;
There really is no way around that.
In C++20, you will be able to, but probably not the way you think. The latter restriction is going to be relaxed in C++20 as a result of P0784 to something like:
a new-expression (8.3.4), unless the selected allocation function is a replaceable global allocation function (21.6.2.1, 21.6.2.2);
That is, new T will become fine but new (ptr) T will still not be allowed. As part of making std::vector constexpr-friendly, we need to be able to manage "raw" memory - but we still can't actually manage truly raw memory. Everything still has to be typed. Dealing with raw bytes is not going to work.
But std::allocator doesn't entirely deal in raw bytes. allocate(n) gives you a T* and construct takes a T* as a location and a bunch of arguments and creates a new object at that location. You may be wondering at this point how this is any different from placement new - and the only difference is that sticking with std::allocator, we stay in the land of T* - but placement new uses void*. That distinction turns out to be critical.
Unfortunately, this has the interesting consequence of your constexpr version "allocates" memory (but it allocates compiler memory, which will get elevated to static storage as necessary - so this does what you want) - but your pure runtime version surely does not want to allocate memory, indeed the whole point would be that it does not. To that end, you will have to use is_constant_evaluated() to switch between the allocating at constant evaluation time and non-allocating at runtime. This is admittedly not beautiful, but it should work.
Your storage can look something like this:
// For trivial objects
using data_t = const array<remove_const_t<T>, Capacity>>;
alignas(alignof(T)) data_t data_{};
// For non-trivial objects
alignas(alignof(T)) aligned_storage_t<T> data_[Capacity]{};
This will allow you to create a const array of non-const objects. Then constructing objects will look something like this:
// Not real code, for trivial objects
data_[idx] = T(forward<Args>(args)...);
// For non-trivial objects
new (end()) T(forward<Args>(args)...);
Placement new is mandatory here. You will be able to have the storage at compile-time, but you cannot construct it at compile-time for non-trivial objects.
You will also need to take into account whether or not your container is zero-sized, etc. I suggest you look at existing implementations for fixed sized vectors and there are even some proposals for constexpr fixed sized vectors like p0843r1.

c++ type trait to say "trivially movable" - examples of

I would define "trivially movable" by
Calling the move constructor (or the move assignment operator) is
equivalent to memcpy the bytes to the new destination and not calling
the destructor on the moved-from object.
For instance, if you know that this property holds, you can use realloc to resize a std::vector or a memory pool.
Types failing this would typically have pointers to their contents that needs to be updated by the move constructor/assignment operator.
There is no such type traits in the standard that I can find.
I am wondering whether this already has a (better) name, whether it's been discussed and whether there are some libraries making use of such a trait.
Edit 1:
From the first few comments, std::is_trivially_move_constructible and std::is_trivially_move_assignable are not equivalent to what I am looking for.
I believe they would give true for types containing pointers to themselves, since reading your own member seems to fall under "trivial" operation.
Edit 2:
When properly implemented, types which point to themselves won't be trivially_move_constructible or move_assignable because the move ctor / move assignment operator are not trivial anymore.
Though, we ought to be able to say that unique_ptr can be safely copied to a new location provided we don't call its destructor.
I think what you need is std::is_trivially_relocatable from proposal P1144. Unfortunately the proposal didn't make it into C++20, so we shouldn't expect it before 2023. Which is sad, because this type trait would enable great optimizations for std::vector and similar types.
Well, this got me thinking... It is very important to overload type traits of structs that hold a pointer to themselves.
The following code demonstrates how fast a bug can creep in code, when type_traits are not defined properly.
#include <memory>
#include <type_traits>
struct A
{
int a;
int b;
int* p{&a};
};
int main()
{
auto p = std::make_unique<A>();
A a = std::move(*p.get()); // gets moved here, a.p is dangling.
return std::is_move_assignable<A>::value; // <-- yet, this returns true.
}

How To Destruct Destructor-less Types Constructed via 'Placement New'

So I have built class with which I intended to use std::aligned_storage to store different types up to 16 bytes in for a 'Variant' class. Theoretically it should be able to store any POD type and common containers such as std::string and std::map.
I went by the code example found here, and it seemed like it was made for exactly what I was looking for: http://en.cppreference.com/w/cpp/types/aligned_storage
My version, basically:
class Variant {
public:
Variant() { /* construct */ }
Variant(std::map<int,int> v) {
new(&m_data) std::map<int,int>(v); // construct std::map<int,int> at &m_data
m_type = TYPE_MAP;
}
~Variant() {
if (m_type == TYPE_MAP) {
// cool, now destruct..?
reinterpret_cast<std::map<int, int>*>(&m_data)->~/*???????????????*/();
}
}
private:
// type of object in m_data
enum Type m_type;
// chunk of space for allocating to
std::aligned_storage<16, std::alignment_of<std::max_align_t>::value>::type m_data;
};
My problem comes with the destruction. As you can see at /*???????????????*/, I'm not sure what to call in place of ~T() in the cppreference.com example:
reinterpret_cast<const T*>(data+pos)->~T(); // I did the same thing except I know what T is, is that a problem is it?
In my mind, I'm doing exactly the same thing, disregarding template anonymity. Problem is, std::map doesn't have any std::map::~map() destructor method, only a std::map::~_Tree, which is clearly not intended for direct use. So, in the cppreference.com example code, what would ~T() be calling if T was an std::map<int,int>, and what is the proper way for me to call the destructor for an object with a known type in std::aligned_storage? Or am I over complicating things and are the clear() methods in these STL containers guaranteed do the equivalent of full destruction's?
Or, is there any simpler way around this? As I've possibly misunderstood something along the way regarding my intended usage of std::aligned_storage.
It sounds like you have read the header file where std::map is defined and think that std::map has no destructor because you could not find the declaration of a destructor.
However, in C++, a type that doesn't have a destructor declared will have a destructor implicitly declared by the compiler. This implicit destructor will call the destructors of bases and non-static members. It sounds like the std::map in your library implementation is a thin layer over _Tree. Therefore, all that needs to be done to destroy the map is to destroy the tree. Therefore, the compiler's default destructor does the trick.
It is allowed to write ->~map() in your case, and it will call the implicitly defined destructor, and the map will be destroyed correctly. You may also use this syntax with scalar types such as int (but not arrays, for some reason).
I'm not sure what to call in place of ~T()
Your type, which is named map:
reinterpret_cast<std::map<int, int>*>(&m_data)->~map();
Which if it makes you feel better you can put in a function template:
template <class T>
void destroy_as(void* p) {
static_cast<T*>(p)->~T();
}
destroy_as<std::map<int, int>>(&m_data);
Problem is, std::map doesn't have any std::map::~map() destructor method
It may be compiler generated, but the type assuredly has a destructor. All types have destructors. Some may be explicitly or impleted deleted, but they exist.
Note that your aligned_storage is too small to store a map, sizeof(std::map) is larger than 16.