How can I efficiently clone a dynamically allocated array? - c++

I have a class which is a templated smart pointer meant for wrapping dynamically allocated arrays. I know that there are classes in the STL that can be used for this, especially in C++11, but this is a widely-used internal class.
I wish to write a Clone() method for it. My initial implementation used std::copy, but I realized I should be able to avoid the default construction when allocating the array.
My attempt at a PoC ends up with a segmentation fault:
#include <iostream>
#include <algorithm>
class A
{
public:
A(int j) : i(j) {}
~A() {std::cout << "Destroying " << i << std::endl;}
private:
int i;
};
int main()
{
int a[] = {1, 2, 3};
A* arr = static_cast<A*>(::operator new[](sizeof(A) * 3));
std::uninitialized_copy(a, a + 3, arr);
delete [] arr;//::operator delete[](arr);
}
How do I create a dynamically allocated array of T, initialized with std::uninitialized_copy, so that it can be deleted with 'delete []' (i.e. treated as if it was allocated with a simple 'new T[N]')?
Since it seems people have had trouble understanding what I'm asking, here's the essence of my question:
#include <algorithm>
template <typename T>
T* CloneArray(T* in_array, size_t in_count)
{
if (!in_array)
return nullptr;
T* copy = new T[in_count];
try
{
std::copy(in_array, in_array + in_count, copy);
}
catch (...)
{
delete [] copy;
throw;
}
return copy;
}
How would I rewrite this function in a way that prevents T::T() from being called (if it even exists!), while returning the exact same result (let's assume our types are well behaved in that T t; t = other; and T t(other); are equivalent), including the fact that the result of the function can be deleted using the standard delete [] operator.

How do I create a dynamically allocated array of T, initialized with std::uninitialized_copy, so that it can be deleted with 'delete []' (i.e. treated as if it was allocated with a simple 'new T[N]')?
So, given the relatively simple requirement that the memory be able to be deleted with delete[], lets see what options we have.
Note: All quotes from the standard are from the C++14 draft N3797 and I'm not the best at standard-interpreting so take this with a grain of salt.
Mixing malloc()/free() and new[]/delete[]
Undefined, since new doesn't necessarily call malloc, see §18.6.1/4 (default behavior of operator new):
Default behavior:
Executes a loop: Within the loop, the function first attempts to allocate the requested storage. Whether the attempt involves a call to the Standard C library function malloc is unspecified.
Avoiding default-initialization
So, seeing that we're required to use new[] if we want to use delete[],
looking at the standard for information about initialization in a new-expression §5.3.4/17:
A new-expression that creates an object of type T initializes that object as follows:
If the new-initializer is omitted, the object is default-initialized (8.5); if no initialization is performed, the object has indeterminate value.
Otherwise, the new-initializer is interpreted according to the initialization rules of 8.5 for direct-initialization.
and going to §8.5/7:
To default-initialize an object of type T means:
if T is a (possibly cv-qualified) class type (Clause 9), the default constructor (12.1) for T is called (and the initialization is ill-formed if T has no default constructor or overload resolution (13.3) results in an ambiguity or in a function that is deleted or inaccessible from the context of the initialization);
if T is an array type, each element is default-initialized;
we see that if we omit a new-initializer in our new[], all the elements of the array will be default initialized via their default constructors.
So, what if we include a new-initializer, do we have any options? Going back to its definition in §5.3.2/1:
new-initializer:
(expression-list opt)
braced-init-list
The only possibility we are left with is a braced-init-list (expression-list is for non-array new-expressions). I managed to get it working for objects with compile time size, but obviously that's not terribly helpful. For reference (portions of code adapted from here):
#include <iostream>
#include <utility>
struct A
{
int id;
A(int i) : id(i) {
std::cout << "c[" << id << "]\t";}
A() : A(0) {}
~A() {std::cout << "d[" << id << "]\t";}
};
template<class T, std::size_t ...I>
T* template_copy_impl(T* a, std::index_sequence<I...>) {
return new T[sizeof...(I)]{std::move(a[I])...};
}
template<class T, std::size_t N,
typename Indices = std::make_index_sequence<N>>
T* template_copy(T* a) {
return template_copy_impl<T>(a, Indices());
}
int main()
{
const std::size_t N = 3;
A* orig = new A[N];
std::cout << std::endl;
// modify original so we can see whats going on
for (int i = 0; i < N; ++i)
orig[i].id = 1 + i;
A* copy = template_copy<A, N>(orig);
for (int i = 0; i < N; ++i)
copy[i].id *= 10;
delete[] orig;
std::cout << std::endl;
delete[] copy;
std::cout << std::endl;
}
Which when compiled with -std=c++1y (or equivalent) should output something like:
c[0] c[0] c[0]
d[3] d[2] d[1]
d[30] d[20] d[10]
Different types in new[] vs delete[]
To summarize, not only are we required to use new[] if we want to use delete[] but when omitting a new-initializer, our objects are default-initialized. So, what if we allocate memory using a fundamental type (similar, in a way, to using placement new), it will leave the memory uninitialized, right? Yes, but its undefined to delete the memory with something like T* ptr = /* whatever */; delete[] ptr if it was allocated with a different type. See §5.3.5/2:
In the second alternative (delete array), the value of the operand of delete may be a null pointer value or a pointer value that resulted from a previous array new-expression. If not, the behavior is undefined. [ Note: this means that the syntax of the delete-expression must match the type of the object allocated by new, not the syntax of the new-expression. — end note ]
and §5.3.5/3, which hints at the same thing:
In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.
Other options?
Well, you could still use unique_ptrs as others have suggested. Although given that you're stuck with a large code base it's probably not feasible. Again for reference, here's what my humble implementation might look like:
#include <iostream>
#include <memory>
struct A
{
int id;
A(int i) : id(i) {
std::cout << "c[" << id << "]\t";}
A() : A(0) {}
~A() {std::cout << "d[" << id << "]\t";}
};
template<class T>
struct deleter
{
const bool wrapped;
std::size_t size;
deleter() :
wrapped(true), size(0) {}
explicit deleter(std::size_t uninit_mem_size) :
wrapped(false), size(uninit_mem_size) {}
void operator()(T* ptr)
{
if (wrapped)
delete[] ptr;
else if (ptr)
{
// backwards to emulate destruction order in
// normally allocated arrays
for (std::size_t i = size; i > 0; --i)
ptr[i - 1].~T();
std::return_temporary_buffer<T>(ptr);
}
}
};
// to make it easier on ourselves
template<class T>
using unique_wrap = std::unique_ptr<T[], deleter<T>>;
template<class T>
unique_wrap<T> wrap_buffer(T* orig)
{
return unique_wrap<T>(orig);
}
template<class T>
unique_wrap<T> copy_buffer(T* orig, std::size_t orig_size)
{
// get uninitialized memory
auto mem_pair = std::get_temporary_buffer<T>(orig_size);
// get_temporary_buffer can return less than what we ask for
if (mem_pair.second < orig_size)
{
std::return_temporary_buffer(mem_pair.first);
throw std::bad_alloc();
}
// create a unique ptr with ownership of our memory, making sure to pass
// the size of the uninitialized memory to the deleter
unique_wrap<T> a_copy(mem_pair.first, deleter<T>(orig_size));
// perform the actual copy and return the unique_ptr
std::uninitialized_copy_n(orig, orig_size, a_copy.get());
return a_copy;
}
int main()
{
const std::size_t N = 3;
A* orig = new A[N];
std::cout << std::endl;
// modify original so we can see whats going on
for (int i = 0; i < N; ++i)
orig[i].id = 1 + i;
unique_wrap<A> orig_wrap = wrap_buffer(orig);
{
unique_wrap<A> copy = copy_buffer(orig, N);
for (int i = 0; i < N; ++i)
copy[i].id *= 10;
// if we are passing the original back we can just release it
A* back_to_somewhere = orig_wrap.release();
delete[] back_to_somewhere;
std::cout << std::endl;
}
std::cout << std::endl;
}
Which should output:
c[0] c[0] c[0]
d[3] d[2] d[1]
d[30] d[20] d[10]
Lastly, you might be able to override the global or class operator new/delete, but I wouldn't suggest it.

Since A already is a smart pointer to a class that wraps dynamically allocated memory, it's guaranteed that the memory will not be deallocated until all references are released. Thus you can use a simple array or vector and copy the smart pointers around, no need to dynamically allocate the array.
For example:
typedef sdt::vector<A<SomeType> > AVector;
AVector copy(AVector in)
{
AVector copyArray = in;
return copyArray;
}

Related

Uninitialized default constructor in c++: munmap_chunk(): invalid pointer

having this code:
#include <iostream>
#include <iterator>
#include <initializer_list>
#include <algorithm>
class Foo {
public:
Foo() = default;
explicit Foo(size_t size) :size(size){
ar = new double[size];
}
Foo(std::initializer_list<double> initList): Foo(initList.size()){
std::copy(initList.begin(), initList.end(), ar);
}
Foo(double *values, size_t size):size(size), ar(values){}
Foo(const Foo &rhs): Foo(rhs.size){
std::copy(rhs.ar, rhs.ar+size, ar);
}
~Foo(){delete[] ar;}
Foo &operator=(Foo rhs){
swap(*this, rhs);
return *this;
}
void print(){
std::copy(ar, ar+size, std::ostream_iterator<double>(std::cout, " "));
std::cout << std::endl;
}
private:
size_t size;
double *ar;
static void swap(Foo &f, Foo &s){
std::swap(f.size, s.size);
std::swap(f.ar, s.ar);
}
};
int main() {
using namespace std;
size_t size = 100;
auto *values = new double[size];
for(int i = 0; i<100; i++){
double fraction = ((10+i) % 10) / 10.0;
values[i] = i + fraction;
}
Foo f(values, size);
// Foo g; //IF THIS IS NOT BRACED-INITIALIZED, I GOT munmap_chunk(): invalid pointer
Foo g{};
g = f;
g.print();
}
The only difference between the programing running and getting error is whether I initilize the Foo g with braces or not. Why is that important. I know the braces will value-initialize the class, which means the int *ar would be nullptr. If it is not brace-initialized, then the int *ar is indeterminate. But what does that mean? How could be pointer indeterminate? Is it the same as nullptr? And why does the program break, when the pointer is indeterminate?
If it is not brace-initialized, then the int *ar is indeterminate. But what does that mean? How could be pointer indeterminate?
Because you are not assigning any value to the pointer, not even nullptr. So its value will consist of whatever random bytes were already stored in the memory location that the pointer is occupying.
When a default constructor is declared with = default, that just means the compiler will implicitly generate a constructor that will default-initialize each class member for you. Any member that is a class type will have its default constructor called, and any member that is a non-class type will either have its default value assigned if such a value is explicitly specified, or else it will not be assigned any value at all. The latter is what is happening in your situation.
Is it the same as nullptr?
No.
And why does the program break, when the pointer is indeterminate?
Because the pointer is not pointing at valid memory, so any attempt to dereference the pointer to access the pointed memory will fail. Including your destructor, which is unconditionally calling delete[] on the pointer, which is safe to do only if the pointer is set to nullptr or is pointing at valid memory that was new[]'ed.
In your case, you should add default values for your non-class type members, eg:
private:
size_t size = 0;
double *ar = nullptr;
That way, if any constructor does not explicitly set values to them, the compiler will still assign their default values to them. In this case, Foo() = default; will generate an implicit default constructor that is roughly equivalent to:
Foo() : size(0), ar(nullptr) {}
You are also missing a move constructor. Your existing operator= assignment operator acts sufficiently as a copy assignment operator, but adding a move constructor will allow it to also act as a sufficient move assignment operator, eg:
Foo(Foo &&rhs): size(rhs.size), ar(rhs.ar){
rhs.ar = nullptr;
rhs.size = 0;
}
Or:
Foo(Foo &&rhs){
size = std::exchange(rhs.size, 0);
ar = std::exchange(rhs.ar, nullptr);
}
Or:
Foo(Foo &&rhs): Foo(){
swap(*this, rhs);
}
Also, your Foo(double*, size_t) constructor is broken. It is taking ownership of the passed in double* pointer, which is not guaranteed to be new[]'ed so the destructor can safely delete[] it.
This constructor needs to allocate its own double[] array and copy the source values into it, just like the Foo(std::initializer_list) constructor is doing, eg:
Foo(const double *values, size_t size): Foo(size) {
std::copy(values, values+size, ar);
}
That all being said, a much better and safer design would be to replace your manual double[] array with std::vector instead, and let it handle all of the memory management and copy/move operations for you, eg:
#include <vector>
class Foo {
public:
Foo() = default;
explicit Foo(size_t size) : ar(size){}
Foo(std::initializer_list<double> initList) : ar(initList){}
Foo(const double *values, size_t size) : ar(values, values+size){}
// compiler-generated copy/move constructors, copy/move assignment operators,
// and destructor will suffice, so no need to declare them explicitly...
void print() const {
std::copy(ar.cbegin(), ar.cend(), std::ostream_iterator<double>(std::cout, " "));
std::cout << std::endl;
}
private:
std::vector<double> ar;
};

how to call destructor on some of the objects in a Dynamic Array

I finally got around to trying placement new to create an efficient dynamic array. the purpose is to understand how it works, not to replace class vector. The constructor works. A block is allocated but uninitialized. As each element is added, it is initialized. But I don't see how to use placement delete to call the destructor on only those elements that exist. Can anyone explain that one? This code works for allocating the elements one by one as the array grows, but the delete is not right.
template<typename T>
class DynArray {
private:
uint32_t capacity;
uint32_t size;
T* data;
void* operator new(size_t sz, T* place) {
return place;
}
void operator delete(void* p, DynArray* place) {
}
public:
DynArray(uint32_t capacity) :
capacity(capacity), size(0), data((T*)new char[capacity*sizeof(T)]) {}
void add(const T& v) {
new(data+size++) T(v);
}
~DynArray() {
for (int i = 0; i < size; i++)
delete (this) &data[i];
delete [] (char*)data;
}
};
A placement delete doesn't make much sense since the destructor already does what a placement delete is supposed to do.
An ordinary delete calls the destructor and then releases the memory that was allocated for the object with new. However, unlike an ordinary new, a placement new does not allocate memory, it only initialises it. Therefore, a placement delete would only have to call the destructor of the object to be "deleted".
All you need is to call the destructor of each object of the array directly:
~DynArray() {
for (int i = 0; i < size; i++)
data[i].~T();
}
Since C++17 you can also use the function template std::destroy instead of directly calling the destructor:
~DynArray() {
auto first = std::addressof(data[0]);
auto last = std::next(first, size);
std::destroy(first, last);
}
You actually found the only case (at least that I'm aware of) where you want to invoke the destructor manually:
~DynArray() {
for (int i = 0; i < size; i++)
data[i].~T();
delete [] (char*)data;
}
Combined with a trivial class and main, you should get the expected results:
struct S {
~S() { std::cout << __PRETTY_FUNCTION__ << '\n'; }
};
int main() {
DynArray<S> da{10};
da.add(S{});
return 0;
}
Note that you see the destructor called twice since DynArray takes objects by const reference, thus it has a temporary.
$./a.out
S::~S()
S::~S()

Using std::memcpy to object of non-trivially copyable type

The standard defines we can use std::memcpy int the following way:
For any trivially copyable type T, if two pointers to T point to
distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a
base-class subobject, if the underlying bytes (1.7) making up obj1 are
copied into obj2, obj2 shall subsequently hold the same value as obj1.
What potential problem we could get if we applied that function to an object of non-trivially copyable type? The following code works as if it worked for trivially-copyable type:
#include <iostream>
#include <cstring>
using std::cout;
using std::endl;
struct X
{
int a = 6;
X(){ }
X(const X&)
{
cout << "X()" << endl;
}
};
X a;
X b;
int main()
{
a.a = 10;
std::memcpy(&b, &a, sizeof(X));
cout << b.a << endl; //10
}
DEMO
You asked:
What potential problem we could get if we applied that function to an object of non-trivially copyable type?
Here's a very simple example that illustrates the problem of using std::memcpy for objects of non-trivially copyable type.
#include <cstring>
struct A
{
A(int size) : size_(size), data_(new int[size]) {}
~A() { delete [] data_; }
// The copy constructor and the copy assignment operator need
// to be implemented for the class too. They have been omitted
// to keep the code here minimal.
int size_;
int* data_;
};
int main()
{
A a1(10);
A a2(20);
std::memcpy(&a1, &a2, sizeof(A));
// When we return from the function, the original data_ of a1
// is a memory leak. The data_ of a2 is deleted twice.
return 0;
}
Consider this program:
#include <memory>
int main() {
std::shared_pointer<int> x(new int);
{
std::shared_pointer<int> y;
memcpy((void*)&y, (void*)&x, sizeof(x));
}
*x = 5;
}
Because we copied x to y using memcpy instead of the assignment operator, the reference counts did not get updated. So, at the end of that block, the destructor of y is called. It finds that it has a reference count of 1, meaning it is the only shared_pointer instance pointing to the heap-allocated integer. So it deletes it.
The last line of main will likely segfault, because x points to an object that has been deleted.

Return type of array subscription operator for a wrapper of type ** is *&?

Suppose I have a class A, whose constructor requires an argument x.
class A
{
public:
int a;
A(int x) { a = x; std::cout << a << std::endl; }
~A() {}
};
Now I want to allocate an array of A, and wrap it in another class B (in reality it should be a 2-dimensional array of A, mapped onto a 1-dimensional array, which is why I need to wrap it). Since constructor of A requires argument, I cannot use new[] (…right?), so I have to have an A**. Also I don’t want B to know about x, so my B is like this:
class B
{
private:
A** As;
const int n;
public:
B(int nn): n(nn) { As = new A*[n]; }
~B() { delete[] As; }
A* at(int i) { return As[i]; }
const A* at(int i) const { return As[i]; }
};
Note that “subscription operator” loosely means that at() function. Now my main function is like this:
int main()
{
B b(3);
int x = -1;
for(int i = 0; i < 3; i++)
{
b.at(i) = new A(x);
}
return 0;
}
When I compile this with g++, it prints an error “lvalue required as left operand of assignment” at my “new” line. Then I change my signature of at() to
A*& at(int i)
and it works.
What’s bothering my is that A*&, which just looks weird to me…
Is this A*& something I should use? Or any other way to deal with an array of objects, whose constructor requires arguments? BTW we don’t have c++11 and boost available on our target machine…
"Or any other way to deal with an array of objects, whose constructor requires arguments?"
Use std::vector, you don't need C++11 for it, neither any additional libraries:
#include <vector>
...
std::vector<A> myObjects(n, A(0)); // objects will be constructed by calling A(0)
Your class B could look the following way:
class B
{
private:
std::vector<A> As;
const int n;
public:
B(int n): n(n), As(std::vector<A>(n, A(0))) { }
// no explicit destructor needed
// memory management is handled by std::vector object automatically
A& at(int i) { return As[i]; }
const A& at(int i) const { return As[i]; }
};
Note that vector's elements are stored within continuous block of memory and lifetime of these elements is tied to the lifetime of instance of B. Once the B is destructed, so is the vector and so are the elements that were stored in it. If n is a constant known at compile time, you might also consider using std::array instead.
The & operator has a role to change an function parameter or a return variable to a reference.
Without & the value of the variable will be copied into a temporal variable, therefore the original variable will not be sensible to any value changes.
By referenced handling the parameter will be the original variable only the name differs, but signs the same memory block.
In this example the b.at(i) statement will be copied into a temporary constant variable, which is unusable in the left side of = statement.
But by referenced return of this statement it points to the original As[i] memory block, only the 'name' differs, and so can be changed its value too.

Simulating new[] with argument constructor

If I am not modifying any static variable inside the argument constructor, is below the proper way to simulate new T[N] (x,y); (array new with arguments) ?
template<typename T>
void* operator new [] (size_t size, const T &value)
{
T* p = (T*) malloc(size);
for(int i = size / sizeof(T) - 1; i >= 0; i--)
memcpy(p + i, &value, sizeof(T));
return p;
}
Usage will be,
struct A
{
A () {} // default
A (int i, int j) {} // with arguments
};
int main ()
{
A *p = new(A(1,2)) A[10]; // instead of new A[10](1,2)
}
I'd suggest
std::vector<A> v(10, A(1,2));
I realize that this doesn't really address the question for arrays.
You could use
p = &v[0];
since the standard guarantees contiguous storage. Be very careful with resizing the vector though, because it could invalidate p
I checked boost::array<> (which adapts C style arrays), but it doesn't define constructors...
This isn’t OK. You are copying objects into uninitialised memory without invoking proper copy semantics.
As long as you’re only working with PODs, this is fine. However, when working with objects that are not PODs (such as your A) you need to take precautions.
Apart from that, operator new cannot be used in this way. As Alexandre has pointed out in the comments, the array won’t be initialised properly since C++ will call constructors for all elements after having called your operator new, thus overriding the values:
#include <cstdlib>
#include <iostream>
template<typename T>
void* operator new [] (size_t size, T value) {
T* p = (T*) std::malloc(size);
for(int i = size / sizeof(T) - 1; i >= 0; i--)
new(p + i) T(value);
return p;
}
struct A {
int x;
A(int x) : x(x) { std::cout << "int ctor\n"; }
A() : x(0) { std::cout << "default ctor\n"; }
A(const A& other) : x(other.x) { std::cout << "copy ctor\n"; }
};
int main() {
A *p = new(A(42)) A[2];
for (unsigned i = 0; i < 2; ++i)
std::cout << p[i].x << std::endl;
}
This yields:
int ctor
copy ctor
copy ctor
default ctor
default ctor
0
0
… not the desired outcome.
That's not okay - C++ will call those objects non-trivial default constructors if typename T has such (struct A in your example does have one) and that would lead to reconstructing objects in memory already occupied.
An appropriate solution would be to use std::vector (recommended) or call ::operator new[] to allocate memory, then call constructors using placement-new and taking care of exceptions if any.
You should consider that operator new[] may be called asking for more memory than the bare amount sizeof(T) * n.
This extra memory is possibly needed because C++ must know how many object to destroy in case of delete[] p; but it cannot reliably use the size of block of memory allocated by new p[sz] to infer this number because the memory may have been asked to a custom memory manager so (e.g. your case) there is no way to know how much memory was allocated only by knowing the pointer.
This also means that your attempt to provide already-initialized objects will fail because the actually array returned to the application will potentially not start at the address you returned from your custom operator new[] so that initialization could be misaligned.
template <typename myType> myType * buildArray(size_t numElements,const myType & startValue) {
myType * newArray=(myType *)malloc(sizeof(myType)*numElements);
if (NULL!=newArray) {
size_t index;
for (index=0;index<numElements;++index) {
new (newArray+index) myType(startValue);
}
}
return newArray;
}
template <typename myType> void destroyArray(size_t numElements,myType * oldArray) {
size_t index;
for (index=0;index<numElements;++index) {
(oldArray+index)->~myType();
}
free(oldArray);
}
A * p=newArray(10,A(1,2));
destroyArray(10,p);
destroyArray could also be written like this depending on the platform you are building for:
template <typename myType> void destroyArray(myType * oldArray) {
size_t numElements=malloc_size(oldArray)/sizeof(myType); //or _msize with Visual Studio
size_t index;
for (index=0;index<numElements;++index) {
(oldArray+index)->~myType();
}
free(oldArray);
}