I am working on an application with high performance and memory needs. With that I mean 80 cores and 500 GB of RAM. To save some memory, I use my own dynamic array (16 B overhead) as opposed to std::vector (24 B overhead), which matters if you have billions of them.
My question relates to expanding that array which looks like this:
//private
template <class ArrType>
void DynamicArray<ArrType>::reallocate(unsigned newCapacity) {
if (newCapacity < _size) return;
if (capacity == newCapacity) return;
ArrType * newArray = new ArrType[newCapacity];
capacity = newCapacity;
//for (unsigned i = 0; i < _size; i++) {
// newArray[i] = array[i];
//}
memcpy(newArray, array, _size * sizeof(ArrType));
if(array) delete [] array;
array = newArray;
}
As you can see, pretty standard reallocation, but I tested memcpy and it was about 10 times faster than using a for cycle. The problem is when I call delete, it will call destructors for objects of ArrType, which is a problem when ArrType has its own dynamic allocations. The copy in newArray will use deleted memory. Is there any way to delete the old array without calling destructors?
Replace your memcpy with:
std::move(array, array + _size, newArray);
And require that the type ArrType must have a correct move or copy assignment operator.
But in real life, just use vector<ArrType>.
In fact vector is better than this: rather than allocating an array (which runs a constructor if the type has one) and then move-assigning (which over-writes what new just did) it allocates raw memory and then uses the move constructor with placement new.
So, if you absolutely positively need a version of vector that uses a smaller type for size_type than the one in your implementation I suppose the thing to do is to re-implement vector under a new name with that change. You can use the source in your implementation to help you: that way you will have solutions in front of you to this problem and all the other problems involved.
Related
In order to use placement new instead of automatically attempting to call the default constructor, I'm allocating an array using reinterpret_cast<Object*>(new char[num_elements * sizeof(Object)]) instead of new Object[num_elements].
However, I'm not sure how I should be deleting the array so that the destructors get called correctly. Should I loop through the elements, call the destructor manually for each element, and then cast the array to a char* and use delete[] on that, like this:
for (size_t i = 0; i < num_elements; ++i) {
array[i].~Object();
}
delete[] reinterpret_cast<char*>(array);
Or is it sufficient if I don't call the destructor manually for each element, and simply rely on delete[] to do that since the type of the array is Object*, like delete[] array?
What I'm worried about, is that not every platform might be able to determine the amount of elements in the array correctly that way, because I didn't allocate the array using a type of the right size. An answer to a question about "how delete[] knows the size of the operand" suggests that a possible implementation of delete[] would be to store the number of allocated elements (rather than the amount of bytes).
If delete[] is indeed implemented that way, that would suggest that using just delete[] array would try to delete too many elements, because the array was created with more char elements than how many Object elements fit in it. So in that case, the only reliable way to delete the array would be to manually call the destructors, cast the array to a char*, and then use delete[].
However, another logical way to implement it would be to store the size of the array in bytes, rather than the amount of elements, and then when calling delete[], divide the size of the array by the size of the type to get the amount of elements to call the destructor of. If this method is used, then just using delete[] array where array has a type of Object* would be sufficient.
So my question is: can I rely on delete[] to correctly call the destructors of the elements in the operand array, if the array was originally not allocated with the right type?
This is the code I'm using:
template <typename NumberType>
NeuronLayer<NumberType>::NeuronLayer(size_t num_inputs, size_t num_neurons, const NumberType *weights)
: neurons(reinterpret_cast<Neuron<NumberType>*>(new char[num_neurons * sizeof(Neuron<NumberType>)])),
num_neurons(num_neurons), num_weights(0) {
for (size_t i = 0; i < num_neurons; ++i) {
Neuron<NumberType> &neuron = neurons[i];
new(&neuron) Neuron<NumberType>(num_inputs, weights + num_weights);
num_weights += neuron.GetNumWeights();
}
}
and
template <typename NumberType>
NeuronLayer<NumberType>::~NeuronLayer() {
delete[] neurons;
}
or
template <typename NumberType>
NeuronLayer<NumberType>::~NeuronLayer() {
for (size_t i = 0; i < num_neurons; ++i) {
neurons[i].~Neuron();
}
delete[] reinterpret_cast<char*>(neurons);
}
Calling delete[] on an Object* will call the destructor once for every object allocated by new[]. new Object[N] typically stores N before the actual array, and delete[] certainly knows where to look.
Your code doesn't store that count. And it can't, since it's an unspecified implementation detail where and how the count is stored. As you speculate, there are two obvious ways: element count and array size, and one obvious location (before the array). Even so, there could be alignment issues, and you can't predict what type is used for the size.
Also, new unsigned char[N] is a special case since delete[] doesn't need to call destructors of char. In that case new[] doesn't need to store N at all. So you can't even bank on that size being stored, even if new Object[N] would have stored a size.
Here is portable code that manages a dynamic array of objects. It's essentially std::vector:
void * addr = ::operator new(sizeof(Object) * num_elements);
Object * p = static_cast<Object *>(addr);
for (std::size_t i = 0; i != num_elements; ++i)
{
::new (p + i) Object(/* some initializer */);
}
// ...
for (std::size_t i = 0; i != num_elements; ++i)
{
std::size_t ri = num_elements - i - 1;
(p + ri)->~Object();
}
::operator delete(addr);
This is general pattern how you should organize dynamic storage if you want to have very low-level control. The upshot is that dynamic arrays should never have been a language feature and are much better implemented in library. As I said above, this code is pretty much identical to the existing standard library gadget called std::vector<Object>.
I have a dynamic array as a member of my class. I'm trying to find an efficient way to resize it and keep all of the information in it. I know that vectors would work well for this but I want to do this with a dynamic array instead.
My class has a dynamic array of type unsigned _int8 called data.
Is the following acceptable?
unsigned _int8 * temp = data;
data = new unsigned _int8[NewSize]();
if(OldSize >= NewSize)
{
for(int i = 0; i < NewSize; i++)
data[i] = temp[i];
}
else
{
for(int i = 0; i < OldSize; i++)
data[i] = temp[i];
}
delete [] temp;
Or should I do this a different way? Any suggestions?
Edit
Fixed an error in my example and changed char to unsigned _int8.
Edit 2
I will not be reallocating often, if at all. I want the functionality to be there to avoid having to write the code to create a new object and copy everything over if it's needed.
The class I am writing is for creating and saving Bitmap (.bmp) images. The array simply holds the file bytes. The image size will (should) be known when I create the object.
Since the array is using a POD (plain old data) type, you can replace the loops with memcpy() instead:
unsigned _int8 *temp = new unsigned _int8[NewSize];
if (OldSize >= NewSize)
memcpy(temp, data, NewSize * sizeof(unsigned _int8));
else
{
memcpy(temp, data, OldSize);
memset(&temp[OldSize], 0, (NewSize-OldSize) * sizeof(unsigned _int8));
}
delete[] data;
data = temp;
Or at least use std::copy() (for POD types, std::copy() is like memcpy(), but for non-POD types it uses loops so object assignment semantics are preserved):
unsigned _int8 *temp = new unsigned _int8[NewSize];
if (OldSize >= NewSize)
std::copy(data, &data[NewSize], temp);
else
{
std::copy(data, &data[OldSize], temp);
std::memset(&temp[OldSize], 0, (NewSize-OldSize) * sizeof(unsigned _int8));
}
delete[] data;
data = temp;
That being said, you really should use std::vector<unsigned _int8> instead. It handles these details for you. This type of array management is what you have to use in C, but really should not use in C++ if you can avoid it, use native C++ functionality instead.
By doing it this way, every time a new element is added to the array, it must be resized.
And the resize operation is Θ(n), so the insert operation also becomes Θ(n).
The common procedure is to duplicate (or triplicate, etc) the array size every time it has to be resized, with this, the resize operation is still Θ(n), but the amortized insertion cost is Θ(1).
Also, usually the capacity is separated from the size, because the capacity is an implementation detail, while the size is part of the interface of the array.
And you may want to verify, when elements are removed, if the capacity is too big, and if so, decrease it, otherwise, once it gets big, that space will never be released.
You can see more about it here:
http://en.wikipedia.org/wiki/Dynamic_array
The problem with this approach is that you resize to just the size needed. This would mean that when you insert a new element the time needed to do it varies a lot.
So for example if you keep doing a "push_back" like operation then you would reallocate all the time.
An alternative idea is to allocate extra size to avoid frequent reallocations that cost a lot regarding performance
Vector for example allocate extra size to have an amortisized redimensionning constant.
Here is a link that exaplains it in detail
Vector in the stl use this method to be more effiicient.
Amortized analysis of std::vector insertion
The question: How to use "placement new" for creating an array with dynamic size? or more specifically, how to allocate memory for array elements from a pre-allocated memory.
I am using the following code:
void* void_array = malloc(sizeof(Int));
Int* final_array = new(void_array) Int;
This guarantees that the final_array* (the array pointer) is allocated from the place that is reserved by void_array*. But what about the final_array elements? I want them to be allocated from a pre-allocated memory as well.
P.S: I have to say that I'm using some API that gives me some controls over a tile architecture. There is a function that works exactly like malloc, but also have other features, e.g. lets you control the properties of the allocated memory. So, what i basically need to do, is to use that malloc-like function to allocate memory with my desired properties (e.g. from which memory bank, to be cached where and etc.)
First off, let's make sure we all agree on the separation of memory allocation and object construction. With that in mind, let's assume we have enough memory for an array of objects:
void * mem = std::malloc(sizeof(Foo) * N);
Now, you cannot use placement array-new, because it is broken. The correct thing to do is construct each element separately:
for (std::size_t i = 0; i != N; ++i)
{
new (static_cast<Foo*>(mem) + i) Foo;
}
(The cast is only needed for the pointer arithmetic. The actual pointer required by placement-new is just a void pointer.)
This is exactly how the standard library containers work, by the way, and how the standard library allocators are designed. The point is that you already know the number of elments, because you used it in the initial memory allocation. Therefore, you have no need for the magic provided by C++ array-new, which is all about storing the array size somewhere and calling constructors and destructors.
Destruction works in reverse:
for (std::size_t i = 0; i != N; ++i)
{
(static_cast<Foo*>(mem) + i)->~Foo();
}
std::free(mem);
One more thing you must know about, though: Exception safety. The above code is in fact not correct unless Foo has a no-throwing constructor. To code it correctly, you must also store an unwind location:
std::size_t cur = 0;
try
{
for (std::size_t i = 0; i != N; ++i, ++cur)
{
new (static_cast<Foo*>(mem) + i) Foo;
}
}
catch (...)
{
for (std::size_t i = 0; i != cur; ++i)
{
(static_cast<Foo*>(mem) + i)->~Foo();
}
throw;
}
Instead of using a custom malloc, you should overwrite operator new() and use it. This is not operator new; there is a function actually called operator new(), confusing as it may seem, which is the function used by the normal (non-placement) operator new in order to get raw memory upon which to construct objects. Of course, you only need to overwrite it if you need special memory management; otherwise the default version works fine.
The way to use it is as follows, asuming your array size will be size:
Int* final_array = static_cast<Int*>(size == 0 ? 0 : operator new(sizeof(Int) * size));
Then you can construct and destroy each element independently. For instance, for element n:
// Create
new(final_array + n) Int; // use whatever constructor you want
// Destroy
(final_array + n)->~Int();
I have made for school purposes my own take on a dynamically allocated array using templates.
While what I'm about to ask works, I don't know how and why and I've reached the point where I need to know.
template <typename TElement>
DynamicArray<TElement>::ensureCapacity () {
if (capacity >= elemNumb) {
return; //we have space to store the values
}
//we need to allocate more space for the values
TElement *auxArray = myArray;
//create space to hold more numbers
capacity = capacity * 2;
myArray = new TElement[capacity];
//copy the values
for (int i = 0; i < size; i++) {
myArray[i] = auxArray[i];
}
//release the memory
delete[] auxArray;
}
I need to know: TElement *auxArray = myArray; How does this work ? is it using pointers, are elements copied one by one ? I need to understand how it works so that I can figure out the complexity of my algorithm. I don't mind if some one tells me the complexity but the real answer I'm looking for is how does that work ?
Also myArray = new TElement[capacity]; I do this before deleting the old myArray does this delete the old one ? or is it still floating somewhere in memory in one form or another ?
This
TElement *auxArray = myArray;
just means that auxArray points to whatever myArray is pointing to. There is no copying of anything else, it is just a pointer copy.
This
myArray = new TElement[capacity];
means that myArray now points to a new, dynamically allocated TElement array. The expression doesn't delete anything. But auxArray is pointing to what myArray was pointing before this assignment, so when you delete auxArray, you release the resources originally pointed to by myArray.
I am thinking of how I can implement std::vector from the ground up.
How does it resize the vector?
realloc only seems to work for plain old stucts, or am I wrong?
it is a simple templated class which wraps a native array. It does not use malloc/realloc. Instead, it uses the passed allocator (which by default is std::allocator).
Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.
UPDATE: in C++11, the elements will be moved instead of copy constructed if it is possible for the stored type.
In addition to this, it will need to store the current "size" and "capacity". Size is how many elements are actually in the vector. Capacity is how many could be in the vector.
So as a starting point a vector will need to look somewhat like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_;
typename A::size_type capacity_;
typename A::size_type size_;
A allocator_;
};
The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end() (which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size() call (which now needs a subtraction). In which case it could look like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_; // points to first element
T* end_capacity_; // points to one past internal storage
T* end_; // points to one past last element
A allocator_;
};
I believe gcc's libstdc++ uses the latter approach, but both approaches are equally valid and conforming.
NOTE: This is ignoring a common optimization where the empty base class optimization is used for the allocator. I think that is a quality of implementation detail, and not a matter of correctness.
Resizing the vector requires allocating a new chunk of space, and copying the existing data to the new space (thus, the requirement that items placed into a vector can be copied).
Note that it does not use new [] either -- it uses the allocator that's passed, but that's required to allocate raw memory, not an array of objects like new [] does. You then need to use placement new to construct objects in place. [Edit: well, you could technically use new char[size], and use that as raw memory, but I can't quite imagine anybody writing an allocator like that.]
When the current allocation is exhausted and a new block of memory needs to be allocated, the size must be increased by a constant factor compared to the old size to meet the requirement for amortized constant complexity for push_back. Though many web sites (and such) call this doubling the size, a factor around 1.5 to 1.6 usually works better. In particular, this generally improves chances of re-using freed blocks for future allocations.
From Wikipedia, as good an answer as any.
A typical vector implementation consists, internally, of a pointer to
a dynamically allocated array,[2] and possibly data members holding
the capacity and size of the vector. The size of the vector refers to
the actual number of elements, while the capacity refers to the size
of the internal array. When new elements are inserted, if the new size
of the vector becomes larger than its capacity, reallocation
occurs.[2][4] This typically causes the vector to allocate a new
region of storage, move the previously held elements to the new region
of storage, and free the old region. Because the addresses of the
elements change during this process, any references or iterators to
elements in the vector become invalidated.[5] Using an invalidated
reference causes undefined behaviour
Like this:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_vector.h
(official gcc mirror on github)
///Implement Vector class
class MyVector {
int *int_arr;
int capacity;
int current;
public:
MyVector() {
int_arr = new int[1];
capacity = 1;
current = 0;
}
void Push(int nData);
void PushData(int nData, int index);
void PopData();
int GetData(int index);
int GetSize();
void Print();
};
void MyVector::Push(int data)
{
if (current == capacity){
int *temp = new int[2 * capacity];
for (int i = 0; i < capacity; i++)
{
temp[i] = int_arr[i];
}
delete[] int_arr;
capacity *= 2;
int_arr = temp;
}
int_arr[current] = data;
current++;
}
void MyVector::PushData(int data, int index)
{
if (index == capacity){
Push(index);
}
else
int_arr[index] = data;
}
void MyVector::PopData(){
current--;
}
int MyVector::GetData(int index)
{
if (index < current){
return int_arr[index];
}
}
int MyVector::GetSize()
{
return current;
}
void MyVector::Print()
{
for (int i = 0; i < current; i++) {
cout << int_arr[i] << " ";
}
cout << endl;
}
int main()
{
MyVector vect;
vect.Push(10);
vect.Push(20);
vect.Push(30);
vect.Push(40);
vect.Print();
std::cout << "\nTop item is "
<< vect.GetData(3) << std::endl;
vect.PopData();
vect.Print();
cout << "\nTop item is "
<< vect.GetData(1) << endl;
return 0;
}
It allocates a new array and copies everything over. So, expanding it is quite inefficient if you have to do it often. Use reserve() if you have to use push_back().
You'd need to define what you mean by "plain old structs."
realloc by itself only creates a block of uninitialized memory. It does no object allocation. For C structs, this suffices, but for C++ it does not.
That's not to say you couldn't use realloc. But if you were to use it (note you wouldn't be reimplementing std::vector exactly in this case!), you'd need to:
Make sure you're consistently using malloc/realloc/free throughout your class.
Use "placement new" to initialize objects in your memory chunk.
Explicitly call destructors to clean up objects before freeing your memory chunk.
This is actually pretty close to what vector does in my implementation (GCC/glib), except it uses the C++ low-level routines ::operator new and ::operator delete to do the raw memory management instead of malloc and free, rewrites the realloc routine using these primitives, and delegates all of this behavior to an allocator object that can be replaced with a custom implementation.
Since vector is a template, you actually should have its source to look at if you want a reference – if you can get past the preponderance of underscores, it shouldn't be too hard to read. If you're on a Unix box using GCC, try looking for /usr/include/c++/version/vector or thereabouts.
You can implement them with resizing array implementation.
When the array becomes full, create an array with twice as much the size and copy all the content to the new array. Do not forget to delete the old array.
As for deleting the elements from vector, do resizing when your array becomes a quarter full. This strategy makes prevents any performance glitches when one might try repeated insertion and deletion at half the array size.
It can be mathematically proved that the amortized time (Average time) for insertions is still linear for n insertions which is asymptotically the same as you will get with a normal static array.
realloc only works on heap memory. In C++ you usually want to use the free store.