I wrote a program, which computes the flow shop scheduling problem.
I need help with optimizing the slowest parts of my program:
Firstly there is array 2D array allocation:
this->_perm = new Chromosome*[f];
//... for (...)
this->_perm[i] = new Chromosome[fM1];
It works just fine, but a problem occurs, when I try to delete array:
delete [] _perm[i];
It takes extremely long to execute line above. Chromosome is array of about 300k elements - allocating it takes less than a second but deleting takes far more than a minute.
I would appreciate any suggestions of improving delete part.
On a general note, you should never manually manage memory in C++. This will lead to leaks, double-deletions and all kinds of nasty inconveniences. Use proper resource-handling classes for this. For example, std::vector is what you should use for managing a dynamically allocated array.
To get back to your problem at hand, you first need to know what delete [] _perm[i] does: It calls the destructor for every Chromosome object in that array and then frees the memory. Now you do this in a loop, which means this will call all Chromosome destructors and perform f deallocations. As was already mentioned in a comment to your question, it is very likely that the Chromosome destructor is the actual culprit. Try to investigate that.
You can, however, change your memory handling to improve the speed of allocation and deallocation. As Nawaz has shown, you could allocate one big chunk of memory and use that. I'd use a std::vector for a buffer:
void f(std::size_t row, std::size_t col)
{
int sizeMemory = sizeof(Chromosome) * row * col;
std::vector<unsigned char> buffer(sizeMemory); //allocation of memory at once!
vector<Chromosome*> chromosomes(row);
// use algorithm as shown by Nawaz
std::size_t j = 0 ;
for(std::size_t i = 0 ; i < row ; i++ )
{
//...
}
make_baby(chromosomes); //use chromosomes
in_place_destruct(chromosomes.begin(), chromosomes.end());
// automatic freeing of memory holding pointers in chromosomes
// automatic freeing of buffer memory
}
template< typename InpIt >
void in_place_destruct(InpIt begin, InpIt end)
{
typedef std::iterator_traits<InpIt>::value_type value_type; // to call dtor
while(begin != end)
(begin++)->~value_type(); // call dtor
}
However, despite handling all memory through std::vector this still is not fully exception-safe, as it needs to call the Chromosome destructors explicitly. (If make_baby() throws an exception, the function f() will be aborted early. While the destructors of the vectors will delete their content, one only contains pointers, and the other treats its content as raw memory. No guard is watching over the actual objects created in that raw memory.)
The best solution I can see is to use a one-dimensional arrays wrapped in a class that allows two-dimensional access to the elements in that array. (Memory is one-dimensional, after all, on current hardware, so the system is already doing this.) Here's a sketch of that:
class chromosome_matrix {
public:
chromosome_matrix(std::size_t row, std::size_t col)
: row_(row), col_(col), data_(row*col)
{
// data_ contains row*col constructed Chromosome objects
}
// note needed, compiler generated dtor will do the right thing
//~chromosome_matrix()
// these rely on pointer arithmetic to access a column
Chromosome* operator[](std::size_t row) {return &data_[row*col_];}
const Chromosome* operator[](std::size_t row) const {return &data_[row*col_];}
private:
std::size_t row_;
std::size_t col_;
std::vector<chromosomes> data_
};
void f(std::size_t row, std::size_t col)
{
chromosome_matrix cm(row, col);
Chromosome* column = ch[0]; // get a whole column
Chromosome& chromosome1 = column[0]; // get one object
Chromosome& chromosome2 = cm[1][2]; // access object directly
// make baby
}
check your destructors.
If you were allocating a built-in type (eg an int) then allocating 300,000 of them would be more expensive than the corresponding delete. But that's a relative term, 300k allocated in a single block is pretty fast.
As you're allocating 300k Chromosomes, the allocator has to allocate 300k * sizeof the Chromosome object, and as you say its fast - I can't see it doing much beside just that (ie the constructor calls are optimised into nothingness)
However, when you come to delete, it not only frees up all that memory, but it also calls the destructor for each object, and if its slow, I would guess that the destructor for each object takes a small, but noticeable, time when you have 300k of them.
I would suggest you to use placement new. The allocation and deallocation can be done just in one statement each!
int sizeMemory = sizeof(Chromosome) * row * col;
char* buffer = new char[sizeMemory]; //allocation of memory at once!
vector<Chromosome*> chromosomes;
chromosomes.reserve(row);
int j = 0 ;
for(int i = 0 ; i < row ; i++ )
{
//only construction of object. No allocation!
Chromosome *pChromosome = new (&buffer[j]) Chromosome[col];
chromosomes.push_back(pChromosome);
j = j+ sizeof(Chromosome) * col;
}
for(int i = 0 ; i < row ; i++ )
{
for(int j = 0 ; j < col ; j++ )
{
//only destruction of object. No deallocation!
chromosomes[i][j].~Chromosome();
}
}
delete [] buffer; //actual deallocation of memory at once!
std::vector can help.
Special memory allocators too.
Related
In order to use placement new instead of automatically attempting to call the default constructor, I'm allocating an array using reinterpret_cast<Object*>(new char[num_elements * sizeof(Object)]) instead of new Object[num_elements].
However, I'm not sure how I should be deleting the array so that the destructors get called correctly. Should I loop through the elements, call the destructor manually for each element, and then cast the array to a char* and use delete[] on that, like this:
for (size_t i = 0; i < num_elements; ++i) {
array[i].~Object();
}
delete[] reinterpret_cast<char*>(array);
Or is it sufficient if I don't call the destructor manually for each element, and simply rely on delete[] to do that since the type of the array is Object*, like delete[] array?
What I'm worried about, is that not every platform might be able to determine the amount of elements in the array correctly that way, because I didn't allocate the array using a type of the right size. An answer to a question about "how delete[] knows the size of the operand" suggests that a possible implementation of delete[] would be to store the number of allocated elements (rather than the amount of bytes).
If delete[] is indeed implemented that way, that would suggest that using just delete[] array would try to delete too many elements, because the array was created with more char elements than how many Object elements fit in it. So in that case, the only reliable way to delete the array would be to manually call the destructors, cast the array to a char*, and then use delete[].
However, another logical way to implement it would be to store the size of the array in bytes, rather than the amount of elements, and then when calling delete[], divide the size of the array by the size of the type to get the amount of elements to call the destructor of. If this method is used, then just using delete[] array where array has a type of Object* would be sufficient.
So my question is: can I rely on delete[] to correctly call the destructors of the elements in the operand array, if the array was originally not allocated with the right type?
This is the code I'm using:
template <typename NumberType>
NeuronLayer<NumberType>::NeuronLayer(size_t num_inputs, size_t num_neurons, const NumberType *weights)
: neurons(reinterpret_cast<Neuron<NumberType>*>(new char[num_neurons * sizeof(Neuron<NumberType>)])),
num_neurons(num_neurons), num_weights(0) {
for (size_t i = 0; i < num_neurons; ++i) {
Neuron<NumberType> &neuron = neurons[i];
new(&neuron) Neuron<NumberType>(num_inputs, weights + num_weights);
num_weights += neuron.GetNumWeights();
}
}
and
template <typename NumberType>
NeuronLayer<NumberType>::~NeuronLayer() {
delete[] neurons;
}
or
template <typename NumberType>
NeuronLayer<NumberType>::~NeuronLayer() {
for (size_t i = 0; i < num_neurons; ++i) {
neurons[i].~Neuron();
}
delete[] reinterpret_cast<char*>(neurons);
}
Calling delete[] on an Object* will call the destructor once for every object allocated by new[]. new Object[N] typically stores N before the actual array, and delete[] certainly knows where to look.
Your code doesn't store that count. And it can't, since it's an unspecified implementation detail where and how the count is stored. As you speculate, there are two obvious ways: element count and array size, and one obvious location (before the array). Even so, there could be alignment issues, and you can't predict what type is used for the size.
Also, new unsigned char[N] is a special case since delete[] doesn't need to call destructors of char. In that case new[] doesn't need to store N at all. So you can't even bank on that size being stored, even if new Object[N] would have stored a size.
Here is portable code that manages a dynamic array of objects. It's essentially std::vector:
void * addr = ::operator new(sizeof(Object) * num_elements);
Object * p = static_cast<Object *>(addr);
for (std::size_t i = 0; i != num_elements; ++i)
{
::new (p + i) Object(/* some initializer */);
}
// ...
for (std::size_t i = 0; i != num_elements; ++i)
{
std::size_t ri = num_elements - i - 1;
(p + ri)->~Object();
}
::operator delete(addr);
This is general pattern how you should organize dynamic storage if you want to have very low-level control. The upshot is that dynamic arrays should never have been a language feature and are much better implemented in library. As I said above, this code is pretty much identical to the existing standard library gadget called std::vector<Object>.
I create a 2D array of Nodes (Node class is in a separate file) and i'm wondering how to deallocate exactly this (below). I've tried many ways and mem leaks still appear.
board = new Node * [r];
//creats a column for each element in the row
for(int i = 0; i < r; i++) {
board [i] = new Node [c];
}
(r is the rows and c is the cols)
I've done this:
for(int i = 0; i < r; i++) {
delete [] board[i];
}
delete [] board;
But apparently it's not enough
The code you have is correct and sufficient. However, it would be better to use RAII so that you do not need to explicitly call delete at all (perhaps not even new). For example, you could create a std::vector<std::vector<Node>>, or better still, some sort of matrix class (not in the standard library, unfortunately).
Your solution is the correct way to free two dimensional array. However you may still get a memory leak if Node uses dynamic memory and it's destructor is not properly defined.
As others have said, you're correctly pairing up all your new[]s and delete[]s: assuming no errors occur, the memory allocated by this code will be correctly deallocated.
The only issue is that errors may occur, and in particular exceptions may be thrown:
new[] can throw an exception if it fails to allocate memory (doesn't normally happen on desktop OSes, but you should still write code as if it does.)
Node's constructor may throw. Even if you've designed the constructor not to throw you generally shouldn't take advantage of that knowledge. Just write code as if throws.
In fact, you should just generally write code as if pretty much anything can throw. For more detailed info on writing exception safe code, and on what the exceptions to this rule are you can read the info at http://exceptionsafecode.com
The best way to make this code exception safe is to use RAII. For example use a vector instead of new[]/delete[].
Using an array of pointers and a separate allocation for each row makes sense for 'ragged' arrays, where each row can be a different length. In your case you have rectangular array, so you can use a single allocation for the whole thing.
std::vector<Node> board(rows*columns);
board[row_index*columns + column_index] // replaces board[row_index][column_index]
You can hide the implementation by putting this in a class:
class Board {
std::vector<Node> board_data;
public:
const int rows;
const int columns;
Board(int rows_, int columns_)
: board_data(rows_*columns_)
, rows(rows_)
, columns(columns_)
{}
struct board_index { int row, column; };
Node &operator[](board_index i) {
assert(0 <= i.row && i.row < rows);
assert(0 <= i.column && i.column < columns);
return board_data[i.row * columns + i.column];
}
};
Board board(r, c);
with the above implementation you replace board[i][j] with board[{i, j}].
board[{i, j}] = ... // assign to a place on the board
board[{i, j}].foo(); // call a Node method
std::cout << board[{i, j}]; // print a Node
// etc.
I am working on an application with high performance and memory needs. With that I mean 80 cores and 500 GB of RAM. To save some memory, I use my own dynamic array (16 B overhead) as opposed to std::vector (24 B overhead), which matters if you have billions of them.
My question relates to expanding that array which looks like this:
//private
template <class ArrType>
void DynamicArray<ArrType>::reallocate(unsigned newCapacity) {
if (newCapacity < _size) return;
if (capacity == newCapacity) return;
ArrType * newArray = new ArrType[newCapacity];
capacity = newCapacity;
//for (unsigned i = 0; i < _size; i++) {
// newArray[i] = array[i];
//}
memcpy(newArray, array, _size * sizeof(ArrType));
if(array) delete [] array;
array = newArray;
}
As you can see, pretty standard reallocation, but I tested memcpy and it was about 10 times faster than using a for cycle. The problem is when I call delete, it will call destructors for objects of ArrType, which is a problem when ArrType has its own dynamic allocations. The copy in newArray will use deleted memory. Is there any way to delete the old array without calling destructors?
Replace your memcpy with:
std::move(array, array + _size, newArray);
And require that the type ArrType must have a correct move or copy assignment operator.
But in real life, just use vector<ArrType>.
In fact vector is better than this: rather than allocating an array (which runs a constructor if the type has one) and then move-assigning (which over-writes what new just did) it allocates raw memory and then uses the move constructor with placement new.
So, if you absolutely positively need a version of vector that uses a smaller type for size_type than the one in your implementation I suppose the thing to do is to re-implement vector under a new name with that change. You can use the source in your implementation to help you: that way you will have solutions in front of you to this problem and all the other problems involved.
I am building a class called ParticleMatrix that stores a two dimensional array of the object Ball. I want to dynamically allocate space for them. The code looks like this.
/*
* allocateParticle takes a width w, height h, restlength RL then allocates space for
* and constructs a 2D array of Particles of subclass Ball.
*/
void ParticleMatrix::allocParticles(int w, int h, float RL)
{
// Gets the number of particles in the xDirection
xPart = getNrPart(w,RL);
// Gets the number of particles in the yDirection
yPart = getNrPart(h,RL);
// Allocates a row of pointers to pointers.
ballArray = new Ball*[xPart];
// The ID of the particles.
int ID = 0;
// For every particle in the xDirection
for(int x = 0; x<xPart; x++)
{
// Allocate a row of Ball-pointers yPart long.
ballArray[x] = new Ball[yPart];
// For every allocated space
for(int y = 0; y<yPart; y++)
{
// Construct a Ball
ballArray[x][y] = Ball( ID, RL*(float)x, RL*(float)y);
ID++;
}
}
}
The problem occurs with the line "ballArray[x] = new Ball[yPart]". CodeBlocks gives me the compiler error " error: no matching function for call to 'Ball::Ball()' ".
I have 4 constructors for Ball with different signatures, none looking like: "Ball()".
I have tried with adding a constructor "Ball::Ball()" and it compiles then but I feel like I should be able to just allocate space for an object and later instantiate them.
What I'm wondering is: Why can't I allocate space for the object Ball without having a constructor "Ball::Ball()" in the code above?
and: If it is possible in some way to allocate space without the constructor "Ball::Ball()", how would I go about doing it?
I know I can create the constructor "Ball::Ball()" and give the objects some dummy values then later set them to their required values, but I feel uncomfortable doing this since I don't know why I couldn't just "Allocate space -> instantiate object". I hope I was able to explain my issue. Thanks!
Instead of new T, that gets memory and calls ctor, you can call operator new with a size you supply. The only provides you memory and nothing else. Then you can call placement new on properly calculated locations, that will invoke only your ctor. On the location you posted instead of allocating anew. Search google for the provided terms to see examples.
But normally you're not supposed to do any of that, your task can be done well using a std::vector<Ball> with way less effort and more security.
Another "C++ way" of doing this is using std::allocator.
It provides you allocate and deallocate which only reserve memory without constructing elements.
std::allocator<Ball*> ball_ptr_allocator;
std::allocator<Ball> ball_allocator;
Ball ** ptr = ball_ptr_allocator.allocate(10);
for (size_t x=0; x<10; ++x)
{
ptr[x] = ball_allocator.allocate(10);
for (size_t y=0; y<10; ++y)
{
ball_allocator.construct(&ptr[x][y], ID, RL*(float)x, RL*(float)y);
// if you do not have access to C++11 use this:
// ball_allocator.construct(&ptr[x][y], Ball(ID, RL*(float)x, RL*(float)y));
++ID;
}
Note several issues here:
I'd generally suggest to use unsigned types for sizes (like size_t for example).
If you make the allocator a member you can access is in the destructor etc. to deallocate stuff again. std::allocator<Ball> m_ballAlloc;
You have to (somehow) keep track of the constructed elements and allocated memory. If one of the constructions will throw an exception you should be able to clean up the constructed elements and deallocate the allocated memory.
For the deallocation tracking you can go with an additional loop in your allocParticles
for(size_t x = 0; x<xPart; x++) ballArray[x] = nullptr;
Now you know that every ballArray[i] that is not a nullptr needs to be deallocated.
But you'll have to destroy your elements first.
If you make your ID a member variable of the class you can use it to destroy constructed elements (since it's only incremented after element construction).
I wrote a destructor example with respect to ballArray destruction only, note that you'll may have to take care of other resources, too, if present.
~ParticleMatrix (void)
{
// check this here and set ballArray to nullptr upon construction
// of the ParticleMatrix
if (ballArray != nullptr)
{
// start to destroy the Balls
size_t destroycounter(0U);
for(size_t x = 0; x<xPart; x++)
{
for(size_t y = 0; y<yPart; y++)
{
if (destroycounter < ID)
{
m_ballAlloc.destroy(&ballArray[x][y]);
++destroycounter;
}
}
}
// deallocate 2nd dimension arrays
for(size_t x = 0; x<xPart; x++)
{
if (ballArray[x] != nullptr) m_ballAlloc.deallocate(ballArray[x], yPart);
}
// deallocate first dimension
delete [] ballArray;
}
}
In C++, the operator new does not just allocate space for a variable, but also constructs it. If you don't have a default construtor Ball::Ball(), then you cannot construct each object in the array. There is no "just allocate space" in C++, in principle...
I want to implement an array that can increment as new values are added. Just like in Java. I don't have any idea of how to do this. Can anyone give me a way ?
This is done for learning purposes, thus I cannot use std::vector.
Here's a starting point: you only need three variables, nelems, capacity and a pointer to the actual array. So, your class would start off as
class dyn_array
{
T *data;
size_t nelems, capacity;
};
where T is the type of data you want to store; for extra credit, make this a template class. Now implement the algorithms discussed in your textbook or on the Wikipedia page on dynamic arrays.
Note that the new/delete allocation mechanism does not support growing an array like C's realloc does, so you'll actually be moving data's contents around when growing the capacity.
I would like to take the opportunity to interest you in an interesting but somewhat difficult topic: exceptions.
If you start allocating memory yourself and subsequently playing with raw pointers, you will find yourself in the difficult position of avoiding memory leaks.
Even if you are entrusting the book-keeping of the memory to a right class (say std::unique_ptr<char[]>), you still have to ensure that operations that change the object leave it in a consistent state should they fail.
For example, here is a simple class with an incorrect resize method (which is at the heart of most code):
template <typename T>
class DynamicArray {
public:
// Constructor
DynamicArray(): size(0), capacity(0), buffer(0) {}
// Destructor
~DynamicArray() {
if (buffer == 0) { return; }
for(size_t i = 0; i != size; ++i) {
T* t = buffer + i;
t->~T();
}
free(buffer); // using delete[] would require all objects to be built
}
private:
size_t size;
size_t capacity;
T* buffer;
};
Okay, so that's the easy part (although already a bit tricky).
Now, how do you push a new element at the end ?
template <typename T>
void DynamicArray<T>::resize(size_t n) {
// The *easy* case
if (n <= size) {
for (; n < size; ++n) {
(buffer + n)->~T();
}
size = n;
return;
}
// The *hard* case
// new size
size_t const oldsize = size;
size = n;
// new capacity
if (capacity == 0) { capacity = 1; }
while (capacity < n) { capacity *= 2; }
// new buffer (copied)
try {
T* newbuffer = (T*)malloc(capacity*sizeof(T));
// copy
for (size_t i = 0; i != oldsize; ++i) {
new (newbuffer + i) T(*(buffer + i));
}
free(buffer)
buffer = newbuffer;
} catch(...) {
free(newbuffer);
throw;
}
}
Feels right no ?
I mean, we even take care of a possible exception raised by T's copy constructor! yeah!
Do note the subtle issue we have though: if an exception is thrown, we have changed the size and capacity members but still have the old buffer.
The fix is obvious, of course: we should first change the buffer, and then the size and capacity. Of course...
But it is "difficult" to get it right.
I would recommend using an alternative approach: create an immutable array class (the capacity should be immutable, not the rest), and implement an exception-less swap method.
Then, you'll be able to implement the "transaction-like" semantics much more easily.
An array which grows dynamically as we add elements are called dynamic array, growable array, and here is a complete implementation of a dynamic array .
In C and C++ array notation is basically just short hand pointer maths.
So in this example.
int fib [] = { 1, 1, 2, 3, 5, 8, 13};
This:
int position5 = fib[5];
Is the same thing as saying this:
int position5 = int(char*(fib)) + (5 * sizeof(int));
So basically arrays are just pointers.
So if you want to auto allocate you will need to write some wrapper functions to call malloc() or new, ( C and C++ respectively).
Although you might find vectors are what you are looking for...