First time implementing a graph where the total number of nodes is known when the constructor is called and performance is the highest priority.
Never allocated memory before, so the process is a little hazy.
The number of nodes required is (n*(n+1))/2 where n is the length of the string passed to the constructor.
#include <string>
struct ColorNode {
ColorNode* lParent;
ColorNode* rParent;
char color;
};
class ParentGraph {
std::string base;
int len, nodes;
ParentGraph(std::string b): base(b) {
len = base.length();
nodes = (len * (len + 1)) / 2;
// how to allocate enough memory for number of copies of "ColorNode" equal to "nodes"?
}
};
What is the best practice for allocating memory in this instance?
Will allocating the memory beforehand make a significant difference in performance?
It may turn out that an array or vector is a better choice, but really need the practice in both data structures and memory allocation.
Thanks for the consideration.
use
std::vector<ColorNode> nodes;
life will be very simple after that.
You can be helpful to std::vector if you know the size you want
auto nodes = std::vector<ColorNode>(size);
This will allocate a contiguous array on the heap for you, manage its growth, allocation, deallocation etc.
You will basically get the same in memory structure if you do new ColorNode[size] (or even malloc(....) if some evil person tried to persuade you that raw malloc will be faster). But you have to do all the nasty management yourself.
You only need to diverge from this advice if you have too many objects to fit into one contiguous memory block. If thats the case say so
Related
I have identified a bottleneck in my c++ code, and my goal is to speed it up. I am moving items from one vector to another vector if a condition is true.
In python, the pythonic way of doing this would be to use a list comprehension:
my_vector = [x for x in data_vector if x > 1]
I have hacked a way to do this in C++, and it is working fine. However, I am calling this millions of times in a while-loop and it is slow. I do not understand much about memory allocation, but I assume that my problem has to do with allocating memory over-and-over again using push_back. Is there a way to allocate my memory differently to speed up this code? (I do not know how large my_vector should be until the for-loop has completed).
std::vector<float> data_vector;
// Put a bunch of floats into data_vector
std::vector<float> my_vector;
while (some_condition_is_true) {
my_vector.clear();
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Use my_vector to render graphics on the GPU, but do not change the elements of my_vector
// Change the elements of data_vector, but not the size of data_vector
}
Use std::copy_if, and reserve data_vector.size() for my_vector initially (as this is the maximum possible number of elements for which your predicate could evaluate to true):
std::vector<int> my_vec;
my_vec.reserve(data_vec.size());
std::copy_if(data_vec.begin(), data_vec.end(), std::back_inserter(my_vec),
[](const auto& el) { return el > 1; });
Note that you could avoid the reserve call here if you expect that the number of times that your predicate evaluates to true will be much less than the size of the data_vector.
Though there are various great solutions posted by others for your query, it seems there is still no much explanation for the memory allocation, which you do not much understand, so I would like to share my knowledge about this topic with you. Hope this helps.
Firstly, in C++, there are several types of memory: stack, heap, data segment.
Stack is for local variables. There are some important features associated with it, for example, they will be automatically deallocated, operation on it is very fast, its size is OS-dependent and small such that storing some KB of data in the stack may cause an overflow of memory, et cetera.
Heap's memory can be accessed globally. As for its important features, we have, its size can be dynamically extended if needed and its size is larger(much larger than stack), operation on it is slower than stack, manual deallocation of memory is needed (in nowadays's OS, the memory will be automatically freed in the end of program), et cetera.
Data segment is for global and static variables. In fact, this piece of memory can be divided into even smaller parts, e.g. BBS.
In your case, vector is used. In fact, the elements of vector are stored into its internal dynamic array, that is an internal array with a dynamic array size. In the early C++, a dynamic array can be created on the stack memory, however, it is no longer that case. To create a dynamic array, ones have to create it on heap. Therefore, the elements of vector are stored in an internal dynamic array on heap. In fact, to dynamically increase the size of an array, a process namely memory reallocation is needed. However, if a vector user keeps enlarging his or her vector, then the overhead cost of reallocation cost will be high. To deal with it, a vector would firstly allocate a piece of memory that is larger than the current need, that is allocating memory for potential future use. Therefore, in your code, it is not that case that memory reallocation is performed every time push_back() is called. However, if the vector to be copied is quite large, the memory reserved for future use will be not enough. Then, memory allocation will occur. To tackle it, vector.reserve() may be used.
I am a newbie. Hopefully, I have not made any mistake in my sharing.
Hope this helps.
Run the code twice, first time only counting, how many new elements you will need. Then use reserve to already allocate all the memory you need.
while (some_condition_is_true) {
my_vector.clear();
int newLength = 0;
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
newLength++;
my_vector.reserve(newLength);
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Do stuff with my_vector and change data_vector
}
I doubt allocating my_vector is the problem, especially if the while loop is executed many times as the capacity of my_vector should quickly become sufficient.
But to be sure you can just reserve capacity in my_vector corresponding to the size of data_vector:
my_vector.reserve(data_vector.size());
while (some_condition_is_true) {
my_vector.clear();
for (auto value : data_vector) {
if (value > 1)
my_vector.push_back(value);
}
}
If you are on Linux you can reserve memory for my_vector to prevent std::vector reallocations which is bottleneck in your case. Note that reserve will not waste memory due to overcommit, so any rough upper estimate for reserve value will fit your needs. In your case the size of data_vector will be enough. This line of code before while loop should fix the bottleneck:
my_vector.reserve(data_vector.size());
I understand that we can use size() function to obtain the vector size, for example:
std::vector<in> abc;
abc.resize(3);
abc.size();
The my question is how can I know the memory size of a vector? Take an example:
std::vector<int> abc;
abc.reserve(7);
//the size of memory that has been allocated for abc
You use the member function capacity() to obtain the allocated capacity
std::vector<int> abc;
abc.reserve(7);
std::cout << abc.capacity() << std::endl;
To get the memory allocated by in bytes, You can do:
sizeof(int) * abc.capacity();
This is given, that you know your value_type is int. If you don't
sizeof(decltype(abc.back())) * abc.capacity();
The real answer is that you can't. Others have suggested ways that will often work, but you can't depend on capacity reflecting in any way the actual memory allocated.
For one thing, the heap will often allocate more memory than was requested. This has to do with optimizations against fragmenting, etc... vector has no way of knowing how much memory was actually allocated, only what it requested.
So capacity at best gives you a very rough estimate.
Use the capacity member function - http://en.cppreference.com/w/cpp/container/vector/capacity
There are strong statements on the memory being contiguous, and so the size is
sizeof( abc[0] ) * abc.capacity();
or
( (char*)(&abc[1]) - (char*)(&abc[0] ) ) * abc.capacity();
Since std::vector can store complex objects (such as std::string), which have their internal memory management and may allocate additional memory, determining the total memory usage can be hard.
For a vector containing simple objects such as int, the suggested solution using capacity and sizeof will work though.
I have a
priority_queue<node*, std::vector<node*>, CompareNodes> heap;
Let's say the node consists of:
class node {
public:
int value;
int key;
int order = 1000000;
};
How do I free the memory after i'm done with the priority queue?
My approach doesn't seem to be working:
while (heap.top()) {
node * t = heap.top();
heap.pop();
delete t;
}
Looks like you'll want to do something more like this:
while (!heap.empty())
{ /* the rest ... */ }
If the heap is empty, .top() will throw an exception because there's nothing to return, which will happen when you are popping elements.
Also, if available you should use
priority_queue<std::unique_ptr<node>, std::vector<std::unique_ptr<node>>, CompareNodes> heap;
so you don't have to worry about clearing the memory yourself.
Just like most std:: containers, the memory may or may not be freed when you want it to be. Memory is usually kept around for a longer time so that when you perform a heap.push or equivalent operation, the memory doesn't need to be allocated again.
Think of std::vector which has to allocate a new set of memory for the entire vector each time it grows (vector data must be contiguous in memory). It is more efficient for std::vector to perform a large one time allocation and keep the memory around so that the growth operation doesn't kill performance -- a) allocate new space big enough, b) copy entire contents of existing vector to new vector space, c) delete the old vector space.
Bottom line is you can't force it to free memory for individual items.
Why is it not possible to get the length of a buffer allocated in this fashion.
AType * pArr = new AType[nVariable];
When the same array is deallocated
delete [] pArr;
the runtime must know how much to deallocate. Is there any means to access the length before deleting the array. If no, why no such API is provided that will fetch the length?
Is there any means to access the length before deleting the array?
No. there is no way to determine that.
The standard does not require the implementation to remember and provide the specifics of the number of elements requested through new.
The implementation may simply insert specific bit patterns at end of allocated memory blocks instead of remembering the number of elements, and might simply lookup for the pattern while freeing the memory.
In short it is solely an imlpementation detail.
On a side note, There are 2 options to practically overcome this problem:
You can simple use a std::vector which provides you member functions like size() or
You can simply do the bookkeeping yourself.
new atleast allocates enough memory as much as you requested.
You already know how much memory you requested so you can calculate the length easily. You can find size of each item using sizeof.
Total memory requested / Memory required for 1 item = No of Items
The runtime DOES know how much was allocated. However such details are compiler specific so you don't have any cross platform way to handle it.
If you would like the same functionality and be able to track the size you could use a std::vector as follows:
std::vector< AType > pArr( nVariable );
This has the added advantage of using RAII as well.
The delete operator doesn't need to know the size to free the allocated memory, just like the free system call doesn't. This is because that problem is left to the operating system and not the compilers runtime system.
The runtime must deallocate the same amount as it allocated, and it does
keep track of this in some manner (usually very indirectly). But
there's no reliable way of getting from amount allocated to number of
elements: the amount allocated cannot be less than the number of
elements times the size of each element, but it will often be more.
Alignment considerations, for example, mean that new char[5] and new
char[8] will often allocate the same amount of memory, and there are
various allocation strategies which can cause significantly more memory
to be allocated that what is strictly necessary.
No, not really. At least not in a platform-independent, defined way.
Most implementations store the size of a dynamically allocated array before the actual array though.
There is no portable way in C++ to get the size of a dynamically allocated array from the raw pointer.
Under MSVC and WIN32 you can get the size of the allocated block with the _msize(void*) function.
see https://msdn.microsoft.com/en-us/library/z2s077bc.aspx for further details.
I use this "dirty" method, only for debugging purpose:
T *p = new T[count];
size_t size = (char*)&(p[count]) - (char*)p;
This gives the size of real data but not any extra size that could has been allocated by the compiler.
For already aligned types T, it is equal to:
size_t size = sizeof(T) * count;
Of course this doesn't works if you don't know the count of items in array.
why not a bit of extra info like this:
template <typename T> class AType
{
public:
AType(size_t s) : data(0)
{
a_size = s;
data = new T[s];
}
~AType() {
if (data != nullptr)
delete [] data;
}
size_t getSize() const
{
return a_size * sizeof(T);
}
private:
size_t a_size;
T* data;
};
I have a long array of data (n entities). Every object in this array has some values (let's say, m values for an object). And I have a cycle like:
myType* A;
// reading the array of objects
std::vector<anotherType> targetArray;
int i, j, k = 0;
for (i = 0; i < n; i++)
for (j = 0; j < m; j++)
{
if (check((A[i].fields[j]))
{
// creating and adding the object to targetArray
targetArray[k] = someGenerator(A[i].fields[j]);
k++;
}
}
In some cases I have n * m valid objects, in some (n * m) /10 or less.
The question is how do I allocate a memory for targetArray?
targetArray.reserve(n*m);
// Do work
targetArray.shrink_to_fit();
Count the elements without generating objects, and then allocate as much memory as I need and go with cycle one more time.
Resize the array on every iteration where new objects are being created.
I see a huge tactical mistake in each of my methods. Is another way to do it?
What you are doing here is called premature optimization. By default, std::vector will exponentially increase its memory footprint as it runs out of memory to store new objects. For example, a first push_back will allocate 2 elements. The third push_back will double the size etc. Just stick with push_back and get your code working.
You should start thinking about memory allocation optimization only when the above approach proves itself as a bottleneck in your design. If that ever happens, I think the best bet would be to come up with a good approximation for a number of valid objects and just call reserve() on a vector. Something like your first approach. Just make sure your shrink to fit implementation is correct because vectors don't like to shrink. You have to use swap.
Resizing array on every step is no good and std::vector won't really do it unless you try hard.
Doing an extra cycle through the list of objects can help, but it may also hurt as you could easily waste CPU cycles, bloat CPU cache etc. If in doubt - profile it.
The typical way would be to use targetArray.push_back(). This reallocates the memory when needed and avoids two passes through your data. It has a system for reallocating the memory that makes it pretty efficient, doing fewer reallocations as the vector gets larger.
However, if your check() function is very fast, you might get better performance by going through the data twice, determining how much memory you need and making your vector the right size to begin with. I would only do this if profiling has determined it is really necessary though.