How to avoid reallocation using the STL (C++) - c++

This question is derived from the topic:
vector reserve c++
I am using a datastructure of the type vector<vector<vector<double> > >. It is not possible to know the size of each of these vector (except the outer one) before items (doubles) are added. I can get an approximate size (upper bound) on the number of items in each "dimension".
A solution with the shared pointers might be the way to go, but I would like to try a solution where the vector<vector<vector<double> > > simply has .reserve()ed enough space (or in some other way has allocated enough memory).
Will A.reserve(500) (assumming 500 is the size or, alternatively an upper bound on the size) be enough to hold "2D" vectors of large size, say [1000][10000]?
The reason for my question is mainly because I cannot see any way of reasonably estimating the size of the interior of A at the time of .reserve(500).
An example of my question:
vector<vector<vector<int> > > A;
A.reserve(500+1);
vector<vector<int> > temp2;
vector<int> temp1 (666,666);
for(int i=0;i<500;i++)
{
A.push_back(temp2);
for(int j=0; j< 10000;j++)
{
A.back().push_back(temp1);
}
}
Will this ensure that no reallocation is done for A?
If temp2.reserve(100000) and temp1.reserve(1000) were added at creation will this ensure no reallocation at all will occur at all?
In the above please disregard the fact that memory could be wasted due to conservative .reserve() calls.
Thank you all in advance!

your example will cause a lot of copying and allocations.
vector<vector<vector<double>>> A;
A.reserve(500+1);
vector<vector<double>> temp2;
vector<double> temp1 (666,666);
for(int i=0;i<500;i++)
{
A.push_back(temp2);
for(int j=0; j< 10000;j++)
{
A.back().push_back(temp1);
}
}
Q: Will this ensure that no reallocation is done for A?
A: Yes.
Q: If temp2.reserve(100000) and temp1.reserve(1000) where added at creation will this ensure no reallocation at all will occur at all?
A: Here temp1 already knows its own length on creation time and will not be modified, so adding the temp1.reserve(1000) will only force an unneeded reallocation.
I don't know what the vector classes copy in their copy ctor, using A.back().reserve(10000) should work for this example.
Update: Just tested with g++, the capacity of temp2 will not be copied. So temp2.reserve(10000) will not work.
And please use the source formating when you post code, makes it more readable :-).

How can reserving 500 entries in A beforehand be enough for [1000][1000]?
You need to reserve > 1000 for A (which is your actual upperbound value), and then whenever you add an entry to A, reserve in it another 1000 or so (again, the upperbound but for the second value).
i.e.
A.reserve(UPPERBOUND);
for(int i = 0; i < 10000000; ++i)
A[i].reserve(UPPERBOUND);
BTW, reserve reserves the number of elements, not the number of bytes.

The reserve function will work properly for you vector A, but will not work as you are expecting for temp1 and temp2.
The temp1 vector is initialized with a given size, so it will be set with the proper capacity and you don't need to use reserve with this as long as you plan to not increase its size.
Regarding temp2, the capacity attribute is not carried over in a copy. Considering whenever you use push_back function you are adding a copy to your vector, code like this
vector<vector<double>> temp2;
temp2.reserve(1000);
A.push_back(temp2); //A.back().capacity() == 0
you are just increasing the allocated memory for temps that will be deallocated soon and not increasing the vector elements capacity as you expect. If you really want to use vector of vector as your solution, you will have to do something like this
vector<vector<double>> temp2;
A.push_back(temp2);
A.back().reserve(1000); //A.back().capacity() == 1000

I had the same issue one day. A clean way to do this (I think) is to write your own Allocator and use it for the inner vectors (last template parameter of std::vector<>). The idea is to write an allocator that don't actually allocate memory but simply return the right address inside the memory of your outter vector. You can easely know this address if you know the size of each previous vectors.

In order to avoid copy and reallocation for a datastructure such as vector<vector<vector<double> > >, i would suggest the following:
vector<vector<vector<double> > > myVector(FIXED_SIZE);
in order to 'assign' value to it, don't define your inner vectors until you actually know their rquired dimensions, then use swap() instead of assignment:
vector<vector<double> > innerVector( KNOWN_DIMENSION );
myVector[i].swap( innerVector );
Note that push_back() will do a copy operation and might cause reallocation, while swap() won't (assuming same allocator types are used for both vectors).

It seems to me that you need a real matrix class instead of nesting vectors. Have a look at boost, which has some strong sparse matrix classes.

Ok, now I have done some small scale testing on my own. I used a "2DArray" obtained from http://www.tek-tips.com/faqs.cfm?fid=5575 to represent a structure allocating memory static. For the dynamic allocation I used vectors almost as indicated in my original post.
I tested the following code (hr_time is a timing routine found on web which I due to anti spam unfortunately cannot post, but credits to David Bolton for providing it)
#include <vector>
#include "hr_time.h"
#include "2dArray.h"
#include <iostream>
using namespace std;
int main()
{
vector<int> temp;
vector<vector<int> > temp2;
CStopWatch mytimer;
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
temp2.push_back(temp);
for(int j=0; j< 2000; j++)
{
temp2.back().push_back(j);
}
}
mytimer.stopTimer();
cout << "With vectors without reserved: " << mytimer.getElapsedTime() << endl;
vector<int> temp3;
vector<vector<int> > temp4;
temp3.reserve(1001);
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
temp4.push_back(temp3);
for(int j=0; j< 2000; j++)
{
temp4.back().push_back(j);
}
}
mytimer.stopTimer();
cout << "With vectors with reserved: " << mytimer.getElapsedTime() << endl;
int** MyArray = Allocate2DArray<int>(1000,2000);
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
for(int j=0; j< 2000; j++)
{
MyArray[i][j]=j;
}
}
mytimer.stopTimer();
cout << "With 2DArray: " << mytimer.getElapsedTime() << endl;
//Test
for(int i=0; i<1000; i++)
{
for(int j=0; j< 200; j++)
{
//cout << "My Array stores :" << MyArray[i][j] << endl;
}
}
return 0;
}
It turns out that there is approx a factor 10 for these sizes. I should thus reconsider if dynamic allocation is appropriate for my application since speed is of utmost importance!

Why not subclass the inner containers and reserve() in constructors ?

If the Matrix does get really large and spare I'd try a sparse matrix lib too. Otherwise, before messing with allocaters, I'd try replacing vector with deque. A deque won't reallocate on growing and offers almost as fast random access as a vector.

This was more or less answered here. So your code would look something like this:
vector<vector<vector<double> > > foo(maxdim1,
vector<vector<double> >(maxdim2,
vector<double>(maxdim3)));

Related

C++ Removing empty elements from array

I only want to add a[i] into the result array if the condition is met, but this method causes empty elements in the array as it adds to result[i]. Is there a better way to do this?
for(int i=0; i<N; i++)
{
if(a[i]>=lower && a[i]<=upper)
{
count++;
result[i]=a[i];
}
}
you can let result stay empty at first, and only push_back a[i] when the condition is met:
std::vector<...> result;
for (int i = 0; i < N; i++)
{
if (a[i] >= lower && a[i] <= upper)
{
result.push_back(a[i]);
}
}
and count you can leave out, as result.size() will tell you how many elements satisfied the condition.
to get a more modern solution, like how Some programmer dude suggested, you can use std::copy_if in combination with std::back_inserter to achieve the same thing:
std::vector<...> result;
std::copy_if(a.begin(), a.end(), std::back_inserter(result),
[&](auto n) {
return n >= lower && n <= upper;
});
Arrays in c++ are dumb.
They are just pointers to the beginning of the array and don't know their length.
If you just arr[i] you have to be sure that you aren't out of bounds. In that case it is undefined behavior as you dont know what part of meory have you written over. You could as well write over a different variable or beginning of another array.
So when you try to add results to an array you already have to have the array created with enough space.
This boilerplate of deleting and creating dumb arrays so that you can grow the array is very efficiently done in std::vector container which remembers number of elements stored, number of elements that could be stored and the array itself. Every time you try to add element when the reserved space is full it creates a new array two times the size of the original one and copy the data over. Which is O(n) in worst case but O(1) in avarege case (it may deviate when the n is under certain threshold)
Then the answer from Stack Danny applies.
Also use emplace_back instead of push_back if you can it is able to construct the data type in place based on the constructor parameters and in other cases it tries to act like push_back. It basically does what you want the fastest way possible so you avoid as much copies as possible.
count=0;
for(int i=0; i<N; i++)
{
if(a[i]>=lower && a[i]<=upper)
{
count++;
result[count] = a[i];
}
}
Try this.
Your code was copying elements from a[i] and pasting it in result[i] at random places.
For example, if a[0] and a[2] meet the required condition, but a[1] doesn't, then your code will do the following:
result[0] = a[0];
result[2] = a[2];
Notice how result[1] remains empty because a[1] didn't meet the required condition. To avoid empty positions in the result array, use another variable for copying instead of i.

How do I generate a vector during each iteration of loop, store data, and then delete that vector?

I am trying to create a vector of vectors in my program. I wish to have a double loop; the inner loops checks for a certain condition, and if that condition is met, a value is stored in my vector. Once the inner loop runs its course, that "temp" vector is stored in the main vector. My idea was to clear my "temp" (inner) vector, but vector.clear() deletes everything in my main vector as well. This is my vector code:
vector <int> vectortestInner;
vector <vector<int> > vectortestOuter(10);
I populate my vectors here:
void vectorTest()
{
for (int i=0; i<vectortestOuter.size(); i++)
{
for (int j=0; j<vectortestOuter.size(); j++)
{
vectortestInner.push_back(j);
}
vectortestOuter[i]=vectortestInner;
//vectortestInner.clear();
}
}
and attempt printing the contents like this:
for(int i=0; i<vectortestOuter.size(); i++)
{
for (int j=0; j<vectortestInner.size(); i++)
{
cout<<vectortestInner[j]<<endl;
}
}
So far, it seems to be printing 0s, (when I want it to print 1-10), and if I call clear();, it just outputs empty lines.
What am I doing wrong, and how can I achieve what I am trying to do? Thanks!
Populating (or repopulating since that's what your first function does can be done with
// remove any global declaration of vectortestInner since we won't use it
void vectorTest()
{
for (int i=0; i<vectortestOuter.size(); i++)
{
std::vector<int> vectortestInner;
vectortestInner.reserve(vectortestOuter.size());
for (int j=0; j<vectortestOuter.size(); j++)
{
vectortestInner.push_back(j);
}
vectortestOuter[i]=vectortestInner;
// vectortestInner ceases to exist here
}
}
This locally constructs vectorTestInner in every iteration of the outer loop, so it will be destructed at the end of the iteration as well. The reserve() call avoids multiple resizing (but is specific to the fact your inner loop is, in total, going to append vectortestOuter.size() elements).
Yes, this reconstructs vectortestInner every time. But that is not actually any worse than clearing and repopulating every time (since those are the most significant operations done in construction and destruction).
To print the elements of your vector of vectors, you actually need to refer to them. Your code has a flaw in that (somehow) you are assuming vectorTestInner magically provides a means of accessing elements of vectorTestOuter. That is not so.
for(int i=0; i<vectortestOuter.size(); i++)
{
for (int j=0; j<vectortestOuter[i].size(); j++) // also using j++ here, not i++
{
cout<<vectortestOuter[i][j]<<endl;
}
}
There are other inefficiencies in your code that I haven't addressed. Rather than using [] consider using iterators as well. I'll leave that as an exercise.
You are printing the temporary vector (vectortestInner) in your inner loop, not a vector contained in the main vector (vectortestOuter[i]).
for(int i=0; i<vectortestOuter.size(); i++)
{
for (int j=0; j<vectortestOuter[i].size(); j++)
{
cout<<vectortestOuter[i][j]<<endl;
}
}
With the printing function changed, the clear of vectortestInner should work as expected.

std multiset insert and keep length fixed

I am interested in inserting elements in a std::multiset but I would like to keep the set fixed length. Every time an element is inserted, the last element will be removed. I came up with the following solution
int main(){
std::multiset<std::pair<double, int>> ms;
for (int i=0; i<10; i++){
ms.insert(std::pair<double, int>(double(rand())/RAND_MAX, i));
}
ms.insert(std::pair<double, int>(0.5, 10));
ms.erase(--ms.end());
for(auto el : ms){std::cout<<el.first<<"\t"<<el.second<<std::endl;}
return 0;
}
I will be doing something similar to this many times in my code on sets of a size in the order of 1000 elements. Is there a more performant way of doing this? I am worried that the erase will cause memory reallocation and slow down the code.

Why can't I insert 6 million elements in STL set?

I am trying to insert a little over 6.5 million elements(ints) in an stl set. Here is the code:
set<int> s;
cout << s.max_size() << endl;
for(int i = 0; i < T.MULT * T.MAXP; i++) {
s.insert(a[i]);
}
T.MULT is 10; T.MAXP is 666013.
a is an array - statically allocated - (int a[T.MULT * T.MAXP];) that contains distinct elements.
After about 4.6 million elements s.insert() throws a bad_alloc exception. The resource monitor available on Windows 7 says I have 3 GB free memory left.
What am I doing wrong? Why can't STL set allocate the memory?
Edit: Here is the full code: http://ideone.com/rdrEnt
Edit2: apparently the inserted elements might not be distinct after all, but that should not be a problem.
Edit3: Here is a simplified version of the code: http://ideone.com/dTp0fZ
The problem actually lies in the fact that you statically allocated the array A with more than 6.5 million elements, which corrupts your program stack space. If you allocate the array on the heap, it actually works. I did some code change based on your description, it worked fine.
int *A = new int[T.MULT * T.MAXP];
for (int i= 0; i < T.MULT * T.MAXP; ++i)
{
A[i] = i; //for simplicity purpose, your array may have different elem. values
}
set<int> s;
for (int i = 0; i < T.MULT * T.MAXP; ++i )
{
s.insert(A[i]);
}
cout << s.size();
set<int>::iterator iter;
int count = 0;
for (iter = s.begin(); iter != s.end(); ++ iter)
{
cout << *iter << " ";
count ++;
if (count == 100)
{
cout <<endl;
count = 0;
}
}
delete [] A;
return 0;
It worked perfectly fine with both vector and set. It can print all those 6.6 million elements on the screen.
As other posts indicated, you may also want to try STXXL if you have interest.
You might want to take a look at STXXL.
While I can't answer your question directly, I think it is more efficient to store your data in a std::vector, sort it, and then use std::binary_search to test for the existence of the item. Storage in a std::set is relatively expensive compared to that of std::vector. That's because there is some overhead when storing each element.
As an example, here's how you could do it. This sorts the static array.
std::sort(a,a+(T.MULT*T.MAXP));
bool existence=std::binary_search(a,a+(T.MULT*T.MAXP),3);
Fast and easy.

Efficient memory allocation for large nested vectors

I'm creating a huge matrix that is stored inside nested vectors:
typedef vector<vector<pair<unsigned int, char>>> Matrix;
The outer vector will eventually contain ~400.000 vectors that which each contain ~220 pairs at max (most contain less). This takes about 1GB of RAM and is done like this:
Matrix matrix;
for (unsigned int i = 0; i < rows; i++) {
vector<pair<unsigned int, char>> row;
for (unsigned int j = 0; j < cols; j++) {
// ...calculations...
row.push_back( pair<unsigned int, char>(x, y) );
}
matrix.push_back(row);
}
The first 20% go quite fast but the larger the outer vector grows, the slower gets the whole process. I'm pretty sure that there is some optimization possible, but I'm not an expert on this field. Are there any simple tricks to speed this up? Or are there any major faults in my attempt?
It would be better to just use a single one dimensional vector and wrap up the row, column indexing in some functions/class. This way the memory for the entire matrix is guaranteed to be contiguous.
And instead of using push_back allocate the entire matrix up front:
std::vector<pair<unsigned int, char>> matrix(rows * cols);
I would start with the obvious optimization.
If you know the number of rows before you start filling the values (or usable upper bound), just reserve the space beforehand. The most time spend when pushing_back a lot of values is spend by reallocating memory and copying already contained values.
Matrix matrix(rows);
for(unsigned i = 0; i < rows; i++) {
vector<pair<unsigned int, char>> row(cols);
for(unsigned j; j < cols; j++) {
row[j] = // value
}
matrix[i] = row;
}
Using the VS 2010 compiler, the following turned out to work best:
Matrix matrix;
matrix.reserve(rows);
vector<pair<unsigned int, char>> row;
row.reserve(cols);
for (unsigned int i = 0; i < rows; i++) {
for (unsigned int j = 0; j < cols; j++) {
// ...calculations...
row.push_back( pair<unsigned int, char>(x, y) );
}
matrix.push_back(row);
row.clear();
}
Creating just a single vector that is used to build up all the rows consumes much less memory than creating a fresh one that allocates memory for "cols" entries every time. Not really sure why that is though.
However, I'm accepting Andreas' answer as this one is only a solution for my specific case while his answer provided the general information needed for such optimizations.
The problem is a lot of data copying when the outer vector grows. Consider changing your typedef to
typedef vector< shared_ptr< vector<pair<unsigned int, char>> > > Matrix;
and doing matrix.reserve(rows) before you start fililng it with values.