I am interested in the best way regarding performance of copying large containers. Imagine that one has a vector container that stores for example 60.000.000 entries (probably long doubles) or much more values. Now, if one is solving, for example, an ODE (ordinary differential equation), it is necessary (based on the algorithm in use) to make a copy of the old values which are used for the calculation to update the new values. Following (imaginary) example:
// This container is inside a class (so it is initialized and stored at the memory during runtime)
vector<long double> Y(60000000,0);
// Later on in a function
void solve()
{
for(i=0; i<iMax, ++i)
{
// Make a copy for the field one is solving for (depending on the algorithm in use if it is needed)
// The following is not the best solution as we allocate and deallocate YprefIter for each
// Iteration; imagine iMax = 1000000
vector<long double> YprefIter = Y;
...
// Do some analysis (simplified);
Y = something * YprefIter * something + anything
// As YprefIter might be used somewhere else, we cannot update Y only
Y = YprefIter + Y * whatever
...
}
}
Sure, while taking the vector<long double> YprefIter before the brackets, we do not have to create and destroy the object for each iteration. This should be definitely a better choice:
// This container is inside a class (so it is initialized and kept)
vector<long double> Y(60000000,0);
// Later on in a function
void solve()
{
vector<long double> YprefIter (Y.size(), 0);
for(i=0; i<iMax, ++i)
{
// Make a copy for the iteration algorithm
// Better solution as we get rid of the memory allocation and deallocation
YprefIter = Y;
...
}
}
However, I am asking myself, if there are more advanced solutions around. Such as using the move semantics in such an example or do other things that I am not aware of stuff which would be much better in the sense of using actual developments. I would expect that my above-mentioned strategies are not state of the art. It just came into my mind that I could use two pointers while switching the pointing object for each iteration. However this is just a thought, did not test my logic here but the idea is that I do not need to copy anything; maybe a better solution and if such things work, I am sure there is already something implemented in c++ :)
// This container is inside a class (so it is initialized and kept)
vector<long double> Y(60000000,0);
// Later on in a function
void solve()
{
// Create the second object
vector<long double> YprefIter (Y.size(), 0);
// Pointer 1 and Pointer 2
vector<long double>* pToY = NULL;
vector<long double>* pToYPref = NULL
// Set pointer pToY to point to Y
pToY = &Y;
for(i=0; i<iMax, ++i)
{
// Switch the Pointer fields for each iteration
if (i%2)
{
pToY = &Y;
pToYPrefIter = &YPrefIter;
}
else
{
pToY = &YPrefIter;
pToYPrefIter = &Y;
}
// Work with the pointers afterwards
...
}
}
Any comment is appreciated. Tobi
The first snippet is basically something like this
for(i=0; i<iMax, ++i)
{
vector<long double> YprefIter = Y;
// ...
Y = f(YprefIter);
// ...
}
In this case, you could simply swap the two vectors:
// Initialize Y_old
vector<long double> Y_old = whatever(),
Y;
for(i=0; i<iMax, ++i)
{
// ...
Y = f(Y_old);
// ...
// The swap is implemented in terms of moves, it doesn't copy the values.
std::swap(Y_old, Y);
}
Related
#include<bits/stdc++.h>
using namespace std;
class Heap
{
vector <int> v;
int length;
public:
void create(vector <int> v, int s);
void display();
};
void Heap::create(vector <int> v, int s)
{
length=s+1;
for(int i=1;i<=s;i++)
{
this->v[i]=v[i-1];
}
int temp;
int j;
for(int i=2;i<length;i++)
{
temp=v[i];
j=i;
while(j>1&&temp>v[j/2])
{
swap(v[j],v[j/2]);
j=j/2;
}
if(j==1)
{
v[j]=temp;
}
}
}
void Heap::display()
{
for(int i=1;i<length;i++)
{
cout<<v[i]<<"\t";
}
cout<<endl;
}
int main()
{
vector <int> v;
int ans=1;
int d;
while(ans==1)
{
cout<<"Enter the Data\n";
cin>>d;
v.push_back(d);
cout<<"Do you want to enter more data?\n";
cin>>ans;
}
cout<<endl;
Heap h;
h.create(v,((int)v.size()));
h.display();
}
When i execute this code, it asks me to enter the data value. i enter all the data values i want to enter and click the enter button. it shows segmentation error. also the execution is taking a lot of time which is very unusaul. i use codeblocks version 20.
When i execute this code, it asks me to enter the data value. i enter all the data values i want to enter and click the enter button
Yeah, I'm not interested in guessing what you typed in order to reproduce your problem. I'm also not interested in guessing whether the issue is in your I/O code or the code you think you're testing.
Always remove interactive input when you're preparing a minimal reproducible example so that other people can actually reproduce it.
Sometimes removing the interactive input may fix your problem, in which case you've learnt something important (and probably want to ask a different question about your input code).
it shows segmentation error
A segmentation fault interrupts your program at the exact point where it happens. If you run your program in a debugger, it will show you where this is, and the state of everything in your program when it happened. You should try this, and learn to use your debugger.
this->v[i]=v[i-1];
As correctly pointed out in the other answer, there is a bug on this line.
You correctly called push_back when reading input, so you could just do the same here. Alternatively you need to explicitly size this->v before indexing elements that don't exist.
The other main problem with this function is that it mixes up this->v (used, illegally, only once on the line above) and v which is a local copy of the v in main, and which goes out of scope and is lost forever at the end of the function.
Just give your variables different names so you don't have to write this->v on all the other lines where you currently refer to v. Also, consider passing the original v by const ref instead of making a copy.
NB. I do see and understand that you're deliberately switching to 1-based indexing for the sort. If for some reason you can't just use std::sort or std::make_heap, you could at least explicitly set the zeroth element to zero, and then just std::copy the rest.
Finally, Heap::create really looks like it should just be a constructor. Forcing two-phase initialization is poor style in general, and I don't see any reason for it here.
First issue: you have used 'this->v' before initializing it. In this point:
this->v[i]=v[i-1];
this->v have size 0 and has no element to be accessed via index;
Furtheremore you have used wrong indices for it. Assuming this->v has initialized, correct index access is like this
this->v[i-1]=v[i-1];
Finally, it is better to sort the std vectors by using std::sort builtin function:
#include <algorithm>
std::sort(this->v.begin(), this->v.end());
This is obviously a school exercise. So I will only give you pointers as to where your code goes wrong.
class Heap
{
// vector <int> v; // v is not a suitable name for a class member, it's too short
// int length; // why length ? Your container declared above has length information, using
// a duplicate can only introduce opportunities for bugs!!!
vector<int> heap; // I've also renamed it in code below
public:
void create(vector <int> v, int s);
void display();
};
// some documentation is needed here...
// A I read it, it should be something like this, at least (this may reveal some bug):
//
// Initializes heap from range [v[1], v[s])
// void Heap::create(vector <int> v, int s) // you do not need a copy of the source vector!
void Heap::create(const vector& <int> v, int s) // use a const reference instead.
{
// This is not how you assign a vector from a range
// length=s+1;
// for(int i=1;i<=s;i++)
// {
// this->v[i]=v[i-1];
// }
// check inputs always, I'll throw, but you should decide how to handle errors
// This test assumes you want to start at v[1], see comment below.
if (0 > s || s >= v.size())
throw std::out_of_range ("parameter 's' is out of range in Heap::Create()");
// assign our private storage, why the offset by '1' ????
// I doubt this is a requirement of the assignment.
heap.assign(v.begin() + 1, v.begin() + s + 1);
//int temp; // these trivial variables are not needed outside the loop.
//int j;
// why '2' ?? what happens to the first element of heap?
// shouldn't the largest element be already stored there at this point?
// something is obviously missing before this line.
// you'll notice that v - the parameter - is used, and heap, our
// private storage is left unchanged by your code. Another hint
// that v is not suitable for a member name.
for(int i = 2; i < v.length(); i++)
{
int temp = v[i]; // temp defined here
int j = i;
//while(j > 1 && temp > v[j/2]) // avoid using while() when you can use for().
//{
// swap(v[j],v[j/2]);
// j=j/2;
//}
// This is your inner loop. it does not look quite right
for (; j > 1 && temp > v[j / 2]; j = j / 2)
swap(v[j], v[j/2]);
if (j == 1)
v[j] = temp;
}
}
void Heap::display()
{
for(int i=1;i<length;i++)
{
cout<<v[i]<<"\t";
}
cout<<endl;
}
From reading your code, it seems you forgot that vectors are zero-based arrays, i.e. The first element of vector v is v[0], and not v[1]. This creates all kinds of near unrecoverable errors in your code.
As a matter of personal preference, I'd declare Heap as deriving publicly from std::vector, instead of storing data in a member variable. Just something you should consider. You could use std::vector<>::at() to access and assign elements within the object.
As is, your code will not function correcly, even after fixing the memory access errors.
I have a strange problem. There is a vector of structures. With a temporary structure, I push_back to that vector of structures. But when I check the first member's cnt, I see that it became changed. Any idea? (Below code is a simplified version but a representative one)
struct Vector
{
float *dim;
Vector ()
{
dim = new float [3];
}
};
struct Face
{
float an_N, an_P;
int P, N;
Vector Af;
float Ad;
Vector cnt;
float ifac;
float mf;
};
std::vector <Face> face;
Face temp_face;
for (;;)
{
temp_face.cnt.dim[0] = 0.f;
temp_face.cnt.dim[1] = 0.f;
temp_face.cnt.dim[2] = 0.f;
for (int q=0; q<n_vtx_2D; ++q)
{
temp_face.cnt = temp_face.cnt + pt[vtx[q]] / n_vtx_2D;
}
face.push_back(temp_face);
}
std::cout << face[0].cnt.dim[0] << std::endl;
Output
0.25
0
The default compiler generated copy constructor (and assignment operator) is being used for Vector and Face (but most importantly Vector). As the code re-uses the same instance of Face, named temp_face, all the instances of Face in the vector face point to the same Face.cnt.dim array (as the faces will contain copies of temp_face).
I can't see any reason for dynamically allocated the array inside Vector as it is a fixed size. Suggest changing to:
struct Vector
{
float dim[3];
};
Or you need to implement copy constructor, assignment operator and destructor for Vector.
See What is The Rule of Three? for more information.
There is a function, accepting 2D-array:
void foo ( double ** p )
{ /*writing some data into p*/ }
I wouldn't like to pass raw 2D array into this function because i don't want to manage memory calling new and delete. Also i wouldn't like to change function signature to void foo ( std::vector< std::vector<double> >& ). And i can't make foo as a template function (in my project it's a COM-interface method).
I would like to pass some RAII object decoring it raw-one, like
void foo ( double * p ){}
std::vectore<double> v(10);
p( &v[0] );
Is there way to do this for 2D-arrays? I tried
std::vector< std::vector<int> > v;
foo( &v[0][0] )
and
std::vector< std::tr1::shared_ptr<int> > v;
but i get compile errors error C2664 - cannot convert parameter.
Also, can one be sure that raw address arithmetics inside the function works ok in this case?
No C++11, the sizes of 2D-array are known.
One possible solution is to change:
std::vector< std::vector<int> > v;
foo( &v[0][0] )
to
std::vector< std::vector<int> > v;
std::vector<int*> v_ptr;
for (...) {
v_ptr[i] = &v[i][0];
}
foo( &v_ptr[0])
Although Andreas already gave an answer to your question that solves your particular problem, I would like to point out, that although that works, it is probably not what you want to do.
As you are using C++ (and not an easier to use language), I am assuming that you care about performance. In that case your approach is in all likelihood wrong from the beginning, as double** represents an array of arrays (or rather can represent) and not a 2D array. Why is this bad? Because, just as with the std::vector< std::vector<int> > example, multiple allocations are required, and the resulting memory is non-contiguous (which is bad for the cache) unlike a double array[N][M].
For 2D arrays you should in fact use 1D arrays with index calculation, which is far more cache-friendly. This of course does not allow [x][y]-style indexing (instead you have to use [x+N*y] or [y+M*x], depending on your choice of "Fortran order" or "C order"), but avoids cache issues and only requires a single allocation (and a simple double*pointer).
If you are iterating all elements of the array as in
for(unsigned x = 0; x < N; ++x) for(unsigned y = 0; y < M; ++y)
{
unsigned idx = y + M * x;
p[idx] = bar(x, y); // equivalent to p[x][y] = bar(x,y) in the array-of-arrays case
}
You can even avoid the multiplications as this is basically just a 1D iteration with additional generation of 2D indices
for(unsigned x = 0, idx = 0; x < N; ++x) for(unsigned y = 0; y < M; ++y, ++idx)
{
p[idx] = bar(x, y);
}
Understanding std::set.insert & std::vector behavior.
Please consider the following scenario:
A.h
class A {
uint id;
vector<double> values;
operator<(const A& argA) const;
}
A.cpp
A::A(uint argId, vector<double> argValues) {
this->id = argId;
this->values = argValues;
}
A::operator<(const A& argA) const {
// it's guaranteed that there's always at least one element in the vector
return this->values[0] < argA.values[0];
}
B.cpp
std::set<A> mySet;
for (uint i = 0; i < (uint) 10; i++)
{
vector<double> tempVector(3);
for (uint j = 0; j < (uint) 3; j++) {
tempVector[j] = j;
}
myset.insert(A(i + 1, tempVector));
}
In my understanding, tempElement owns a deep copied vector (values), because the vector was passed by value in its constructor and assigned. Therefore looping over i shouldn't break the added elements to my set. BUT inserting *tempElement breaks - SIGSEV. In my logic this should work... Every help appreciated!
EDIT: the code crashes during the insertion process (second element); set invokes the LT-operator, tries to access the vector of the passed argument - but cannot. Before the creation of A where I pass the id and the vector I check if the passed vector contains the right elements.
For a small vector it shouldn't matter, but if you have a large array and it will be expensive to keep copying it, yourA should contain some kind of pointer that shallow-copies. There are several options:
boost::shared_array<double>
boost::shared_ptr<vector<double> >
boost::shared_ptr<double> but with array deleter passed in on construction.
Make A non-copyable and have a set of (shared) pointers to A with some comparison functor that compares what is in the pointers rather than the pointers themselves.
Note that with either shared_array or shared_ptr you won't be able to extract the size (number of elements) so you would have to store that separately.
I don't think the problem is in this code. However I notice you have a vector tempVector but you assing the values to tempComponents instead. I can't see tempComponents declaration but my guess is it is of different size.
Working code with numerous changes - but I don't see the problem that you were describing.
#include <set>
#include <vector>
using namespace std;
typedef unsigned int uint;
class A {
public:
A(uint argId, vector<double> argValues)
{
this->id = argId;
this->values = argValues;
}
bool operator < ( A const& a ) const
{
return a.id < id;
}
uint id;
vector<double> values;
};
int _tmain(int argc, _TCHAR* argv[])
{
std::set<A> mySet;
for (uint i = 0; i < (uint) 10; i++)
{
vector<double> tempVector(3);
for (uint j = 0; j < (uint) 3; j++) {
tempVector[j] = j;
}
std::unique_ptr<A> tempElement(new A(i + 1, tempVector));
mySet.insert(*tempElement);
}
return 0;
}
No, there's no reason for inserting into myset here to cause a crash. The problem must lie elsewhere. Perhaps in A's copy ctor if you're not using the default one.
However your code is leaking memory. When you insert into the set *tempElement is copied into the set, and then the original that you allocated with new is no longer used but is never deleted. Instead you could just do A tempElement(i+1,tempVector); so that after the object is copied into the set it gets properly destroyed. Or perhaps better in this case you could just construct it as a temporary passed directly to insert: myset.insert(A(i+1,tempVector)) in which case the object will be moved instead of copied, reducing the overhead. Or you could just construct the object in place to avoid even moving: myset.emplace(i+1,tempVector);
Also I'm assuming that by tempComponents[j] = j; you meant tempVector[j] = j. You could replace that loop with std::iota(begin(tempVector),end(tempVector),0). edit: or you could use the new initializer syntax Furthermore, since the vector is the same everytime you could use just one outside the loop:
vector<double> tempVector(3) = {0.0,1.0,2.0}
std::set<A> mySet;
for (uint i = 0; i < (uint) 10; i++)
{
myset.emplace(i+1,tempVector);
}
C++03 compilers won't support emplace or the new initializer syntax, and iota would be a compiler extension for them (it's from the original SGI STL, so some may have it). For those you would still use insert and use a for loop to initialize tempVector or use an array:
double tempVector_init[] = {0.0,1.0,2.0};
vector<double> tempVector(tempVector_init,tempVector_init+3);
std::set<A> mySet;
for (uint i = 0; i < (uint) 10; i++)
{
myset.insert(A(i+1,tempVector));
}
I have written this function
vector<long int>* randIntSequence(long int n) {
vector<long int> *buffer = new vector<long int>(n, 0);
for(long int i = 0; i < n; i++)
buffer->at(i);
long int j; MTRand myrand;
for(long int i = buffer->size() - 1; i >= 1; i--) {
j = myrand.randInt(i);
swap(buffer[i], buffer[j]);
}
return buffer;
}
but when I call it from main, myvec = randIntSequence(10), I see the myvector always empty. Shall I modify the return value?
The swap call is indexing the *buffer pointer as if it were an array and is swapping around pointers. You mean to swap around the items of the vector. Try this modification:
swap((*buffer)[i], (*buffer)[j]);
Secondary to that, your at calls don't set the values as you expect. You are pulling out the items in the vector but not setting them to anything. Try one of these statements:
buffer->at(i) = i;
(*buffer)[i] = i;
You never assign to any of the elements in the vector pointed to by buffer:
for (long int i = 0; i < n; i++)
buffer->at(i); // do you mean to assign something here?
You end up with the vector containing n zeroes.
Your question has already been answered, so I'll make this CW, but this is how your code should look.
std::vector<long int> randIntSequence(long int n)
{
std::vector<long int> buffer(n);
for(int i=0; i<n; ++i)
buffer[i] = i;
std::random_shuffle(buffer.begin(), buffer.end());
return buffer;
}
There is absolutely no reason you should be using a pointer here. And unless you have some more advanced method of random shuffling, you should be using std::random_shuffle. You might also consider using boost::counting_iterator to initialize the vector:
std::vector<long int> buffer(
boost::counting_iterator<long int>(0),
boost::counting_iterator<long int>(n));
Though that may be overkill.
Since the question is about STL, and all you want is a vector with random entries then:
std::vector<long int> v(10);
generate( v.begin(), v.end(), std::rand ); // range is [0,RAND_MAX]
// or if you provide long int MTRand::operator()()
generate( v.begin(), v.end(), MTRand() );
But if you want to fix your function then
n should be size_t not long int
First loop is no-op
As John is saying, buffer is a pointer, so buffer[0] is your vector, and buffer[i] for i!=0 is garbage. It seems you have been very lucky to get a zero-sized vector back instead of a corrupt one!
Is your intention to do random shuffle? If yes, you are shuffling around zeros. If you just want to generate random entries then why don't you just loop the vector (from 0 to buffer->size(), not the other way around!!) and assign your random number?
C++ is not garbage collected, and you probably don't want smart pointers for such simple stuff, so you'll be sure to end up with leaks. If the reason is in generating a heap vector and returning by pointer is avoiding a copy for performance's sake, then my advise is don't do it! The following is the (almost) perfect alternative, both for clarity and for performance:
vector<T> randIntSequence( size_t n ) {
vector<T> buffer(n);
// bla-bla
return buffer;
}
If you think there is excess copying around in here, read this and trust your compiler.