Adding elements in loop to container (lifetime) - c++

If I add elements to a vector using the code below, then at the time I call foo, the elements (automatic variables) of vec have been destroyed since the scope in which they are created ends.
std::vector<A> vec;
for (int i = 0; i < n; i++) {
A a;
vec.push_back(a);
}
foo(vec);
My question is now what the textbook solution to such a problem is

No, the elements in vec will be different copies of a.
However, you need to allocate the size of vec if you want to use operator[] or else use vec.push_back():
for (int i = 0; i < n; i++) vec.push_back(A());
EDIT (after question change):
Even though push_back() takes its argument as a reference, internally it will make a copy of it. It takes it argument by reference to avoid making an unnecessary copy prior to making the copy to store internally.

Don't worry about stack variables. When you push value in std::vector, this container creates heap copy of the variable. Thus, all your variables will exists, when you're live the scope.

you can define your variable as global and in your loop just let the value in that variable and then push back

Related

Scope for containers in C++

Let's say I have something like the following piece of code:
int main() {
vector< vector<int> > vecs;
int n_vecs;
cin >> n_vecs;
for (int i = 0; i < n_vecs; i++) {
int n_nums;
cin >> n_nums;
vector<int> tmp;
for (int j = 0; i < n_nums; i++) {
int num;
cin >> num;
tmp.push_back(num);
}
vecs.push_back(tmp);
}
}
which populates a vector of vector<int>s gradually. From some testcases, I understand that after the for loops finish, the vector is constructed as expected. However, I can't understand why is that: shouldn't the tmp vector be
out of scope after the outer for loop finishes? Does a copy of it get inserted into the vecs vector? (The same applies for maps)
Yes, a copy is made. See documentation for push_back:
The new element is initialized as a copy of value
There are two overloads for the function. In your example, the void push_back( const T& value ) overload is chosen because you are passing a named object and have not applied std::move, hence the copy.
If you passed an unnamed object or applied std::move, then the other overload would be chosen, "stealing" the contents of tmp to initialise the new object at the end of the vecs vector.
But that doesn't even matter for your question. In either case, the fact that tmp's lifetime ends afterwards (because of its scope in the source code) is irrelevant. The vecs vector already contains and owns a new element, and what happens to tmp is none of its concerns.
Perhaps the most important thing to keep in mind is that C++ standard containers are designed so that you can use them as easily as plain ints. You can copy them, return them, pass them around and assign them, and the results will always be as expected. This is because C++ standard containers own their contents.
Does a copy of it get inserted into the vecs vector?
Yes.

Delete member of struct on heap after std::move()

I'm currently refactoring and change existing code to C++11 and I wonder if have memory leak. My code has a struct with a std::vector in it as well as a method to shrink() this vector down to its negative elements.
struct mystruct_t {
int other_stuff;
std::vector <int> loc;
// Adds elements to loc vector
void add(int pos){
loc.push_back(pos);
}
// Shrink the list
void shrink () {
std::vector<int> tmp;
for (unsigned int i = 0; i < loc.size(); ++i) {
if (loc[i] < 0) tmp.push_back (loc[i]);
}
loc = tmp;
std::vector<int>(loc).swap (loc);
}
mystruct_t(): otherstuff(0) {};
};
In another function I create a new instance of this struct like this:
mystruct_t c = new mystruct_t;
c->add(2);
c->add(3);
...
And later I call the shrink() method of this struct.
c->shrink()
Now I'm not sure what's happening with the "old" loc vector after the shrink function?
Will it get destroyed automatically or do I have to destroyed by hand? And if the later, how would I do that?
I also tried to change shrink() to more C++11 style by change it to:
void shrink (){
std::vector<int> tmp;
for (auto &currLoc : loc) {
if (currLoc < 0) tmp.push_back (currLoc);
}
loc = std::move(tmp);
}
But the question remains the same what is happening to the "old" loc vector additionally this seems to increase the memory usage. I'm new to C++11 and not sure if I totally misunderstand the concept?
Now I'm not sure what's happening with the "old" loc vector after the shrink function?
There is no "old" loc vector. Through the lifetime of a mystruct_t object, it has exactly one member vector loc. You never get a new member or throw away an old one.
When you copy assign to the member (loc = tmp;), the buffer - cotained within the vector - is renewed. The vector owns the buffer, and the vector takes care that it is destroyed properly. Same applies when you move assign in the c++11 version.
Will it get destroyed automatically
If you refer to the memory allocated by the vector, then yes.
or do I have to destroyed by hand?
You have to destroy by hand only whatever you created by hand. You didn't call new, so you don't call delete.
additionally this seems to increase the memory usage.
Your c++11 version lacks the "shrink to fit" part of the original (std::vector<int>(loc).swap (loc);). In c++11 you can do:
loc = std::move(tmp);
loc.shrink_to_fit();
In the pre c++11 version, can get rid of the copy assignment and simply construct the temporary from tmp, and swap it with loc:
std::vector<int> tmp;
// copy the objects you want
std::vector<int>(tmp).swap(loc);
Operation std::move just casting values, so there is no additional memory usage.
When you use std::move compiler will remove head address of first object, and just reassign memory to second object. So it's very fast operation, etc just changing the head of data.

are C++ structs fully copied or just referenced when assigned with '='?

If structs are fully copied, then the first loop is more expensive than the second one, because it is performing an additional copy for each element of v.
vector<MyStruct> v;
for (int i = 0; i < v.size(); ++i) {
MyStruct s = v[i];
doSomething(s);
}
for (int i = 0; i < v.size(); ++i) {
doSomething(v[i]);
}
Suppose I want to write efficient code (as in loop 2) but at the same time I want to name the MyStruct elements that I draw from v (as in loop 1). Can I do that?
Structs (and all variables for that matter) are indeed fully copied when you use =. Overloading the = operator and the copy constructor can give you more control over what happens, but there is no way you can use these to change the behavior from copying to referencing. You can work around this by creating a reference like this:
for (int i = 0; i < v.size(); ++i) {
MyStruct& s = v[i]; //& creates reference; no copying performed
doSomething(s);
}
Note that the struct will still be fully copied when you pass it to the function, unless the argument is declared as a reference. This is a common pattern when taking structs as arguments. For instance,
void doSomething(structType x);
Will generally perform poorer than
void doSomething(const structType& x);
If sizeof structType is greater than sizeof structType*. The const is used to prevent the function from modifying the argument, imitating pass-by-value behavior.
In your first example, the object will be copied over and you will have to deal with the cost of the overhead of the copy.
If you don't want the cost of the over head, but still want to have a local object then you could use a reference.
for (int i = 0; i < v.size(); ++i) {
MyStruct& s = v[i];
doSomething(s);
}
You can use references or pointers to avoid copying and having a name to relate to.
vector<MyStruct> v;
for (int i = 0; i < v.size(); ++i) {
MyStruct& s = v[i];
doSomething(s);
}
However since you use a vector for your container, using iterators might be a good idea. doSomething should take argument by const ref though otherwise, you'll still copy to pass argument to it.
vector<MyStruct> v;
for (vector<MyStruct>::iterator it = v.begin(); it != v.end(); ++it) {
doSomething(*it);
}
In your examples, you are creating copies. However not all uses of operator '=' will result in a copy. C++11 allows for 'move construction' or 'move assignment' in which case you aren't actually copying the data; instead, you're just (hopefully) making a high-speed move from one structure to another. (Naturally, what it ACTUALLY does is entirely dependent upon how the move constructor or move assignment operator is implemented, but that's the intent.)
For example:
std::vector<int> foo(); // returns a long vector
std::vector<int> myVector = std::move(foo());
Will cause a MOVE construction, which hopefully just performs a very efficient re-pointing of the memory in the new myVector object, meaning that you don't have to copy the huge amount of data.
Don't forget, however, about the return-value optimization, as well. This was just a trivial example. RVO is actually superior to move semantics when it can be used. RVO allows the compiler to simply avoid any copying or moving at all when an object is returned, instead just using it directly on the stack where it was returned (see http://en.wikipedia.org/wiki/Return_value_optimization). No constructor is called at all.
Copied*. Unless you overload the assignment operator. Also, Structs and Classes in C++ are the same in this respect, their copy behaviour does not differ as it does in c#.
If you want to dive deep into C++ you can also look up the move operator, but it is generally best to ignore that for beginners.
C++ does not have garbage collection, and gives more control over memory management. If you want behaviour similar to c# references, you can use pointers. If you use pointers, you should use them with smart pointers (What is a smart pointer and when should I use one?).
* Keep in mind, if the struct stores a pointer, the pointer in a copied struct will point to the same location. If the object in that location is changed, both structs' pointers will see the changed object.
P.S: I assume you come from a c# background based on the vocabulary in your question.

C++ return value optimization

This code:
#include <vector>
std::vector<float> getstdvec() {
std::vector<float> v(4);
v[0] = 1;
v[1] = 2;
v[2] = 3;
v[3] = 4;
return v;
}
int main() {
std::vector<float> v(4);
for (int i = 0; i != 1000; ++i)
{
v = getstdvec();
}
}
My incorrect understanding here is that the function getstdvec shouldn't have to actually allocate the vector that it's returning. When I run this in valgrind/callgrind, I see there are 1001 calls to malloc; 1 for the initial vector declaration in main, and 1000 for every loop iteration.
What gives? How can I return a vector (or any other object) from a function like this without having to allocate it every time?
edit: I'm aware I can just pass the vector by reference. I was under the impression that it was possible (and even preferable) to write a function like this that returns an object without incurring an unnecessary allocation.
When you call a function, for a return type like std::vector<T> the compiler provides memory for the returned object. The called function is responsible for constructing the instance it returns in this memory slot.
The RVO/NRVO now allows the compiler to omit creating a local temporary object, copy-constructing the returned value in the memory slot from it, destructing the temporary object and finally returning to the caller. Instead, the called function simply constructs the local object in the return slot's memory directly and at the end of the function, it just returns.
From the caller's perspective, this is transparent: It provides memory for the returned value and when the function called returned, there is a valid instance. The caller may now use this object and is responsible for calling the destructor and freeing the memory later on.
This means that the RVO/NRVO only work for when you call a function to construct a new instance, not when you assign it. The following is an example of where RVO/NRVO could be applied:
std::vector<float> v = getstdvec();
but you original code uses a loop and in each iteration, the result from getstdvec() needs to be constructed and this temporary is assigned to v. There is no way that the RVO/NRVO could remove this.
You can pass it by reference...copy elision makes it so that v = getstdvect() allocates v (in your main) directly to the v (in your getstdvec()) and skips the copy usually associated with returning by value, but it will NOT skip the v(4) in your function. In order to do that, you need to take the vector in by reference:
#include <vector>
void getstdvec(std::vector<float>& v){
v.resize(4);//will only realocate if v is wrong size
v[0] = 1; v[1] = 2; v[2] = 3; v[3] = 4;
return v;
}
int main() {
std::vector<float> v(4);
for (int i=0; i!=1000;++i)
getstdvec(v);
}
You're doing copy-assignment in your loop, not copy-construction. The RVO optimization only applies to constructing variables from a return value, not assigning to them.
I can't quite make out the real problem you're trying to solve here. With more details it might be possible to provide a good answer that addresses your underlying problem.
As it stands, to return from your function in such a way you'll need to create a temporary vector to return each time the function is called.
The simplest answer is to pass the already created vector object into the function.
std::vector<float> getstdvec(std::vector<float> &myvec){
In that case you don't really have to return it so
void getstdvec(std::vector<float> &myvec){
in stead use the return value, you can use a reference:
void getstdvec(std::vector<float> &v)
Which can avoid the copy of temporary object
How can I return a vector (or any other object) from a function like this without having to allocate it every time?
In your way, you declare a local vector with size 4, so each time the function is called, it is going to allocate the memory. If you means that you always modify on the same vector, then you may consider pass the vector by reference instead.
For example:
void getstdvec(std::vector<float>& vec)
{ //^^
//do something with vec
}
inside main, you declare the vector and allocate space as what you did. You now do the following:
for (int i=0; i!=1000;++i)
{ //^^^minor: Don't use magic number in the code like this,
//define a const instead
getstdvec(vec);
}

How to insert objects in a vector efficiently and correctly

Suppose I want to declare a vector of objects. I can do it this way -
vector<mynode> nodes;
But if the size of the mynode is large, this would be bad. So I think of doing it this way -
vector<mynode*> nodes;
But the above declaration has an obvious problem that I'm storing addresses and it's not safe at all. For instance, if i add objects in a foor loop -
vector<mynode*> nodes;
for (int i=0; i<10; i++)
{
mynode mn;
nodes.push_back(&mn);
}
This will lead to errors as I can never guarantee if the contents of the pointer are actually ok.
So, I decide to use this declaration -
vector<mynode&> nodes;
for (int i=0; i<10; i++)
{
mynode mn;
nodes.push_back(mn);
}
is this ok? safe? It gives a compilation with the first line itself. Please suggest some efficient way of storing the objects in a vector. thanks a lot.
I can do it this way -
vector<mynode> nodes;
But if the size of the mynode is large, this would be bad.
No, it would not. You need to store the objects anyway. If you are worried about copying large objects, you have some solutions:
Use std::vector<std::unique_ptr<my_node>> (or another smart pointer), which automatically releases objects on destruction. This is the best solution if my_node is polymorphic.
Use std::vector<my_node> and use the emplace_back function to construct objects in place (beware if you're using visual studio 2010, this function does not do what it is supposed to do).
Still use std::vector<my_node> and use push_back with a rvalue reference, as in
v.push_back(std::move(some_node));
to move already constructed objects.
Anyway, a good rule of thumb is to have the copy constructor/assignment deleted (or private) for most non-lightweight objects. Containers are still functional (provided, again, that you use C++11) and your concerns are moot.
Using references gives is essentially the same as using pointers (it's just that you don't need to dereference them in code).
If you want to automatically ensure that the objects inserted to vector don't get deleted without copying them, you should use smart pointers from boost or c++11.
vector< smart_ptr<mynode> > nodes;
for (int i=0; i<10; i++)
{
smart_ptr<mynode> mn = new mynode();
nodes.push_back(mn);
}
I dont see pointer being so bad here. It's not void or something. Inserting reference as in your example saves a reference to a temporary object located on a stack an this will go out of scope...