C++ return value optimization - c++

This code:
#include <vector>
std::vector<float> getstdvec() {
std::vector<float> v(4);
v[0] = 1;
v[1] = 2;
v[2] = 3;
v[3] = 4;
return v;
}
int main() {
std::vector<float> v(4);
for (int i = 0; i != 1000; ++i)
{
v = getstdvec();
}
}
My incorrect understanding here is that the function getstdvec shouldn't have to actually allocate the vector that it's returning. When I run this in valgrind/callgrind, I see there are 1001 calls to malloc; 1 for the initial vector declaration in main, and 1000 for every loop iteration.
What gives? How can I return a vector (or any other object) from a function like this without having to allocate it every time?
edit: I'm aware I can just pass the vector by reference. I was under the impression that it was possible (and even preferable) to write a function like this that returns an object without incurring an unnecessary allocation.

When you call a function, for a return type like std::vector<T> the compiler provides memory for the returned object. The called function is responsible for constructing the instance it returns in this memory slot.
The RVO/NRVO now allows the compiler to omit creating a local temporary object, copy-constructing the returned value in the memory slot from it, destructing the temporary object and finally returning to the caller. Instead, the called function simply constructs the local object in the return slot's memory directly and at the end of the function, it just returns.
From the caller's perspective, this is transparent: It provides memory for the returned value and when the function called returned, there is a valid instance. The caller may now use this object and is responsible for calling the destructor and freeing the memory later on.
This means that the RVO/NRVO only work for when you call a function to construct a new instance, not when you assign it. The following is an example of where RVO/NRVO could be applied:
std::vector<float> v = getstdvec();
but you original code uses a loop and in each iteration, the result from getstdvec() needs to be constructed and this temporary is assigned to v. There is no way that the RVO/NRVO could remove this.

You can pass it by reference...copy elision makes it so that v = getstdvect() allocates v (in your main) directly to the v (in your getstdvec()) and skips the copy usually associated with returning by value, but it will NOT skip the v(4) in your function. In order to do that, you need to take the vector in by reference:
#include <vector>
void getstdvec(std::vector<float>& v){
v.resize(4);//will only realocate if v is wrong size
v[0] = 1; v[1] = 2; v[2] = 3; v[3] = 4;
return v;
}
int main() {
std::vector<float> v(4);
for (int i=0; i!=1000;++i)
getstdvec(v);
}

You're doing copy-assignment in your loop, not copy-construction. The RVO optimization only applies to constructing variables from a return value, not assigning to them.
I can't quite make out the real problem you're trying to solve here. With more details it might be possible to provide a good answer that addresses your underlying problem.
As it stands, to return from your function in such a way you'll need to create a temporary vector to return each time the function is called.

The simplest answer is to pass the already created vector object into the function.
std::vector<float> getstdvec(std::vector<float> &myvec){
In that case you don't really have to return it so
void getstdvec(std::vector<float> &myvec){

in stead use the return value, you can use a reference:
void getstdvec(std::vector<float> &v)
Which can avoid the copy of temporary object

How can I return a vector (or any other object) from a function like this without having to allocate it every time?
In your way, you declare a local vector with size 4, so each time the function is called, it is going to allocate the memory. If you means that you always modify on the same vector, then you may consider pass the vector by reference instead.
For example:
void getstdvec(std::vector<float>& vec)
{ //^^
//do something with vec
}
inside main, you declare the vector and allocate space as what you did. You now do the following:
for (int i=0; i!=1000;++i)
{ //^^^minor: Don't use magic number in the code like this,
//define a const instead
getstdvec(vec);
}

Related

How the vector gets returned even though it is a local variable inside a method of a class

The vector<int> bfs is local to the method bfs_of_graph then how can we return the vector as it would be erased in the memory and only garbage values should be printed in the calling method. But I find the values are present. How ?
class Solution {
public:
vector<int> bfs_of_graph(int no_of_nodes, vector<int> adj_list[]) {
queue<int> qt;
vector<int> bfs;
vector<int> visited(no_of_nodes + 1, 0);
for (int i = 1; i <= no_of_nodes; ++i) {
// Node in graph not visited
if (visited[i] == 0) {
qt.push(i);
visited[i] = 1;
while (!qt.empty()) {
int node = qt.front();
qt.pop();
bfs.push_back(node);
for (auto ele: adj_list[node]) {
if (visited[ele] == 0) {
qt.push(ele);
visited[ele] = 1;
}
}
}
}
}
return bfs;
}
};
Until C++11, the basic behavior according to the standard was to invoke the copy constructor of the returned object (vector<int> in your case). However, according to the as-if rule, some compilers and optimizers applied the copy-elision optimization (also known as RVO).
C++11 introduced move semantics. In some cases, local objects can be moved into the caller's object. You can see here for more info: C++11 rvalues and move semantics confusion (return statement). Some compilers and optimizers still continued to use the copy-elision optimization when appropriate.
C++17 introduced guaranteed copy-elision in certain cases. See here: How does guaranteed copy elision work?.
The bottom line: even without any modern C++ stuff, the code you mentioned can work by creating a copy of the local object (thus constructing the object on the caller side). But nowadays, usually there will be no copy (either a move, or copy-elision all the way).
If you just look at it then you see that the return type is vector<int>. It's not a pointer nor a reference. The vector is returned as value. Which means it gets copied.
Copied to where? Say you have the following:
std::vector<int> v = bfs_of_graph(...);
Internally the compiler places v on the stack in passes the address of v in the structure return register. The return statement would then copy the temporary bfs into the object pointer to by the structure return register.
This is how it used to be. But over time people noticed that this can be rather slow and wanted to get rid of the extra copy on return and that is now possible in modern C++.
One of the ways the standard specified that is RVO (return value optimization) (Copy_elision. This is what the compiler will use in your example. There is only a single return bfs; so when you declare vector<int> bfs; at the start of the function that doesn't actually create a new object. Instead the compiler uses the object from the structure return register in place of bfs. So the function will modify the v from the caller directly and never have a temporary for bfs at al.
In cases where this is not possible there is another mechanism that at least reduces the cost of copying called move semantic. For a vector move semantic means that the destination vector will take over the data part of the source vector. So it only copies the size, capacity and data pointer, which takes constant time. None of the data (on the heap) itself is copied.

How to return the reference of a declared vector in the method body?

I have this method :
vector<float> MyObject::getResults(int n = 1000)
{
vector<float> results(n, 0);
// do some stuff
return results;
}
Of course this is not optimized and I want to return a reference of this vector but I cannot simply do this :
const vector<float>& MyObject::getResults(int n = 1000)
{
vector<float> results(n, 0);
// do some stuff
return results;
}
This doesn't work, the vector will be destroy at the end of method because it's a local variable.
So the only solution I found to solve this problem is to create a private vector in MyObject and return a reference to this vector :
const vector<float>& MyObject::getResults(int n = 1000)
{
this->results.clear();
this->results.resize(n, 0);
// do some stuff
return results;
}
Is this the right way to do that? Do you have any other solution to propose?
What's most efficient?
Return by value. Don't worry, no copying occurs. This is best practice:
// Use this
vector<float> getResults(int n = 1000);
Why is this? Local variables returned from a function are not copied. They are moved into the location where the return value will be stored:
// Result moved into v; no copying occurs
vector<float> v = getResults();
// Result moved into memory allocated by new; no copying occurs
vector<float>* q = new vector<float>(getResults());
How does this work?
When a function returns an object, it returns it in one of two ways:
In the registers
In memory
You can only return simple objects like ints and doubles in the registers. For values returned in memory, the function is passed a pointer to the location that it needs to place the return value.
When you call new vector<float>(getResults());, the following things happen:
The computer allocates memory for a new vector
It gives the location of that memory to getResults(), along with any other parameters.
getResults constructs the vector in that memory, no need to copy.
What about returning a reference to a member variable?
Generally speaking, this is a premature optimization that may not provide much, or any, benefit, and it makes your code more complex and more prone to bugs.
If you assign the output of getResults to a vector, then the data will get copied anyways:
MyObject m;
vector<float> = m.getResults(); // if getResults returns a const reference, the data gets copied
On the other hand, if you assign the output of getResults to a const reference, this can make managing the lifetime of MyObject much more complex. In the below example, the reference you return is invalidated as soon as the function ends because m gets destroyed.
vector<float> const& evilDoNotUseThisFunction() {
MyObject m;
vector<float> const& ref = m.getResults();
return ref; // This is a bug - ref is invalid when m gets destroyed
}
What's the difference between copying and moving for std::vector?
Copying loops over all the elements of a vector. When a vector is copied, all the data stored by the vector gets copied:
vector<float> a = getVector(); // Get some vector
vector<float> b = a // Copies a
This is equivalent to the following code:
vector<float> a = getVector(); // Get some vector
vector<float> b(a.size()); // Allocate vector of size a
// Copy data; this is O(n)
float* data = b.data();
for(float f : a) {
*data = f;
data++;
}
Moving doesn't loop over any elements. When a vector is constructed by move, it's as though it's swapped with an empty vector:
vector<float> a = getVector(); // Get some vector
vector<float> b = std::move(a); // Move a into b
is equivalent to:
vector<float> a = getVector(); // Get some vector
vector<float> b; // Make empty vector (no memory allocated)
std::swap(a, b); // Swap a with b; very fast; this is O(1)
TL;DR: Copying copies all the data in a loop. Moving just swaps out who owns the memory.
How do we know results gets moved? C++11 requires that local variables get moved automatically when they're returned. You don't have to call move.
Does a swap actually occur? In many cases, no. A swap is already cheap, but the compiler can be clever and optimize out the swap entirely. It does this by constructing your results vector in the memory where it'll be returning results. This is called Named Return Value Optimization. See https://shaharmike.com/cpp/rvo/#named-return-value-optimization-nrvo
Of course this is not optimized
It's fine. Specifically, since C++11 you don't need to do anything extra here.
In any case you should worry about optimization only after you have something that is correct, and a way to profile it.
Anyway, returning a reference to a private vector is less than ideal - it extends the lifetime of the vector unnecessarily, and may lead to re-entrancy problems later, like any other stateful function.

Delete member of struct on heap after std::move()

I'm currently refactoring and change existing code to C++11 and I wonder if have memory leak. My code has a struct with a std::vector in it as well as a method to shrink() this vector down to its negative elements.
struct mystruct_t {
int other_stuff;
std::vector <int> loc;
// Adds elements to loc vector
void add(int pos){
loc.push_back(pos);
}
// Shrink the list
void shrink () {
std::vector<int> tmp;
for (unsigned int i = 0; i < loc.size(); ++i) {
if (loc[i] < 0) tmp.push_back (loc[i]);
}
loc = tmp;
std::vector<int>(loc).swap (loc);
}
mystruct_t(): otherstuff(0) {};
};
In another function I create a new instance of this struct like this:
mystruct_t c = new mystruct_t;
c->add(2);
c->add(3);
...
And later I call the shrink() method of this struct.
c->shrink()
Now I'm not sure what's happening with the "old" loc vector after the shrink function?
Will it get destroyed automatically or do I have to destroyed by hand? And if the later, how would I do that?
I also tried to change shrink() to more C++11 style by change it to:
void shrink (){
std::vector<int> tmp;
for (auto &currLoc : loc) {
if (currLoc < 0) tmp.push_back (currLoc);
}
loc = std::move(tmp);
}
But the question remains the same what is happening to the "old" loc vector additionally this seems to increase the memory usage. I'm new to C++11 and not sure if I totally misunderstand the concept?
Now I'm not sure what's happening with the "old" loc vector after the shrink function?
There is no "old" loc vector. Through the lifetime of a mystruct_t object, it has exactly one member vector loc. You never get a new member or throw away an old one.
When you copy assign to the member (loc = tmp;), the buffer - cotained within the vector - is renewed. The vector owns the buffer, and the vector takes care that it is destroyed properly. Same applies when you move assign in the c++11 version.
Will it get destroyed automatically
If you refer to the memory allocated by the vector, then yes.
or do I have to destroyed by hand?
You have to destroy by hand only whatever you created by hand. You didn't call new, so you don't call delete.
additionally this seems to increase the memory usage.
Your c++11 version lacks the "shrink to fit" part of the original (std::vector<int>(loc).swap (loc);). In c++11 you can do:
loc = std::move(tmp);
loc.shrink_to_fit();
In the pre c++11 version, can get rid of the copy assignment and simply construct the temporary from tmp, and swap it with loc:
std::vector<int> tmp;
// copy the objects you want
std::vector<int>(tmp).swap(loc);
Operation std::move just casting values, so there is no additional memory usage.
When you use std::move compiler will remove head address of first object, and just reassign memory to second object. So it's very fast operation, etc just changing the head of data.

Adding elements in loop to container (lifetime)

If I add elements to a vector using the code below, then at the time I call foo, the elements (automatic variables) of vec have been destroyed since the scope in which they are created ends.
std::vector<A> vec;
for (int i = 0; i < n; i++) {
A a;
vec.push_back(a);
}
foo(vec);
My question is now what the textbook solution to such a problem is
No, the elements in vec will be different copies of a.
However, you need to allocate the size of vec if you want to use operator[] or else use vec.push_back():
for (int i = 0; i < n; i++) vec.push_back(A());
EDIT (after question change):
Even though push_back() takes its argument as a reference, internally it will make a copy of it. It takes it argument by reference to avoid making an unnecessary copy prior to making the copy to store internally.
Don't worry about stack variables. When you push value in std::vector, this container creates heap copy of the variable. Thus, all your variables will exists, when you're live the scope.
you can define your variable as global and in your loop just let the value in that variable and then push back

passing a vector of pointers and erasing duplicates

I am trying to erase a vector of pointers that I pass by value into some function. The reason why I pass by value is that I plan to erase these values in numerous calls to the function. So if I pass by pointer/reference I could not achieve this.
First of all is the statement above correct?
Here is some example code:
vector<Boson*>* BosonMaker::remove_duplicates(vector<Boson*>* boson_candidates, vector<Particle*> child_candidates){
vector<Particle*> used_leptons.clear();
// This needs deleting at some point
m_unduplicated_bosons = new vector<Boson*>();
for(int i_b = 0; boson_candidates->size(); i_b++){
vector<Particle*>::iterator child1_finder = find(used_leptons.begin(), used_leptons.end(), boson_candidates->at(i_b)->Child1());
//Search pointer will reach end of collection if child isn't in the used_leptons vector
if (child1_finder == used_leptons.end()) {
vector<Particle*>::iterator child2_finder = find(used_leptons.begin(), used_leptons.end(), boson_candidates->at(i_b)->Child2());
if (child2_finder == used_leptons.end()) {
used_leptons.push_back(boson_candidates->at(i_b)->Child1());
used_leptons.push_back(boson_candidates->at(i_b)->Child2());
// And add the boson to the vector of final bosons
unduplicated_bosons->push_back(boson_candidates->at(i_b));
}
}
}
// Now make a vector of unused leptons
for (int i_l = 0; i_l < used_leptons.size(); i_l++) {
vector<Particle*>::iterator lepton_finder = find(child_candidates.begin(), child_candidates.end(), used_leptons.at(i_l));
child_candidates.erase(lepton_finder);
}
return unduplicated_bosons;
}
I would then use this member function inside the class like so
vector<Boson*> *m_boson_finals_elpair = remove_duplicates(&m_boson_electronPair_candidates, m_all_particle_candidates);
vector<Boson*> *m_boson_finals_mupair = remove_duplicates(&m_boson_muonPair_candidates, m_all_particle_candidates);
vector<Boson*> *m_boson_finals_elneutrino = remove_duplicates(&m_boson_electronNeutrino_candidates, m_all_particle_candidates);
vector<Boson*> *m_boson_finals_muneutrino = remove_duplicates(&m_boson_muonNeutrino_candidates, m_all_particle_candidates);
My question is:
Would m_all_particle_candidates which is
vector<Particle*> m_all_particle_candidates;
be different in each call of remove_duplicates?
I think I am trying to ask is the iterator lepton_finder erased from the vector and not the actual object Particle since I have passed by value?
Note: There was a typo in the remove_duplicate function. I passed by pointer and not value. it should be value
I'm a little confused about what you are saying about passing by value and passing by reference, so I'm going to give a short explanation on that first:
When passing by value, the variable that the method is called with remains unchanged (since a copy is passed into the called method). Be careful though, this case can also incur a heavy performance penalty, since the whole variable is copied! In case of a vector holding many elements this might take quite some time! Passing by value is achieved like this in C++:
When passing by reference (or more or less equivalently by pointer) the outer variable is also changed - since you're only passing a reference into the method, which is referencing the same actual space in memory as the original variable!
So basically what the difference is that in when using call by value, the original caller's value remains unchanged, while when using call by reference, a reference to the original caller's value is passed in, and therefore this value can change on both ends.
Now which method is needed simply depends on what you want to achieve. Pass by Value if the variable you're passing into the method should remain unchanged (m_all_particle_candidates in your example). Or if you need it to change, then pass by reference/pointer.
If the passed-in variable shouldn't change, but you also only need a read-only version of the variable inside the method, then the possible performance problems introduced by passing by value can be overcome by using a const reference. In you case, however, you seem to need a full copy (meaning a normal pass-by-value).
Does the code presented in the OP compile? I don't think so. In fairness, it should be passed through a compiler before posting.
typedef struct {
long double x, y, z;
} V3;
void fnExpectingPtrToVec(vector<V3> * pvec) {
}
void fnExpectingVec(vector<V3> vec) {
}
void testVecs() {
vector<V3> v;
//fnExpectingPtrToVec(v); Does not compile
fnExpectingPtrToVec(&v);
fnExpectingVec(v);
}
If it is expecting a pointer to a vector in the 2nd param, and you passed in a vector instead, then its a compile error.
When you fix the function to accept a vector, not a pointer to one, and call it with your vector it will make a copy and the repeated calls to the function will leave m_all_particle_candidates unchanged.
You're not passing the vector by value.
vector<Boson*>* BosonMaker::remove_duplicates(vector<Boson*>* boson_candidates, vector<Particle*> *child_candidates);
will pass a pointer to the vector by value. But the pointer, which is a copy of the original one, will point to the same vector as the original.
So you're basically changing the same vector as outside the call.
To pass by value, you need:
vector<Boson*>* BosonMaker::remove_duplicates(vector<Boson*> boson_candidates, vector<Particle*> child_candidates);
But be careful when doing so. Copying will occur, so you probably need to override the virtual destructor, copy constructor and assignment operator for Boson and Particle if they're not POD types.