passing a vector of pointers and erasing duplicates - c++

I am trying to erase a vector of pointers that I pass by value into some function. The reason why I pass by value is that I plan to erase these values in numerous calls to the function. So if I pass by pointer/reference I could not achieve this.
First of all is the statement above correct?
Here is some example code:
vector<Boson*>* BosonMaker::remove_duplicates(vector<Boson*>* boson_candidates, vector<Particle*> child_candidates){
vector<Particle*> used_leptons.clear();
// This needs deleting at some point
m_unduplicated_bosons = new vector<Boson*>();
for(int i_b = 0; boson_candidates->size(); i_b++){
vector<Particle*>::iterator child1_finder = find(used_leptons.begin(), used_leptons.end(), boson_candidates->at(i_b)->Child1());
//Search pointer will reach end of collection if child isn't in the used_leptons vector
if (child1_finder == used_leptons.end()) {
vector<Particle*>::iterator child2_finder = find(used_leptons.begin(), used_leptons.end(), boson_candidates->at(i_b)->Child2());
if (child2_finder == used_leptons.end()) {
used_leptons.push_back(boson_candidates->at(i_b)->Child1());
used_leptons.push_back(boson_candidates->at(i_b)->Child2());
// And add the boson to the vector of final bosons
unduplicated_bosons->push_back(boson_candidates->at(i_b));
}
}
}
// Now make a vector of unused leptons
for (int i_l = 0; i_l < used_leptons.size(); i_l++) {
vector<Particle*>::iterator lepton_finder = find(child_candidates.begin(), child_candidates.end(), used_leptons.at(i_l));
child_candidates.erase(lepton_finder);
}
return unduplicated_bosons;
}
I would then use this member function inside the class like so
vector<Boson*> *m_boson_finals_elpair = remove_duplicates(&m_boson_electronPair_candidates, m_all_particle_candidates);
vector<Boson*> *m_boson_finals_mupair = remove_duplicates(&m_boson_muonPair_candidates, m_all_particle_candidates);
vector<Boson*> *m_boson_finals_elneutrino = remove_duplicates(&m_boson_electronNeutrino_candidates, m_all_particle_candidates);
vector<Boson*> *m_boson_finals_muneutrino = remove_duplicates(&m_boson_muonNeutrino_candidates, m_all_particle_candidates);
My question is:
Would m_all_particle_candidates which is
vector<Particle*> m_all_particle_candidates;
be different in each call of remove_duplicates?
I think I am trying to ask is the iterator lepton_finder erased from the vector and not the actual object Particle since I have passed by value?
Note: There was a typo in the remove_duplicate function. I passed by pointer and not value. it should be value

I'm a little confused about what you are saying about passing by value and passing by reference, so I'm going to give a short explanation on that first:
When passing by value, the variable that the method is called with remains unchanged (since a copy is passed into the called method). Be careful though, this case can also incur a heavy performance penalty, since the whole variable is copied! In case of a vector holding many elements this might take quite some time! Passing by value is achieved like this in C++:
When passing by reference (or more or less equivalently by pointer) the outer variable is also changed - since you're only passing a reference into the method, which is referencing the same actual space in memory as the original variable!
So basically what the difference is that in when using call by value, the original caller's value remains unchanged, while when using call by reference, a reference to the original caller's value is passed in, and therefore this value can change on both ends.
Now which method is needed simply depends on what you want to achieve. Pass by Value if the variable you're passing into the method should remain unchanged (m_all_particle_candidates in your example). Or if you need it to change, then pass by reference/pointer.
If the passed-in variable shouldn't change, but you also only need a read-only version of the variable inside the method, then the possible performance problems introduced by passing by value can be overcome by using a const reference. In you case, however, you seem to need a full copy (meaning a normal pass-by-value).

Does the code presented in the OP compile? I don't think so. In fairness, it should be passed through a compiler before posting.
typedef struct {
long double x, y, z;
} V3;
void fnExpectingPtrToVec(vector<V3> * pvec) {
}
void fnExpectingVec(vector<V3> vec) {
}
void testVecs() {
vector<V3> v;
//fnExpectingPtrToVec(v); Does not compile
fnExpectingPtrToVec(&v);
fnExpectingVec(v);
}
If it is expecting a pointer to a vector in the 2nd param, and you passed in a vector instead, then its a compile error.
When you fix the function to accept a vector, not a pointer to one, and call it with your vector it will make a copy and the repeated calls to the function will leave m_all_particle_candidates unchanged.

You're not passing the vector by value.
vector<Boson*>* BosonMaker::remove_duplicates(vector<Boson*>* boson_candidates, vector<Particle*> *child_candidates);
will pass a pointer to the vector by value. But the pointer, which is a copy of the original one, will point to the same vector as the original.
So you're basically changing the same vector as outside the call.
To pass by value, you need:
vector<Boson*>* BosonMaker::remove_duplicates(vector<Boson*> boson_candidates, vector<Particle*> child_candidates);
But be careful when doing so. Copying will occur, so you probably need to override the virtual destructor, copy constructor and assignment operator for Boson and Particle if they're not POD types.

Related

C++ storing local variable into a vector prevents it from being destroyed when out of scope

I've encountered this piece of existing code, and being still new to C++, I don't understand why storing a local variable into a vector makes it still accessible later.
Here's the simplified flow of the code, where BackendCall is some class defined earlier:
void SaveBackendCall(BackendCall* backend_call,
vector<BackendCall>* logged_calls) {
logged_calls->push_back(*backend_call);
}
void AddNewCall(vector<BackendCall>* logged_calls) {
BackendCall backend_call; // The local variable in question.
SaveBackendCall(&backend_call, logged_calls);
}
vector<BackendCall> logged_calls;
AddNewCall(&logged_calls);
for (auto i = logged_calls->begin(); i != logged_calls->end(); ++i) {
i->access_stuff(); // This still works?
}
Wouldn't the local variable backend_call in AddNewCall() be destroyed after the function returns? Then I don't understand what is actually stored in logged_calls.
Would it make sense to convert the vector of BackendCall objects into a vector of unique_ptrs?
As the commenters have noted, vector<T>::push_back(someT) pushes a copy of someT onto the vector. It's true that by the time you get to your iteration loop, the original variable named backend_call has already departed for the great stack in the sky (which shall never overflow); or possibly it's been reincarnated as i (who knows? I don't want to debate religion here). But its memory lives on in the form of a facsimile inside logged_calls's buffer.
The vector is a vector of BackendCall objects. When adding objects to a vector, the copy constructor is called.
If you look at SaveBackendCall(), you will notice that it is dereferencing the pointer, which means it is passing in a reference to the object, not the pointer. (If the vector stored pointers, this would be dangerous code since what it is pointing to is going away when the stack goes out of scope.)
So what is happening here is:
AddNewCall() creates a local variable.
It's passing a pointer to that variable to SaveBackendCall().
SaveBackendCall() is passing a reference to the object to push_back().
push_back() implicitly makes a copy of the BackendCall and adds it to the vector.
Depending on the complexity and size of the BackendCall object, that copy may be an expensive operation. This can become especially problematic when the push_back requires the vector to be resized.
Yes, IMO it would probably be beneficial to use a vector of smart pointers to avoid unnecessary copies.

meaning of reference and pointer together?

In my project, there is a definition of a function call like this.
int32 Map(void * &pMemoryPointer)
In the calling place, the paramenter passed is void*, why cant we just receive it as a pointer itself, instead of this?
Without knowing what the Map function does, I'd guess that it sets the pointer. Therefore it has to be passed by reference.
Using a reference to a pointer, you can allocate memory and assign it to the pointer inside the function. For example
void DoSomething(int*& pointerReference)
{
// Do some stuff...
pointerReference = new int[someSize];
// Do some other stuff...
}
The other way to make functions like that is to return the pointer, but as the Map function in the question returns something else that can't be used.
Reading it backwards, this means that pMemoryPointer is a reference (&) to a pointer (*) to void. This means that whatever pointer you pass gets referenced, and any modification that the function will do to pMemoryPointer will also affect the original (passed) pointer (e.g. changing the value of pMemoryPointer will also change the value of the original pointer).
why cant we just receive it as a pointer itself, instead of this?
That's because by doing that, you are copying the pointer and any change that you'll make to the copy doesn't reflect to the original one.
void im_supposed_to_modify_a_pointer(void* ptr) { // Oh no!
ptr = 0xBADF00D;
}
int* my_ptr = 0xD0GF00D;
im_supposed_to_modify_a_pointer(my_ptr);
ASSERT(my_ptr == 0xBADF00D) // FAIL!
That's a weird function prototype IMHO, but it means
(Update) that the Map function accepts a reference to a void pointer as a parameter.
So I think, it is equivalent to declaring the function like this:
int32 Map(void** pMemoryPointer)

Ensure a returned value is not a pointer

I want to ensure myself that in the GetConnections method I am returning an exact copy of connections. I will be editing it outside of the existing Node, and my program will most likely stop functioning if it returns a pointer to the memory location (thus editing the vector of the node). How can I ensure myself I am returning a clone / copy and not a pointer?
std :: vector<NodeConnection*> Node :: GetConnections()
{
return this->connections;
}
class Node {
private:
std :: vector <NodeConnection*> connections;
public:
// getters
std :: vector <NodeConnection*> GetConnections();
};
The NodeConnection* in the vector itself will not be edited, so that's not the issue here.
You can tell what you are returning by looking at the function signature:
SomeType* functionName(ArgType arg) - the function returns a pointer. Whatever the pointer is pointing to can be modified by the caller.
SomeType const * functionName(ArgType arg) - the function returns a pointer to const. Whatever the pointer is pointing to can be examined, but cannot be modified by the caller.
SomeType& functionName(ArgType arg) - the function returns a reference. Whatever the reference is referencing can be modified by the caller.
const SomeType& functionName(ArgType arg) - the function returns a const reference. Whatever the reference is referencing can be examined, but cannot be modified by the caller.
SomeType functionName(ArgType arg) - the function returns a copy. Any modifications the caller might do to the returned value will not be reflected on the original being returned.
Your function's return type is that of the fifth kind - you are returning by value, i.e. your code makes a copy of the vector of pointers. It should be noted that although the callers cannot modify the vector inside your class, they could certainly call methods on the objects pointed to by the vector's elements. If some of these methods make changes to the items, the items in the original vector will see these changes as well. In other words, when you copy a vector of pointers, you get a shallow copy.
Also note that it is not necessary to return a copy if all you want is preventing modification: returning a pointer to const or a const reference would achieve the same result with higher efficiency, because the copy would be avoided.

C++ Pass by value/reference

I have the following code snippet:
vector<DEMData>* dems = new vector<DEMData>();
ConsumeXMLFile(dems);
if(!udp_open(2500))
{
}
I want the ConsumeXMLFile method to populate the vector with DEMData objects built from reading an XML file. When ConsumeXMLFile returns, the dems vector is empty. I think I'm running into a pass by value problem.
By the looks of this the ConsumeXMLFile takes a pointer to a vector so I doubt this is a pass by value problem.
Are you at any point reassigning the pointer that is passed into the function? In other words, does your function look anything like this:
void ConsumeXMLFile(vector<DEMData>* dems)
{
// ... some code ...
dems = new vector<DEMData>();
// ...more code...
}
This is a common mistake that I see beginning C++ programmers (and C programmers) make. What is going on here is that a pointer to the dems vector is being passed by value, which means that if you modify the pointed-to vector, that will affect the the vector possessed by the caller. However if you modify the pointer (which is passed by value) this will not affect the pointer possessed by the caller. After the re-assignment, the dems pointer in ConsumeXMLFile will point to a totally different vector than the dems pointer that the caller holds.
One of the things that makes me suspect that you might be doing this is that this is C++ and there's no clear reason why you would want to pass a pointer to the vector instead of a reference to the vector otherwise.

How are objects passed to functions C++, by value or by reference?

Coming from C#, where class instances are passed by reference (that is, a copy of the reference is passed when you call a function, instead of a copy of the value), I'd like to know how this works in C++. In the following case, _poly = poly, is it copying the value of poly to _poly, or what?
#include <vector>
using namespace std;
class polynomial {
vector<int> _poly;
public:
void Set(vector<int> poly);
};
void polynomial::Set(vector<int> poly) {
_poly = poly; <----------------
}
poly's values will be copied into _poly -- but you will have made an extra copy in the process. A better way to do it is to pass by const reference:
void polynomial::Set(const vector<int>& poly) {
_poly = poly;
}
EDIT I mentioned in comments about copy-and-swap. Another way to implement what you want is
void polynomial::Set(vector<int> poly) {
_poly.swap(poly);
}
This gives you the additional benefit of having the strong exception guarantee instead of the basic guarantee. In some cases the code might be faster, too, but I see this as more of a bonus. The only thing is that this code might be called "harder to read", since one has to realize that there's an implicit copy.
This will do a shallow-copy of the vector of ints. This will generally work as you would expect (_poly will end up containing the same values as poly).
You would see some strange behaivor if you had pointers (as they would be copied by value).
In general, you would want to pass that parameter by const reference:
void polynomial::Set( const vector<int>& poly )
In this case, passing by const reference will not affect the outcome and will be more efficient since it will eliminate an unneeded copy of the vector being passed into the method.
This will copy the entire vector. Assignment is by value in C++. If you are assigning a pointer, the pointer value is assigned. References may not be reassigned to refer to another object once initialized, so assignment of them alters the referent object.
The copy operator for vectors will copy the contents of the vector over.
There are three possibilities:
Pass by value
void someFunction(SomeClass theObject);
Pass a pointer
void someFunction(SomeClass *theObject);
Pass by reference
void someFunction(SomeClass &theObject);
Your vector will be copied.
What's actually going on is that the "=" operator of vector has been overloaded to do the actual copy.
Yes, the line you point to is copying the entire vector. Furthermore, there will be a copy on the function call, as well, since that's not const.
Basically, if the vector has any size to it, this is VERY expensive.
Unless you assign or pass a parameter by reference (using the & prefix) you are passing by value. For classes, this means that a copy of the object is constructed using either a supplied or implicitly generated (shallow) copy constructor for the type. This can be expensive - and is often undesirable.
In your example, the vector is copied twice - once when it is passed as a parameter to the Set() method, and again when it is assigned to the _poly member.
You could avoid the first copy by passing the vector by reference:
void polynomial::Set(const vector<int>& poly) // passes the original parameter by reference
{
_poly = poly; // still makes a copy
}