I am profiling an old legacy C++ application.
I saw there are lots of vector<class> var declared:
vector<someClass> myVec1;
vector<someClass> myVec2;
vector<someClass> myVec3;
This vector can grow quite a lot. In the code I also found that sometimes it is done:
myVec2 = myVec1;
This assign operation acutally makes a copy of data, and if the vector is pretty large the operation is slow.
Is there any way to assign to myVec2 only a reference of myVec1 without having to refactor all the code that relies on these variables (ie allocating them dynamically)?
Note that after the assignment myVec1 is not used anymore.
You could try myVec2.swap(myVec1);, which should be very fast. Since you don't care about myVec1 any more, it doesn't matter that it now contains the original contents of myVec2.
You can declare a reference variable like this:
vector<someClass> myVec1;
vector<someClass>& myVec2 = myVec1;
However, if myVec1 goes out of scope myVec2 will be bad.
Also, if you are using a c++11 compiler your vector can be 'moved' instead of copied.
Hard to say what you need to do without knowing your use case.
If you do not need to reassign your vector variable you can use C++ reference like this:
vector<someClass> myVec1;
vector<someClass>& myVec2 = myVec1;
(this works like assigning constant pointer to myVec2 but preserves semantic of "by-value" variable)
More about references in C++ and possible pitfalls here.
Why would you want to? If you want to reference myVect1, then reference it by the name myVect1.
Copying a vector is extremely useful: it can be used as a snapshot or initialize another from one in a useful state.
Anyway, copying a few megabytes of data isn't all that slow anymore. Milliseconds at worst.
Related
I'm getting to grips with references in C++ and I have a small query surrounding references & scoping, for this it's probably best to create an example:
Imagine I have a method in "BankDatabase.cpp" which takes a bank record by reference and adds it to a data structure (also by reference).
void AddRecord( BankRecord& bankRecord )
{
//Add record to data structure by reference
}
If I run a method like so:
void TestAddRecord( BankDatabase& bankDatabase )
{
BankRecord bankRecord { "John", "Doe", 9999 }
bankDatabase.AddRecord( bankRecord );
}
To my mind, "bankRecord" falls out of scope (as do its two strings and int) and is thus cleared from memory at the end of the "TestAddRecord" method, leaving "bankDatabase" pointing at some empty memory?
If so what's the general accepted standard / resolution to such a scenario? It seems a little mad to have to pass things by value...
In that case passing by value seems like the way to go. Allocating a new BankRecord pointer will work too. Storing things by reference is not very great.
However if I'm not mistaking, your two strings and the int won't be lost since they are present in the stack and will not be deallocated. But bankRecord will still be lost.
The best way to answer these concerns is to step through the code in the debugger and see what the Vector is doing with the variable being appended. Look especially at the constructor calls as you step into the data structure's Append functions. Because I do not know your underlying data structure, it is a bit more difficult for me to tell you more information. I will assume it is a std::vector for now until told otherwise.
You may be surprised to learn that references passed through a function do not tell the entire story about when it will go in and out of scope. I often think of C++ references as pointers that do not need nullptr checks.
Your code will work fine as long as the reference is copied into the vector or does not go out of scope because the variable it pointed to was destroyed. The reference will not go out of scope if it is referring to a member variable or memory on the heap for your particular case.
If the reference was declared on the stack to a variable created on the stack, and then appended to the vector, then you will have scope problems.
You should also look into emplace() if you have C++11 and the compiler supports move semantics.
In short, the most important thing you can do here is step through the code and see what Constructors are being called. This will give you the answer you desire.
This is happening to me a lot during the making of this program and i thought it was better to ask you guys.
For example, if I have a loop that calls a specific structure of a vector, is it better to call the vector over and over like this:
FP_list[n].callsign=...
FP_list[n].de_airport=...
FP_list[n].ar_airport=...;
FP_list[n].aircraft_type=...
FP_list[n].trueairspeed=...
FP_list[n].FL_route.push_back(Aircraft.GetClearedAltitude());
FP_list[n].last_WP=...
FP_list[n].next_WP=...
...
Or to declare a temporary variable and use it from that point on like this:
FP temp=FP_list[n];
temp.callsign=...
...
temp.next_WP=...
Which one it better in terms of memory consumption and running time?
Thank you in advance
If FP_list is an std::vector or similar you can do:
FP& p = FP_list[n];
^^^ use a reference
p.callsign = ...;
p.de_airport = ...;
p.ar_airport = ...;
This code uses a reference to access the data. A reference gives you direct access to the element it refers. It works a bit like a pointer. Now you have to call operator[] only once, and your code is much more compact.
As noted in the comments, be careful that references might by invalidated if you make changes to the vector itself, e.g. adding or removing elements.
This assumes you actually want to change the contents stored in the vector. If you do not want to change them, you have to create a copy: FP p = FP_list[n];.
Efficiency is a trade-off. The way you wrote the code, it is making a copy of the structure. Depending on how expensive making that copy is, it may be far worse than the extra time to evaluate the index expression.
My conclusion: Write the code as cleanly as possible so it is obvious what it is doing, then let the optimizer in the compiler worry about efficiency. If performance does become an issue, then profile first so you can be sure you are hand-optimizing the right problem.
Assume I have some object, such as:
std::map<int, std::vector<double> > some_map;
Simple question: is it more efficient to do the following
std::vector<double> vec = some_map[some_index];
or referencing it
std::vector<double>& vec = some_map[some_index];
Can anyone explain in short what typically happens behind the scenes here?
Thanks very much in advance!
The two have different semantics, and aren't interchangeable.
The first gives you a copy, which you can modify however you
wish, without changing anything in the map. The second gives
you a reference to the data element in the map; any
modifications modify the contents of the map. Also, although
probably not an issue, be aware that if the map is destructed
before the reference goes out of scope, the reference will
dangle.
With regards to performance, it depends on what's in the vector,
and what you do with it later; in most cases, the reference will
probably have better performance, but you shouldn't worry about
it until the profiler says you have to. (And if you do use the
reference, make it const, unless you really do want to be able
to modify the contents of the map.)
Creating a reference is more efficient, but you should note that these two statements are different in semantics and have different behaviors.
If you do
std::vector<double> vec = some_map[some_index];
The copy constructor of std::vector is called to copy the whole vector some_map[some_index] into vec. In this way, you get a fresh new vector vec. They are independent objects and any changes to vec does not affect the original map.
If you use
std::vector<double>& vec = some_map[some_index];
then vec refers directly to some_map[some_index] and copy is avoided. However, be aware that if you later change vec, the change will be reflected in both vec and some_map[some_index] since they refer to the same object. To prevent undesirable changes, it is safer to use a const reference:
const std::vector<double>& vec = some_map[some_index];
Referencing is much more efficient, both in terms of memory used and cpu cycles. Your first line of code makes a copy of the vector, which includes copying every item in the vector. In the second, you're simply referring to the existing vector. No copies are made.
This is presumable a simple C++ question, but I'm relearning C++ and don't know some of the basics. I have a class that includes a struct with a vector of objects in it, so something like this:
struct my_struct{
Irrelevant_Object object,
vector<tuple> tuple_list;
}
The struct and the tuple (another struct) are predefined by the architecture and given to me in my method; so I can't change them. I want to generate and insert a tuple into the originaly empty tuple_list.
The simple solution is have a method which allocates a new tuple object, fills in the tuple data, then call tuple_list.push_back() and pass in the allocated tuple. But this would require allocating a new tuple only to have the push_back method copy all of the contents of the (large) tuple struct into an already defined memory space of the vector. So I'm paying the expense of an allocation/delete as well as the lesser expense of copying the tuple contents into the vector to do it this way. It seems rather inefficent, and since this method would be in the critical path of the function I would prefer something faster (admitedly I doubt this method would be the bottle-neck, and I know early optimization == bad. However, I'm asking this question more to learn something about C++ syntax then out of a deperate need to actually do this in my code).
So my question is, is there a quicker way to fill the contents of my tuple list without allocating and copying a tuple? If this was an array I could make the array as large as I want, then past a reference to tuple_list[0] to the function that creates the tuple. That way the funciton could fill the empty contents of the already allocated tuple within the array without allocating a new one or copying from one tuple to another. I tried to do that with the vector out of curiousity and ended up with a seg fault when my itterator pointed to 0x0, so I assume that syntax doesn't work for vectors. So is there a quick way of doing this assignment?
Since this is a question as much to learn the language as for actual use feel free to throw in any other tangentally relevant stuff you think are interesting, I'm looking to learn.
Thanks.
In C++11, you can use std::vector::emplace_back, which constructs the new object in-place, therefore there is no copying when you use this method.
By using this method, you could do this:
my_struct some_struct;
some_struct.tuple_list.emplace_back(1, 5, "bleh");
Assuming your tuple object contains this constructor:
tuple::tuple(int, int, const std::string&)
Edit: You can also use move semantics to store a pre-allocated tuple:
my_struct some_struct;
tuple a_tuple;
/* modify a_tuple, initialize it, whatever... */
some_struct.push_back(std::move(a_tuple)); // move it into your vector
Or use a reference to the tuple after it has been stored in the vector:
my_struct some_struct;
some_struct.tuple_list.emplace_back(1, 5, "bleh");
// store a reference to the last element(the one we've just inserted)
tuple &some_tuple = some_struct.tuple_list.back();
some_tuple.foo();
On all of the above solutions you're creating only one tuple while also avoiding copying.
I'm having some trouble to find the best way to accomplish what I have in mind due to my inexperience. I have a class where I need to a vector of objects. So my first question will be:
is there any problem having this: vector< AnyType > container* and then on the constructor initialize it with new (and deleting it on the destructor)?
Another question is: if this vector is going to store objects, shouldn't it be more like vector< AnyTipe* > so they could be dynamically created? In that case how would I return an object from a method and how to avoid memory leaks (trying to use only STL)?
Yes, you can do vector<AnyType> *container and new/delete it. Just be careful when you do subscript notation to access its elements; be sure to say (*container)[i], not container[i], or worse, *container[i], which will probably compile and lead to a crash.
When you do a vector<AnyType>, constructors/destructors are called automatically as needed. However, this approach may lead to unwanted object copying if you plan to pass objects around. Although vector<AnyType> lends itself to better syntactic sugar for the most obvious operations, I recommend vector<AnyType*> for non-primitive objects simply because it's more flexible.
is there any problem having this: vector< AnyType > *container and then on the constructor initialize it with new (and deleting it on the destructor)
No there isn't a problem. But based on that, neither is there a need to dynamically allocate the vector.
Simply make the vector a member of the class:
class foo
{
std::vector<AnyType> container;
...
}
The container will be automatically constructed/destructed along with the instance of foo. Since that was your entire description of what you wanted to do, just let the compiler do the work for you.
Don't use new and delete for anything.
Sometimes you have to, but usually you don't, so try to avoid it and see how you get on. It's hard to explain exactly how without a more concrete example, but in particular if you're doing:
SomeType *myobject = new SomeType();
... use myobject for something ...
delete myobject;
return;
Then firstly this code is leak-prone, and secondly it should be replaced with:
SomeType myobject;
... use myobject for something (replacing -> with . etc.) ...
return;
Especially don't create a vector with new - it's almost always wrong because in practice a vector almost always has one well-defined owner. That owner should have a vector variable, not a pointer-to-vector that they have to remember to delete. You wouldn't dynamically allocate an int just to be a loop counter, and you don't dynamically allocate a vector just to hold some values. In C++, all types can behave in many respects like built-in types. The issues are what lifetime you want them to have, and (sometimes) whether it's expensive to pass them by value or otherwise copy them.
shouldn't it be more like vector< AnyTipe* > so they could be dynamically created?
Only if they need to be dynamically created for some other reason, aside from just that you want to organise them in a vector. Until you hit that reason, don't look for one.
In that case how would I return an object from a method and how to avoid memory leaks (trying to use only STL)?
The standard libraries don't really provide the tools to avoid memory leaks in all common cases. If you must manage memory, I promise you that it is less effort to get hold of an implementation of shared_ptr than it is to do it right without one.