How should I pass read only vectors in functions in C++? - c++

How should I pass read only vectors in functions in C++?
One way to do it is the usual way as const&
const vector<string>& input
but I was wondering maybe I should do it this way instead:
const vector<const string&>& input
I am not sure which one is better. Basically I want to pass
a vector only for reading purposes, and I want to do it efficiently,
and avoid unnecessary copies.

A vector of references is not possible in C++ and wont compile at all (as a reference may be implemented without storage).
Besides that even if it would be possible you cant simply convert a std::vector<T> to a std::vector<T&> or std::vector<T*> (which is possible).
Now asides from the techinally aspects: There is no need for this. It would actually create more overhead. If your passing a reference to your vector, its data is completely left untouched. The only operation done is the PUSH of the address of the vector onto the stack.
So go with your first solution

Related

Structure of structures

I'm making a global struct of structures by declaring them on the header file file this:
const int Numberof=8;
struct FP
{
std::string callsign;
std::string de_airport;
std::string ar_airport;
std::string aircraft_type;
int trueairspeed;
std::vector<string> route;
std::vector<int> FL_route;
int current_state;
std::string current_WP;
std::string hour_lastWP;
std::string next_WP;
std::string hour_nextWP;
};
struct FP FP_list[Numberof];
Problem is, I defined the Numberof to 8 just to make it through my case. In the future that value is going to vary and i won't know its value until way later. What I'm interested in is a way of adding an instance to FP_list every time FP_listis called. Is there any way of doing it?
I know that making std::vector <FP> FP_listand then using vector::push_back is a solution, but since I don't know when and where my program ends I won't be able to swap my vector properly. Is that a problem, not calling .swap(Numberof)?
Edit: Oh and also, what I said about FP_list.swap(Numberof) applies to my objects of structure FP. Will it be a problem if I don't swap route and FL_route?
You can use std::vector<FP> FP_list and then using vector::push_back to add elements to it. Based on the structures given, there's no need to do swap anywhere (or even an option to do so).
swap is to swap the contents of two different vectors of the same type, which you don't have.
If you read somewhere that you should use swap after finishing using a vector, you may have misunderstood the scenario described. I can't come up with a scenario where it would be useful off the top of my head, but it does not relate to when the program finishes (if you consider that swap just swaps the content of 2 vectors, the other vector will still be left with the data, so whether it's in the one or the other, it doesn't really matter - it still needs to be freed).
For future reference, std::vector<FP> FP_list was the solution and in the function where I fill the structures, the solution to add one element do the vector each time I call the function is:
FP_List.push_bak(FP())
That's exactly what i was looking for

vector copy constructor C++ : does it have to be linear time?

I have a vector containing objects of type STL map, and I do vector.push_back(some map).
This unfortunately calls the map copy constructor, and wastes a lot of time. I understand that i can get around this by keeping a vector of (smart) pointers to maps - but this got me wondering - I read that STL anyway keeps its data on the heap and not on the stack - so why is the copy ctor not O(1) time, by simply copying pointers?
If you don't need the original map anymore after pushing back a copy back into the vector, write:
some_vector.push_back(std::move(some_map));
If you don't have a C++11 compiler yet, add an empty map and then swap that with the original:
some_vector.resize(some_vector.size() + 1);
some_vector.back().swap(some_map);
To answer your question directly: to do that, it would have to start with some sort of copy on write mechanism -- when you put something into a vector, it's required to be a copy of the original (or at least act like one). For example, if I push a map onto my vector, and then remove an item from the original map, that item should still be there in the copy of the map that was pushed onto the vector.
Then it would have to keep track of all the pointers, and ensure that the pointee (the map in this case) remained valid until all those pointers were themselves destroyed. It's certainly possible to do that. Quite a few languages, for example, provide garbage collection largely for this reason. Most of those change the semantics of things, so when/if you (for example) create a vector of maps, putting a map into the vector has reference semantics -- i.e., when you modify the original map, that's supposed to change any "copies" of it that you put into other collections.
As you've observed, you can do any/all of the above in C++ if you really want. The reason it doesn't right now is that most of the C++ standard library is built around value semantics instead of reference semantics. Either is (IMO, anyway) a perfectly valid and reasonable approach -- some languages take one, others take the other. Either/both can work just fine, but value semantics happens to be the choice that was made in C++.
If you want to copy pointers, create a vector of pointers to map. You can do that.
std::vector<std::map<A,B>* > x;
It doesn't do this automatically because it can't know who you want to manage the memory. Should the objects of the map be destroyed when the vector goes out of scope. What if the original map is still in scope?

c++ vector construct with given memory

I'd like to use a std::vector to control a given piece of memory. First of all I'm pretty sure this isn't good practice, but curiosity has the better of me and I'd like to know how to do this anyway.
The problem I have is a method like this:
vector<float> getRow(unsigned long rowIndex)
{
float* row = _m->getRow(rowIndex); // row is now a piece of memory (of a known size) that I control
vector<float> returnValue(row, row+_m->cols()); // construct a new vec from this data
delete [] row; // delete the original memory
return returnValue; // return the new vector
}
_m is a DLL interface class which returns an array of float which is the callers responsibility to delete. So I'd like to wrap this in a vector and return that to the user.... but this implementation allocates new memory for the vector, copies it, and then deletes the returned memory, then returns the vector.
What I'd like to do is to straight up tell the new vector that it has full control over this block of memory so when it gets deleted that memory gets cleaned up.
UPDATE: The original motivation for this (memory returned from a DLL) has been fairly firmly squashed by a number of responders :) However, I'd love to know the answer to the question anyway... Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?
The obvious answer is to use a custom allocator, however you might find that is really quite a heavyweight solution for what you need. If you want to do it, the simplest way is to take the allocator defined (as the default scond template argument to vector<>) by the implementation, copy that and make it work as required.
Another solution might be to define a template specialisation of vector, define as much of the interface as you actually need and implement the memory customisation.
Finally, how about defining your own container with a conforming STL interface, defining random access iterators etc. This might be quite easy given that underlying array will map nicely to vector<>, and pointers into it will map to iterators.
Comment on UPDATE: "Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?"
Surely the simple answer here is "No". Provided you want the result to be a vector<>, then it has to support growing as required, such as through the reserve() method, and that will not be possible for a given fixed allocation. So the real question is really: what exactly do you want to achieve? Something that can be used like vector<>, or something that really does have to in some sense be a vector, and if so, what is that sense?
Vector's default allocator doesn't provide this type of access to its internals. You could do it with your own allocator (vector's second template parameter), but that would change the type of the vector.
It would be much easier if you could write directly into the vector:
vector<float> getRow(unsigned long rowIndex) {
vector<float> row (_m->cols());
_m->getRow(rowIndex, &row[0]); // writes _m->cols() values into &row[0]
return row;
}
Note that &row[0] is a float* and it is guaranteed for vector to store items contiguously.
The most important thing to know here is that different DLL/Modules have different Heaps. This means that any memory that is allocated from a DLL needs to be deleted from that DLL (it's not just a matter of compiler version or delete vs delete[] or whatever). DO NOT PASS MEMORY MANAGEMENT RESPONSIBILITY ACROSS A DLL BOUNDARY. This includes creating a std::vector in a dll and returning it. But it also includes passing a std::vector to the DLL to be filled by the DLL; such an operation is unsafe since you don't know for sure that the std::vector will not try a resize of some kind while it is being filled with values.
There are two options:
Define your own allocator for the std::vector class that uses an allocation function that is guaranteed to reside in the DLL/Module from which the vector was created. This can easily be done with dynamic binding (that is, make the allocator class call some virtual function). Since dynamic binding will look-up in the vtable for the function call, it is guaranteed that it will fall in the code from the DLL/Module that originally created it.
Don't pass the vector object to or from the DLL. You can use, for example, a function getRowBegin() and getRowEnd() that return iterators (i.e. pointers) in the row array (if it is contiguous), and let the user std::copy that into its own, local std::vector object. You could also do it the other way around, pass the iterators begin() and end() to a function like fillRowInto(begin, end).
This problem is very real, although many people neglect it without knowing. Don't underestimate it. I have personally suffered silent bugs related to this issue and it wasn't pretty! It took me months to resolve it.
I have checked in the source code, and boost::shared_ptr and boost::shared_array use dynamic binding (first option above) to deal with this.. however, they are not guaranteed to be binary compatible. Still, this could be a slightly better option (usually binary compatibility is a much lesser problem than memory management across modules).
Your best bet is probably a std::vector<shared_ptr<MatrixCelType>>.
Lots more details in this thread.
If you're trying to change where/how the vector allocates/reallocates/deallocates memory, the allocator template parameter of the vector class is what you're looking for.
If you're simply trying to avoid the overhead of construction, copy construction, assignment, and destruction, then allow the user to instantiate the vector, then pass it to your function by reference. The user is then responsible for construction and destruction.
It sounds like what you're looking for is a form of smart pointer. One that deletes what it points to when it's destroyed. Look into the Boost libraries or roll your own in that case.
The Boost.SmartPtr library contains a whole lot of interesting classes, some of which are dedicated to handle arrays.
For example, behold scoped_array:
int main(int argc, char* argv[])
{
boost::scoped_array<float> array(_m->getRow(atoi(argv[1])));
return 0;
}
The issue, of course, is that scoped_array cannot be copied, so if you really want a std::vector<float>, #Fred Nurk's is probably the best you can get.
In the ideal case you'd want the equivalent to unique_ptr but in array form, however I don't think it's part of the standard.

Avoid making copies with vectors of vectors

I want to be able to have a vector of vectors of some type such as:
vector<vector<MyStruct> > vecOfVec;
I then create a vector of MyStruct, and populate it.
vector<MyStruct> someStructs;
// Populate it with data
Then finally add someStructs to vecOfVec;
vecOfVec.push_back(someStructs);
What I want to do is avoid having the copy constructor calls when pushing the vector. I know this can be accomplished by using a vector of pointers, but I'd like to avoid that if possible.
One strategy I've thought of seems to work, but I don't know if I'm over-engineering this problem.
// Push back an empty vector
vecOfVec.push_back(vector<MyStruct>());
// Swap the empty with the filled vector (constant time)
vecOfVec.back().swap(someStructs);
This seems like it would add my vector without having to do any copies, but this seems like something a compiler would already be doing during optimization.
Do you think this is a good strategy?
Edit: Simplified my swap statement due to some suggestions.
The swap trick is as good as it gets with C++03. In C++0x, you'll be able to use the vector's move constructor via std::move to achieve the same thing in a more obvious way.
Another option is to not create a separate vector<MyStruct>, but instead have the code that creates it accept it a a vector<MyStruct>& argument, and operate on it. Then, you add a new empty element to your outer vector<vector<MyStruct>>, and pass a reference to the code that will fill it.
I know this can be accomplished by
using a vector of pointers, but I'd
like to avoid that if possible.
Why?
That would be the most intuitive/readable/maintainable solution and would be much better than any weird hacks anyone comes up with (such as the swap you show).
Tim,
There's a common pattern to solve this. This is called smart pointers, and the best one to use is boost::shared_ptr.
Then, never pass vector by value or store it. Instead, store boost::shared_ptr >. You don't need to care about allocations/deallocations (when the containing vector is destroyed, so will be the others, just as in your code), and you can access the inner members almost the same way. The copy is, however, avoided by means of the smart pointer object's reference counting mechanism.
Let me show you how.
using boost::shared_ptr;
vector<shared_ptr<vector<MyStruct> > vecOfVecs;
shared_ptr<vector<MyStruct> > someStructs(new vector<MyStruct>);
// fill in the vector MyStructs
MyStructs->push_back(some struct.... as you usually do).
//...
vecOfVecs.push_back(someStructs); // Look! No copy!
If you do not already use boost::shared_ptr, I recommend downloading it from boost.org rather than implementing your own. It is really irreplaceable tool, soon to be in the C++ standard library.
You can either do something like vect.push_back(vector<MyStruct>()); and do vect.back().push_back(MyStruct()); or use smart pointers and have a vector of smart pointers to vector<MyStruct>
I think the swap idea is already fine, but can be written much easier:
vecOfVec.push_back(vector<MyStruct>());
vecOfVec.back().swap(someStructs);

How do I return hundreds of values from a C++ function?

In C++, whenever a function creates many (hundreds or thousands of) values, I used to have the caller pass an array that my function then fills with the output values:
void computeValues(int input, std::vector<int>& output);
So, the function will fill the vector output with the values it computes. But this is not really good C++ style, as I'm realizing now.
The following function signature is better because it doesn't commit to using a std::vector, but could use any container:
void computeValues(int input, std::insert_iterator<int> outputInserter);
Now, the caller can call with some inserter:
std::vector<int> values; // or could use deque, list, map, ...
computeValues(input, std::back_inserter(values));
Again, we don't commit to using std::vector specifically, which is nice, because the user might just need the values in a std::set etc. (Should I pass the iterator by value or by reference?)
My question is: Is the insert_iterator the right or standard way to do it? Or is there something even better?
EDIT: I edited the question to make it clear that I'm not talking about returning two or three values, but rather hundreds or thousands. (Imagine you have return all the files you find in a certain directory, or all the edges in a graph etc.)
Response to Edit: Well, if you need to return hundreds and thousands if values, a tuple of course would not be the way to go. Best pick the solution with the iterator then, but it's best not use any specific iterator type.
If you use iterators, you should use them as generic as possible. In your function you have used an insert iterator like insert_iterator< vector<int> >. You lost any genericity. Do it like this:
template<typename OutputIterator>
void computeValues(int input, OutputIterator output) {
...
}
Whatever you give it, it will work now. But it will not work if you have different types in the return set. You can use a tuple then. Also available as std::tuple in the next C++ Standard:
boost::tuple<int, bool, char> computeValues(int input) {
....
}
If the amount of values is variadic and the type of the values is from a fixed set, like (int, bool, char), you can look into a container of boost::variant. This however implies changes only on the call-side. You can keep the iterator style of above:
std::vector< boost::variant<int, bool, char> > data;
computeValues(42, std::back_inserter(data));
You could return a smart pointer to a vector. That should work and no copy of the vector will be made.
If you don't want to keep the smart pointer for the rest of your program, you could simply create a vector before calling the function, and swap both vectors.
Actually, your old method of passing in the vector has a lot to recommend it -- it's efficient, reliable, and easy to understand. The disadvantages are real but don't apply equally in all cases. Are people really going to want the data in an std::set or list? Are they really going to want to use the long list of numbers without bothering to assign it to a variable first (one of the reasons to return something via 'return' rather than a parameter)? Being generic is nice, but there is a cost in your programming time that may not be redeemed.
If you ever have a group of objects, chances are you have at least a few methods that work on that group of objects (otherwise, what are you doing with them?)
If that's the case, it would make sense to have those methods in a class that contain both said objects and methods.
If that makes sense and you have such a class, return it.
I virtually never find myself thinking that I wish I could return more than one value. By the fact that a method should only do one small thing, your parameters and return values tend to have a relationship, and so are more often than not deserving of a class that contains them, so returning more than one value is rarely interesting (Maybe I wished for it 5 times in 20 years--each time I refactored instead, came up with a better result and realized my first attempt was sub-standard.)
One other option is boost::tuple: http://www.boost.org/doc/libs/1_38_0/libs/tuple/doc/tuple_users_guide.html
int x, y;
boost::tie(x,y) = bar();
A stadard container works for homogenous objects 9which you can return).
The standard library way is to abstract an algorithm from the container and use iterators to bridge the gap between.
If you need to pass more than a single type think of structs/classes.
My question is: Is the insert_iterator the right or standard way to do it?
Yes. Otherwise, if you are not going to have at least as many elements in your container as there will be computed values. This is not always possible, specially, if you want to write to a stream. So, you are good.
Your example with insert_iterator won't work, because insert_iterator is a template requiring a container for a parameter. You could declare it
void computeValues(int input, std::insert_iterator<vector<int> > outputInserter);
or
template<class Container>
void computeValues(int input, std::insert_iterator<Container> outputInserter);
The first will tie you back to a vector<int> implementation, without any obvious advantages over your initial code. The second is less restrictive, but implementing as a template will give you other constraints that might make it a less desirable choice.
I'd use something like
std::auto_ptr<std::vector<int> > computeValues(int input);
{
std::auto_ptr<std::vector<int> > r(new std::vector<int>);
r->push_back(...) // Hundreds of these
return r;
}
No copying overhead in the return or risk of leaking (if you use auto_ptr correctly in the caller).
I'd say your new solution is more general, and better style. I'm not sure I'd worry too much about style in c++, more about usability and efficiency.
If you're returning a lot of items, and know the size, using a vector would allow you to reserve the memory in one allocation, which may or may not be worth it.