I have been overthinking (some may say underthinking, let's see what happens) the const-ness of STL containers and their elements.
I have been looking for a discussion of this, but the results have been surprisingly sparse. So I'm not necessarily looking for a definite answer here, I'd be just as happy with a discussion that gets the gears in my head moving again.
Let's say I have a class that keeps std::strings in a std::vector. My class is a dictionary that reads words from a dictionary file. They will never be changed. So it seems prudent to declare it as
std::vector<const std::string> m_myStrings;
However, I've read scattered comments that you shouldn't use const elements in a std::vector, since the elements need to be assignable.
Question:
Are there cases when const elements are used in std::vector (excluding hacks etc)?
Are const elements used in other containers? If so, which ones, and when?
I'm primarily talking about value types as elements here, not pointers.
My class is a dictionary that reads words from a dictionary file. They will never be changed.
Encapsulation can help here.
Have your class keep a vector<string>, but make it private.
Then add an accessor to your class that returns a const vector<string> &, and make the callers go through that.
The callers cannot change the vector, and operator [] on the vector will hand them const string &, which is exactly what you want.
No, for the reason you state.
In the context of std::vector, I don't think it makes sense to use a const qualifier with its template parameter because a std::vector is dynamic by nature and may be required to "move" in memory in order to "resize" itself.
In the C++03 standard, std::vector is guaranteed stored in contiguous memory. This almost requires that std::vector be implemented with some form of an array. But how can we create a dynamic size-changing array? We cannot simply just "append" memory to the end of it--that would either require an additional node (and a linked list) or actually physically putting our additional entries at the end of the array, which would be either out-of-bounds or require us to just reserve more memory in the first place.
Thus, I would assume that std::vector would need to allocate an additional array, copy or move its members over to the end array, and then delete the old one.
It is not guaranteed that a move or copy assignment for every template-able object for a std::vector would not change the underlying object being moved or copied--it is considered good form to do add the const qualifier, but it is not required. Therefore, we cannot allow a std::vector<const T>.
Related: How is C++ std::vector implemented?
consider using
std::vector<std::shared_ptr<const std::string>>
instead?
Related
Let's assume that I have a class
class Foo
{
public:
Foo (const std::string&);
virtual ~Foo()=default;
private:
//some private properties
};
And I want to create many instances of this class. Since I aim for good performance, I want to allocate the memory at once for all of them (at this point, I know the exact number but only at runtime). However, each object shall be constructed with an individual constructor parameter from a vector of parameters
std::vector<std::string> parameters;
Question: How can this be achieved?
My first try was to start with a std::vector<Foo> and then reserve(parameters.size()) and use emplace_back(...) in a loop. However I cannot use this approach because I use pointers to the individual objects and want to be sure that they are not moved to a different location in memory by the internal methods of std::vector. To avoid this I tried to delete the copy constructor of Foo to be sure at compile time that no methods can be called that might copy the objects to a different location but then I cannot use emplace_back(...) anymore. The reason is that in this method, the vector might want to grow and copy all the elements to the new location, it does not know that I reserved enough space.
I see three possibilities:
Use vector with reserve + emplace_back. You have the guarantee that your elements don't get moved as long as you don't exceed the capacity.
Use malloc + placement new. This allows you to allocate raw memory and then construct each element one by one e.g. in a loop.
If you already have a range of parameters from which to construct you objects as in the example, you can brobably (depending on your implementation of std::vector) use std::vector's iterator based constructor like this:
std::vector<Foo> v(parameters.begin(),parameters.end());
First solution has the advantage to be much simpler and has all the other goodies of a vector like taking care of destruction, keeping the size around etc.
The second solution might be faster, because you don't need to do the housekeeping stuff of vector emplace_back and it works even with a deleted move / copy constructor if that is important to you, but it leaves you with dozens of possibilities for errors
The third solution - if applicable - is imho the best. It also works with deleted copy / move constructors, should not have any performance overhead and it gives you all the advantages of using a standard container.
It does however rely on the constructor first determining the size of the range (e.g. via std::distance) and I'm not sure if this is guaranteed for any kind of iterators (in practice, all implementations do this at least for random access iterators). Also in some cases, providing appropriate iterators requires writing some boilerplate code.
I am trying to connect two existing codebases — one in C, the other in C++. The C++ code uses std::vector whereas the other one is based on arrays of double. I would like to pass arrays of double from the C code, perform operations on std::vectors in the C++ code, and eventually have these operations reflected in the arrays of double.
Is it possible to create a std::vector that matches the memory occupied by the array of double?
I have tried several options, but they all involve the creation of a new vector and a copy of the array of double into that vector. For instance:
void fcn(double* a, int sizeofa)
{
std::vector<double> vect_a;
vect_a.assign(a, a + sizeofa);
// operations on vect_a
for (int i=0;i<sizeofa;i++) { a[i] = vect_a[i]; }
}
As noted in the comments, std::vector manages its own memory, you can't make it use some other memory as the backing store (it would have no idea what to do if the size changed, among other issues).
But you may not need a vector at all; if you're just using vector for non-dynamic size related features, it's highly likely you could just use the functions from <algorithm> to perform the same work directly on the array that you wanted to use vector's methods to accomplish.
In C++, functions requiring a container often denote the container by Iterator pairs, which are templated. This is very convenient, especially when interfacing with external libraries, because an iterator isn't a type. It is a concept, which is just an interface that defines what a type should look like. Turns out, C style pointers are valid iterators. This means that you can use any C++ function that accepts an iterator with any C array.
So, now to answering your question. In other answers, it was made clear that you cannot make a std::vector control the memory allocated by a C array because a std::vector requires full ownership over the data because it wouldn't know how to deallocate it. You can copy the C array into a vector, but there is no point in using a std::vector unless you want it's resizing capabilities.
In summary: try not to pass std::vectors into functions because iterators are more generic. If you must avoid templates (virtual function, etc) than use a C style array because those are very flexible too, you can turn a std::vector into a C array, but the other way requires a copy.
I know this is hard if you have already made your code interface with std::vectors, in which case a copy is the only possible way. Prefer C style arrays when you don't need to resize the array, and maybe in the future std::array_view
Is it safe to have a std::array of dynamic object, for example std::array<std::string, 3>, and to resize the contents (the strings) ? (because it can be problematic to have a raw C array of strings)
Yes, because std::array is a just a friendly template that wraps an underlying C style aray array. You can think of it as something like this:
template <typename T, int size>
class Array {
...
T vals[size];
}
Change T to string above and you'll quickly realize that anything that you can do to the contents of an array of strings you can do with a std::array of strings. This includes resizing, deleting, whatever you can imagine.
To think even deeper about it, think about it this way. The std::array holds a string. The string has no idea where its being held. The array might tell the string to make a copy of itself (through a copy constructor or assignment), when say the array itself is assigned. However, this is its entirely through the string's public interface. The fact that the string is being held by any data structure doesn't limit that string's functionality, it just makes the holder (in this case std::array) yet another client of the string's public interface.
As containers like std::array need to work with a large variety of types, they tend to make relatively few typically well documented assumptions on the type T passed in. Stuff like requiring that T can be copy constructed, default constructed, and assigned. Then its typically up to the implementer* of T to ensure these few assumptions are valid.
*There is a very advanced topic called template specialization where one could write a specialized version of array just for say "string". Aside from vector<bool> these are pretty rare with the standard containers.
Assuming you mean resizing the strings, then yes.
I'd like to use a std::vector to control a given piece of memory. First of all I'm pretty sure this isn't good practice, but curiosity has the better of me and I'd like to know how to do this anyway.
The problem I have is a method like this:
vector<float> getRow(unsigned long rowIndex)
{
float* row = _m->getRow(rowIndex); // row is now a piece of memory (of a known size) that I control
vector<float> returnValue(row, row+_m->cols()); // construct a new vec from this data
delete [] row; // delete the original memory
return returnValue; // return the new vector
}
_m is a DLL interface class which returns an array of float which is the callers responsibility to delete. So I'd like to wrap this in a vector and return that to the user.... but this implementation allocates new memory for the vector, copies it, and then deletes the returned memory, then returns the vector.
What I'd like to do is to straight up tell the new vector that it has full control over this block of memory so when it gets deleted that memory gets cleaned up.
UPDATE: The original motivation for this (memory returned from a DLL) has been fairly firmly squashed by a number of responders :) However, I'd love to know the answer to the question anyway... Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?
The obvious answer is to use a custom allocator, however you might find that is really quite a heavyweight solution for what you need. If you want to do it, the simplest way is to take the allocator defined (as the default scond template argument to vector<>) by the implementation, copy that and make it work as required.
Another solution might be to define a template specialisation of vector, define as much of the interface as you actually need and implement the memory customisation.
Finally, how about defining your own container with a conforming STL interface, defining random access iterators etc. This might be quite easy given that underlying array will map nicely to vector<>, and pointers into it will map to iterators.
Comment on UPDATE: "Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?"
Surely the simple answer here is "No". Provided you want the result to be a vector<>, then it has to support growing as required, such as through the reserve() method, and that will not be possible for a given fixed allocation. So the real question is really: what exactly do you want to achieve? Something that can be used like vector<>, or something that really does have to in some sense be a vector, and if so, what is that sense?
Vector's default allocator doesn't provide this type of access to its internals. You could do it with your own allocator (vector's second template parameter), but that would change the type of the vector.
It would be much easier if you could write directly into the vector:
vector<float> getRow(unsigned long rowIndex) {
vector<float> row (_m->cols());
_m->getRow(rowIndex, &row[0]); // writes _m->cols() values into &row[0]
return row;
}
Note that &row[0] is a float* and it is guaranteed for vector to store items contiguously.
The most important thing to know here is that different DLL/Modules have different Heaps. This means that any memory that is allocated from a DLL needs to be deleted from that DLL (it's not just a matter of compiler version or delete vs delete[] or whatever). DO NOT PASS MEMORY MANAGEMENT RESPONSIBILITY ACROSS A DLL BOUNDARY. This includes creating a std::vector in a dll and returning it. But it also includes passing a std::vector to the DLL to be filled by the DLL; such an operation is unsafe since you don't know for sure that the std::vector will not try a resize of some kind while it is being filled with values.
There are two options:
Define your own allocator for the std::vector class that uses an allocation function that is guaranteed to reside in the DLL/Module from which the vector was created. This can easily be done with dynamic binding (that is, make the allocator class call some virtual function). Since dynamic binding will look-up in the vtable for the function call, it is guaranteed that it will fall in the code from the DLL/Module that originally created it.
Don't pass the vector object to or from the DLL. You can use, for example, a function getRowBegin() and getRowEnd() that return iterators (i.e. pointers) in the row array (if it is contiguous), and let the user std::copy that into its own, local std::vector object. You could also do it the other way around, pass the iterators begin() and end() to a function like fillRowInto(begin, end).
This problem is very real, although many people neglect it without knowing. Don't underestimate it. I have personally suffered silent bugs related to this issue and it wasn't pretty! It took me months to resolve it.
I have checked in the source code, and boost::shared_ptr and boost::shared_array use dynamic binding (first option above) to deal with this.. however, they are not guaranteed to be binary compatible. Still, this could be a slightly better option (usually binary compatibility is a much lesser problem than memory management across modules).
Your best bet is probably a std::vector<shared_ptr<MatrixCelType>>.
Lots more details in this thread.
If you're trying to change where/how the vector allocates/reallocates/deallocates memory, the allocator template parameter of the vector class is what you're looking for.
If you're simply trying to avoid the overhead of construction, copy construction, assignment, and destruction, then allow the user to instantiate the vector, then pass it to your function by reference. The user is then responsible for construction and destruction.
It sounds like what you're looking for is a form of smart pointer. One that deletes what it points to when it's destroyed. Look into the Boost libraries or roll your own in that case.
The Boost.SmartPtr library contains a whole lot of interesting classes, some of which are dedicated to handle arrays.
For example, behold scoped_array:
int main(int argc, char* argv[])
{
boost::scoped_array<float> array(_m->getRow(atoi(argv[1])));
return 0;
}
The issue, of course, is that scoped_array cannot be copied, so if you really want a std::vector<float>, #Fred Nurk's is probably the best you can get.
In the ideal case you'd want the equivalent to unique_ptr but in array form, however I don't think it's part of the standard.
I just started learning about pointers in C++, and I'm not very sure on when to use pointers, and when to use actual objects.
For example, in one of my assignments we have to construct a gPolyline class, where each point is defined by a gVector. Right now my variables for the gPolyline class looks like this:
private:
vector<gVector3*> points;
If I had vector< gVector3 > points instead, what difference would it make? Also, is there a general rule of thumb for when to use pointers? Thanks in advance!
The general rule of thumb is to use pointers when you need to, and values or references when you can.
If you use vector<gVector3> inserting elements will make copies of these elements and the elements will not be connected any more to the item you inserted. When you store pointers, the vector just refers to the object you inserted.
So if you want several vectors to share the same elements, so that changes in the element are reflected in all the vectors, you need the vectors to contain pointers. If you don't need such functionality storing values is usually better, for example it saves you from worrying about when to delete all these pointed to objects.
Pointers are generally to be avoided in modern C++. The primary purpose for pointers nowadays revolves around the fact that pointers can be polymorphic, whereas explicit objects are not.
When you need polymorphism nowadays though it's better to use a smart pointer class -- such as std::shared_ptr (if your compiler supports C++0x extensions), std::tr1::shared_ptr (if your compiler doesn't support C++0x but does support TR1) or boost::shared_ptr.
Generally, it's a good idea to use pointers when you have to, but references or alternatively objects objects (think of values) when you can.
First you need to know if gVector3 fulfils requirements of standard containers, namely if the type gVector3 copyable and assignable. It is useful if gVector3 is default constructible as well (see UPDATE note below).
Assuming it does, then you have two choices, store objects of gVector3 directly in std::vector
std::vector<gVector3> points;
points.push_back(gVector(1, 2, 3)); // std::vector will make a copy of passed object
or manage creation (and also destruction) of gVector3 objects manually.
std::vector points;
points.push_back(new gVector3(1, 2, 3));
//...
When the points array is no longer needed, remember to talk through all elements and call delete operator on it.
Now, it's your choice if you can manipulate gVector3 as objects (you can assume to think of them as values or value objects) because (if, see condition above) thanks to availability of copy constructor and assignment operator the following operations are possible:
gVector3 v1(1, 2, 3);
gVector3 v2;
v2 = v1; // assignment
gVector3 v3(v2); // copy construction
or you may want or need to allocate objects of gVector3 in dynamic storage using new operator. Meaning, you may want or need to manage lifetime of those objects on your own.
By the way, you may be also wondering When should I use references, and when should I use pointers?
UPDATE: Here is explanation to the note on default constructibility. Thanks to Neil for pointing that it was initially unclear. As Neil correctly noticed, it is not required by C++ standard, however I pointed on this feature because it is an important and useful one. If type T is not default constructible, what is not required by the C++ standard, then user should be aware of potential problems which I try to illustrate below:
#include <vector>
struct T
{
int i;
T(int i) : i(i) {}
};
int main()
{
// Request vector of 10 elements
std::vector<T> v(10); // Compilation error about missing T::T() function/ctor
}
You can use pointers or objects - it's really the same at the end of the day.
If you have a pointer, you'll need to allocate space for the actual object (then point to it) any way. At the end of the day, if you have a million objects regardless of whether you are storing pointers or the objects themselves, you'll have the space for a million objects allocated in the memory.
When to use pointers instead? If you need to pass the objects themselves around, modify individual elements after they are in the data structure without having to retrieve them each and every time, or if you're using a custom memory manager to manage the allocation, deallocation, and cleanup of the objects.
Putting the objects themselves in the STL structure is easier and simpler. It requires less * and -> operators which you may find to be difficult to comprehend. Certain STL objects would need to have the objects themselves present instead of pointers in their default format (i.e. hashtables that need to hash the entry - and you want to hash the object, not the pointer to it) but you can always work around that by overriding functions, etc.
Bottom line: use pointers when it makes sense to. Use objects otherwise.
Normally you use objects.
Its easier to eat an apple than an apple on a stick (OK 2 meter stick because I like candy apples).
In this case just make it a vector<gVector3>
If you had a vector<g3Vector*> this implies that you are dynamically allocating new objects of g3Vector (using the new operator). If so then you need to call delete on these pointers at some point and std::Vector is not designed to do that.
But every rule is an exception.
If g3Vector is a huge object that costs a lot to copy (hard to tell read your documentation) then it may be more effecient to store as a pointer. But in this case I would use the boost::ptr_vector<g3Vector> as this automatically manages the life span of the object.