I want to know if there is a way to keep track of addresses of elements in a vector.
I like to use vector of solid elements like std::vector<MyObject> vec because:
std::vector do all the allocation / deallocation stuff for me
it also ensures me that my elements are stored in contiguous memory
I can get benefits of all code that work with vectors (e.g. <algorithm>)
This is exactly what I want. The problem arises when I want to store the address / reference of elements of my vector in other objects. In fact, the problem really arises when std::vector need to reallocate memory.
I tried to find alternatives to my problem:
Use vector of pointers / smart pointers : No, the pointers will be allocated contiguously but not the elements
Use vector of pointers / smart pointers and write my own operator new/new[] for MyObject : Humm, that seems better but no. By using a vector to allocate my elements I can say : "those particular set (I do not refer here to std::set) of elements must be allocated contiguously, not all". In fact I may want to have other set of elements that should be allocated contiguously because of the way I want to use them and using a vector to do that is exactly what (I think) I need. Thats also implies that I'm doing the job I want the vector to do.
Why not using boost multi-index? : In some ways that will do what I want because I want to store pointers/smartpointers of my vector elements in other containers. But no again because I really want to store reference/pointer/smartpointer of my vector's elements inside other objects, not just other containers.
What I would loved to have is a vector that can give me a pointer object that will allways point the the address of the desired element and I will use it like that :
std::vector<MyObject> vec;
// insert some elements
...
// get a pointer object by index or by using an iterator
// does something like that exist?
std::vector<MyObject>::pointer ptr = vec.get_pointer_at(5);
// do what I want on the vector except removing the element
...
// use my pointer whatever reallocations occurred or not
ptr->doSomething();
That sounds like an iterator that will never be invalidated except the fact I don't need/want to perform arithmetic on it (+x, -x, ++, --).
So, can someone leads me to the way to achieve what I want or explain me why/where I'm wrong wanting to do this? Please accept my apologize for my lack of knowledge in STL if there is a well known solution that I missed / or if this question has already be answered.
Edit:
I think that if I have to code such kind of pointer, that means that I'm wanting something useless or I'm wrong somewhere (unless someone should have already wrote a template for that) . So I'm more looking to a validated C++ idiom to get rid of this problem.
Although std::vector does not give you such pointer, there is no reason why you cannot make one yourself. All it takes is a class that keeps a reference to the std::vector object and the index, and overloads the prefix operator * and the infix operator -> (you need four overloads - const and non-const for each operator).
You would use this pointer like this:
std::vector<int> vect = {2, 4, 6, 8, 10, 12, 14, 16};
vect_ptr<int> ptr(vect, 5); // <<== You need to implement this
*ptr = 123;
cout << *ptr << endl;
The implementation of these overloads would grab the std::vector's begin() iterator, and return the result of calling vect.at(index). This would look like a pointer from the outside, but the object to which it points would change as the content of the std::vector gets resized.
As far as I know, there is nothing in the standard library nor in Boost to address your problem. A solution would be to implement your own kind of element pointer:
template<typename T>
class vector_element
{
public:
vector_element( std::vector<T>& v, std::size_t i )
: m_container( v ), m_element_index(i)
{ }
T& operator*() { return m_container[m_element_index]; }
T* operator->() { return &m_container[m_element_index]; }
private:
std::vector<T>& m_container;
std::size_t m_element_index;
};
Related
I have a vector of vectors and I wish to delete myvec[i] from memory entirely, free up the room, and so on. Will .erase or .clear do the job for me? If not, what should I do?
Completely Removing The Vector
If you want to completely remove the vector at index i in your myvec, so that myvec[i] will no longer exist and myvec.size() will be one less that it was before, you should do this:
myvec.erase (myvec.begin() + i); // Note that this only works on vectors
This will completely deallocate all memories owned by myvec[i] and will move all the elements after it (myvec[i + 1], myvec[i + 2], etc.) one place back so that myvec will have one less vector in it.
Emptying But Keeping The Vector
However, if you don't want to remove the ith vector from myvec, and you just want to completely empty it while keeping the empty vector in place, there are several methods you can use.
Basic Method
One technique that is commonly used is to swap the vector you want to empty out with a new and completely empty vector, like this:
// suppose the type of your vectors is vector<int>
vector<int>().swap (myvec[i]);
This is guaranteed to free up all the memory in myvec[i], it's fast and it doesn't allocate any new heap memory or anything.
This is used because the method clear does not offer such a guarantee. If you clear the vector, it always does set its size to zero and destruct all the elements, but it might not (depending on the implementation) actually free the memory.
In C++11, you can do what you want with two function calls: (thanks for the helpful comment)
myvec[i].clear();
myvec[i].shrink_to_fit();
Generalization
You can write a small function that would work for most (probably all) STL containers and more:
template <typename T>
void Eviscerate (T & x)
{
T().swap (x);
}
which you use like this:
Eviscerate (myvec[i]);
This is obviously cleaner and more readable, not to mention more general.
In C++11, you can also use decltype to write a generic solution (independent of the type of your container and elements,) but it's very ugly and I only put it here for completeness:
// You should include <utility> for std::remove_reference
typename std::remove_reference<decltype(myvec[i])>::type().swap(myvec[i]);
My recommended method is the Eviscerate function above.
myvec.erase( myvec.begin() + i ) will remove myvec[i]
completely, calling its destructor, and freeing all of its
dynamically allocated memory. It will not reduce the memory
used directly by myvec: myvec.size() will be reduced by one,
but myvec.capacity() will be unchanged. To remove this last
residue, C++11 has myvec.shrink_to_fit(), which might remove
it; otherwise, you'll have to make a complete copy of myvec,
then swap it in:
void
shrink_to_fit( MyVecType& target )
{
MyVecType tmp( target.begin(), target.end() );
target.swap( tmp );
}
(This is basically what shring_to_fit will do under the hood.)
This is a very expensive operation, for very little real gain,
at least with regards to the removal of single elements; if you
are erasing a large number of elements, it might be worth
considering it after all of the erasures.
Finally, if you want to erase all of the elements,
myvec.clear() is exactly the same as myvec.erase() on each
element, with the same considerations described above. In this
case, creating an empty vector and swapping is a better
solution.
I want to ask whether there are some problems with the copy for the vector of pointer items. Do I need to strcpy or memcpy because there may be depth copy problem?
For instance:
Class B;
Class A
{
....
private:
std::vector<B*> bvec;
public:
void setB(std::vector<B*>& value)
{
this->bvec = value;
}
};
void main()
{
....
std::vector<const B*> value; // and already has values
A a;
a.setB(value);
}
This example only assign the value to the class variable bvec inside A class. Do I need to use memcpy since I found that std::vector bvec; has pointer items? I am confused with the depth copy in C++, could you make me clear about that? Thank you.
Think about this, if you remove and delete an item from the vector value after you call setB, then the vector in A will have a pointer that is no longer valid.
So either you need to do a "deep copy", have guarantees that the above scenario will never happen, or use shared smart pointers like std::shared_ptr instead of raw pointers. If you need pointers, I would recommend the last.
There is another alternative, and that is to store the vector in A as a reference to the real vector. However, this has other problems, like the real vector needs to be valid through the lifetime of the object. But here too you can use smart pointers, and allocate the vector dynamically.
It is unlikely you need strcpy or memcpy to solve your problem. However, I'm not sure what your problem is.
I will try to explain copying as it relates to std::vector.
When you assign bvev to value in setB you are making a deep copy. This means all of the elements in the vector are copied from value to bvec. If you have a vector of objects, each object is copied. If you have a vector of pointers, each pointer is copied.
Another option is to simply copy the pointer to the vector if you wish to reference the elements later on. Just be careful to manage the lifetimes properly!
I hope that helps!
You probably want to define your copy constructor for class A to ensure the problem your asking about is handled correctly (though not by using memcpy or strcpy). Always follow the rule of three here. I'm pretty sure with std::vector your good, but if not, then use a for loop instead of memcpy
Edit: The below question was answered by this. I have a new updated question, is it any more efficient to use: (my friend said it is inefficient to put a vector of a vector because it uses sequential memory and to realloc when you push_back means it takes more time to find the location where a chunk of memory for the entire large vector can be placed)
(where Picture is a vector of lines, Line is a vector of points)
std::vector<Point> *LineVec;
std::vector<Line> PictureVec;
versus
std::vector<Point> LineVec;
std::vector<Line> PictureVec;
struct Point{
int x;
int y;
}
I'm trying to get a vector of a vector and my friend told me that it's inefficient to put a vector of a vector because it uses sequential memory and vector of a vector will require huge amounts of space. So what he suggested was a using a vector of a pointer vector. Therefore the inner vector looks like this. Clearly I'm very new to C++ and would appreciate any insight.
struct Shape{
int c;
int d;
}
std::vector<Shape> *intvec;
When I want to push back into this, how would I do so? Something like this?
Shape s;
s.c=1;
s.d=1;
intvec->push_back(s);
Also, I wrote an iterator to go through, however it does not seem to work, hence why I believe the above code does not work. Finally my last concern is, while the above code works, it gives really weird values for my output. Large numbers that are 7 digits long and definitely not the values I put in for s.c and s.d
for(std::vector<Shape>::iterator it=Shapes->begin();it<Shapes->end();it++){
Shape s = (*it);
std::cout << s.c << s.d << std::endl;
}
Using a vector of pointers to vectors is not more efficient than a vector of vectors. It's less efficient, because it introduces an extra level of indirection. It also does not cause all elements of the resulting 2-d array to be allocated contiguously.
The reason is that a vector is practically a pointer to an array, in the sense that a vector<T> is implemented roughly as
template <typename T>
class vector
{
T *p; // pointer to array of elements
size_t nelems, capacity;
public:
// interface
};
so that a vector of vectors behaves, performance-wise, like a dynamic array of pointers to arrays.
[Note: I can't quote the C++ standard chapter and verse, but I'm pretty sure it constrains std::vector's operations and complexity in such a way that the above is the only practical way of implementing it.]
As to your updated question about whether or not it is more efficient to use a pointer to a vector over a vector itself. In some cases it is more efficient to use a pointer to a vector rather then the actual vector itself. A specific example would be using a vector as a parameter for a function.
EX:
void somefunction(std::vector<int> hello)
In this case the copy constructor for std::vector is invoked any time this function is called (which copies the vector completely, INCLUDING the elements contained in the vector). Passing by reference gets rid of this extra copy.
As for whether push_back itself is more efficient when using a pointer to a vector. No its not more efficient to use a pointer (they should be roughly equivalent time wise).
If I have a vector in C++, I know I can safely pass it as an array (pointer to the contained type):
void some_function(size_t size, int array[])
{
// impl here...
}
// ...
std::vector<int> test;
some_function(test.size(), &test[0]);
Is it safe to do this with a nested vector?
void some_function(size_t x, size_t y, size_t z, int* multi_dimensional_array)
{
// impl here...
}
// ...
std::vector<std::vector<std::vector<int> > > test;
// initialize with non-jagged dimensions, ensure they're not empty, then...
some_function(test.size(), test[0].size(), test[0][0].size(), &test[0][0][0]);
Edit:
If it is not safe, what are some alternatives, both if I can change the signature of some_function, and if I can't?
Short answer is "no".
Elements here std::vector<std::vector<std::vector<int> > > test; are not replaced in contiguous memory area.
You can only expect multi_dimensional_array to point to a contiguos memory block of size test[0][0].size() * sizeof(int). But that is probably not what you want.
It is erroneous to take the address of any location in a vector and pass it. It might seem to work, but don't count on it.
The reason why is closely tied to why a vector is a vector, and not an array. We want a vector to grow dynamically, unlike an array. We want insertions into a vector be a constant cost and not depend on the size of the vector, like an array until you hit the allocated size of the array.
So how does the magic work? When there is no more internal space to add a next element to the vector, a new space is allocated twice the size of the old. The old space is copied to the new and the old space is no longer needed, or valid, which makes dangling any pointer to the old space. Twice the space is allocated so the average cost of insertion to the vector that is constant.
Is it safe to do this with a nested vector?
Yes, IF you want to access the inner-most vector only, and as long you know the number of elements it contains, and you don't try accessing more than that.
But seeing your function signature, it seems that you want to acess all three dimensions, in that case, no, that isn't valid.
The alternative is that you can call the function some_function(size_t size, int array[]) for each inner-most vector (if that solves your problem); and for that you can do this trick (or something similar):
void some_function(std::vector<int> & v1int)
{
//the final call to some_function(size_t size, int array[])
//which actually process the inner-most vectors
some_function(v1int.size(), &v1int[0]);
}
void some_function(std::vector<std::vector<int> > & v2int)
{
//call some_function(std::vector<int> & v1int) for each element!
std::for_each(v2int.begin(), v2int.end(), some_function);
}
//call some_function(std::vector<std::vector<int> > & v2int) for each element!
std::for_each(test.begin(), test.end(), some_function);
A very simple solution would be to simply copy the contents of the nested vector into one vector and pass it to that function. But this depends on how much overhead you are willing to take.
That being sad: Nested vectorS aren't good practice. A matrix class storing everything in contiguous memory and managing access is really more efficient and less ugly and would possibly allow something like T* matrix::get_raw() but the ordering of the contents would still be an implementation detail.
Simple answer - no, it is not. Did you try compiling this? And why not just pass the whole 3D vector as a reference? If you are trying to access old C code in this manner, then you cannot.
It would be much safer to pass the vector, or a reference to it:
void some_function(std::vector<std::vector<std::vector<int>>> & vector);
You can then get the size and items within the function, leaving less risk for mistakes. You can copy the vector or pass a pointer/reference, depending on expected size and use.
If you need to pass across modules, then it becomes slightly more complicated.
Trying to use &top_level_vector[0] and pass that to a C-style function that expects an int* isn't safe.
To support correct C-style access to a multi-dimensional array, all the bytes of all the hierarchy of arrays would have to be contiguous. In a c++ std::vector, this is true for the items contained by a vector, but not for the vector itself. If you try to take the address of the top-level vector, ala &top_level_vector[0], you're going to get an array of vectors, not an array of int.
The vector structure isn't simply an array of the contained type. It is implemented as a structure containing a pointer, as well as size and capacity book-keeping data. Therefore the question's std::vector<std::vector<std::vector<int> > > is more or less a hierarchical tree of structures, stitched together with pointers. Only the final leaf nodes in that tree are blocks of contiguous int values. And each of those blocks of memory are not necessarily contiguous to any other block.
In order to interface with C, you can only pass the contents of a single vector. So you'll have to create a single std::vector<int> of size x * y * z. Or you could decide to re-structure your C code to handle a single 1-dimensional stripe of data at a time. Then you could keep the hierarchy, and only pass in the contents of leaf vectors.
If I want to declare a vector of unknown size, then assign values to index 5, index 10, index 1, index 100, in that order. Is it easily doable in a vector?
It seems there's no easy way. Cause if I initialize a vector without a size, then I can't access index 5 without first allocating memory for it by doing resize() or five push_back()'s. But resize clears previously stored values in a vector. I can construct the vector by giving it a size to begin with, but I don't know how big the vector should.
So how can I not have to declare a fixed size, and still access non-continuous indices in a vector?
(I doubt an array would be easier for this task).
Would an std::map between integer keys and values not be an easier solution here? Vectors will require a contiguous allocation of memory, so if you're only using the occasional index, you'll "waste" a lot of memory.
Resize doesn't clear the vector. You can easily do something like:
if (v.size() <= n)
v.resize(n+1);
v[n] = 42;
This will preserve all values in the vector and add just enough default initialized values so that index n becomes accessible.
That said, if you don't need all indexes or contigous memory, you might consider a different data structure.
resize() doesn't clear previously stored values in a vector.
see this documentation
I would also argue that if this is what you need to do then its possible that vector may not be the container for you. Did you consider using map maybe?
Data structures which do not contain a contiguous set of values are known as sparse or compressed data structures. It seems that this is what you are looking for.
If this is case, you want a sparse vector. There is one implemented in boost, see link text
Sparse structures are typically used to conserve memory. It is possible from your problem description that you don't actually care about memory use, but about addressing elements that don't yet exist (you want an auto-resizing container). In this case a simple solution with no external dependencies is as follows:
Create a template class that holds a vector and forwards all vector methods to it. Change your operator[] to resize the vector if the index is out of bounds.
// A vector that resizes on dereference if the index is out of bounds.
template<typename T>
struct resize_vector
{
typedef typename std::vector<T>::size_type size_type;
// ... Repeat for iterator/value_type typedefs etc
size_type size() const { return m_impl.size() }
// ... Repeat for all other vector methods you want
value_type& operator[](size_type i)
{
if (i >= size())
resize(i + 1); // Resize
return m_impl[i];
}
// You may want a const overload of operator[] that throws
// instead of resizing (or make m_impl mutable, but thats ugly).
private:
std::vector<T> m_impl;
};
As noted in other answers, elements aren't cleared when a vector is resized. Instead, when new elements are added by a resize, their default constructor is called. You therefore need to know when using this class that operator[] may return you a default constructed object reference. Your default constructor for <T> should therefore set the object to a sensible value for this purpose. You may use a sentinel value for example, if you need to know whether the element has previously been assigned a value.
The suggestion to use a std::map<size_t, T> also has merit as a solution, provided you don't mind the extra memory use, non-contiguous element storage and O(logN) lookup rather than O(1) for the vector. This all boils down to whether you want a sparse representation or automatic resizing; hopefully this answer covers both.