Checking whether element exists in a Vector - c++

I have a vector and it is going to hold 10 objects maximum. When I create the vector I believe it creates 10 "blank" objects of the class it will hold. Therefore I have a problem because my code relies on checking that a vector element is null and obviously it is never null.
How can I check whether a vector object contains an element I inserted, or one of the default constructor "blank" objects upon initialisation?
Is there a technique round this?
(I need to check for null because I am writing a recursive algorithm and the termination point, when the recursive function returns is when an object in the vector is null)

An instance of a class cannot be null. Only a pointer.
You do, however, have size() that you can use.
typedef stdd::vector<SomeClass> vec;
//define some vec, v
for (vec::size_type i = 0, s = vec.size(); i < s; ++i) {
//do something with v[i]
}
With a recursive function you could use this idea by passing along a maximum index.
void recursiveFunc(vec& v, vec::size_type s);
Then when checking your condition to recurse, you would need to check "am I at the end of the vector?"
Alternatively, instead of working on indexes, you could use iterators:
template <typename Iterator>
void recursiveFunc(Iterator begin, const Iterator& end);
If done correctly (and if possible in your situation), this could decouple your manipulation from being aware of the underlying data being stored in a vector.
The loop to go over the vector would then look like:
while (begin != end) {
//do something with *begin
++begin;
}

std::vector only inserts "real" objects. It (at least normally) allocates raw memory, and uses placement new to construct objects in that memory as needed. The only objects it'll contain will be the ones you put there.
Of course, if you want to, you can create a vector containing a number of copies of an object you pass to the constructor. Likewise, when you resize a vector, you pass an object it'll copy into the new locations if you're making it larger.
Neither of those is really the norm though. In a typical case, you'll just create a vector, which will start out containing 0 objects. You'll use push_back to add objects to the vector. When you search through the vector, the only objects there will be the ones you put there with push_back, and you don't need to worry about the possibility of it containing any other objects.
If you just want to check whether the vector is empty, you can just use:
if (your_vector.empty())
...which will (obviously enough) return true if it's empty, and false if it contains at least one object.

As #Corbin mentioned, size() will return the number of elements in the vector. It is guaranteed not to have any holes in between(contiguous), so you assured vector[vector.size()] is empty.

Related

What's the fastest way to reinitialize a vector?

What is the fastest way to reset all values for a large vector to its default values?
struct foo
{
int id;
float score;
};
std::vector<foo> large_vector(10000000);
The simplest way would be to create a new vector, but I guess it takes more time to reallocate memory than to reinitialize an existing one?
I have to iterate over the vector to collect non-zero scores (could be thousands or millions) before resetting it. Should I reset the structs one by one in this loop?
Edit:
The vector size is fixed and 'default value' means 0 for every struct member (all floats and ints).
What's the fastest way to reinitialize a vector?
Don't.
Just record the fact that the vector has no valid entries by calling clear(). This has the advantage of being both (probably) optimal, and guaranteed correct, and also being perfectly expressive. IMO none of the suggested alternatives should be considered unless profiling shows an actual need.
Your element type is trivial, so the linear upper-bound on complexity should in reality be constant for a decent quality implementation - there's no need to destroy each element in turn.
No memory is deallocated, or needs to be re-allocated later.
You'll just need to push_back or emplace_back when you're writing into the vector after clear()ing, instead of using operator[].
To make this consistent with the first use, don't initialize your vector with 10000000 value-constructed elements, but use reserve(10000000) to pre-allocate without initialization.
eg.
int main() {
vector<foo> v;
v.reserve(10000000);
while(keep_running) {
use(v);
v.clear();
}
}
// precondition: v is empty, so
// don't access v[i] until you've done
// v.push_back({id,score})
// at least i+1 times
void use(vector<foo> &v) {
}
Since you need to zero your elements in-place, the second fastest general-purpose solution is probably to alter the loop above to
while(keep_running) {
v.resize(10000000);
use(v);
v.clear();
}
or alternatively to remove the clear() and use fill() to overwrite all elements in-place.
If non-zero elements are sparse, as may be the case if you're updating them based on some meaningful index, it might be faster to zero them on the fly as your main loop iterates over the vector.
Again, you really need to profile to find out which is better for your use case.
In order to determine the fastest way you will need to run some benchmarks.
There are a number of different ways to "reinitialise" a vector:
Call clear(), for trivial types this should be roughly equivalent to just doing vector.size = 0. The capacity of the vector doesn't change and no elements are deallocated. Destructors will be called on elements if they exist. As you push_back, emplace_back or resize the vector the old values will be overwritten.
Call assign(), e.g. large_vector.assign( large_vector.size(), Foo() );. This will iterate through the whole vector resetting every element to its default value. Hopefully the compiler will manage to optimise this to a memset or similar.
As your type is trivial, if you want to just reset every element to 0 you should be able to do a memset, e.g.: memset( large_vector.data(), 0, sizeof(Foo)*large_vector.size() );.
Call std::fill e.g. std::fill( large_vector.begin(), large_vector.end(), Foo() );, this should be similar to assign or memset.
What is the fastest way to reset all values for a large vector to its default values?
Depends on what vector in its "default values" means.
If you want to remove all elements, most efficient is std::vector::clear.
If you want to keep all elements in the vector but set their state, then you can use std::fill:
std::fill(large_vector.begin(), large_vector.end(), default_value);
If the element type is trivial, and the "default value" is zero†, then std::memset may be optimal:
static_assert(std::is_trivially_copyable_v<decltype(large_vector[0])>);
std::memset(large_vector.data(), 0, large_vector.size() * sizeof(large_vector[0]));
To verify that std::memset is worth the trouble, you should measure (or inspect assembly). The optimiser may do the work for you.
† Zero in the sense that all bits are unset. C++ does not guarantee that this is a representation for a zero float. It also doesn't guarantee it to be a null pointer, in case your non-minimal use case uses pointers.

How does std::vector::end() iterator work in memory?

Today, I was attempting to extract a subset of N elements from a vector of size M, where N < M. I realised that I did not need to create a new copy, only needed to modify the original, and moreover, could take simply the first N elements.
After doing a few brief searches, there were many answers, the most attractive one being resize() which appears to truncate the vector down to length, and deal neatly with the memory issues of erasing the other elements.
However, before I came across vector.resize(), I was trying to point the vector.end() to the N+1'th position. I knew this wouldn't work, but I wanted to try it regardless. This would leave the other elements past the N'th position "stranded", and I believe (correct me if i'm wrong) this would be an example of a memory leak.
On looking at the iterator validity on http://www.cplusplus.com/reference/vector/vector/resize/,
we see that if it shrinks, vector.end() stays the same. If it expands, vector.end() will move (albeit irrelevant to our case).
This leads me to question, what is the underlying mechanic of vector.end()? Where does it lie in memory? It can be found incrementing an iterator pointing to the last element in the vector, eg auto iter = &vector.back(), iter++, but in memory, is this what happens?
I can believe that at all times, what follows vector.begin() should be the first element, but on resize, it appears that vector.end() can lie elsewhere other than past the last element in the vector.
For some reason, I can't seem to find the answer, but it sounds like a very basic computer science course would contain this information. I suppose it is stl specific, as there are probably many implementations of a vector / list that all differ...
Sorry for the long post about a simple question!
you asked about "the underlying mechanic of vector.end()". Well here is (a snippet of) an oversimplified vector that is easy to digest:
template <class T>
class Simplified_vector
{
public:
using interator = T*;
using const_interator = const T*;
private:
T* buffer_;
std::size_t size_;
std::size_t capacity_;
public:
auto push_back(const T& val) -> void
{
if (size_ + 1 > capacity_)
{
// buffer increase logic
//
// this usually means allocation a new larger buffer
// followed by coping/moving elements from the old to the new buffer
// deleting the old buffer
// and make `buffer_` point to the new buffer
// (along with modifying `capacity_` to reflect the new buffer size)
//
// strong exception guarantee makes things a bit more complicated,
// but this is the gist of it
}
buffer_[size_] = val;
++size_;
}
auto begin() const -> const_iterator
{
return buffer_;
}
auto begin() -> iterator
{
return buffer_;
}
auto end() const -> const_iterator
{
return buffer_ + size_;
}
auto end() -> iterator
{
return buffer_ + size_;
}
};
Also see this question Can std::vector<T>::iterator simply be T*? for why T* is a perfectly valid iterator for std::vector<T>
Now with this implementation in mind let's answer a few of your misconceptions questions:
I was trying to point the vector.end() to the N+1'th position.
This is not possible. The end iterator is not something that is stored directly in the class. As you can see it's a computation of the begging of the buffer plus the size (number of elements) of the container. Moreover you cannot directly manipulate it. The internal workings of the class make sure end() will return an iterator pointing to 1 past the last element in the buffer. You cannot change this. What you can do is insert/remove elements from the container and the end() will reflect these new changes, but you cannot manipulate it directly.
and I believe (correct me if i'm wrong) this would be an example of a
memory leak.
you are wrong. Even if you somehow make end point to something else that what is supposed to point, that wouldn't be a memory leak. A memory leak would be if you would lost any reference to the dynamically allocated internal buffer.
The "end" of any contiguous container (like a vector or an array) is always one element beyond the last element of the container.
So for an array (or vector) of X elements the "end" is index X (remember that since indexes are zero-based the last index is X - 1).
This is very well illustrated in e.g. this vector::end reference.
If you shrink your vector, the last index will of course also change, meaning that the "end" will change as well. If the end-iterator does not change, then it means you have saved it from before you shrank the vector, which will change the size and invalidate all iterators beyond the last element in the vector, including the end iterator.
If you change the size of a vector, by adding new elements or by removing elements, then you must re-fetch the end iterator. The existing iterator objects you have will not automatically be updated.
Usually the end isn't stored in an implementation of vector. A vector stores:
A pointer to the first element. If you call begin(), this is what you get back.
The size of the memory block that's been managed. If you call capacity() you get back the number of elements that can fit in this allocated memory.
The number of elements that are in use. These are elements that have been constructed and are in the first part of the memory block. The rest of the memory is unused, but is available for new elements. If the entire capacity gets filled, to add more elements the vector will allocate a larger block of memory and copy all the elements into that, and deallocate the original block.
When you call end() this returns begin() + size(). So yes, end() is a pointer that points to one beyond the last element.
So the end() isn't a thing that you can move. You can only change it by adding or removing elements.
If you want to extract a number of elements 'N' you can do so by reading those from begin() to begin() + 'N'.
for( var it = vec.begin(); it != begin() + n; ++it )
{
// do something with the element (*it) here.
}
Many stl algorithms take a pair of iterators for the begin and end of a range of elements you want to work with. In your case, you can use vec.begin() and vec.begin() + n as the begin and end of the range you're interested in.
If you want to throw away the elements after n, you can do vec.resize(n). Then the vector will destruct elements you don't need. It might not change the size of the memory block the vector manages, the vector might keep the memory around in case you add more elements again. That's an implementation detail of the vector class you're using.

Safety deleting elements of the container

Could you suggest safety deleting of element of std::vector in to cases:
1. Clear all elements of the vector;
2. Erase one element depending on condition.
What are the dangers of this code:
typename std::vector<T*>::iterator it;
for (it=std::vector<T*>::begin();it!=std::vector<T*>::end();it++)
{
if (*it) delete *it;
}
Thank's.
You don't remove the element from the vector. So the vector element is pointing to the location as before, i.e. the same T. However, since you have deleted that T you can not dereference the pointer anymore - that would be UB and may crash your program.
delete is calling the destructor of T (great, it's something you shall do) but deleteis not changing the vector. Consequently the iterator is valid all the time.
Either you should remove the element for which you have called delete or at least set the vector element to nullptr.
typename std::vector<T*>::iterator it;
for (it=std::vector<T*>::begin();it!=std::vector<T*>::end();it++)
{
delete *it;
*it = nullptr; // Only needed when you don't erase the vector element
}
That solution requires that you always check for nullptr before using any element of the vector.
In most cases the best solution is however to remove the element from the vector.
In the case where you destroy all elements by calling delete on every element, simply call clear on the vector after the loop.
For example, for you first two questions which appeared to regard clearing a vector
std::vector<int> v;//maybe have elements or not...
Clear the vector by calling clear
v.clear();
If you wish to remove elements while you walk over the vector which satisfy a condition (i.e. a predicate) using v.erase(it) you need to be careful. Fortunately, erase returns the iterator after the removed position, so something like
for (auto it=v.begin();it!=v.end();)
{
if (predicate(it))
it = v.erase(it);
else
++it;
}
Obviously you can use std::remove_if instead (with std::erase) if you want to shrink your vector slightly, avoiding a "awfully performant hand-written remove_if" as a commenter described this.
If you want to delete things as you loop round, you can either hand-do a loop yourself or use an algorithm - but smart pointers make far more sense. I suspect you are wanting to delete the items possibly in addition to shrinking the vector since you said " I found that element of this vector can be used after its deleting". If you do want to store raw pointers in a container, just delete the items, you don't the the if in your suggested code. You need to consider when you plan on doing this.
Note: changing the contents of an iterator doesn't invalidate an iterator.
You should consider using smart pointers, for exception safety etc.

Freeing up memory by deleting a vector in a vector in C++

I have a vector of vectors and I wish to delete myvec[i] from memory entirely, free up the room, and so on. Will .erase or .clear do the job for me? If not, what should I do?
Completely Removing The Vector
If you want to completely remove the vector at index i in your myvec, so that myvec[i] will no longer exist and myvec.size() will be one less that it was before, you should do this:
myvec.erase (myvec.begin() + i); // Note that this only works on vectors
This will completely deallocate all memories owned by myvec[i] and will move all the elements after it (myvec[i + 1], myvec[i + 2], etc.) one place back so that myvec will have one less vector in it.
Emptying But Keeping The Vector
However, if you don't want to remove the ith vector from myvec, and you just want to completely empty it while keeping the empty vector in place, there are several methods you can use.
Basic Method
One technique that is commonly used is to swap the vector you want to empty out with a new and completely empty vector, like this:
// suppose the type of your vectors is vector<int>
vector<int>().swap (myvec[i]);
This is guaranteed to free up all the memory in myvec[i], it's fast and it doesn't allocate any new heap memory or anything.
This is used because the method clear does not offer such a guarantee. If you clear the vector, it always does set its size to zero and destruct all the elements, but it might not (depending on the implementation) actually free the memory.
In C++11, you can do what you want with two function calls: (thanks for the helpful comment)
myvec[i].clear();
myvec[i].shrink_to_fit();
Generalization
You can write a small function that would work for most (probably all) STL containers and more:
template <typename T>
void Eviscerate (T & x)
{
T().swap (x);
}
which you use like this:
Eviscerate (myvec[i]);
This is obviously cleaner and more readable, not to mention more general.
In C++11, you can also use decltype to write a generic solution (independent of the type of your container and elements,) but it's very ugly and I only put it here for completeness:
// You should include <utility> for std::remove_reference
typename std::remove_reference<decltype(myvec[i])>::type().swap(myvec[i]);
My recommended method is the Eviscerate function above.
myvec.erase( myvec.begin() + i ) will remove myvec[i]
completely, calling its destructor, and freeing all of its
dynamically allocated memory. It will not reduce the memory
used directly by myvec: myvec.size() will be reduced by one,
but myvec.capacity() will be unchanged. To remove this last
residue, C++11 has myvec.shrink_to_fit(), which might remove
it; otherwise, you'll have to make a complete copy of myvec,
then swap it in:
void
shrink_to_fit( MyVecType& target )
{
MyVecType tmp( target.begin(), target.end() );
target.swap( tmp );
}
(This is basically what shring_to_fit will do under the hood.)
This is a very expensive operation, for very little real gain,
at least with regards to the removal of single elements; if you
are erasing a large number of elements, it might be worth
considering it after all of the erasures.
Finally, if you want to erase all of the elements,
myvec.clear() is exactly the same as myvec.erase() on each
element, with the same considerations described above. In this
case, creating an empty vector and swapping is a better
solution.

C++: Assigning values to non-continuous indexes in vectors?

If I want to declare a vector of unknown size, then assign values to index 5, index 10, index 1, index 100, in that order. Is it easily doable in a vector?
It seems there's no easy way. Cause if I initialize a vector without a size, then I can't access index 5 without first allocating memory for it by doing resize() or five push_back()'s. But resize clears previously stored values in a vector. I can construct the vector by giving it a size to begin with, but I don't know how big the vector should.
So how can I not have to declare a fixed size, and still access non-continuous indices in a vector?
(I doubt an array would be easier for this task).
Would an std::map between integer keys and values not be an easier solution here? Vectors will require a contiguous allocation of memory, so if you're only using the occasional index, you'll "waste" a lot of memory.
Resize doesn't clear the vector. You can easily do something like:
if (v.size() <= n)
v.resize(n+1);
v[n] = 42;
This will preserve all values in the vector and add just enough default initialized values so that index n becomes accessible.
That said, if you don't need all indexes or contigous memory, you might consider a different data structure.
resize() doesn't clear previously stored values in a vector.
see this documentation
I would also argue that if this is what you need to do then its possible that vector may not be the container for you. Did you consider using map maybe?
Data structures which do not contain a contiguous set of values are known as sparse or compressed data structures. It seems that this is what you are looking for.
If this is case, you want a sparse vector. There is one implemented in boost, see link text
Sparse structures are typically used to conserve memory. It is possible from your problem description that you don't actually care about memory use, but about addressing elements that don't yet exist (you want an auto-resizing container). In this case a simple solution with no external dependencies is as follows:
Create a template class that holds a vector and forwards all vector methods to it. Change your operator[] to resize the vector if the index is out of bounds.
// A vector that resizes on dereference if the index is out of bounds.
template<typename T>
struct resize_vector
{
typedef typename std::vector<T>::size_type size_type;
// ... Repeat for iterator/value_type typedefs etc
size_type size() const { return m_impl.size() }
// ... Repeat for all other vector methods you want
value_type& operator[](size_type i)
{
if (i >= size())
resize(i + 1); // Resize
return m_impl[i];
}
// You may want a const overload of operator[] that throws
// instead of resizing (or make m_impl mutable, but thats ugly).
private:
std::vector<T> m_impl;
};
As noted in other answers, elements aren't cleared when a vector is resized. Instead, when new elements are added by a resize, their default constructor is called. You therefore need to know when using this class that operator[] may return you a default constructed object reference. Your default constructor for <T> should therefore set the object to a sensible value for this purpose. You may use a sentinel value for example, if you need to know whether the element has previously been assigned a value.
The suggestion to use a std::map<size_t, T> also has merit as a solution, provided you don't mind the extra memory use, non-contiguous element storage and O(logN) lookup rather than O(1) for the vector. This all boils down to whether you want a sparse representation or automatic resizing; hopefully this answer covers both.