Appending to a container - c++

I have a doubt regarding the generic algorithm copy in C++.
To copy from destination container ret and source container bottom,
copy(bottom.begin(), bottom.end(), back_inserter(ret));
works but
copy(bottom.begin(), bottom.end(), ret.end());
does not. Do these two statements have different implications?

Check what the statements do – there is no magic involved. In particular, copy is (essentially) just a loop. Simplified:
template <typename I>
void copy(I begin, I end, I target) {
while (begin != end)
*target++ = *begin++;
}
And back_inserter really does what the name says.
So in effect, without theback_inserter you do not expand the target container, you just write past its end: iterators don’t change their underlying container. The back_inserter function, on the other hand, creates a specialised iterator which does hold a reference to its original container and calls push_back when you dereference and assign to it.

In the first one you are giving copy a method of inserting, and from what container to insert from.
In the second one you are only giving a pointer to the end of the container.

Both return iterators, but...
ret.end() returns an iterator pointing to the end of the
container. It can be decremented, but not incremented (since it
already points to the end of the sequence), and it cannot be
dereferenced unless it is decremented (again, because it points
to one past the end of the sequence).
back_inserter(ret) is a function which returns
a back_insertion_iterator, which is a very special type of
"iterator" (category OutputIterator): it's incrementation
functions are no-ops, dereferencing it returns *this, and
assigning a value type to it calls push_back on the owning
container. (In other words, it's not an iterator at all, except
for the C++ standard; but it presents the interface of one to do
something very different.)

Related

why not we pass asterisk(*) in iterator in stl

When we use iterator we declare iterator and then itr as an object, but we don't pass any pointer like we do every time when declaring pointer variable but when we print the value of vector by the use of iterator than how itr became*itr
when we doesn't pass any pointer
Is pointer is hidden or its work on the background?
Example like:
iterator itr;
*itr
How it works does * means any other things to iterator or *itr act like normal pointer variable.
If it works like a pointer variable then why we do not pass * when declaring itr.
An iterator is an object that lets you travel (or iterate) over each object in a collection or stream. It is a sort of generalization of pointers. That is, pointers are one example of an iterator.
Iterators implement concepts required by various algorithms such as forward iteration (meaning it can be incremented to move forward in the collection), bi-directional iteration (meaning it can go forward and backward), and random access (meaning you can use an index an arbitrary item in the collection).
For instance, moving backward can't typically happen in a stream, so stream's iterators are typically forward iterators only because once you access a value, you can't go back in the stream. A linked list's iterators are bi-directional because you can move forward or backward, but you cannot access them by indexing because the nodes are not typically in contiguous memory, so you can't calculate with an index where an arbitrary element is. A vector's iterators are random access and very much like pointers. (C++20 made these categories more precise, so the old categories are now called "Legacy".)
Iterators can also have special functions, such as std::back_inserter, which appends items to the end of a container when a value is assigned to it's referrent.
So, you can see that iterators allow you to be more precise in defining what your consumer of iterators expects. If your algorithm requires bi-directional iteration, you can communicate that and limit it so it won't work with forward-only iterators.
As for the * operator, it is similar to the * operator for a pointer. In both cases, it means, "give me the value referred to by this handle". It is implemented via operator overloading. You do not need the * when declaring an iterator because it is not a pointer, which is a lower-level construct in the language. Rather, it is an object with pointer-like semantics.
To answer your questions below:
No, the * is not automatically created. When you declare an iterator you are declaring an object. When the class for that object is defined, it may or may not have an operator overload for the * operator (or the == or the + or any other operators).
When you go to use the object, such as passing it to a function, the types will need to match up. If a function you were passing it to requires an iterator (e.g. std::sort()), then no dereferencing * is needed. If the function was expecting a value of the type the iterator refers to, then you would need to dereference it first. In that case the compiler calls the overloaded operator *and returns the value.
That is the nature of overloaded operators -- they look like ordinary operators but ultimately are resolved to a function call defined by the creator of the class. It works the same as if you defined a matrix class that has plus and minus operators overloaded.
How it works does * means any other things to iterator or *itr act like normal pointer variable.
It depends what type stands behind iterator. It can be alias for a pointer:
using iterator = int *;
iterator itr;
*itr; // it is pointer dereferencing in this case.
Or it can be a user defined type:
struct iterator {
int &operator*();
};
iterator itr;
*itr; // it means itr.operator*() here
So without knowing what type iterator is it is quite impossible to say what * actually does here. But in reality you should not care as developers of the library should implement it the way it would not matter for you.

Reseat the container an iterator "points" to

Suppose I have a std::list myList and an iterator myIt that I am using to point to some element of the list.
Now I make a shallow copy copiedList of myList (so I can reorder it). Similarly I can make a copy copiedIt of myIt, however it still references myList instead of copiedList, meaning I cannot sensibly compare it against copiedList.end(), as I may have modified that list.
Is there a (standard) way to reseat copiedIt to reference copiedList instead? This should be semantically valid, as long as I have not made any changes to the copy.
My current solution is to use the original iterator to std::find the pointed-to element in the copy of the list, but while that works and causes no problems, it seems unelegant.
You can use std::next and std::distance, like this:
template <class Container>
typename Container::iterator reseat(typename Container::iterator it, const Container &source, Container &target)
{
return std::next(target.begin(), std::distance(source.begin(), it));
}
In prose: find the distance of it from the beginning of its container, and take an iterator to the same-distant element in the new container.
This could easily be generalised to allow source and target to be of different types. With slightly more work, generalisation to constant iterators is also possible.
How do you copy the list? If you'd iterate the first list and kept inserting the items individually, you'd get the iterator at each step: http://www.cplusplus.com/reference/list/list/insert/
This is because list::insert returns an iterator that points to the first of the newly inserted elements.

Decrementing an off the end iterator

I was reading today about how for containers that support bidirectional iteration, this piece of code is valid:
Collection c(10, 10);
auto last = --c.end();
*last;
That got me thinking, is it required that when submitting a pair of bidirectional iterators [beg, end) to an algorithm in the STL that --end is defined? If so, should the result be dereferenceable?
ie
void algo(T beg, T end){
//...
auto iter = --end;
//...
*iter;
}
If the algorithm requires a range defined by bidirectional iterators first and last, then --last needs to be valid under the same conditions that ++first does -- namely that the range isn't empty. The range is empty if and only if first == last.
If the range isn't empty, then --last evaluates to an iterator that refers to the last element in the range, so *--last indeed also needs to be valid.
That said, there aren't all that many standard algorithms that require specifically a bidirectional iterator (and don't require random-access). prev, copy_backward, move_backward, reverse, reverse_copy, stable_partition, inplace_merge, [prev|next]_permutation.
If you look at what some of those do, you should see that typically the algorithm does decrement the end-of-range iterator and dereference the result.
As James says, for containers the function end() returns an iterator by value. There is no general requirement that for iterators that --x should be a well-formed expression when x is an rvalue of the type. For example, pointers are bidirectional iterators, and a function declared as int *foo(); returns a pointer by value, and --foo() is not a well-formed expression. It just so happens that for the containers you've looked at in your implementation, end() returns a class type which has operator-- defined as a member function, and so the code compiles. It also works since the container isn't empty.
Be aware that there is a difference in this respect between:
auto last = --c.end();
vs.
auto last = c.end();
--last;
The former decrements an rvalue, whereas the latter decrements an lvalue.
You read wrong. The expression --c.end() is never authorized. If the
iterator isn't at least bidirectional, it is, in fact, expressedly
forbidden, and requires a compiler error. If the collection is empty,
it is undefined behavior. And in all other cases, it will work if
it compiles, but there is no guarantee that it will compile. It failed
to compile with many early implementations of std::vector, for
example, where the iterator was just a typedef to a pointer. (In fact,
I think formally that it is undefined behavior in all cases, since
you're violating a constraint on a templated implementation. In
practice, however, you'll get what I just described.)
Arguably, because it isn't guaranteed, a good implementation will cause
it to fail to compile, systematically. For various reasons, most don't.
Don't ask me why, because it's incredibly simple to get it to fail
systematically: just make the operator-- on the iterator a free
function, rather than a member.
EDIT (additional information):
The fact that it isn't required is probably a large part of the
motivation behind std::next and std::prev in C++11. Of course,
every project I've worked on has had them anyway. The correct way to
write this is:
prev( c.end() );
And of course, the constraints that the iterator be bidirectional or
better, and that the container not be empty, still hold.
Each algorithm will tell you what type of iterator it requires. When a bidirectional iterator is called for, then naturally it will need to support decrementing.
Whether --end is possible depends on whether end == beg.
It's only required for algorithms that require bidirectional iterators.

vector::erase and reverse_iterator

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).
Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:
grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).
std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
iterate(vect_rit = it->rbegin() -> it->rend())
{
// ...
what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
if (what_to_delete.size() > threshold)
what_to_delete.erase(what_to_delete.begin());
// ...
}
}
Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n, where m > 0.
When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:
The relationship between reverse_iterator and iterator is
&*(reverse_iterator(i)) == &*(i- 1)
Which means that to delete a vect_rit, I use:
vector.erase(--vect_rit.base());
Now, according to C++11 standard §23.3.6.5/3:
iterator erase(const_iterator position); Effects: Invalidates
iterators and references at or after the point of the erase.
How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning (vector[0]) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()]) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?
Edit:
Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator) while inserting to what_to_delete worked correctly.
Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?
Thanks!
In the question you have already quoted exactly what the standard says a reverse_iterator is:
The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)
Remember that a reverse_iterator is just an 'adaptor' on top of the underlying iterator (reverse_iterator::current). The 'reference point', as you put it, for a reverse_iterator is that wrapped iterator, current. All operations on the reverse_iterator really occur on that underlying iterator. You can obtain that iterator using the reverse_iterator::base() function.
If you erase --vect_rit.base(), you are in effect erasing --current, so current will be invalidated.
As a side note, the expression --vect_rit.base() might not always compile. If the iterator is actually just a raw pointer (as might be the case for a vector), then vect_rit.base() returns an rvalue (a prvalue in C++11 terms), so the pre-decrement operator won't work on it since that operator needs a modifiable lvalue. See "Item 28: Understand how to use a reverse_iterator's base iterator" in "Effective STL" by Scott Meyers. (an early version of the item can be found online in "Guideline 3" of http://www.drdobbs.com/three-guidelines-for-effective-iterator/184401406).
You can use the even uglier expression, (++vect_rit).base(), to avoid that problem. Or since you're dealing with a vector and random access iterators: vect_rit.base() - 1
Either way, vect_rit is invalidated by the erase because vect_rit.current is invalidated.
However, remember that vector::erase() returns a valid iterator to the new location of the element that followed the one that was just erased. You can use that to 're-synchronize' vect_rit:
vect_rit = vector_type::reverse_iterator( vector.erase(vect_rit.base() - 1));
From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:
namespace std {
template <class Iterator>
class reverse_iterator ...
{
...
Iterator base() const; // explicit
...
protected:
Iterator current;
...
};
}
And from §24.5.1.3.3:
Iterator base() const; // explicit
Returns: current.
Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterators points to, said reverse_iterator should remain valid.
Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.
Just in case that's clear as mud, let me say that differently:
std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;
// Erase everything after it_1's pointee:
// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();
// but that points one too far in, so decrement;
--tmp_it;
// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());
// perform erasure
v.erase(tmp_it); // invalidates all iterators pointing at or past *tmp_it
// (like, say it_2.base()...)
// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!
// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);
In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.
The reverse_iterator, just like the normal iterator, points to a certain position in the vector. Implementation details are irrelevant, but if you must know, they both are (in a typical implementation) just plain old pointers inside. The difference is the direction. The reverse iterator has its + and - reversed w.r.t. the regular iterator (and also ++ and --, > and < etc).
This is interesting to know, but doesn't really imply an answer to the main question.
If you read the language carefully, it says:
Invalidates iterators and references at or after the point of the erase.
References do not have a built-in sense of direction. Hence, the language clearly refers to the container's own sense of direction. Positions after the point of the erase are those with higher indices. Hence, the iterator's direction is irrelevant here.

Checking if an iterator is valid

Is there any way to check if an iterator (whether it is from a vector, a list, a deque...) is (still) dereferenceable, i.e. has not been invalidated?
I have been using try-catch, but is there a more direct way to do this?
Example: (which doesn't work)
list<int> l;
for (i = 1; i<10; i++) {
l.push_back(i * 10);
}
itd = l.begin();
itd++;
if (something) {
l.erase(itd);
}
/* now, in other place.. check if it points to somewhere meaningful */
if (itd != l.end())
{
// blablabla
}
I assume you mean "is an iterator valid," that it hasn't been invalidated due to changes to the container (e.g., inserting/erasing to/from a vector). In that case, no, you cannot determine if an iterator is (safely) dereferencable.
As jdehaan said, if the iterator wasn't invalidated and points into a container, you can check by comparing it to container.end().
Note, however, that if the iterator is singular -- because it wasn't initialized or it became invalid after a mutating operation on the container (vector's iterators are invalidated when you increase the vector's capacity, for example) -- the only operation that you are allowed to perform on it is assignment. In other words, you can't check whether an iterator is singular or not.
std::vector<int>::iterator iter = vec.begin();
vec.resize(vec.capacity() + 1);
// iter is now singular, you may only perform assignment on it,
// there is no way in general to determine whether it is singular or not
Non-portable answer: Yes - in Visual Studio
Visual Studio's STL iterators have a "debugging" mode which do exactly this. You wouldn't want to enable this in ship builds (there is overhead) but useful in checked builds.
Read about it on VC10 here (this system can and in fact does change every release, so find the docs specific to your version).
Edit Also, I should add: debug iterators in visual studio are designed to immediately explode when you use them (instead undefined behavior); not to allow "querying" of their state.
Usually you test it by checking if it is different from the end(), like
if (it != container.end())
{
// then dereference
}
Moreover using exception handling for replacing logic is bad in terms of design and performance. Your question is very good and it is definitively worth a replacement in your code. Exception handling like the names says shall only be used for rare unexpected issues.
Is there any way to check if a iterator (whether it is from a vector, a list, a deque...) is (still) dereferencable, i.e has not been invalidated ?
No, there isn't. Instead you need to control access to the container while your iterator exists, for example:
Your thread should not modify the container (invalidating the iterator) while it is still using an instantiated iterator for that container
If there's a risk that other threads might modify the container while your thread is iterating, then in order to make this scenario thread-safe your thread must acquire some kind of lock on the container (so that it prevents other threads from modifying the container while it's using an iterator)
Work-arounds like catching an exception won't work.
This is a specific instance of the more general problem, "can I test/detect whether a pointer is valid?", the answer to which is typically "no, you can't test for it: instead you have to manage all memory allocations and deletions in order to know whether any given pointer is still valid".
Trying and catching is not safe, you will not, or at least seldom throw if your iterator is "out of bounds".
what alemjerus say, an iterator can always be dereferenced. No matter what uglyness lies beneath. It is quite possible to iterate into other areas of memory and write to other areas that might keep other objects. I have been looking at code, watching variables change for no particular reason. That is a bug that is really hard to detect.
Also it is wise to remember that inserting and removing elements might potentially invalidate all references, pointers and iterators.
My best advice would be to keep you iterators under control, and always keep an "end" iterator at hand to be able to test if you are at the "end of the line" so to speak.
In some of the STL containers, the current iterator becomes invalid when you erase the current value of the iterator. This happens because the erase operation changes the internal memory structure of the container and increment operator on existing iterator points to an undefined locations.
When you do the following, iterator is incementented before it is passed to erase function.
if (something) l.erase(itd++);
Is there any way to check if an iterator is dereferencable
Yes, with gcc debugging containers available as GNU extensions. For std::list you can use __gnu_debug::list instead. The following code will abort as soon as invalid iterator is attempted to be used. As debugging containers impose extra overhead they are intended only when debugging.
#include <debug/list>
int main() {
__gnu_debug::list<int> l;
for (int i = 1; i < 10; i++) {
l.push_back(i * 10);
}
auto itd = l.begin();
itd++;
l.erase(itd);
/* now, in other place.. check if itd points to somewhere meaningful */
if (itd != l.end()) {
// blablabla
}
}
$ ./a.out
/usr/include/c++/7/debug/safe_iterator.h:552:
Error: attempt to compare a singular iterator to a past-the-end iterator.
Objects involved in the operation:
iterator "lhs" # 0x0x7ffda4c57fc0 {
type = __gnu_debug::_Safe_iterator<std::_List_iterator<int>, std::__debug::list<int, std::allocator<int> > > (mutable iterator);
state = singular;
references sequence with type 'std::__debug::list<int, std::allocator<int> >' # 0x0x7ffda4c57ff0
}
iterator "rhs" # 0x0x7ffda4c580c0 {
type = __gnu_debug::_Safe_iterator<std::_List_iterator<int>, std::__debug::list<int, std::allocator<int> > > (mutable iterator);
state = past-the-end;
references sequence with type 'std::__debug::list<int, std::allocator<int> >' # 0x0x7ffda4c57ff0
}
Aborted (core dumped)
The type of the parameters of the erase function of any std container (as you have listed in your question, i.e. whether it is from a vector, a list, a deque...) is always iterator of this container only.
This function uses the first given iterator to exclude from the container the element that this iterator points at and even those that follow. Some containers erase only one element for one iterator, and some other containers erase all elements followed by one iterator (including the element pointed by this iterator) to the end of the container. If the erase function receives two iterators, then the two elements, pointed by each iterator, are erased from the container and all the rest between them are erased from the container as well, but the point is that every iterator that is passed to the erase function of any std container becomes invalid! Also:
Each iterator that was pointing at some element that has been erased from the container becomes invalid, but it doesn't pass the end of the container!
This means that an iterator that was pointing at some element that has been erased from the container cannot be compared to container.end().
This iterator is invalid, and so it is not dereferencable, i.e. you cannot use neither the * nor -> operators, it is also not incrementable, i.e. you cannot use the ++ operator, and it is also not decrementable, i.e. you cannot use the -- operator.
It is also not comparable!!! I.E. you cannot even use neither == nor != operators
Actually you cannot use any operator that is declared and defined in the std iterator.
You cannot do anything with this iterator, like null pointer.
Doing something with an invalid iterator immediately stops the program and even causes the program to crash and an assertion dialog window appears. There is no way to continue program no matter what options you choose, what buttons you click. You just can terminate the program and the process by clicking the Abort button.
You don't do anything else with an invalid iterator, unless you can either set it to the begin of the container, or just ignore it.
But before you decide what to do with an iterator, first you must know if this iterator is either invalid or not, if you call the erase function of the container you are using.
I have made by myself a function that checks, tests, knows and returns true whether a given iterator is either invalid or not. You can use the memcpy function to get the state of any object, item, structure, class and etc, and of course we always use the memset function at first to either clear or empty a new buffer, structure, class or any object or item:
bool IsNull(list<int>::iterator& i) //In your example, you have used list<int>, but if your container is not list, then you have to change this parameter to the type of the container you are using, if it is either a vector or deque, and also the type of the element inside the container if necessary.
{
byte buffer[sizeof(i)];
memset(buffer, 0, sizeof(i));
memcpy(buffer, &i, sizeof(i));
return *buffer == 0; //I found that the size of any iterator is 12 bytes long. I also found that if the first byte of the iterator that I copy to the buffer is zero, then the iterator is invalid. Otherwise it is valid. I like to call invalid iterators also as "null iterators".
}
I have already tested this function before I posted it there and found that this function is working for me.
I very hope that I have fully answered your question and also helped you very much!
There is a way, but is ugly... you can use the std::distance function
#include <algorithms>
using namespace std
auto distance_to_iter = distance(container.begin(), your_iter);
auto distance_to_end = distance(container.begin(),container.end());
bool is_your_iter_still_valid = distance_to_iter != distance_to_end;
use erase with increment :
if (something) l.erase(itd++);
so you can test the validity of the iterator.
if (iterator != container.end()) {
iterator is dereferencable !
}
If your iterator doesnt equal container.end(), and is not dereferencable, youre doing something wrong.