Switching Vectors Supplied Iterators - c++

I am designing my own generic tree container and am using the STL as a reference. However, when implementing my iterator class I noticed something about the STL's use of iterators.
As an example, the std::vector class relies on iterators as arguments for many of its methods. (ie. erase(const_iterator position))
This got me wondering: what happens if, given two vectors of the same template type, and the first vectors iterator is supplied to the second vector in a method call, what happens? To help answer this question I have put together a simple program to illustrate my thoughts.
// Example program
#include <iostream>
#include <string>
#include <vector>
#include <iomanip>
void printVec(const std::string &label, const std::vector<int> &vec){
for (unsigned int i=0; i<vec.size(); i++){
std::cout << ::std::setw(3) << vec[i] << ", ";
}
std::cout << std::endl;
}
int main()
{
std::vector<int> test={0,1,2,3,4,5,6,7,8,9};
std::vector<int> test2{10,11,12,13,14,15,16,17,18,19};
std::vector<int>::iterator iter=test.begin();
std::vector<int>::iterator iter2=test2.begin();
printVec("One",test);
printVec("Two",test2);
for (int i=0; i<5; i++, iter++, iter2++);
std::cout << "One Pos: " << *iter << std::endl;
std::cout << "Two Pos: " << *iter2 << std::endl;
test.erase(iter2); //Switching the iterators and there respective vectors
test2.erase(iter); //Switching the iterators and there respective vectors
printVec("One",test);
printVec("Two",test2);
}
Running this program results in a seg. fault, which seems to indicate that this is undefined behavior. I hesitate to call this a flaw in the STL vector interface, but it sure seems that way.
So my question is this: is there any way to avoid this when designing my own container?

The iterator passed to a member function of a container must refer to an element within that container (or, in some cases, the past-the-end element returned by end()). If the iterator does not refer to the container you have Undefined Behavior.
There is no simple way to avoid that. About the closest you can come is to validate the iterators, which means you'd have to keep track of the container each iterator belongs to. This gets a bit complicated with some operations like swap or insert that don't invalidate existing iterators but leave them referring to the new container.
Some compilers, like Visual C++ when compiling in debug mode, can detect these sorts of problems at runtime and issue an appropriate notification.

Related

For Random Access Iterator (vector iterator), are the iterators C++ style pointers?

I have the following code to randomize the elements in a list container:
#include <vector>
#include <list>
#include <iterator>
#include <algorithm>
#include <iostream>
using namespace std;
template<class RandomAccesIterator>
void randomize(RandomAccesIterator iterBegin, RandomAccesIterator iterEnd)
{
while (iterBegin != iterEnd)
{
iter_swap(iterBegin, iterBegin + rand() % (iterEnd - iterBegin));
++iterBegin;
}
}
And then later in main():
int main()
{
//container used as to apply algorithm to.
list<int> List = {34,77,16,2,35,76,18,2};
//randomize example.
cout << "calling randomize on sorted vector: " << endl;
List.sort();
vector<int> temp(List.begin(), List.end());
cout << "before randomize: " << endl;
for (vector<int>::iterator it = temp.begin(); it != temp.end(); it++)
{
cout << *it << " ";
}
cout << endl;
randomize(temp.begin(),temp.end());
cout << "after randomize: " << endl;
for (vector<int>::iterator it = temp.begin(); it != temp.end(); it++)
{
cout << *it << " ";
}
cout << endl<<endl;
return 0;
}
I had a couple of questions:
I believe performing iterEnd - iterBegin (as shown in the template function) is a valid operation, because both iterEnd and iterBegin are C++ style pointers. and subtracting these pointers gives the distance between them. Am I correct?
I tried the following in the immediate window:
iterEnd
{-33686019}
[ptr]: 0x00ba4f78 {-33686019}
[Raw View]: {...}
it means iterEnd is a pointer, whose value is 0x00ba4f78, and it points to the garbage value of -33686019. I believe I am correct here?
So, the iterator is a pointer, for random access iterators. Is it true for all iterator types (Input/output iterators, Forward iterators, BiDirectional iterators)? If those iterators are not C++ style pointers, then what are those?
I also tried the following in the immediate window:
&iterEnd
0x006ff368 {-33686019}
[ptr]: 0x00ba4f78 {-33686019}
[Raw View]: 0x006ff368 {...}
&&iterEnd
expected an expression
Why is &iterEnd giving me an address? It should give me the message "expected an expression", as, &&iterEnd does.
How are random access iterators implemented? - I am asking because iterEnd gives me a pointer value and, &iterEnd also gives me a (different) pointer value. Is the random access iterator a pointer within a pointer?
For Random Access Iterator (vector iterator), are the iterators C++
style pointers?
Short answer -- it depends on the compiler.
The internals of a vector iterator is implementation-defined. The std::vector<T>::iterator could have operator - overloaded, thus it gives the illusion of pointer subtraction. Thus if you assume that vector iterators are simple pointers, writing code assuming they are simple pointers will break using various compilers, while for other compilers it will compile successfully.
One such famous case is Visual C++, where in version 6.0, vector iterators were simple pointers, thus many authors at that time using that compiler would write code with the assumption that a std::vector<T>::iterator was just a T*. The code compiled successfully and worked correctly due to the fact that vector iterators were implemented as pointers.
An example would be something like this:
#include <vector>
void foo(char *c)
{
}
int main()
{
std::vector<char> vc;
foo(vc.begin());
}
No compile errors, since vc.begin() was a simple char *.
Then came the subsequent versions of Visual C++, and that code that used to compile successfully under 6.0 is now broken. The std::vector<T>::iterator was no longer a simple T*, but a struct. A lot of code based on the (faulty) reasoning of an iterator being a simple pointer had to be changed for Visual C++ > version 6.0.

std::list.end() is not returning "past-the-end" iterator

I recently started learning about C++ iterators and pointers, and while messing around with some basic exercises I came upon a situation which I think is really unusual.
#include <iostream>
#include <vector>
#include <time.h>
#include <list>
#include <array>
using namespace std;
template<typename ITER>
void print_with_hyphens(ITER begin, ITER end){
cout << "End: " << *(end) << endl;
cout << "Begin: " << *begin << endl;
for(ITER it = begin; it != end; it++){
cout << *it << endl;
}
cout << endl << "Finished" << endl;
}
int main()
{
vector<int> v { 1, 2, 3, 4, 5};
list<int> l { 1, 2, 3, 4, 5};
print_with_hyphens(v.begin(), v.end());
print_with_hyphens(l.begin(), l.end());
// print_with_hyphens(a.begin(), a.end());
return 0;
}
And when I run it like this, I get this unusual result:
Results of the code
Now, the vector is returning a weird (random, if I'm not mistaken) value, because it's trying to access a value that doesn't exist, hence, "past the end" iterator.
And it should be the same for lists, yet, the list is returning the value 5. Shouldn't it also return a "past the end" iterator?
Things such as dereferencing an invalid iterator or accessing an out-of-bounds array index produce undefined behavior.
This means the C++ standard does not specify what should happen if you do it. Anything might happen, such as a segmentation fault, or getting a random value, depending on things like your standard library implementation and compiler.
Needless to say, programs should not rely on undefined behavior.
The phrase "past-the-end" in this context is abstract. It means the iterator is off the end of the logical sequence of elements in the container. It does not mean there is some literal bit of data located just after the container in memory that you can access and read.
Because it's "past-the-end" and doesn't refer to any actual element, dereferencing the end iterator is not permitted. By doing so, you get weird behaviours.

how the iterator in c++ could be printed?

Suppose, I have declared a vector in C++ like this:
vector<int>numbers = {4,5,3,2,5,42};
I can iterate it through the following code:
for (vector<int>::iterator it = numbers.begin(); it!=numbers.end(); it++){
// code goes here
}
Now, I would talk about coding in the block of for loop.
I can access and change any value using this iterator. say, I want to increase every value by 10 and the print. So, the code would be:
*it+=10;
cout << *it << endl;
I can print the address of both iterator and elements that are being iterated.
Address of iterator can be printed by:
cout << &it << endl;
Address of iterated elements can be printed by:
cout << &(*it) << endl;
But why the iterator itself could not printed by doing the following?
cout << it <<endl;
At first I thought the convention came from JAVA considering the security purpose. But if it is, then why I could print it's address?
However, Is there any other way to do this? If not, why?
Yes, there is a way to do it!
You can't print the iterator because it is not defined to have a value.
But you can perform arithematic operations on them and that helps you to print the value (of the iterator).
Do the following.
cout << it - v.begin();
Example:
#include <iostream>
#include <algorithm>
#include <vector>
#include <iterator>
using namespace std;
int main () {
vector<int> v = {20,3,98,34,20,11,101,201};
sort (v.begin(), v.end());
vector<int>::iterator low,up;
low = lower_bound (v.begin(), v.end(), 20);
up = upper_bound (v.begin(), v.end(), 20);
std::cout << "lower_bound at position " << (low - v.begin()) << std::endl;
std::cout << "upper_bound at position " << (up - v.begin()) << std::endl;
return 0;
}
Output of the above code:
lower_bound at position 2
upper_bound at position 4
Note: this is just a way to get things done and no way I have claimed that we can print the iterator.
...
There is no predefined output operator for the standard iterators because there is no conventional meaning of printing an iterator. What would you expect such an operation to print? While you seem to expect to see the address of the object the iterator refers to, I find that not clear at all.
There is no universal answer to that, so the committee decided not to add a those operators. (The last half sentence is a guess, I am not part of the committee.)
If you want to print those iterators, I would define a function like print(Iterator); (or something like this, whatever fits your needs) that does what you want. I would not add an operator << for iterators for the reason I mentioned above.
why the iterator itself could not printed by doing the following?
Because, it is not defined to a value internally.
Is there any other way to do this?
Basically, the compiler does not facilitate it by default, you may try to edit the compiler code! But it is too terrific you know!
If not, why?
Because it has no well-defined way to express it.
You can't print the iterator because it is not defined to have a value. But you can perform arithematic operations on them and that helps you to print the value (of the iterator).

Why is vector.begin() needed when deleting an element?

Deleting the element 4th element in a vector requires the following code:
vector<int> v;
....
....
v.erase(v.begin()+3); // Erase v[3], i.e. the 4th element
I'm wondering why we need the v.begin() part.
It would be nicer just to write:
v.erase(3); // Erase v[3], i.e. the 4th element
The begin() is a member of vector so the erase method could just as well handle that part for us so that our code would be more readable and easy to understand.
There is probably a good reason and I like to know.
Can someone explain or link an explanation?
Thanks.
If this is something you'd like to be able to do, it's easy enough to come up with a function template (or is it a template function?) that will do this with a slightly different, but probably similar enough syntax:
#include <iostream>
#include <ostream>
#include <vector>
template <typename T>
void erase_at( T& container, size_t pos)
{
container.erase(container.begin() + pos);
}
using namespace std;
int main() {
vector<int> v;
v.push_back(0);
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
v.push_back(5);
for (vector<int>::iterator i = v.begin(); i != v.end(); ++i) {
cout << *i << " ";
}
cout << endl;
erase_at(v, 3); // <-- instead of `v.erase(v.begin() + 3)`
for (vector<int>::iterator i = v.begin(); i != v.end(); ++i) {
cout << *i << " ";
}
cout << endl;
return 0;
}
Presumably, one of the key things here is to create a uniform interface to the standard library containers.
Let's look at std::set<T>, std::vector<T>, and std::list<T>. In only one of these cases do we have access to a random access iterator. In all the other cases, getting container.begin() + 3 can be relatively expensive. Especially when the user probably has an iterator already, because they found the element exists in the object and they want to remove it.
Because of vector.erase receives iterator, but not integer.
You can see that on documentation http://www.cplusplus.com/reference/vector/vector/erase/
STL developers, I think, wanted to make unique interface of erasing on all container generic classes(vector, list, etc)

What's this unexpected std::vector behavior?

I found something surprising with std::vector that I thought I'd ask about here to hopefully get some interesting answers.
The code below simply copies a string into a char vector and prints the contents of the vector in two ways.
#include <vector>
#include <string>
#include <iostream>
int main()
{
std::string s("some string");
std::vector<char> v;
v.reserve(s.size()+1);
// copy using index operator
for (std::size_t i=0; i<=s.size(); ++i)
v[i] = s[i];
std::cout << "&v[0]: " << &v[0] << "\n";
std::cout << "begin/end: " << std::string(v.begin(), v.end()) << "\n";
// copy using push_back
for (std::size_t i=0; i<=s.size(); ++i)
v.push_back(s[i]);
std::cout << "&v[0]: " << &v[0] << "\n";
std::cout << "begin/end: " << std::string(v.begin(), v.end()) << "\n";
return 0;
}
Building and running this yields:
$ g++ main.cpp -o v && ./v
&v[0]: some string
begin/end:
&v[0]: some string
begin/end: some string
My expectation was that it would print the string correctly in both cases, but assigning character by character using the index operator doesn't print anything when later using begin() and end() iterators.
Why isn't end() updated when when using []? If this is intentional, what's the reason it's working like this?
Is there a reasonable explanation for this behaviour? :)
I've only tried this with gcc 4.6.1 so far.
Typical example of Undefined Behavior.
You are only ever allowed to access elements by index (using operator[]) between 0 and v.size()-1 (included).
Using reserve does not modify the size, only the capacity. Would you have used resize instead, it would work as expected.
In the first case, you have undefined behaviour. reserve sets the capacity, but leaves the size as zero. Your loop then writes to invalid locations beyond the end of the vector. Printing using the (invalid) pointer appears to work (although there is no guarantee of that), since you've written the string to the memory that it points at; printing using the iterator range prints nothing, because the vector is still empty.
The second loop correctly increases the size each time, so that the vector actually contains the expected contents.
Why isn't end() updated when when using []? If this is intentional, what's the reason it's working like this?
[] is intended to be as fast as possible, so it does no range checking. If you want a range check, use at(), which will throw an exception on an out-of-range access. If you want to resize the array, you have to do it yourself.