I have the following C++ code using set_union() from algorithm stl:
9 int first[] = {5, 10, 15, 20, 25};
10 int second[] = {50, 40, 30, 20, 10};
11 vector<int> v(10);
12 vector<int>::iterator it;
13
14 sort(first, first+5);
15 sort(second, second+5);
16
17 it = set_union(first, first + 5, second, second + 5, v.begin());
18
19 cout << int(it - v.begin()) << endl;
I read through the document of set_union from http://www.cplusplus.com/reference/algorithm/set_union/ . I have two questions:
Line 17. I understand set_union() is returning an OutputIterator. I
thought iterators are like an object returned from a container object
(e.g. instantiated vector class, and calling blah.begin()
returns the iterator object). I am trying to understand what does
the "it" returned from set_union point to, which object?
Line 19. What does "it - v.begin()" equate to. I am guessing from the output value of "8", the size of union, but how?
Would really appreciate if someone can shed some light.
Thank you,
Ahmed.
The documentation for set_union states that the returned iterator points past the end of constructed range, in your case to one past the last element in v that was written to by set_union.
This is the reason it - v.begin() results in the length of the set union also. Note that you are able to simply subtract the two only because a vector<T>::iterator must satisfy the RandomAccessIterator concept. Ideally, you should use std::distance to figure out the interval between two iterators.
Your code snippet can be written more idiomatically as follows:
int first[] = {5, 10, 15, 20, 25};
int second[] = {50, 40, 30, 20, 10};
std::vector<int> v;
v.reserve(10); // reserve instead of setting an initial size
sort(std::begin(first), std::end(first));
sort(std::begin(second), std::begin(second));
// use std::begin/end instead of hard coding length
auto it = set_union(std::begin(first), std::end(first),
std::begin(second), std::end(second),
std::back_inserter(v));
// using back_inserter ensures the code works even if the vector is not
// initially set to the right size
std::cout << std::distance(v.begin(), it) << std::endl;
std::cout << v.size() << std::endl;
// these lines will output the same result unlike your example
In response to your comment below
What is the use of creating a vector of size 10 or reserving size 10
In your original example, creating a vector having initial size of at least 8 is necessary to prevent undefined behavior because set_union is going to write 8 elements to the output range. The purpose of reserving 10 elements is an optimization to prevent possibility of multiple reallocations of the vector. This is typically not needed, or feasible since you won't know the size of the result in advance.
I tried with size 1, works fine
Size of 1 definitely does NOT work fine with your code, it is undefined behavior. set_union will write past the end of the vector. You get a seg fault with size 0 for the same reason. There's no point in speculating why the same thing doesn't happen in the first case, that's just the nature of undefined behavior.
Does set_union trim the size of the vector, from 10 to 8. Why or is that how set_union() works
You're only passing an iterator to set_union, it knows nothing about the underlying container. So there's no way it could possibly trim excess elements, or make room for more if needed. It simply keeps writing to the output iterator and increments the iterator after each write. This is why I suggested using back_inserter, that is an iterator adaptor that will call vector::push_back() whenever the iterator is written to. This guarantees that set_union will never write beyond the bounds of the vector.
first: "it" is an iterator to the end of the constructed range (i.e. equivalent to v.end())
second: it - v.begin() equals 8 because vector iterators are usually just typedefed pointers and therefore it is just doing pointer arithmetic. In general, it is better to use the distance algorithm than relying on raw subtraction
cout << distance(v.begin(), it) << endl;
Related
I am new to C++. I was trying using the accumulate algorithm from the numeric library. Consider the following code segment.
int a[3] = {1, 2, 3};
int b = accumulate(a, a + 3, 0);
It turns out that the code segment works and b gets assigned 1+2+3+0=6.
If a was a vector<int> instead of an array, I would call accumulate(a.begin(), a.end(), 0). a.end() would point to one past the the end of a. In the code segment above, a + 3 is analogous to a.end() in that a + 3 also points to one past the end of a. But with the primitive array type, how does the program know that a + 3 is pointing at one past the end of some array?
The end iterator is merely used for comparison as a stop marker, it does not matter what it really points to as it will never be dereferenced. The accumulate function will iterate over all elements in the range [a, a+3[ and stop as soon as it encounters the stop marker.
In the C-style array case, the stop marker will be the element one past the end of the array. This would be the same in the std::vector case.
This question already has answers here:
How do I get the index of an iterator of an std::vector?
(9 answers)
Closed 2 years ago.
I found some c++ code that I would like to understand. In this code they use
int airplane = min_element(min_cost_airplane.begin(),
min_cost_airplane.end()) - min_cost_airplane.begin();
But I don't know what this line of code exactly accomplishes. min_cost_airplane is a vector. I understand the min_element function, but I can't wrap my head around the -vector.begin at the end. Is the structure of this line of code common used? The thing I understand is that this line of code returns an iterator to the smallest element in the vector minus an iterator to the first element of the vector. So what does the iterator point to?
Can someone please help me?
The std::min_element algorithm returns an iterator. You can dereference that iterator to access the "minimum" element of the container. If, instead, you want to know the index of the element you need to compute it as the distance from the beginning of the container.
For random-access iterators you can subtract the iterators to get the offset, or index value. That's what your example does. There's also a std::distance function that computes the index but it also works for non-random access iterators. For example:
auto iter = std::min_element(min_cost_airplane.begin(), min_cost_airplane.end());
int index = std::distance(min_cost_airplane.begin(), iter);
std::min_element returns an iterator to the first instance of the minimum value in the given range.
begin returns an iterator to the first element in the container.
The std::vector iterators are special (it's a random access iterator) in that you can subtract one from another which yields the distance between them in terms of elements (it's pointer arithmetic under the hood). To be more generic and clearer, write
auto airplane = std::distance(
min_cost_airplane.begin(),
std::min_element(min_cost_airplane.begin(), min_cost_airplane.end())
);
std::min_element is used to finds the smallest element in the range [first, last].
std::vector<int> v{3, 8, 4, 2, 5, 9};
std::vector<int>::iterator result = std::min_element(v.begin(), v.end());
//result iterator to the minimum value in vector v.
std::cout << "min element is: " << *result;
output: min element is: 2
Note : The smallest value in vector v is 2.
I've read a lot of posts abut reference, pointers and iterators invalidation. For instance I've read that insertion invalidates all reference to the elements of a deque, then why in the following code I don't have errors?
#include <deque>
int main()
{
std::deque<int> v1 = { 1, 3, 4, 5, 7, 8, 9, 1, 3, 4 };
int& a = v1[6];
std::deque<int>::iterator it = v1.insert(v1.begin() + 2, 3);
int c = a;
return a;
}
When I run this I get 9 as result, so "a" is still referring to the right element.
In general I didn't manage to get invalidation errors. I tried different containers and even with pointers and iterators.
Sometimes, an operation that could invalidate something, doesn't.
I'm not familiar enough with std::deque implementation to comment, but if you did push_back on a std::vector, for example, you might get all your iterators, references and pointers to elements of the vector invalidated, for example, because std::vector needed to allocate more memory to accomodate the new element, and ended up moving all the data to a new location, where that memory was available.
Or, you might get nothing invalidated, because the vector had enough space to just construct a new element in place, or was lucky enough to get enough new memory at the end of its current memory location, and did not have to move anything, while still having changed size.
Usually, the documentation carefully documents what operations can invalidate what. For example, search for "invalidate" in https://en.cppreference.com/w/cpp/container/deque .
Additionally, particular implementations of the standard data structures might be even safer than the standard guarantees - but relying on that will make your code highly non-portable, and potentially introduce hidden bugs when the unspoken safety guarantees change: everything will seem to work just fine until it doesn't.
The only safe thing to do is to read the specification carefully and never rely on something not getting invalidated when it does not guarantee that.
Also, as Enrico pointed out, you might get cases where your references/pointers/iterators get invalidated, but reading from them yields a value that looks fine, so such a simple method for testing if something has been invalidated will not do.
The following code, on my system, shows the effect of the undefined behavior.
#include <deque>
#include <iostream>
int main()
{
std::deque<int> v1 = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
for (auto e : v1) std::cout << e << ' ';
std::cout << std::endl;
int& a = v1[1];
int& b = v1[2];
int& c = v1[3];
std::cout << a << ' ' << b << ' ' << c << std::endl;
std::deque<int>::iterator it = v1.insert(v1.begin() + 2, -1);
for (auto e : v1) std::cout << e << ' ';
std::cout << std::endl;
v1[7] = -3;
std::cout << a << ' ' << b << ' ' << c << std::endl;
return a;
}
Its output for me is:
1 2 3 4 5 6 7 8 9 10
2 3 4
1 2 -1 3 4 5 6 7 8 9 10
-1 3 4
If the references a, b, and c, were still valid, the last line should have been
2 3 4
Please, do not deduce from this that a has been invalidated while b and c are still valid. They're all invalid.
Try it out, maybe you are "lucky" and it shows the same to you. If it doesn't, play around with the number of elements in the containar and a few insertions. At some point maybe you'll see something strange as in my case.
Addendum
The ways std::deques can be implemented all makes the invalidation mechanism a bit more complex than what happens for the "simpler" std::vector. And you also have less ways to check if something is actually going to suffer from the effect of undefined behavior. With std::vector, for instance, you can tell if undefined behavior will sting you upon a push_back; indeed, you have the member function capacity, which tells if the container has already enough space to accomodate a bigger size required by the insertion of further elements by means of push_back. For instance if size gives 8, and capacity gives 10, you can push_back two more elements "safely". If you push one more, the array will have to be reallocated.
Say I have a std::deque<int> dcontaining 100 values, from 0 to 99. Given the following:
Unlike vectors, deques are not guaranteed to store all its elements in
contiguous storage locations: accessing elements in a deque by
offsetting a pointer to another element causes undefined behavior.
It appears line below is not valid:
int invalidResult = *(d.begin() + 81); // might give me 81, but NOT GUARANTEED, right?
My question is this: does an iterator take care of this?
std::deque<int>::iterator it = d.begin();
int isThisValid = *(it + 81); // 81 every time? or does it result in undefined behavior?
At one point, I had thought that the iterator would handle any discontinuities in the underlying storage, but now I'm not so sure. Obviously, if you use it++ 81 times, *it will give you 81 as a result.
Can someone say for sure?
For what it's worth, I am not using C++11.
It appears line below is not valid:
int invalidResult = *(d.begin() + 81); // might give me 81, but NOT GUARANTEED, right?
On the contrary. The statement is perfectly valid and the behaviour is guaranteed (assuming d.size() >= 82). This is because std::deque::begin returns an iterator, not a pointer, so the quoted rule does not apply.
std::deque<int>::iterator it = d.begin();
int isThisValid = *(it + 81); // 81 every time? or does it result in undefined behavior?
This is pretty much equivalent to the previous code, except you've used a named variable, instead of a temporary iterator. The behaviour is exactly the same.
Here is an example of what you may not do:
int* pointer = &d.front();
pointer[offset] = 42; // oops
According to this reference here a std::deque provides a RandomAccessIterator which will certainly work according to your example.
std::deque<int>::iterator it = d.begin();
int isThisValid = *(it + 81); // will be fine assuming the deque is that large
I want to traverse a list in C++ but only till fifth from last not till the end.
But I see that there is no "-" operator defined so that I could use
list<>::iterator j=i-5;
I can do it using size() function somehow keeping counts etc but is there any other direct way?
Count is the only practical way that may not involve effectively traversing the list in some way.
auto myEnd = std::advance(myList.end(),-5)
but this will just traverse the last five list elements to get to your desired point, so its no faster or more elegant than most other solutions. However, using an integer loop does require keeping both an integer count and an iterator, this really only requires the iterator so in that regard it may be nicer.
If your <list> has an O(1) count, and the distance back from end is large, use an integer loop, else the above is nice.
List doesn't support random access iterators. You can use reverse iterator and counter.
You could use std::advance to get the iterator to the fifth from last.
The list has bidirectional iterator. So that to get the fifth iterator from the end iterator you should 5 times apply operation -- that is defined for bidirectional iterators. The C++ Standard provides two functions that perform this task. The first one that appeared in C++ 2003 is std::advance. The second one that appeared in C++ 2011 is std::prev. It is simpler to use the second function, std::prev, because it returns the needed iterator. For example
std::list<int> l = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
std::copy( l.begin(), std::prev( l.end(), 5 ), std::ostream_iterator<int>( std::cout, " " ) );
In addition to the available answers, I'd recommend sticking to a standard algorithm for traversing the list rather than dealing with iterators directly; if you can avoid it.
For example:
auto l = list<int>{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
for_each(begin(l), prev(end(l), 5), [](const int& i) {
cout << i << endl;
});
http://ideone.com/6wNuMP