sort in the C++ standard library is called as:
sort (first element, last element);
So if I have an array:
int a[n];
I should call sort as:
sort(&a[0], &a[n-1]);
since a[0] is the first element and a[n-1] the last. When I do so, however, it doesn't sort the last element. To get a fully sorted array, I must use:
sort(&a[0], &a[n]);
Why is this?
Because ranges in stl are always defined as half-open ranges from the fist element iterator to to the "one-past-the-end"-iterator. With C++11 you can use:
int a[n];
sort(std::begin(a),std::end(a));
Format for sort in STL in c++ is,
sort (first element, last element);
No, it's not. You are supposed to provide an iterator for the first element, and a one-past-the-end iterator, as you've discovered.
The Standard Library in general uses semi-open intervals to describe ranges through iterators. Otherwise it would be impossible for empty ranges to be expressed:
// An empty container!
std::vector<int> v;
// Pretend that `v.end()` returns an iterator for the actual last element,
// with the same caveat as `v.begin()` that the case where no elements
// exist gives you some kind of "sentinel" iterator that does not represent
// any element at all and cannot be dereferenced
std::vector<int>::iterator a = v.begin(), b = v.end();
// Oh no, this would think that there's one element!
std::sort(a, b);
Related
As per documentation, std::find returns
last
if no element is found. What does that mean? Does it return an iterator pointing to the last element in the container? Or does it return an iterator pointing to .end(), i.e. pointing outside the container?
The following code prints 0, which is not an element of the container. So, I guess std::find returns an iterator outside the container. Could you please confirm?
int main()
{
vector<int> vec = {1, 2,3, 1000, 4, 5};
auto itr = std::find(vec.begin(), vec.end(), 456);
cout << *itr;
}
last is the name of second parameter to find. It doesn't know what kind of container you're using, just the iterators that you give it.
In your example, last is vec.end(), which is (by definition) not dereferenceable, since it's one past the last element. So by dereferencing it, you invoke undefined behaviour, which in this case manifests as printing out 0.
Algorithms apply to ranges, which are defined by a pair of iterators. Those iterators are passed as arguments to the algorithm. The first iterator points at the first element in the range, and the second argument points at one past the end of the range. Algorithms that can fail return a copy of the past-the-end iterator when they fail. That's what std::find does: if there is no matching element it returns its second argument.
Note that the preceding paragraph does not use the word "container". Containers have member functions that give you a range that you can use to get at the elements of the container, but there are also ways of creating iterators that have no connection to any container.
Based on this documentation, it literally says:
"Return value:
Iterator to the first element satisfying the condition or last if no such element is found."
In your case, it's out the vector by one, .end()
I'm learning vectors and am confused on how the array is copying to thevector here
double p[] = {1, 2, 3, 4, 5};
std::vector<double> a(p, p+5);
I also know std::vector<double> a(3,5); means `make room for 3 and initialize them with 5. How does the above code work?
The second point is I read the paragraph from where I copied the above code.
Understanding the second point is crucial when working with vectors or
any other standard containers. The controlled sequence is always
expressed in terms of [first, one-past-last)—not only for ctors, but
also for every function that operates on a range of elements.
I don't know what is the meant by [first, one-past-last) ?
I know mathematically but don't know why/how vector copy the array in this way?
Edited
another related question
The member function end() returns an iterator that "points" to
one-past-the-last-element in the sequence. Note that dereferencing the
iterator returned by end() is illegal and has undefined results.
Can you explain this one-past-the-last-element what is it? and why?
Never dereference end() or rend() from STL containers, as they do not point to valid elements.
This picture below can help you visualize this.
The advantage of an half open range is:
1. Handling of empty ranges occur when begin() == end()
2. Iterating over the elements can be intuitively done by checking until the iterator equals end().
Strongly coupled with containers (e.g. vector, list, map) is the concept of iterators. An iterator is a C++ abstraction of a pointer. I.e. an iterator points to an object inside the container (or to one past the last element), and dereferencing the iterator means accessing that element.
Lets take for instance a vector of 4 elements:
| 0 | 1 | 2 | 3 | |
^ ^ ^
| | |
| | one past the end (outside of the container elements)
| last element
first element
The (algorithms in the) standard template library operate on ranges, rather than on containers. This way you can apply operations, not only to the entire container, but also to ranges (consecutive elements of the container).
A range is specified by [first, last) (inclusive first, exclusive last). That's why you need an iterator to one past the end: to specify a range equal to the entire contents of the container. But as that iterator points outside of it, it is illegal to dereference it.
The constructor of std::vector has several overloads.
For std::vector<double> a(3,5); the fill constructor is used :
explicit vector (size_type n);
vector (size_type n, const value_type& val,
const allocator_type& alloc = allocator_type());
This takes a size parameter as it's first parameter and an optional and third parameter, the second parameter specifies the value you want to give the newly created objects.
double p[] = {1, 2, 3, 4, 5};
std::vector<double> a(p, p+5);
Uses another overload of the constructor, namely the range constructor:
template <class InputIterator>
vector (InputIterator first, InputIterator last,
const allocator_type& alloc = allocator_type());
This takes an iterator to the start of a collection and the end() iterator and traverses and adds to the vector until first == last.
The reason why end() is implemented as one-past-the-last-element is because this allows for implementations to check for equality like:
while(first != last)
{
//savely add value of first to vector
++first;
}
Iterators are an abstraction of pointers.
A half-open interval [a,b) is defined as all elements x>=a and x<b. The advantage of it is that [a,a) is well defined and empty for any a.
Anything that can be incremented and compared equal can define a half open interval. So [ptr1,ptr2) is the element ptr1 then ptr1+1 then ptr1+2 until you reach ptr2, but not including ptr2.
For iterators, it is similar -- except we do not always have random access. So we talk about next instead of +1.
Pointers still count as a kind-of iterator.
A range of iterators or pointers "talks about" the elements pointed to. So when a vector takes a pair of iterators (first, and one-past-the-end), this defines a half-open interval of iterators, which also defines a collection of values they point to.
You can construct the vector from such a half-open range. It copies the elements poimtrd to into the vector.
I'm a Java developer. I'm currently learning C++. I've been looking at code samples for sorting. In Java, one normally gives a sorting method the container it needs to sort e.g
sort(Object[] someArray)
I've noticed in C++ that you pass two args, the start and end of the container. My question is that how is the actual container accessed then?
Here's sample code taken from Wikipedia illustrating the the sort method
#include <iostream>
#include <algorithm>
#include <vector>
int main() {
std::vector<int> vec;
vec.push_back(10); vec.push_back(5); vec.push_back(100);
std::sort(vec.begin(), vec.end());
for (int i = 0; i < vec.size(); ++i)
std::cout << vec[i] << ' ';
}
vec.begin() and vec.end() are returning iterators iterators. The iterators are kind of pointers on the elements, you can read them and modify them using iterators. That is what sort is doing using the iterators.
If it is an iterator, you can directly modify the object the iterator is referring to:
*it = X;
The sort function does not have to know about the containers, which is the power of the iterators. By manipulating the pointers, it can sort the complete container without even knowing exactly what container it is.
You should learn about iterators (http://www.cprogramming.com/tutorial/stl/iterators.html)
vec.begin() and vec.end() do not return the first and last elements of the vector. They actually return what is known as an iterator. An iterator behaves very much like a pointer to the elements. If you have an iterator i that you initialised with vec.begin(), you can get a pointer to the second element in the vector just by doing i++ - the same as you would if you had a point to the first element in an array. Likewise you can do i-- to go backwards. For some iterators (known as random access iterators), you can even do i + 5 to get an iterator to the 5th element after i.
This is how the algorithm accesses the container. It knows that all of the elements that it should be sorting are between begin() and end(). It navigates around the elements by doing simple iterator operations. It can then modify the elements by doing *i, which gives the algorithm a reference to the element that i is pointing at. For example, if i is set to vec.begin(), and you do *i = 5;, you will change the value of the first element of vec.
This approach allows you to pass only part of a vector to be sorted. Let's say you only wanted to sort the first 5 elements of your vector. You could do:
std::sort(vec.begin(), vec.begin() + 5);
This is very powerful. Since iterators behave very much like pointers, you can actually pass plain old pointers too. Let's say you have an array int array[] = {4, 3, 2, 5, 1};, you could easily call std::sort(array, array + 5) (because the name of an array will decay to a pointer to its first element).
The container doesn't have to be accessed. That's the whole point of the design behind the Standard Template Library (which became part of the C++ standard library): The algorithms don't know anything about containers, just iterators.
This means they can work with anything that provides a pair of iterators. Of course all STL containers provide begin() and end() methods, but you can also use a regular old C array, or an MFC or glib container, or anything else, just by writing your own iterators for it. (And for C arrays, it's as simple as a and a+a_len for the begin and end iterators.)
As for how it works under the covers: Iterators follow an implicit protocol: you can do things like ++it to advance an iterator to the next element, or *it to get the value of the current element, or *it = 3 to set the value of the current element. (It's a bit more complicated than this, because there are a few different protocols—iterators can be random-access or forward-only, const or writable, etc. But that's the basic idea.) So, if `sort is coded to restrict itself to the iterator protocol (and, of course, it is), it works with anything that conforms to that protocol.
To learn more, there are many tutorials on the internet (and in the bookstore); there's only so much an SO answer can explain.
begin() and end() return iterators. See e.g. http://www.cprogramming.com/tutorial/stl/iterators.html
Iterators act like references into part of a container. That is, *iter = z; actually changes one of the elements in the container.
std::sort actually uses a swap function on references to the contained objects, so that any iterators you have already initialized remain in the same order but the values those iterators refer to are changed.
Note that std::list also has member functions called sort. It works the other way around: any iterators you have already initialized keep the same values, but the order of those iterators changes.
I'm curious about the rationale behind the following code. For a given map, I can delete a range up to, but not including, end() (obviously,) using the following code:
map<string, int> myMap;
myMap["one"] = 1;
myMap["two"] = 2;
myMap["three"] = 3;
map<string, int>::iterator it = myMap.find("two");
myMap.erase( it, myMap.end() );
This erases the last two items using the range. However, if I used the single iterator version of erase, I half expected passing myMap.end() to result in no action as the iterator was clearly at the end of the collection. This is as distinct from a corrupt or invalid iterator which would clearly lead to undefined behaviour.
However, when I do this:
myMap.erase( myMap.end() );
I simply get a segmentation fault. I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case. Is there some subtle reason for this that I'm missing? I noticed that even this works:
myMap.erase( myMap.end(), myMap.end() );
(i.e. does nothing)
The reason I ask is that I have some code which receives a valid iterator to the collection (but which could be end()) and I wanted to simply pass this into erase rather than having to check first like this:
if ( it != myMap.end() )
myMap.erase( it );
which seems a bit clunky to me. The alternative is to re code so I can use the by-key-type erase overload but I'd rather not re-write too much if I can help it.
The key is that in the standard library ranges determined by two iterators are half-opened ranges. In math notation [a,b) They include the first but not the last iterator (if both are the same, the range is empty). At the same time, end() returns an iterator that is one beyond the last element, which perfectly matches the half-open range notation.
When you use the range version of erase it will never try to delete the element referenced by the last iterator. Consider a modified example:
map<int,int> m;
for (int i = 0; i < 5; ++i)
m[i] = i;
m.erase( m.find(1), m.find(4) );
At the end of the execution the map will hold two keys 0 and 4. Note that the element referred by the second iterator was not erased from the container.
On the other hand, the single iterator operation will erase the element referenced by the iterator. If the code above was changed to:
for (int i = 1; i <= 4; ++i )
m.erase( m.find(i) );
The element with key 4 will be deleted. In your case you will attempt to delete the end iterator that does not refer to a valid object.
I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case.
No, it is not hard to do, but the function was designed with a different contract in mind: the caller must pass in an iterator into an element in the container. Part of the reason for this is that in C++ most of the features are designed so that the incur the minimum cost possible, allowing the user to balance the safety/performance on their side. The user can test the iterator before calling erase, but if that test was inside the library then the user would not be able to opt out of testing when she knows that the iterator is valid.
n3337 23.2.4 Table 102
a.erase( q1, q2)
erases all the elements in the range [q1,q2). Returns q2.
So, iterator returning from map::end() is not in range in case of myMap.erase(myMap.end(), myMap.end());
a.erase(q)
erases the element pointed to by q. Returns an iterator pointing to the element immediately following q prior to the element being erased. If no such element exists, returns a.end().
I wouldn't have thought it difficult for map to check whether the
iterator equalled end() and not take action in that case. Is there
some subtle reason for this that I'm missing?
Reason is same, that std::vector::operator[] can don't check, that index is in range, of course.
When you use two iterators to specify a range, the range consists of the elements from the element that the first iterator points to up to but not including the element that the second iterator points to. So erase(it, myMap.end()) says to erase everything from it up to but not including end(). You could equally well pass an iterator that points to a "real" element as the second one, and the element that that iterator points to would not be erased.
When you use erase(it) it says to erase the element that it points to. The end() iterator does not point to a valid element, so erase(end()) doesn't do anything sensible. It would be possible for the library to diagnose this situation, and a debugging library will do that, but it imposes a cost on every call to erase to check what the iterator points to. The standard library doesn't impose that cost on users. You're on your own. <g>
Where does the C++ standard declare that the pair of iterators passed to std::vector::insert must not overlap the original sequence?
Edit: To elaborate, I'm pretty sure that the standard does not require the standard library to handle situations like this:
std::vector<int> v(10);
std::vector<int>::iterator first = v.begin() + 5;
std::vector<int>::iterator last = v.begin() + 8;
v.insert(v.begin() + 2, first, last);
However, I was unable to find anything in the standard, that would prohibit the ranges [first, last) and [v.begin(), v.end()) to overlap.
23.1.1/4 Sequence requirements has:
expression: a.insert(p,i,j)
return type: void
precondition: i,j are not iterators into a. inserts copies of elements in[i,j) before p.
So i and j cannot be iterators into your vector.
It makes sense, as during the insert operation, the vector may need to resize itself, and so the existing elements may first be copied to a new memory location (there by invalidating the current iterators).
Consider the behavior if it was allowed. Every insert into the vector would both increase the distance between the start and end iterator by one and move the start iterator up one. Therefore the start iterator would never reach the end iterator and the algorithm would execute until an out of memory exception occurred.