I'm confused as to whether I should increment an OutputIterator when I set it. See the following code in which I'm trying to split a string. My problem is with the third line of the while loop, the code seems to work fine whether I have *oit = ... or *oit++ = ...
Can someone explain to me why?
template<class O> void split(string& s, O oit){
string::iterator jt, it = s.begin();
while(1){
jt = find(it, s.end(), ' ');
if(it == s.end() && jt == s.end()) return;
*oit++ = string(it, jt);
it = jt;
if(it != s.end() ) it++;
}
}
...
int main(){
string s;
getline(cin, s);
vector<string> v;
split(s, back_inserter(v));
copy(v.begin(), v.end(), ostream_iterator<string>(cout, "\n"));
}
std::back_inserter creates an iterator that inserts by calling push_back on the underlying collection. That means the increment isn't necessary for it to work correctly.
The same is not necessarily true of other output iterators though, so for your code to be correct, it should perform the increment, event though in this particular case it's basically ignored.
Just for what it's worth, in this particular case you can get the same basic effect with a lot less/simpler code:
string s;
getline(cin, s);
replace(s, ' ', '\n');
cout << s;
The concept requires that you increment an output iterator for each write you do. Although the std::back_insert_iterator<T> may call push_back() on the corresponding object on each assignment to *it, the concept still demands that the increment operator is called. In principle, output iterators could be function calls but to fit into the same interface used also by pointers, they need to support the same operations.
The specification of std::back_insert_iterator<Cont> states that each assignment of a typename Cont::value_type calls cont.push_back(value) on the underlying type, operator*() just returns the iterator itself, as does operator++().
Both parts work because standard iterators are designed to be functionally equivalent to raw pointers when used with generic programming. When raw pointers are used, they must be incremented to reach the subsequent address.
Related
This question already has answers here:
Incrementing iterators: Is ++it more efficient than it++? [duplicate]
(7 answers)
Closed 6 years ago.
Johannes Schaub claims here
always use the prefix increment form for iterators whose definitions
you don't know. That will ensure your code runs as generic as
possible.
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}
Why doesn't this first iterate it, then start the loop (at v.begin() + 1)?
Why doesn't this first iterate it, then start the loop (at v.begin() + 1)?
The iteration statement is always executed at the end of each iteration. That is regardless of the type of increment operator you use, or whether you use an increment operator at all.
The result of the iteration statement expression is not used, so it has no effect on how the loop behaves. The statement:
++it;
Is functionally equivalent to the statement:
it++;
Postfix and prefix increment expressions have different behaviour only when the result of the expression is used.
Why use the prefix increment form for iterators?
Because the postfix operation implies a copy. Copying an iterator is generally at least as slow, but potentially slower than not copying an iterator.
A typical implementation of postfix increment:
iterator tmp(*this); // copy
++(*this); // prefix increment
return tmp; // return copy of the temporary
// (this copy can be elided by NRVO)
When the result is not used, even the first copy can be optimized away but only if the operation is expanded inline. But that is not guaranteed.
I wouldn't blindly use the rule "always use prefix increment with itrators". Some algorithms are clearer to express with postfix, although that is just my opinion. An example of an algorithm suitable for postfix increment:
template<class InIter, class OutIter>
OutIter copy(InIter first, InIter last, OutIter out) {
while(first != last)
*out++ = *first++;
return out;
}
Note that your code is equivalent to
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ) {
/* std::cout << *it; ... */
++it;
}
and it should be readily apparent that it doesn't matter if you write ++it; or it++;. (This also addresses your final point.)
But conceptually it++ needs to store, in its implementation, a copy of the unincremented value, as that is what the expression evaluates to.
it might be a big heavy object of which taking a value copy is computationally expensive, and your compiler might not be able to optimise away that implicit value copy taken by it++.
These days, for most containers, a compiler will optimise the arguably clearer it++ to ++it if the value of the expression is not used; i.e. the generated code will be identical.
I follow the author's advice and always use the pre-increment whenever possible, but I am (i) old fashioned and (ii) aware that plenty of expert programmers don't, so it's largely down to personal choice.
Why doesn't this first iterate it, then start the loop (at v.begin() + 1)?
Because the for loop will be parsed as:
{
init_statement
while ( condition ) {
statement
iteration_expression ;
}
}
So
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}
is equivalent to
{
std::vector<T>::iterator it = v.begin();
while ( it != v.end() ) {
/* std::cout << *it; ... */
++it ;
}
}
That means it would do the loop at v.begin() at first, then step forward it. Prefix increment means increase the value and then return the reference of the increased object; As you can seen the returned object is not used at all for this case, then ++it and it++ will lead to the same result.
The code below comes from an answer to this question on string splitting. It uses pointers, and a comment on that answer suggested it could be adapted for std::string. How can I use the features of std::string to implement the same algorithm, for example using iterators?
#include <vector>
#include <string>
using namespace std;
vector<string> split(const char *str, char c = ',')
{
vector<string> result;
do
{
const char *begin = str;
while(*str != c && *str)
str++;
result.push_back(string(begin, str));
} while (0 != *str++);
return result;
}
Ok so I obviously replaced char by string but then I noticed he is using a pointer to the beginning of the character. Is that even possible for strings? How do the loop termination criteria change? Is there anything else I need to worry about when making this change?
You can use iterators instead of pointers. Iterators provide a way to traverse containers, and can usually be thought of as analogous to pointers.
In this case, you can use the begin() member function (or cbegin() if you don't need to modify the elements) of a std::string object to obtain an iterator that references the first character, and the end() (or cend()) member function to obtain an iterator for "one-past-the-end".
For the inner loop, your termination criterion is the same; you want to stop when you hit the delimiter on which you'll be splitting the string. For the outer loop, instead of comparing the character value against '\0', you can compare the iterator against the end iterator you already obtained from the end() member function. The rest of the algorithm is pretty similar; iterators work like pointers in terms of dereference and increment:
std::vector<std::string> split(const std::string& str, const char delim = ',') {
std::vector<std::string> result;
auto end = str.cend();
auto iter = str.cbegin();
while (iter != end) {
auto begin = iter;
while (iter != end && *iter != delim) ++iter;
result.push_back(std::string(begin, iter));
if (iter != end) ++iter; // See note (**) below.
}
return result;
}
Note the subtle difference in the inner loop condition: it now tests whether we've hit the end before trying to dereference. This is because we can't dereference an iterator that points to the end of a container, so we must check this before trying to dereference. The original algorithm assumes that a null character ends the string, so we're ok to dereference a pointer to that position.
(**) The validity of iter++ != end when iter is already end is under discussion in Are end+1 iterators for std::string allowed?
I've added this if statement to the original algorithm to break the loop when iter reaches end in the inner loop. This avoids adding one to an iterator which is already the end iterator, and avoids the potential problem.
this is my homework:
Write a function to prints all strings with a length of 3. Your
solution must use a for loop with iterators.
void print3(const set & str)
And this is my code:
void print3(const set<string>& str){
string st;
set<string,less<string>>::iterator iter;
for(iter=str.begin();iter!=str.end();++iter)
{st=*iter;
if(st.length()==3) cout<<st<<' ';
}
}
But I think it's not good. Do someone have a better code? Please, help me to improve it.
-I have another question about iterator
string name[]={"halohg","nui","ght","jiunji"};
set<string> nameSet(name,name+4);
set<string>::iterator iter;
iter=name.begin();
How can I access name[2]="ght" by using iterator?
I tried iter+2 but it has some problems. I think I have to use random access iterator but I don't know how to use it.
Please, help me. Thanks a lot!
Some thoughts on improvement:
You can get rid of string st; and just check if (iter->length() == 3).
Another improvement would be to use a const_iterator instead of an iterator, since you aren't modifying any of the items.
Also, adding less<string> as a template parameter is kind of useless, since that's the default compare functor anyway, so it can be removed.
And lastly, it's generally a good idea to declare your locals with minimal scope (so they don't pollute other scopes or introduce unexpected hiding issues), so usually you want to declare your iter in the for.
So it becomes:
for (set<string>::const_iterator iter = str.begin(); iter != str.end(); ++iter) {
if (iter->length() == 3) cout << *iter << ' ';
}
That's about as good as you can get, given your requirements.
As for your second question, set's iterator is not a random access iterator. It's a (constant) Bidirectional Iterator. You can use std::advance if you wanted, though, and do:
std::set<std::string>::iterator iter;
iter = name.begin();
std::advance(iter, 2);
// iter is now pointing to the second element
Just remember that set sorts its elements.
I am looking for a standard library equivalent of this code for accumulating elements of an std container into a string with a delimiter separating consecutive entries:
string accumulate_with_delimiter( vector<string> strvect, string delimiter )
{
string answer;
for( vector<string>::const_iterator it = strvect.begin(); it != strvect.end(); ++it )
{
answer += *it;
if( it + 1 != strvect.end() )
{
answer += delimiter;
}
}
return answer;
}
Such code seems to be very common: printing out an array with delimiter " ", or saving into a CSV file with delimiter ",", etc. Therefore it's likely that a piece of code like that made its way into a standard library. std::accumulate comes close, but doesn't have a delimiter.
I don't think the standard C++ library has a nice approach to delimiting sequences. I typically end up using something like
std::ostringstream out;
if (!container.empty()) {
auto end(container.end());
std::copy(container.begin(), --end, std::ostream_iterator<T>(out, ", "));
out << *end;
}
Using std::accumulate() has a similar problem of although with the first element rather than the last element. Using a custom add function, you could use it something like this:
std::string concat;
if (!container.empty()) {
auto begin(container.begin());
concat = std::accumulate(++begin, container.end(), container.front(),
[](std::string f, std::string s) { return f + ", " + s; });
}
In both cases the iterators need to be moved to another element. The code uses temporary objects when moving the iterator because the container may use pointers as iterator in which case a pre-increment or pre-decrement on the result from begin() or end() doesn't work.
std::accumulate might be the correct answer, but you need the version which takes a custom adder. You can then provide your own lambda.
Remember to pass front() as the first value to accumulate, and start adding at begin() + 1. And test for empty vectors first of course.
I'm not sure if there is one in the recent Standard Library or not, but there is always boost::algorithm::join(strvec, delimiter).
I'm working with iterators on C++ and I'm having some trouble here. It says "Debug Assertion Failed" on expression (this->_Has_container()) on line interIterator++.
Distance list is a vector< vector< DistanceNode > >. What I'm I doing wrong?
vector< vector<DistanceNode> >::iterator externIterator = distanceList.begin();
while (externIterator != distanceList.end()) {
vector<DistanceNode>::iterator interIterator = externIterator->begin();
while (interIterator != externIterator->end()){
if (interIterator->getReference() == tmp){
//remove element pointed by interIterator
externIterator->erase(interIterator);
} // if
interIterator++;
} // while
externIterator++;
} // while
vector's erase() returns a new iterator to the next element. All iterators to the erased element and to elements after it become invalidated. Your loop ignores this, however, and continues to use interIterator.
Your code should look something like this:
if (condition)
interIterator = externIterator->erase(interIterator);
else
++interIterator; // (generally better practice to use pre-increment)
You can't remove elements from a sequence container while iterating over it — at least not the way you are doing it — because calling erase invalidates the iterator. You should assign the return value from erase to the iterator and suppress the increment:
while (interIterator != externIterator->end()){
if (interIterator->getReference() == tmp){
interIterator = externIterator->erase(interIterator);
} else {
++interIterator;
}
}
Also, never use post-increment (i++) when pre-increment (++i) will do.
I'll take the liberty to rewrite the code:
class ByReference: public std::unary_function<bool, DistanceNode>
{
public:
explicit ByReference(const Reference& r): mReference(r) {}
bool operator()(const DistanceNode& node) const
{
return node.getReference() == r;
}
private:
Reference mReference;
};
typedef std::vector< std::vector< DistanceNode > >::iterator iterator_t;
for (iterator_t it = dl.begin(), end = dl.end(); it != end; ++it)
{
it->erase(
std::remove_if(it->begin(), it->end(), ByReference(tmp)),
it->end()
);
}
Why ?
The first loop (externIterator) iterates over a full range of elements without ever modifying the range itself, it's what a for is for, this way you won't forget to increment (admittedly a for_each would be better, but the syntax can be awkward)
The second loop is tricky: simply speaking you're actually cutting the branch you're sitting on when you call erase, which requires jumping around (using the value returned). In this case the operation you want to accomplish (purging the list according to a certain criteria) is exactly what the remove-erase idiom is tailored for.
Note that the code could be tidied up if we had true lambda support at our disposal. In C++0x we would write:
std::for_each(distanceList.begin(), distanceList.end(),
[const& tmp](std::vector<DistanceNode>& vec)
{
vec.erase(
std::remove_if(vec.begin(), vec.end(),
[const& tmp](const DistanceNode& dn) { return dn.getReference() == tmp; }
),
vec.end()
);
}
);
As you can see, we don't see any iterator incrementing / dereferencing taking place any longer, it's all wrapped in dedicated algorithms which ensure that everything is handled appropriately.
I'll grant you the syntax looks strange, but I guess it's because we are not used to it yet.