Unexpected behaviour of for_each with transform - c++

When I uncomment the commented line with std::transform then the above for_each won't print anything. The for_each below also does not print anything. I thought the code would take the elements from v, increase them and insert them into v2.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
void print(const int& what){
cout<<what<<" ";
}
int change(const int& from){
return from+1;
}
int main() {
vector<int> v(5,10);
vector<int> v2;
for_each(v.begin(),v.end(),print);
//transform(v.begin(),v.end(),v2.begin(),change);
for_each(v2.begin(),v2.end(),print);
return 0;
}

Your second collection is empty -- to insert items into it, you'd want to use a std;:back_inserter:
transform(v.begin(), v.end(), back_inserter(v2), change);
Note, however, that for_each isn't really the optimal choice here either. If you're going to use a standard algorithm, I'd recommend copying to an ostream_iterator:
copy(v.begin(), v.end(), ostream_iterator<int>(cout, " "));
transform(v.begin(), v.end(), back_inserter(v2));
copy(v.begin(), v.end(), ostream_iterator<int>(cout, " "));
If all you really want to do is add one to each item in the input, you may find it easier to use something like std::plus instead of writing that code for yourself (or, if you have C++11 available, you could use a lambda).
To answer the question you actually asked (why none of it really works when you do your transform as it was): you had undefined behavior attempting to access the vector outside its current bounds. As such, any behavior is allowed. It's often a little hard to see how UB later in a program could affect behavior of code before the UB has actually been invoked, but the standard is quite explicit in allowing that. Some compilers take advantage of this to enable optimizations (for example) that wouldn't be (or might not be) possible otherwise.

The destination range for std::transform must be large enough to hold the results. Your v2 is an empty vector. You could call v2.resize() before calling std::transform or use back_inserter like so:
transform(v.begin(),v.end(),back_inserter(v2),change);

Related

Disturbing order of evaluation

When I work with my favorite containers, I tend to chain operations. For instance, in the well-known Erase–remove idiom:
v.erase( std::remove_if(v.begin(), v.end(), is_odd), v.end() );
From what I know of the order of evaluation, v.end() (on the rhs) might be evaluated before the call to std::remove_if. This is not a problem here since std::remove* only shuffle the vector without changing its end iterator.
But it could lead to really surprising constructs, like for instance (demo):
#include <iostream>
struct Data
{
int v;
int value() const { return v; }
};
auto inc(Data& data) { return ++data.v; }
void print_rhs(int, int value) { std::cout << value << '\n'; }
int main()
{
Data data{0};
print_rhs(inc(data), data.value()); // might print 0
}
This is surprising since print_rhs is called after inc has been called; which means data.v is 1 when print_rhs is called. Nevertheless, since data.value() might be evaluated before, 0 is a possible output.
I think it might be a nice improvement if the order of evaluation would be less surprising; in particular if the arguments of a function with side effects were evaluated before those without.
My questions are then:
Has that change ever been discussed or suggested in a C++ committee?
Do you see any problem it could bring?
Has that change ever been discussed or suggested in a C++ committee?
Probably.
Do you see any problem it could bring?
Yes. It could reduce optimization opportunities which exist today, and brings no direct benefit other than the ability to write more one-liners. But one-liners are not a good thing anyway, so this proposal would probably never get past -99 points.

Is there a better alternative to std::remove_if to remove elements from a vector?

The task of removing elements with a certain property from a std::vector or other container lends itself to a functional style implementation: Why bother with loops, memory deallocation and moving data around correctly?
However the standard way of doing this in C++ seems to be the following idiom:
std::vector<int> ints;
...
ints.erase(
std::remove_if(ints.begin(),
ints.end(),
[](int x){return x < 0;}),
ints.end());
This example removes all elements less than zero from an integer vector.
I find it not only ugly but also easy to use incorrectly. It is clear that std::remove_if cannot change the size of the vector (as its name would suggest) because it only gets iterators passed. But many developers, including myself, don't get that in the beginning.
So is there a safer and hopefully more elegant way to achieve this? If not, why?
I find it not only ugly but also easy to use incorrectly.
Don't worry, we all did at the start.
It is clear that std::remove_if cannot change the size of the vector (as its name would suggest) because it only gets iterators passed. But many developers, including myself, don't get that in the beginning.
Same. It confuses everyone. It probably shouldn't have been called remove_if all those years ago. Hindsight, eh?
So is there a safer and hopefully more elegant way to achieve this?
No
If not, why?
Because this is the safest, most elegant way that preserves performance when deleting items from a container in which deleting an item invalidates iterators.
anticipating:
Anything I can do?
Yes, wrap this idiom into a function
template<class Container, class F>
auto erase_where(Container& c, F&& f)
{
return c.erase(std::remove_if(c.begin(),
c.end(),
std::forward<F>(f)),
c.end());
}
The call in the motivating example then becomes:
auto is_negative = [](int x){return x < 0;};
erase_where(ints, is_negative);
or
erase_where(ints, [](int x){return x < 0;});
This will become available in a C++17-ready compiler soon through the std::experimental::erase_if algorithm:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
#include <experimental/vector>
int main()
{
std::vector<int> ints { -1, 0, 1 };
std::experimental::erase_if(ints, [](int x){
return x < 0;
});
std::copy(ints.begin(), ints.end(), std::ostream_iterator<int>(std::cout, ","));
}
Live Example that prints 0,1

Replace 'for loop' using std::for_each

I have a for loop in the below code and I would like to implement it using std::for_each. I have implemented it. Could someone please tell me if that is the best way to do it using std::for_each? If not, could you please suggest the right one?
#include <vector>
#include <cstdint>
#include <string>
#include <algorithm>
#include <iostream>
#include <sstream>
int main()
{
std::vector<std::uint32_t> nums{3, 4, 2, 8, 15};
std::stringstream list1;
for (auto n : nums)
{
list1 << n<<",";
}
//Is this the right way to do using std::for_each so that above for loop can be done in 1 line??
std::for_each(nums.begin(),nums.end(),[&list1](std::uint32_t n){ list1 << n << ","; });
}
Yes, your use of for_each is a reasonable analog of the preceding loop.
I feel obliged to point out, however, that I find for_each probably the least useful algorithm in the library. From what I've seen, using it generally indicates that you're still basically thinking in terms of loops, and just changing the syntax you use for those loops. I also think that range-based for loops have probably eliminated at least 90% of the (already few) legitimate uses there used to be for for_each.
In this case, your code is really imitating using std::copy with an std::ostream_iterator:
std::copy(nums.begin(), nums.end(),
std::ostream_iterator<std::uint32_t>(std::cout, ","));
Even this, however, is clumsy enough that I think it's open to question whether it's really an improvement over a range-based for loop.
Why don't you just test it out?
auto vs std::for_each
As you can see the assembly output is the same for both. It just doesn't make any difference for your example.
If you want to copy the data from one thing to another you can use std::copy
int main()
{
std::vector<std::uint32_t> nums{3, 4, 2, 8, 15};
std::stringstream list1;
std::copy(nums.begin(), nums.end(),std::ostream_iterator<std::uint32_t>(list1,","));
std::cout << list1.str();
}
Live Example
This will end the stream with a , but that is the same thing you get in you code.
If you do not want this then you should look at Pretty-print C++ STL containers
Yes, that is right. There is no need for the reference in (std::uint32_t &n) if you had performance as the motive.

What is the difference between std::transform and std::for_each?

Both can be used to apply a function to a range of elements.
On a high level:
std::for_each ignores the return value of the function, and
guarantees order of execution.
std::transform assigns the return value to the iterator, and does
not guarantee the order of execution.
When do you prefer using the one versus the other? Are there any subtle caveats?
std::transform is the same as map. The idea is to apply a function to each element in between the two iterators and obtain a different container composed of elements resulting from the application of such a function. You may want to use it for, e.g., projecting an object's data member into a new container. In the following, std::transform is used to transform a container of std::strings in a container of std::size_ts.
std::vector<std::string> names = {"hi", "test", "foo"};
std::vector<std::size_t> name_sizes;
std::transform(names.begin(), names.end(), std::back_inserter(name_sizes), [](const std::string& name) { return name.size();});
On the other hand, you execute std::for_each for the sole side effects. In other words, std::for_each closely resembles a plain range-based for loop.
Back to the string example:
std::for_each(name_sizes.begin(), name_sizes.end(), [](std::size_t name_size) {
std::cout << name_size << std::endl;
});
Indeed, starting from C++11 the same can be achieved with a terser notation using range-based for loops:
for (std::size_t name_size: name_sizes) {
std::cout << name_size << std::endl;
}
Your high level overview
std::for_each ignores the return value of the function and guarantees order of execution.
std::transform assigns the return value to the iterator, and does not guarantee the order of execution.
pretty much covers it.
Another way of looking at it (to prefer one over the other);
Do the results (the return value) of the operation matter?
Is the operation on each element a member method with no return value?
Are there two input ranges?
One more thing to bear in mind (subtle caveat) is the change in the requirements of the operations of std::transform before and after C++11 (from en.cppreference.com);
Before C++11, they were required to "not have any side effects",
After C++11, this changed to "must not invalidate any iterators, including the end iterators, or modify any elements of the ranges involved"
Basically these were to allow the undetermined order of execution.
When do I use one over the other?
If I want to manipulate each element in a range, then I use for_each. If I have to calculate something from each element, then I would use transform. When using the for_each and transform, I normally pair them with a lambda.
That said, I find my current usage of the traditional for_each being diminished somewhat since the advent of the range based for loops and lambdas in C++11 (for (element : range)). I find its syntax and implementation very natural (but your mileage here will vary) and a more intuitive fit for some use cases.
Although the question has been answered, I believe that this example would clarify the difference further.
for_each belongs to non-modifying STL operations, meaning that these operations do not change elements of the collection or the collection itself. Therefore, the value returned by for_each is always ignored and is not assigned to a collection element.
Nonetheless, it is still possible to modify elements of collection, for example when an element is passed to the f function using reference. One should avoid such behavior as it is not consistent with STL principles.
In contrast, transform function belongs to modifying STL operations and applies given predicates (unary_op or binary_op) to elements of the collection or collections and store results in another collection.
#include <vector>
#include <iostream>
#include <algorithm>
#include <functional>
using namespace std;
void printer(int i) {
cout << i << ", ";
}
int main() {
int mynumbers[] = { 1, 2, 3, 4 };
vector<int> v(mynumbers, mynumbers + 4);
for_each(v.begin(), v.end(), negate<int>());//no effect as returned value of UnaryFunction negate() is ignored.
for_each(v.begin(), v.end(), printer); //guarantees order
cout << endl;
transform(v.begin(), v.end(), v.begin(), negate<int>());//negates elements correctly
for_each(v.begin(), v.end(), printer);
return 0;
}
which will print:
1, 2, 3, 4,
-1, -2, -3, -4,
Real example of using std::tranform is when you want to convert a string to uppercase, you can write code like this :
std::transform(s.begin(), s.end(), std::back_inserter(out), ::toupper);
if you will try to achieve same thing with std::for_each like :
std::for_each(s.begin(), s.end(), ::toupper);
It wont convert it into uppercase string

Remove elements from a c++ vector where the removal condition is dependent on other elements

The standard way to remove certain elements from a vector in C++ is the remove/erase idiom. However, the predicate passed to remove_if only takes the vector element under consideration as an argument. Is there a good STL way to do this if the predicate is conditional on other elements of the array?
To give a concrete example, consider removing all duplicates of a number immediately following it. Here the condition for removing the n-th element is conditional on the (n-1)-th element.
Before: 11234555111333
After: 1234513
There's a standard algorithm for this. std::unique will remove the elements that are duplicates of those preceding them (actually, just like remove_if it reorganizes the container so that the elements to be removed are gathered at the end of it).
Example on a std::string for simplicity:
#include <string>
#include <iostream>
#include <algorithm>
int main()
{
std::string str = "11234555111333";
str.erase(std::unique(str.begin(), str.end()), str.end());
std::cout << str; // 1234513
}
Others mentioned std::unique already, for your specific example. Boost.Range has the adjacent_filtered adaptor, which passes both the current and the next element in the range to your predicate and is, thanks to the predicate, applicable to a larger range of problems. Boost.Range however also has the uniqued adaptor.
Another possibility would be to simply keep a reference to the range, which is easy to do with a lambda in C++11:
std::vector<T> v;
v.erase(std::remove_if(v.begin(), v.end(),
[&](T const& x){
// use v, with std::find for example
}), v.end());
In my opinion, there will be easier to use simple traversal algorithm(via for) rather then use std::bind. Of course, with std::bind you can use other functions and predicates(which depends on previous elements). But in your example, you can do it via simple std::unique.