std::transform with lambda: skip some items

std::transform with lambda: skip some items - c++

I have some C++11 code like
std::vector<std::string> names;
std::map<std::string, std::string> first_to_last_name_map;
std::transform(names.begin(), names.end(), std::inserter(first_to_last_name_map, first_to_last_name_map.begin()), [](const std::string& i){
if (i == "bad")
return std::pair<std::string, std::string>("bad", "bad"); // Don't Want This
else
return std::pair<std::string, std::string>(i.substr(0,5), i.substr(5,5));
});
where I'm transforming a vector to a map using std::transform with a lambda function. My problem is that sometimes, as shown, I don't want to return anything from my lambda function, i.e. I basically want to skip that i and go to the next one (without adding anything to the map).
Is there any way to achieve what I'm thinking about? I can use boost if it helps. I want to avoid a solution where I have to do a pre-process or post-process on my vector to filter out the "bad" items; I should only need to look at each item once. Also, my actual logic is a bit more complicated than the if/else as written, so I think it would be nice to keep things encapsulated in this std::transform/lambda model if possible (though maybe what I'm trying to achieve isn't possible with this model).
EDIT: Just to emphasize, I'm looking to perform this operation (selectively processing vector elements and inserting them into a map) in the most efficient way possible, even if it means a less elegant solution or a big rewrite. I could even use a different map data type depending on what is most efficient.

template<class Src, class Sink, class F>
void transform_if(Src&& src, Sink&& sink, F&& f){
for(auto&& x:std::forward<Src>(src))
if(auto&& e=f(decltype(x)(x)))
*sink++ = *decltype(e)(e);
}
Now simply get a boost or std or std experiental optional. Have your f return an optional<blah>.
auto sink = std::inserter(first_to_last_name_map, first_to_last_name_map.begin());
using pair_type = decltype(first_to_last_name_map)::value_type;
transform_if(names, sink,
[](const std::string& i)->std::optional<pair_type>{
if (i == "bad")
return {}; // Don't Want This
else
return std::make_pair(i.substr(0,5), i.substr(5,5));
}
);
My personal preferred optional actually has begin end defined. And we get this algorithm:
template<class Src, class Sink, class F>
void polymap(Src&& src, Sink&& sink, F&& f){
for(auto&& x:std::forward<Src>(src))
for(auto&& e:f(decltype(x)(x)))
*sink++ = decltype(e)(e);
}
which now lets the f return a range, where optional is a model of a zero or one element range.

You can simply have a first/last pass with std::remove_if. E.g.
std::vector<std::string> names;
std::map<std::string, std::string> first_to_last_name_map;
std::transform(names.begin(),
std::remove_if(names.begin(),
names.end(),
[](const std::string &str){
return str=="bad";
}),
std::inserter(first_to_last_name_map,
first_to_last_name_map.begin()),
[](const std::string& i){
return std::pair<std::string, std::string>(i.substr(0,5), i.substr(5,5));
});
Note that remove_if simply shifts the removed items past the iterator it returns.

You can use boost::adaptors::filtered to first filter the vector of the elements you don't want, before passing it to transform.
using boost::adaptors::filtered;
boost::transform(names | filtered([](std::string const& s) { return s != "bad"; }),
std::inserter(first_to_last_name_map, first_to_last_name_map.begin()),
[](std::string const& i) { return std::make_pair(i.substr(0,5), i.substr(5,5)); });
Live demo

Related

constructing a std::vector using std::transform. Possibility to return unnamed result?

Let's have
class InputClass;
class OutputClass;
OutputClass const In2Out(InputClass const &in)
{
//conversion implemented
}
and finally
std::vector<OutputClass> Convert(std::vector<InputClass> const &input)
{
std::vector<OutputClass> res;
res.reserve(input.size());
//either
for (auto const &in : input)
res.emplace_back(In2Out(in));
return res;
//or something like
std::transform(input.begin(), input.end(), std::back_inserter(res), [](InputClass const &in){return In2Out(in);});
return res;
}
And now my question:
Can I rewrite the Convert function somehow avoiding the need to name the new container? I. e. is there a way to construct a vector directly using something roughly like std::transform or std::for_each?
As in (pseudocode, this unsurprisingly does not work or even build)
std::vector<OutputClass> Convert(std::vector<InputClass> const &input)
{
return std::transform(input.begin(), input.end(), std::back_inserter(std::vector<OutputClass>()), [](InputClass const &in){return In2Out(in);});
}
Searched, but did not find any elegant solution. Thanks!

Starting in C++ 20 you can use the new std::ranges::transform_view to accomplish what you want. It will call your transformation function for each element in the container that it is adapting and you can use that view to invoke std::vector's iterator range constructor which will allocate the memory for the entire vector once and then populate the elements. It still requires you to create a variable in the function but it becomes much more streamlined. That would give you something like
std::vector<OutputClass> Convert(std::vector<InputClass> const &input)
{
auto range = std::ranges::transform_view(input, In2Out);
return {range.begin(), range.end()};
}
Do note that this should optimize to the exact same code your function generates.

Yes it is possible, and quite simple when using boost:
struct A
{
};
struct B
{
};
std::vector<B> Convert(const std::vector<A> &input)
{
auto trans = [](const A&) { return B{}; };
return { boost::make_transform_iterator(input.begin(), trans), boost::make_transform_iterator(input.end(), trans) };
}
https://wandbox.org/permlink/ZSqt2SbsHeY8V0mt
But as other mentioned this is weird and doesn't provide any gain (no performance gain or readability gain)

Can I rewrite the Convert function somehow avoiding the need to name the new container?
Not using just std::transform. std::transform itself never creates a container. It only inserts elements to an output iterator. And in order to both get output iterator to a container, and return the container later, you pretty much need a name (unless you allocate the container dynamically, which would be silly and inefficient).
You can of course write a function that uses std::transform, creates the (named) vector, and returns it. Then caller of that function doesn't need to care about that name. In fact, that's pretty much what your function Convert is.

How to use lambda for std::find_if

I am trying to use std::find_if to find an object that matches some criteria. Consider the following:
struct MyStruct
{
MyStruct(const int & id) : m_id(id) {}
int m_id;
};
...
std::vector<MyStruct> myVector; //... assume it contains things
MyStruct toFind(1);
std::vector<MyStruct>::iterator i = std::find_if(myVector.begin(), myVector.end(), ???);
I am not sure what to put in the ???
All the examples I have seen have a lambda that uses a hard-coded value to check for the ID. What I want is to return the iterator/success only if the id of toFind matches the id of one of the items in the vector.
All the examples I have see don't show me how to pass the two parameters
EDIT
Additional info
There are two different scenarios I have to use this for
One in which there is an == operator for the struct
and another in which there is no operator == for the struct - and i can't create one because the criteria for finding a match for this scenario is not as rigid as would be used for an equivalence operator.
(And thanks to all who responded; I was able to use find() in one case and with your help was able to use find_if() for the other)

Try this:
std::find_if(
myVector.begin(), myVector.end(),
[&toFind](const MyStruct& x) { return x.m_id == toFind.m_id;});
Alternatively, if you had defined an appropriate == overload for MyStruct, you could just use find:
std::find(myVector.begin(), myVector.end(), toFind); // requires ==
The find_if version is usually best when you have some kind of heterogeneous lookup, for example if you were just given an int, not a value of MyStruct.

This is where the lambda capture comes into play. Besides saying what type of parameters are to be passed to the lambda you can also say what existing variables are to be used to construct the lambda with. So in this case you would have something like
std::vector<MyStruct>::iterator i = std::find_if(myVector.begin(),
myVector.end(),
[&](const auto& val){ return val.m_id == toFind.m_id; } );
So the [&] says capture all variables used in the body of the lambda by reference. The (const auto& val) makes the operator() of the lambda a template and lets you take in any type. Then in the body we compare what is passed in from find_if to toFind.

You may use the following:
MyStruct toFind(1);
std::vector<MyStruct>::iterator i =
std::find_if(myVector.begin(), myVector.end(),
[&](const auto& e) { return e.id == toFind.id; });

Do as following:
std::find_if(myVector.begin(), myVector.end(),
[&toFind] (const auto &ele) { return ele.m_id == toFind.m_id}; );

Modern way to filter STL container?

Coming back to C++ after years of C# I was wondering what the modern - read: C++11 - way of filtering an array would be, i.e. how can we achieve something similar to this Linq query:
var filteredElements = elements.Where(elm => elm.filterProperty == true);
In order to filter a vector of elements (strings for the sake of this question)?
I sincerely hope the old STL style algorithms (or even extensions like boost::filter_iterator) requiring explicit methods to be defined are superseded by now?

See the example from cplusplus.com for std::copy_if:
std::vector<int> foo = {25,15,5,-5,-15};
std::vector<int> bar;
// copy only positive numbers:
std::copy_if (foo.begin(), foo.end(), std::back_inserter(bar), [](int i){return i>=0;} );
std::copy_if evaluates the lambda expression for every element in foo here and if it returns true it copies the value to bar.
The std::back_inserter allows us to actually insert new elements at the end of bar (using push_back()) with an iterator without having to resize it to the required size first.

In C++20, use filter view from the ranges library: (requires #include <ranges>)
// namespace views = std::ranges::views;
vec | views::filter([](int a){ return a % 2 == 0; })
lazily returns the even elements in vec.
(See [range.adaptor.object]/4 and [range.filter])
This is already supported by GCC 10 (live demo). For Clang and older versions of GCC, the original range-v3 library can be used too, with #include <range/v3/view/filter.hpp> (or #include <range/v3/all.hpp>) and the ranges::views namespace instead of std::ranges::views (live demo).

A more efficient approach, if you don't actually need a new copy of the list, is remove_if, which actually removes the elements from the original container.

I think Boost.Range deserves a mention too. The resulting code is pretty close to the original:
#include <boost/range/adaptors.hpp>
// ...
using boost::adaptors::filtered;
auto filteredElements = elements | filtered([](decltype(elements)::value_type const& elm)
{ return elm.filterProperty == true; });
The only downside is having to explicitly declare the lambda's parameter type. I used decltype(elements)::value_type because it avoids having to spell out the exact type, and also adds a grain of genericity. Alternatively, with C++14's polymorphic lambdas, the type could be simply specified as auto:
auto filteredElements = elements | filtered([](auto const& elm)
{ return elm.filterProperty == true; });
filteredElements would be a range, suitable for traversal, but it's basically a view of the original container. If what you need is another container filled with copies of the elements satisfying the criteria (so that it's independent from the lifetime of the original container), it could look like:
using std::back_inserter; using boost::copy; using boost::adaptors::filtered;
decltype(elements) filteredElements;
copy(elements | filtered([](decltype(elements)::value_type const& elm)
{ return elm.filterProperty == true; }), back_inserter(filteredElements));

Improved pjm code following underscore-d suggestions:
template <typename Cont, typename Pred>
Cont filter(const Cont &container, Pred predicate) {
Cont result;
std::copy_if(container.begin(), container.end(), std::back_inserter(result), predicate);
return result;
}
Usage:
std::vector<int> myVec = {1,4,7,8,9,0};
auto filteredVec = filter(myVec, [](int a) { return a > 5; });

My suggestion for C++ equivalent of C#
var filteredElements = elements.Where(elm => elm.filterProperty == true);
Define a template function to which you pass a lambda predicate to do the filtering. The template function returns the filtered result. eg:
template<typename T>
vector<T> select_T(const vector<T>& inVec, function<bool(const T&)> predicate)
{
vector<T> result;
copy_if(inVec.begin(), inVec.end(), back_inserter(result), predicate);
return result;
}
to use - giving a trivial examples:
std::vector<int> mVec = {1,4,7,8,9,0};
// filter out values > 5
auto gtFive = select_T<int>(mVec, [](auto a) {return (a > 5); });
// or > target
int target = 5;
auto gt = select_T<int>(mVec, [target](auto a) {return (a > target); });

C++/Boost Insert list items into map without manual loop

In a C++ program, I have a std::list and std::map (although I am actually using boost:unordered_map) and I would like to know an elegant way of inserting all the elements in the list into the map. I would like the key to be the result of a method call on the elements on in the list.
So for example I have:
std::list<Message> messages = *another list with elements*;
std::map<std::string, Message> message_map;
And I want to insert all the elements from the list into the map with the key being message.id(), for every message in messages.
Is there a way to do this without looping over the list and doing it manually? I can't use C++11 but I would still be interested in C++11 solutions for interests sake. I am able to use boost.
Thank you.

A C++11 solution: You can use std::transform to transform from std::list elements to std::map elements:
std::transform(message.begin(), messages.end(),
std::inserter(message_map, message_map.end()),
[](const Message& m) { return std::make_pair(m.id(), m); });
The equivalent can be done with C++03 by passing a function pointer instead of a lambda.

Not sure if it is better than a for loop but you can use a functor
struct map_inserter {
std::map<string,Message>& t_map;
map_inserter(std::map<string,Message>& t_map) : t_map(t_map) {}
void operator()(Message& m) {
t_map.insert(std::pair<string,Message>(m.get_id(),m));
}
};
You can use it like this
std::map<string,Message> t_map;
std::for_each(vec.begin(), vec.end(), map_inserter(t_map));

If u can get iterators to the begining and end of the lis u can use for_each()

You could use accumulate:
typedef std::map<std::string, Message> MessageMap;
MessageMap& addIdAndMessage(MessageMap& messageMap, const Message& message) {
messageMap[message.id()] = message;
return messageMap;
}
int main() {
std::list<Message> messages;
//...
MessageMap message_map =
accumulate(messages.begin(), message.end(),
MessageMap(),
addIdAndMessage);
}

How can I return a copy of a vector containing elements not in a set?

Suppose I have the following two data structures:
std::vector<int> all_items;
std::set<int> bad_items;
The all_items vector contains all known items and the bad_items vector contains a list of bad items. These two data structures are populated entirely independent of one another.
What's the proper way to write a method that will return a std::vector<int> contain all elements of all_items not in bad_items?
Currently, I have a clunky solution that I think can be done more concisely. My understanding of STL function adapters is lacking. Hence the question. My current solution is:
struct is_item_bad {
std::set<int> const* bad_items;
bool operator() (int const i) const {
return bad_items.count(i) > 0;
}
};
std::vector<int> items() const {
is_item_bad iib = { &bad_items; };
std::vector<int> good_items(all_items.size());
std::remove_copy_if(all_items.begin(), all_items.end(),
good_items.begin(), is_item_bad);
return good_items;
}
Assume all_items, bad_items, is_item_bad and items() are all a part of some containing class. Is there a way to write them items() getter such that:
It doesn't need temporary variables in the method?
It doesn't need the custom functor, struct is_item_bad?
I had hoped to just use the count method on std::set as a functor, but I haven't been able to divine the right way to express that w/ the remove_copy_if algorithm.
EDIT: Fixed the logic error in items(). The actual code didn't have the problem, it was a transcription error.
EDIT: I have accepted a solution that doesn't use std::set_difference since it is more general and will work even if the std::vector isn't sorted. I chose to use the C++0x lambda expression syntax in my code. My final items() method looks like this:
std::vector<int> items() const {
std::vector<int> good_items;
good_items.reserve(all_items.size());
std::remove_copy_if(all_items.begin(), all_items.end(),
std::back_inserter(good_items),
[&bad_items] (int const i) {
return bad_items.count(i) == 1;
});
}
On a vector of about 8 million items the above method runs in 3.1s. I bench marked the std::set_difference approach and it ran in approximately 2.1s. Thanks to everyone who supplied great answers.

As jeffamaphone suggested, if you can sort any input vectors, you can use std::set_difference which is efficient and less code:
#include <algorithm>
#include <set>
#include <vector>
std::vector<int>
get_good_items( std::vector<int> const & all_items,
std::set<int> const & bad_items )
{
std::vector<int> good_items;
// Assumes all_items is sorted.
std::set_difference( all_items.begin(),
all_items.end(),
bad_items.begin(),
bad_items.end(),
std::back_inserter( good_items ) );
return good_items;
}

Since your function is going to return a vector, you will have to make a new vector (i.e. copy elements) in any case. In which case, std::remove_copy_if is fine, but you should use it correctly:
#include <iostream>
#include <vector>
#include <set>
#include <iterator>
#include <algorithm>
#include <functional>
std::vector<int> filter(const std::vector<int>& all, const std::set<int>& bad)
{
std::vector<int> result;
remove_copy_if(all.begin(), all.end(), back_inserter(result),
[&bad](int i){return bad.count(i)==1;});
return result;
}
int main()
{
std::vector<int> all_items = {4,5,2,3,4,8,7,56,4,2,2,2,3};
std::set<int> bad_items = {2,8,4};
std::vector<int> filtered_items = filter(all_items, bad_items);
copy(filtered_items.begin(), filtered_items.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
}
To do this in C++98, I guess you could use mem_fun_ref and bind1st to turn set::count into a functor in-line, but there are issues with that (which resulted in deprecation of bind1st in C++0x) which means depending on your compiler, you might end up using std::tr1::bind anyway:
remove_copy_if(all.begin(), all.end(), back_inserter(result),
bind(&std::set<int>::count, bad, std::tr1::placeholders::_1)); // or std::placeholders in C++0x
and in any case, an explicit function object would be more readable, I think:
struct IsMemberOf {
const std::set<int>& bad;
IsMemberOf(const std::set<int>& b) : bad(b) {}
bool operator()(int i) const { return bad.count(i)==1;}
};
std::vector<int> filter(const std::vector<int>& all, const std::set<int>& bad)
{
std::vector<int> result;
remove_copy_if(all.begin(), all.end(), back_inserter(result), IsMemberOf(bad));
return result;
}

At the risk of appearing archaic:
std::set<int> badItems;
std::vector<int> items;
std::vector<int> goodItems;
for ( std::vector<int>::iterator iter = items.begin();
iter != items.end();
++iter)
{
int& item = *iter;
if ( badItems.find(item) == badItems.end() )
{
goodItems.push_back(item);
}
}

std::remove_copy_if returns an iterator to the target collection. In this case, it would return good_items.end() (or something similar). good_items goes out of scope at the end of the method, so this would cause some memory errors. You should return good_items or pass in a new vector<int> by reference and then clear, resize, and populate it. This would get rid of the temporary variable.
I believe you have to define the custom functor because the method depends on the object bad_items which you couldn't specify without it getting hackey AFAIK.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::transform with lambda: skip some items - c++

Related

constructing a std::vector using std::transform. Possibility to return unnamed result?

How to use lambda for std::find_if

Modern way to filter STL container?

C++/Boost Insert list items into map without manual loop

How can I return a copy of a vector containing elements not in a set?

Categories

Resources