Dereference iterator as part of a boost::bind composite chain - c++

I am trying to use bind to produce a function that:
Receives a map m
returns m.begin()->first
For that I am trying to use boost::bind:
typedef map<int,int>::const_iterator (map<int,int>::*const_begin_end) () const;
bind(&pair<const int,int>::first, bind(static_cast<const_begin_end>(&map<int, int>::begin), _1));
This doesn't work because the result of begin needs to be dereferenced. I thought something like
bind(&pair<const int,int>::first, bind(&operator*, bind(static_cast<const_begin_end>(&map<int, int>::begin), _1)));
But this wouldn't work since there is no global operator*.
Questions:
Is it possible to achieve this using boost::bind composite chains? How?
More easily readable alternatives?

I highly recommend Boost.Phoenix, it's my go-to library when it comes to writing functors on the fly in C++03. It is a superior alternative to Boost.Bind -- that library is showing its age.
For instance, Phoenix let us use operators on its functors to represent an actual use of that operator when the functor is called. Thus arg1 + arg2 is a functor that returns the sum of its first two operands. This heavily cuts down on the bind noise. A first attempt could look like:
bind(&pair<const int, int>::first
, *bind(static_cast<const_begin_end>(&map<int, int>::begin), arg1)) )
(LWS demo)
But another strong point of Phoenix is that it comes with some batteries. In our case we're very much interested in <boost/phoenix/stl/container.hpp> because this includes some lazy version of the familiar containers operations, including begin. This is very handy in our case:
// We don't need to disambiguate which begin member we want anymore!
bind(&pair<const int, int>::first, *begin(arg1))
(LWS demo)
As a final note, I'll add that C++11 bind expressions are specified such that pointer-to-members work on anything that uses operator*. So out-of-the-box you can do:
bind(&pair<const int, int>::first, bind(static_cast<begin_type>(&std::map<int, int>::begin), _1))
(LWS demo)

you can call bind with member function pointers, and member operators are nothing else but member functions:
const_begin_end pBegin = &map<int,int>::begin;
x = bind(&std::pair::first,
bind(&std::map<int, int>::const_iterator::operator*,
bind(pBegin, _1)
);
But seriously, you can as well just write a proper function that does what you need instead of that unreadable boost.bind mess (can you say "maintainability"?).
So, for C++03, a function:
template <class Map>
typename Map::key_type keyBegin(Map const& m)
{
return m.begin().first;
}
or a C++03 functor (you can define it locally inside your function)
struct KeyBegin
{
typedef std::map<int, int> IntMap;
int operator()(IntMap const& m) {
return m.begin().first;
}
};
or a C++11 lambda (more readable than a bind orgy):
auto keyBegin = [](std::map<int, int> const& m) -> int {
return std::begin(m).first;
};

Related

Is there anything to generically invert a C++ Comparator type?

For example, if I passed in std::less, I would want it to create a new comparator with behavior equal to std::greater. For use in templates.
Something that would enable a syntax like std::map<int, int, std::invert<std::less>> (except if the thing I'm looking for exists, that's not what it's called). Does this exist?
There is std::not_fn, which:
Creates a forwarding call wrapper that returns the negation of the callable object it holds.
However, in this case, it would result in a comparator with functionality of std::greater_equal, not std::greater. Furthermore, the type is not the same as std::greater_equal.
Example:
auto greater_eq = std::not_fn(std::less<>());
For getting equivalent of std::greater, you get that by flipping the arguments. I don't think there's a function for that in the standard library, but it's simple to write yourself:
auto flip = [](auto fun) {
return [fun = std::move(fun)]
(const auto& arg1, const auto& arg2)
{
return fun(arg2, arg1);
};
};
auto greater = flip(std::less<>());
This could also be extended to support any number of arguments, flipping only the first two (making it analoguous to flip from Haskell), as well as to support perfect forwarding of arguments, but those features may complicate the neat example quite a bit and are not needed for comparators.

Can the use of C++11's 'auto' improve performance?

I can see why the auto type in C++11 improves correctness and maintainability. I've read that it can also improve performance (Almost Always Auto by Herb Sutter), but I miss a good explanation.
How can auto improve performance?
Can anyone give an example?
auto can aid performance by avoiding silent implicit conversions. An example I find compelling is the following.
std::map<Key, Val> m;
// ...
for (std::pair<Key, Val> const& item : m) {
// do stuff
}
See the bug? Here we are, thinking we're elegantly taking every item in the map by const reference and using the new range-for expression to make our intent clear, but actually we're copying every element. This is because std::map<Key, Val>::value_type is std::pair<const Key, Val>, not std::pair<Key, Val>. Thus, when we (implicitly) have:
std::pair<Key, Val> const& item = *iter;
Instead of taking a reference to an existing object and leaving it at that, we have to do a type conversion. You are allowed to take a const reference to an object (or temporary) of a different type as long as there is an implicit conversion available, e.g.:
int const& i = 2.0; // perfectly OK
The type conversion is an allowed implicit conversion for the same reason you can convert a const Key to a Key, but we have to construct a temporary of the new type in order to allow for that. Thus, effectively our loop does:
std::pair<Key, Val> __tmp = *iter; // construct a temporary of the correct type
std::pair<Key, Val> const& item = __tmp; // then, take a reference to it
(Of course, there isn't actually a __tmp object, it's just there for illustration, in reality the unnamed temporary is just bound to item for its lifetime).
Just changing to:
for (auto const& item : m) {
// do stuff
}
just saved us a ton of copies - now the referenced type matches the initializer type, so no temporary or conversion is necessary, we can just do a direct reference.
Because auto deduces the type of the initializing expression, there is no type conversion involved. Combined with templated algorithms, this means that you can get a more direct computation than if you were to make up a type yourself – especially when you are dealing with expressions whose type you cannot name!
A typical example comes from (ab)using std::function:
std::function<bool(T, T)> cmp1 = std::bind(f, _2, 10, _1); // bad
auto cmp2 = std::bind(f, _2, 10, _1); // good
auto cmp3 = [](T a, T b){ return f(b, 10, a); }; // also good
std::stable_partition(begin(x), end(x), cmp?);
With cmp2 and cmp3, the entire algorithm can inline the comparison call, whereas if you construct a std::function object, not only can the call not be inlined, but you also have to go through the polymorphic lookup in the type-erased interior of the function wrapper.
Another variant on this theme is that you can say:
auto && f = MakeAThing();
This is always a reference, bound to the value of the function call expression, and never constructs any additional objects. If you didn't know the returned value's type, you might be forced to construct a new object (perhaps as a temporary) via something like T && f = MakeAThing(). (Moreover, auto && even works when the return type is not movable and the return value is a prvalue.)
There are two categories.
auto can avoid type erasure. There are unnamable types (like lambdas), and almost unnamable types (like the result of std::bind or other expression-template like things).
Without auto, you end up having to type erase the data down to something like std::function. Type erasure has costs.
std::function<void()> task1 = []{std::cout << "hello";};
auto task2 = []{std::cout << " world\n";};
task1 has type erasure overhead -- a possible heap allocation, difficulty inlining it, and virtual function table invocation overhead. task2 has none. Lambdas need auto or other forms of type deduction to store without type erasure; other types can be so complex that they only need it in practice.
Second, you can get types wrong. In some cases, the wrong type will work seemingly perfectly, but will cause a copy.
Foo const& f = expression();
will compile if expression() returns Bar const& or Bar or even Bar&, where Foo can be constructed from Bar. A temporary Foo will be created, then bound to f, and its lifetime will be extended until f goes away.
The programmer may have meant Bar const& f and not intended to make a copy there, but a copy is made regardless.
The most common example is the type of *std::map<A,B>::const_iterator, which is std::pair<A const, B> const& not std::pair<A,B> const&, but the error is a category of errors that silently cost performance. You can construct a std::pair<A, B> from a std::pair<const A, B>. (The key on a map is const, because editing it is a bad idea)
Both #Barry and #KerrekSB first illustrated these two principles in their answers. This is simply an attempt to highlight the two issues in one answer, with wording that aims at the problem rather than being example-centric.
The existing three answers give examples where using auto helps “makes it less likely to unintentionally pessimize” effectively making it "improve performance".
There is a flip side to the the coin. Using auto with objects that have operators that don't return the basic object can result in incorrect (still compilable and runable) code. For example, this question asks how using auto gave different (incorrect) results using the Eigen library, i.e. the following lines
const auto resAuto = Ha + Vector3(0.,0.,j * 2.567);
const Vector3 resVector3 = Ha + Vector3(0.,0.,j * 2.567);
std::cout << "resAuto = " << resAuto <<std::endl;
std::cout << "resVector3 = " << resVector3 <<std::endl;
resulted in different output. Admittedly, this is mostly due to Eigens lazy evaluation, but that code is/should be transparent to the (library) user.
While performance hasn't been greatly affected here, using auto to avoid unintentional pessimization might be classified as premature optimization, or at least wrong ;).

C++11 std::set lambda comparison function

I want to create a std::set with a custom comparison function. I could define it as a class with operator(), but I wanted to enjoy the ability to define a lambda where it is used, so I decided to define the lambda function in the initialization list of the constructor of the class which has the std::set as a member. But I can't get the type of the lambda. Before I proceed, here's an example:
class Foo
{
private:
std::set<int, /*???*/> numbers;
public:
Foo () : numbers ([](int x, int y)
{
return x < y;
})
{
}
};
I found two solutions after searching: one, using std::function. Just have the set comparison function type be std::function<bool (int, int)> and pass the lambda exactly like I did. The second solution is to write a make_set function, like std::make_pair.
SOLUTION 1:
class Foo
{
private:
std::set<int, std::function<bool (int, int)> numbers;
public:
Foo () : numbers ([](int x, int y)
{
return x < y;
})
{
}
};
SOLUTION 2:
template <class Key, class Compare>
std::set<Key, Compare> make_set (Compare compare)
{
return std::set<Key, Compare> (compare);
}
The question is, do I have a good reason to prefer one solution over the other? I prefer the first one because it makes use of standard features (make_set is not a standard function), but I wonder: does using std::function make the code (potentially) slower? I mean, does it lower the chance the compiler inlines the comparison function, or it should be smart enough to behave exactly the same like it would it was a lambda function type and not std::function (I know, in this case it can't be a lambda type, but you know, I'm asking in general) ?
(I use GCC, but I'd like to know what popular compilers do in general)
SUMMARY, AFTER I GOT LOTS OF GREAT ANSWERS:
If speed is critical, the best solution is to use an class with operator() aka functor. It's easiest for the compiler to optimize and avoid any indirections.
For easy maintenance and a better general-purpose solution, using C++11 features, use std::function. It's still fast (just a little bit slower than the functor, but it may be negligible) and you can use any function - std::function, lambda, any callable object.
There's also an option to use a function pointer, but if there's no speed issue I think std::function is better (if you use C++11).
There's an option to define the lambda function somewhere else, but then you gain nothing from the comparison function being a lambda expression, since you could as well make it a class with operator() and the location of definition wouldn't be the set construction anyway.
There are more ideas, such as using delegation. If you want a more thorough explanation of all solutions, read the answers :)
It's unlikely that the compiler will be able to inline a std::function call, whereas any compiler that supports lambdas would almost certainly inline the functor version, including if that functor is a lambda not hidden by a std::function.
You could use decltype to get the lambda's comparator type:
#include <set>
#include <iostream>
#include <iterator>
#include <algorithm>
int main()
{
auto comp = [](int x, int y){ return x < y; };
auto set = std::set<int,decltype(comp)>( comp );
set.insert(1);
set.insert(10);
set.insert(1); // Dupe!
set.insert(2);
std::copy( set.begin(), set.end(), std::ostream_iterator<int>(std::cout, "\n") );
}
Which prints:
1
2
10
See it run live on Coliru.
Yes, a std::function introduces nearly unavoidable indirection to your set. While the compiler can always, in theory, figure out that all use of your set's std::function involves calling it on a lambda that is always the exact same lambda, that is both hard and extremely fragile.
Fragile, because before the compiler can prove to itself that all calls to that std::function are actually calls to your lambda, it must prove that no access to your std::set ever sets the std::function to anything but your lambda. Which means it has to track down all possible routes to reach your std::set in all compilation units and prove none of them do it.
This might be possible in some cases, but relatively innocuous changes could break it even if your compiler managed to prove it.
On the other hand, a functor with a stateless operator() has easy to prove behavior, and optimizations involving that are everyday things.
So yes, in practice I'd suspect std::function could be slower. On the other hand, std::function solution is easier to maintain than the make_set one, and exchanging programmer time for program performance is pretty fungible.
make_set has the serious disadvantage that any such set's type must be inferred from the call to make_set. Often a set stores persistent state, and not something you create on the stack then let fall out of scope.
If you created a static or global stateless lambda auto MyComp = [](A const&, A const&)->bool { ... }, you can use the std::set<A, decltype(MyComp)> syntax to create a set that can persist, yet is easy for the compiler to optimize (because all instances of decltype(MyComp) are stateless functors) and inline. I point this out, because you are sticking the set in a struct. (Or does your compiler support
struct Foo {
auto mySet = make_set<int>([](int l, int r){ return l<r; });
};
which I would find surprising!)
Finally, if you are worried about performance, consider that std::unordered_set is much faster (at the cost of being unable to iterate over the contents in order, and having to write/find a good hash), and that a sorted std::vector is better if you have a 2-phase "insert everything" then "query contents repeatedly". Simply stuff it into the vector first, then sort unique erase, then use the free equal_range algorithm.
A stateless lambda (i.e. one with no captures) can decay to a function pointer, so your type could be:
std::set<int, bool (*)(int, int)> numbers;
Otherwise I'd go for the make_set solution. If you won't use a one-line creation function because it's non-standard you're not going to get much code written!
From my experience playing around with the profiler, the best compromise between performance and beauty is to use a custom delegate implementation, such as:
https://codereview.stackexchange.com/questions/14730/impossibly-fast-delegate-in-c11
As the std::function is usually a bit too heavy. I can't comment on your specific circumstances, as I don't know them, though.
If you're determined to have the set as a class member, initializing its comparator at constructor time, then at least one level of indirection is unavoidable. Consider that as far as the compiler knows, you could add another constructor:
Foo () : numbers ([](int x, int y)
{
return x < y;
})
{
}
Foo (char) : numbers ([](int x, int y)
{
return x > y;
})
{
}
Once the you have an object of type Foo, the type of the set doesn't carry information on which constructor initialized its comparator, so to call the correct lambda requires an indirection to the run-time selected lambda operator().
Since you're using captureless lambdas, you could use the function pointer type bool (*)(int, int) as your comparator type, as captureless lambdas have the appropriate conversion function. This would of course involve an indirection through the function pointer.
The difference highly depends on your compiler's optimizations. If it optimizes lambda in a std::function those are equivalent, if not you introduce an indirection in the former that you won't have in the latter.

Without using lambda how can I neatly create a function given an expression?

Would like to be able to do something like
std::map<EventType, boost::function<bool(int,int)> callbackMap;
callbackMap[EVENT1] = boost::bind( magic_creator( this->m_isHerp && !this->m_isDerp ) , _1, _2 );
basically give an expression that evaluates to true or false to the magic_creator and it returns a function to which I can bind with boost. So in the above case, the magic_creator would create a function that would return true regardless of _1 and _2. I am not able to use lamdas as it is not avail to me. Anyone got anything for this?
P.S Assume callbackMap is part of some class and so is the current scope of the above code.
This is actually possible and not that ugly with either the Boost Lambda Library, Boost.Bind or Boost.Phoenix's bind.
All placeholders and binder types returned from a call to bind from any of those libraries will have all kinds of operators overloaded so you can easily create expressions with them "on the fly". With Boost.Bind:
#include <iostream>
#include <boost/bind.hpp>
struct X{ bool a, b; };
int main(){
using boost::bind;
auto op = bind(&X::a, _1) && !bind(&X::b, _1);
X x{true, false};
auto test = bind(op, x);
if(test())
std::cout << "Yay\n";
}
Live example.
This, of course, has the obvious disadvantage of going through member pointers. Also, in C++03, you had a mighty problem writing out the type of such a "lambda", since it's a huge templated mess. The little example above will yield a huge name, as can be seen here. And since a library solution was not "the best it could be", the standard committee added lambdas to the language. Yay.
Note that, while C++11's std::bind looks similar on the surface, it's a pure binder. It does not allow you to create expressions on-the-fly as it does not overload any operators for any related types.

How to std::find using a Compare object?

I am confused about the interface of std::find. Why doesn't it take a Compare object that tells it how to compare two objects?
If I could pass a Compare object I could make the following code work, where I would like to compare by value, instead of just comparing the pointer values directly:
typedef std::vector<std::string*> Vec;
Vec vec;
std::string* s1 = new std::string("foo");
std::string* s2 = new std::string("foo");
vec.push_back(s1);
Vec::const_iterator found = std::find(vec.begin(), vec.end(), s2);
// not found, obviously, because I can't tell it to compare by value
delete s1;
delete s2;
Is the following the recommended way to do it?
template<class T>
struct MyEqualsByVal {
const T& x_;
MyEqualsByVal(const T& x) : x_(x) {}
bool operator()(const T& y) const {
return *x_ == *y;
}
};
// ...
vec.push_back(s1);
Vec::const_iterator found =
std::find_if(vec.begin(), vec.end(),
MyEqualsByVal<std::string*>(s2)); // OK, will find "foo"
find can't be overloaded to take a unary predicate instead of a value, because it's an unconstrained template parameter. So if you called find(first, last, my_predicate), there would be a potential ambiguity whether you want the predicate to be evaluated on each member of the range, or whether you want to find a member of the range that's equal to the predicate itself (it could be a range of predicates, for all the designers of the standard libraries know or care, or the value_type of the iterator could be convertible both to the predicate type, and to its argument_type). Hence the need for find_if to go under a separate name.
find could have been overloaded to take an optional binary predicate, in addition to the value searched for. But capturing values in functors, as you've done, is such a standard technique that I don't think it would be a massive gain: it's certainly never necessary since you can always achieve the same result with find_if.
If you got the find you wanted, you'd still have to write a functor (or use boost), since <functional> doesn't contain anything to dereference a pointer. Your functor would be a little simpler as a binary predicate, though, or you could use a function pointer, so it'd be a modest gain. So I don't know why this isn't provided. Given the copy_if fiasco I'm not sure there's much value in assuming there are always good reasons for algorithms that aren't available :-)
Since your T is a pointer, you may as well store a copy of the pointer in the function object.
Other than that, that is how it is done and there's not a whole lot more to it.
As an aside, it's not a good idea to store bare pointers in a container, unless you are extremely careful with ensuring exception safety, which is almost always more hassle than it's worth.
That's exactly what find_if is for - it takes a predicate that is called to compare elements.