Capturing a lambda in another lambda can violate const qualifiers - c++

Consider the following code:
int x = 3;
auto f1 = [x]() mutable
{
return x++;
};
auto f2 = [f1]()
{
return f1();
};
This will not compile, because f1() is not const, and f2 is not declared as mutable. Does this mean that if I have a library function that accepts an arbitrary function argument and captures it in a lambda, I always need to make that lambda mutable, because I don't know what users can pass in? Notably, wrapping f1 in std::function seems to resolve this problem (how?).

Does this mean that if I have a library function that accepts an arbitrary function argument and captures it in a lambda, I always need to make that lambda mutable, because I don't know what users can pass in?
That's a design decision for your library API. You can require client code to pass function objects with a const-qualified operator() (which is the case for non-mutable lambda expressions). If something different is passed, a compiler error is triggered. But if the context might require a function object argument that modifies its state, then yes, you have to make the internal lambda mutable.
An alternative would be to dispatch on the ability to invoke operator() on a const-qualified instance of the given function type. Something along those lines (note that this needs a fix for function objects with both const and non-const operator(), which results in an ambiguity):
template <class Fct>
auto wrap(Fct&& f) -> decltype(f(), void())
{
[fct = std::forward<Fct>(f)]() mutable { fct(); }();
}
template <class Fct>
auto wrap(Fct&& f) -> decltype(std::declval<const Fct&>()(), void())
{
[fct = std::forward<Fct>(f)]() { fct(); }();
}
Notably, wrapping f1 in std::function seems to resolve this problem (how?).
This is a bug in std::function due to its type-erasure and copy semantics. It allows non-const-qualified operator() to be invoked, which can be verified with such a snippet:
const std::function<void()> f = [i = 0]() mutable { ++i; };
f(); // Shouldn't be possible, but unfortunately, it is
This is a known issue, it's worth checking out Titus Winter's complaint on this.

I'll start by addressing your second question first. std::function type erases, and holds a copy of the functor it's initialized with. That means there's a layer of indirection between std::function::operator() and the actual functor's operator().
Envision if you will, holding something in your class by pointer. Then you may call a mutating operation on the pointee from a const member function of your class, because it doesn't affect (in a shallow view) the pointer that the class holds. This is a similar situation to what you observed.
As for your first question... "Always" is too strong a word. It depends on your goal.
If you want to support self mutating functors easily, then you should capture in a mutable lambda. But beware it may affect the library functions you may call now.
If you wish to favor non-mutating operations, then a non-mutable lambda. I say "favor" because as we observed, the type system can be "fooled" with an extra level of indirection. So the approach you prefer is only going to be easier to use, not impossible to go around. This is as the sage advice goes, make correct use of your API easy, and incorrect harder.

Related

How do correctly use a callable passed through forwarding reference?

I'm used to pass lambda functions (and other callables) to template functions -- and use them -- as follows
template <typename F>
auto foo (F && f)
{
// ...
auto x = std::forward<F>(f)(/* some arguments */);
// ...
}
I mean: I'm used to pass them through a forwarding reference and call them passing through std::forward.
Another Stack Overflow user argue (see comments to this answer) that this, calling the functional two or more time, it's dangerous because it's semantically invalid and potentially dangerous (and maybe also Undefined Behavior) when the function is called with a r-value reference.
I've partially misunderstand what he means (my fault) but the remaining doubt is if the following bar() function (with an indubitable multiple std::forward over the same object) it's correct code or it's (maybe only potentially) dangerous.
template <typename F>
auto bar (F && f)
{
using A = typename decltype(std::function{std::forward<F>(f)})::result_type;
std::vector<A> vect;
for ( auto i { 0u }; i < 10u ; ++i )
vect.push_back(std::forward<F>(f)());
return vect;
}
Forward is just a conditional move.
Therefore, to forward the same thing multiple times is, generally speaking, as dangerous as moving from it multiple times.
Unevaluated forwards don't move anything, so those don't count.
Routing through std::function adds a wrinkle: that deduction only works on function pointers and on function objects with a single function call operator that is not && qualified. For these, rvalue and lvalue invocation are always equivalent if both compiles.
I'd say the general rule applies in this case. You're not supposed to do anything with a variable after it was moved/forwarded from, except maybe assigning to it.
Thus...
How do correctly use a callable passed through forwarding reference?
Only forward if you're sure it won't be called again (i.e. on last call, if at all).
If it's never called more than once, there is no reason to not forward.
As for why your snippet could be dangerous, consider following functor:
template <typename T>
struct foo
{
T value;
const T &operator()() const & {return value;}
T &&operator()() && {return std::move(value);}
};
As an optimization, operator() when called on an rvalue allows caller to move from value.
Now, your template wouldn't compile if given this functor (because, as T.C. said, std::function wouldn't be able to determine return type in this case).
But if we changed it a bit:
template <typename A, typename F>
auto bar (F && f)
{
std::vector<A> vect;
for ( auto i { 0u }; i < 10u ; ++i )
vect.push_back(std::forward<F>(f)());
return vect;
}
then it would break spectacularly when given this functor.
If you're either going to just forward the callable to another place or simply call the callable exactly once, I would argue that using std::forward is the correct thing to do in general. As explained here, this will sort of preserve the value category of the callable and allow the "correct" version of a potentially overloaded function call operator to be called.
The problem in the original thread was that the callable was being called in a loop, thus potentially invoked more than once. The concrete example from the other thread was
template <typename F>
auto map(F&& f) const
{
using output_element_type = decltype(f(std::declval<T>()));
auto sequence = std::make_unique<Sequence<output_element_type>>();
for (const T& element : *this)
sequence->push(f(element));
return sequence;
}
Here, I believe that calling std::forward<F>(f)(element) instead of f(element), i.e.,
template <typename F>
auto map(F&& f) const
{
using output_element_type = decltype(std::forward<F>(f)(std::declval<T>()));
auto sequence = std::make_unique<Sequence<output_element_type>>();
for (const T& element : *this)
sequence->push(std::forward<F>(f)(element));
return sequence;
}
would be potentially problematic. As far as my understanding goes, the defining characteristic of an rvalue is that it cannot explicitly be referred to. In particular, there is naturally no way for the same prvalue to be used in an expression more than once (at least I can't think of one). Furthermore, as far as my understanding goes, if you're using std::move or std::forward or whatever other way to obtain an xvalue, even on the same original object, the result will be a new xvalue every time. Thus, there also cannot possibly be a way to refer to the same xvalue more than once. Since the same rvalue cannot be used more than once, I would argue (see also comments underneath this answer) that it would generally be a valid thing for an overloaded function call operator to do something that can only be done once in case the call happens on an rvalue, for example:
class MyFancyCallable
{
public:
void operator () & { /* do some stuff */ }
void operator () && { /* do some stuff in a special way that can only be done once */ }
};
The implementation of MyFancyCallable may assume that a call that would pick the &&-qualified version cannot possibly happen more than once (on the given object). Thus, I would consider forwarding the same callable into more than one call to be semantically broken.
Of course, technically, there is no universal definition of what it actually means to forward or move an object. In the end, it's really up to the implementation of the particular types involved to assign meaning there. Thus, you may simply specify as part of your interface that potential callables passed to your algorithm must be able to deal with being called multiple times on an rvalue that refers to the same object. However, doing so pretty much goes against all the conventions for how the rvalue reference mechanism is generally used in C++, and I don't really see what there possibly would be to be gained from doing this…

Should I always use `T&&` instead of `const T&` or `T&` to bind to a callback function?

template <typename T>
void myFunction(..., T && callback) {
...
callback(...);
...
}
Is it preferable to use T && than T& or const T&?
Or even simply T to pass by value instead of pass by reference.
Does function or lambdas have the concept of lvalue & rvalue? Can I std::move a function / lambdas?
Does const of const T& enforce that the function cannot modify its closure?
Taking a forwarding reference can make a difference, but you have to call the callback correctly to see it. for functions and lambdas it doesn't matter if they are an rvalue or lvalue, but if you have a functor, it can make a difference
template <typename T>
void myFunction(..., T && callback) {
callback(...);
}
Takes a forwarding reference, and then calls the callback as a lvalue. This can be an error if the function object was passed as an rvalue, and its call operator is defined as
operator()(...) && { ... }
since that is only callable on an rvalue. To make your function work correctly you need to wrap the function name in std::forward so you call it in the same value expression category it was passed to the function as. That looks like
template <typename T>
void myFunction(..., T && callback) {
std::forward<T>(callback)(...);
// you should only call this once since callback could move some state into its return value making UB to call it again
}
So, if you want to take rvalues and lvalues, and call the operator as an rvalue or lvalue, then the above approach should be used, since it is just like doing the call in the call site of myFunction.
You probably want to choose either T (with optional std::ref) or T&& and stick with one or the other.
T const& works if you want to let the caller know that myFunction won't modify callback. It doesn't work if callback might be stateful. Of course, if callback's operator() is marked const (or callback is a non-mutable lambda), then myFunction won't modify callback. Essentially, whoever calls myFunction can provide that constness guarantee for themselves.
T& works if you want myFunction to be allowed to take in a stateful functor. The drawback is that T& can't bind to an rvalue (e.g. myFunction([&](Type arg){ /* whatever */ })).
T is good for both stateful and stateless functors, and it works for both rvalues (e.g. lambdas) and lvalues. If somebody who calls myFunction wants changes to callback's state to be observable outside of myFunction (e.g. callback has more than just operator()), they can use std::ref. This is what the C++ standard library seems to go with.
T&& is similarly general-purpose (it can handle rvalues and/or stateful functors), but it doesn't require std::ref to make changes to callback be visible from outside of myFunction.
Is it preferable to use T && than T& or const T&?
Universal references allow perfect forwarding, so they are often preferable in generic template functions.
Or even simply T to pass by value instead of pass by reference.
This is often preferable when you wish to take ownership.
Does function or lambdas have the concept of lvalue & rvalue?
Yes. Every value expression has a value category.
Can I std::move a function / lambdas?
Yes. You can std::move pretty much anything.
Another matter is whether std::move'd object is moved from. And yet another matter is whether an object can be moved. Lambdas are movable unless they contain non-movable captures.
In the OP's example:
template <typename T>
void myFunction(..., T && callback) {
...
...
}
Passing a T&& will give you the best of both worlds. If we pass an r-value reference, the template resolves the function with an r-value reference, but if we pass an l-value reference, the function resolves as a l-value. When passing callback just remember to use std::forward to preserve the l/r value.
This only works if T is templated, this is not the case where we use something like a std::function.
So what do we pass in the non-templated example. The first thing to decide is whether callback can be called after the exit of the function. In the event that the callback will only be called during the scope of myFunc, you are probably better off using l-value references, since you can call the reference directly.
However if callback will be called after the scope of myFunc, using an r-value will allow you to move callback. This will save you the copy but forces you to guarantee that callback cannot be used anywhere else after passing to myFunc.
In C++ functions (not to be confused with std::function) are essentially just pointers. You can move pointers, though it's the same as copying them.
Lambdas under the hood are just more or less regular structs (or classes, it's the same) that implement operator() and potentially store some state (if your lambda captures it). So the move would function just the same as it would for regular structs.
Similarly, there could be move-only lambdas (because they have move-only values in their state). So sometimes, if you want to accept such lambdas, you need to pass by &&.
However, in some cases you want a copy (for example if you want to execute function in a thread).

Why does boost::bind store arguments of the type passed in rather than of the type expected by the function?

I recently ran into a bug in my code when using boost::bind.
From the boost::bind docs:
The arguments that bind takes are copied and held internally by the returned function object.
I had assumed that the type of the copy that was being held was based on the signature of the function. However, it is actually based on the type of the value passed in.
In my case an implicit conversion was happening to convert the type used in the bind expression to the type received by the function. I was expecting this conversion to happen at the site of the bind, however it happens when the resulting function object is used.
In retrospect I should have been able to figure this out from the fact that using boost::bind gives errors when types are not compatible only at the call site, not the bind site.
My question is:
Why does boost::bind work this way?
It seems to give worse compiler error messages
It seems to be less efficient when implicit conversion happens and there are multiple calls to the functor
But given how well Boost is designed I'm guessing there is a reason. Was it behavior inherited from std::bind1st/bind2nd? Is there a subtle reason why this would be hard/impossible to implement? Something else entirely?
To test that second theory I wrote up a little code snippet that seems to work, but there may well be features of bind I haven't accounted for since it's just a fragment:
namespace b = boost;
template<class R, class B1, class A1>
b::_bi::bind_t<R, R (*) (B1), typename b::_bi::list_av_1<B1>::type>
mybind(R (*f) (B1), A1 a1)
{
typedef R (*F) (B1);
typedef typename b::_bi::list_av_1<B1>::type list_type;
return b::_bi::bind_t<R, F, list_type> (f, list_type(B1(a1)));
}
struct Convertible
{
Convertible(int a) : b(a) {}
int b;
};
int foo(Convertible bar)
{
return 2+bar.b;
}
void mainFunc()
{
int x = 3;
b::function<int()> funcObj = mybind(foo, x);
printf("val: %d\n", funcObj());
}
Because the functor may support multiple overloads, which may give different behaviours. Even if this signature could be resolved when you knew all the arguments (and I don't know if Standard C++ can guarantee this facility) bind does not know all the arguments, and therefore it definitely cannot be provided. Therefore, bind does not possess the necessary information.
Edit: Just to clarify, consider
struct x {
void operator()(int, std::vector<float>);
void operator()(float, std::string);
};
int main() {
auto b = std::bind(x(), 1); // convert or not?
}
Even if you were to reflect on the struct and gain the knowledge of it's overloads, it's still undecidable as to whether you need to convert the 1 to a float or not.
There are different cases where you need the arguments to be processed at the call site.
The first such example is calling a member function, where you can either have the member called on a copy of the object (boost::bind( &std::vector<int>::push_back, myvector)) which most probably you don't want, or else you need to pass a pointer and the binder will dereference the pointer as needed (boost::bind( &std::vector<int>::push_back, &myvector )) --Note both options can make sense in different programs
Another important use case is passing an argument by reference to a function. bind will copy performing the equivalent to a pass-by-value call. The library offers the option of wrapping arguments through the helper functions ref and cref, both of which store a pointer to the actual object to be passed, and at the place of call they dereference the pointer (through an implicit conversion). If the conversion to the target type was performed at bind time, then this would be impossible to implement.
I think this is due to the fact that bind has to work with any callable entity, be it a function pointer, std::function<>, or your own functor struct with operator(). This makes bind generic on any type that can be called using (). I.e. Bind's implicit requirement on your functor is just that it can be used with ()
If bind was to store the function argument types, it would have to somehow infer them for any callable entity passed in as a type parameter. This would obviously not be as generic, since deducing parameter types of an operator() of a passed-in struct type is impossible without relying on the user to specify some kind of typedef (as an example). As a result the requirement on the functor (or concept) is no longer concrete/simple.
I am not entirely sure this is the reason, but it's one of the things that would be a problem.
EDIT: Another point as DeadMG mentions in another answer, overloads would create ambiguities even for standard function pointers, since the compiler would not be able to resolve the functor type. By storing the types you provide to bind and using (), this problem is also avoided.
A good example would binding "std::future"s to some ordinary function taking ordinary types:
Say I want to use an ordinary f(x,y) function in an incredibly asynchronous way. Namely, I want to call it like "f(X.get(), Y.get())". There's a good reason for this- I can just call that line and f's logic will run as soon as both inputs are available (I don't need separate lines of code for the join). To do this I need the following:
1) I need to support implicit conversions "std::future<T> -> T". This means std::future or my custom equivalent needs a cast operator:
operator T() { return get(); }
2) Next, I need to bind my generic function to hide all its parameters
// Hide the parameters
template<typename OUTPUT, typename... INPUTS>
std::function<OUTPUT()> BindVariadic(std::function<OUTPUT(INPUTS...)> f,
INPUTS&&... in)
{
std::function<OUTPUT()> stub = std::bind( f, std::forward<INPUTS>(in)...);
return stub;
}
With a std::bind that does the "std::function<T> -> T" conversion at call time, I only wait for all the input parameters to become available when I ACTUALLY CALL "stub()". If it did the conversion via operator T() at the bind, the logic would silently force the wait when I actually constructed "stub" instead of when I use it. That might be fatal if "stub()" cannot always run safely in the same thread I built it.
There are other use cases that also forced that design choice. This elaborate one for async processing is simply the one I'm personally familiar with.

How can I get around specifying variables in decltype expressions?

Assume I have the following exemplary function:
template <typename Fn> auto call(Fn fn) -> decltype(fn()) {
return fn();
}
The important thing about this function is that its return type depends on its template parameter, which can be inferred. So ultimately, the return type depends on how the function is called.
Now, we also have a test class:
struct X {
int u;
auto test() -> decltype(call([this]() -> double {this->u = 5; return 7.4;})) {
return call([this]() -> double {this->u = 5; return 7.4;});
}
};
as you can see, X::test calls call, returning the same return value. In this case, the return type is trivially given as double, but let's assume for a bit we didn't know what call does and that the lambda has a more complicated return type.
If we try to compile this, the compiler will complain, because we're using this at the top level (not in a scope that would allow an expression):
error: lambda-expression in unevaluated context
error: invalid use of ‘this’ at top level
However, I have to use the capture of the lambda which I pass to call in order to get call's return type right. How would you suggest to get around this, while still leaving the lambda?
Note: Of course I could move the lambda to be an operator() of some helper type, which I instantiate with a copy of the this pointer, but I'd like to avoid that boilerplate.
I think the real error to be concerned about is "lambda-expression in unevaluated context". You can't use a lambda in an unevaluated context because every lambda expression has a unique type. That is, if decltype([]{}) were allowed it would deduce a different type than []{} in some other context. I.e. decltype([]{}) fn = []{}; wouldn't work.
Unless you want to just explicitly write the return type rather than have it deduced, I don't think you have any choice but to create a real type that you can use in the contexts you need, with whatever boilerplate that entails.
Although if changing test to not be a member function is acceptable then you could use the fact that lambda's can deduce their return type by omitting it if the body is only a single return statement:
template <typename Fn> auto call(Fn fn) -> decltype(fn()) {
return fn();
}
struct X {
int u;
};
int main() {
auto test = [](X *x) { return call([x]() -> double {x->u = 5; return 7.4; });};
X x;
test(&x);
}
It would be nice if the trailing return type syntax for functions had the same property. I'm not sure why it doesn't.
It seems to be a made up (construed, artificial) question, since
If you get the lambda from somewhere else, then it's named and no problem binding this.
If you're not getting the lambda from somewhere else, then you know the result type.
In short, as the problem is currently stated (as I'm writing this answer) there's no problem except one imposed by your own will.
But if you insist on that, well, just pass this as an argument instead of binding it via the lambda definition. Then for the call to call, bind the argument. But, perhaps needless to say, since that only solves a made-up problem it's a real Rube Goldberg construction, a decent into over-flowering needless complexity that doesn't solve anything real outside its own intricacies.
What was the original real problem, if any?
You shouldn't always copy-and-paste function body to decltype. The point of introducing late-specified return type was that you'll be able to somehow infer correct return type from arguments.
e.g. auto f(T x) -> decltype(g(x)) { return h(), g(x); }, not -> decltype(h(), g(x))
So, in your case, double test() is enough, because we know behavior of call and we know return type of lambda function we pass to it.
In more complex case, we should reduce code inside decltype, by using knowledge about call and other stuff.

Wrapped lambda expressions

Here's a lamba wrapper expression defined:
function <int(double)> f =
[](double x) -> int{ return static_cast <int> (x*x); };
It is used like this:
f(someintvalue);
What is the difference between normal functions and wrapped lambdas?
question is - what is the difference between normal function and wrapped lambda?
Normal function is a normal function and what you call "wrapped lambda" is actually a function object.
By the way, why use std::function? You could simply write this:
auto f = [](double x) { return static_cast <int> (x*x); };
//call
int result = f(100.0);
Also, I omitted the return type, as it is implicitly known to the compiler from the return expression. No need to write -> int in the lambda expression.
Lambdas can capture surrounding scope ([=] or [&]) resulting in anonymous structs that contain a member function.
Certain tasks, like accessing lambda (types) across Translation Units, or taking standard pointer-to-memberfunction addresses from lambdas may prove hard/useless because the actual type of the automatically generated 'anonymous' type cannot be known (and is in fact implementation defined).
std::function<> was designed to alleviate exactly that problem: being able to bind a function pointer (semantically) to a lambda (or indeed, whatever callable object)
1: in fact the type can have external linkage, but you cannot portably refer to that from another translation unit because the actual type is implementation defined (and need not even be expressible in C++ code; within the original TU you'd be able to use decltype to get the magic type. Thanks, Luc for the good precisions about this)