Why are copy-capturing lambdas not default DefaultConstructible in c++20 - c++

C++20 introduces DefaultConstructible lambdas. However, cppreference.com states that this is only for stateless lambdas:
If no captures are specified, the closure type has a defaulted default constructor. Otherwise, it has no default constructor (this includes the case when there is a capture-default, even if it does not actually capture anything).
Why does this not extend to lambdas that capture things that are DefaultConstructible? For instance, why can [p{std::make_unique<int>(0)}](){ return p.get(); } not be DefaultConstructible, where the captured p would be nullptr?
Edit: For those asking why we would want this, the behavior only seems natural because one is forced to write something like this when calling standard algorithms that require functors to be default-constructible:
struct S{
S() = default;
int* operator()() const { return p.get(); }
std::unique_ptr<int> p;
};
So, we can pass in S{std::make_unique<int>(0)}, which does the same thing.
It seems like it would be much better to be able to write [p{std::make_unique<int>(0)}](){ return p.get(); } versus creating a struct that does the same thing.

There are two reasons not to do it: conceptual and safety.
Despite the desires of some C++ programmers, lambdas are not meant to be a short syntax for a struct with an overloaded operator(). That is what C++ lambdas are made of, but that's not what lambdas are.
Conceptually, a C++ lambda is supposed to be a C++ approximation of a lambda function. The capture functionality is not meant to be a way to write members of a struct; it's supposed to mimic the proper lexical scoping capabilities of lambdas. That's why they exist.
Creating such a lambda (initially, not by copy/move of an existing one) outside of the lexical scope that it was defined within is conceptually vacuous. It doesn't make sense to write a thing bound to a lexical scope, then create it outside of the scope it was built for.
That's also why you cannot access those members outside of the lambda. Because, even though they could be public members, they exist to implement proper lexical scoping. They're implementation details.
To construct a "lambda" that "captures variables" without actually capturing anything only makes sense from a meta-programming perspective. That is, it only makes sense when focusing on what lambdas happen to be made of, rather than what they are. A lambda is implemented as a C++ struct with captures as members, and the capture expressions don't even technically have to name local variables, so those members could theoretically be value initialized.
If you are unconvinced by the conceptual argument, let's talk safety. What you want to do is declare that any lambda shall be default constructible if all of its captures are non-reference captures and are of default constructible types. This invites disaster. Why?
Because the writer of many such lambdas didn't ask for that. If a lambda captures a unique_ptr<T> by moving from a variable that points to an object, it is 100% valid (under the current rules) for the code inside that lambda to assume that the captured value points to an object. Default construction, while syntactically valid, is semantically nonsense in this case.
With a proper named type, a user can easily control if it is default constructible or not. And therefore, if it doesn't make sense to default construct a particular type, they can forbid it. With lambdas, there is no such syntax; you have to impose an answer on everyone. And the safest answer for capturing lambdas, the one that is guaranteed to never break code, is "no."
By contrast, default construction of captureless lambdas can never be incorrect. Such functions are "pure" (with respect to the contents of the functor, since the functor has no contents). This also matches with the above conceptual argument: a captureless lambda has no proper lexical scope and therefore spawning it anywhere, even outside of its original scope, is fine.
If you want the behavior of a named struct... just make a named struct. You don't even need to default the default constructor; you'll get one by default (if you declare no other constructors).

Related

Why use `std::bind_front` over lambdas in C++20?

As mentioned in a similarly worded question (Why use bind over lambdas in c++14?) The answer was - no reason (and also mentioned why it would be better to use lambdas).
My question is - if in C++14 there was no longer a reason to use bind, why did the standards committee found it necessary to add std::bind_front in C++20?
Does it now have any new advantage over a lambda?
bind_front binds the first X parameters, but if the callable calls for more parameters, they get tacked onto the end. This makes bind_front very readable when you're only binding the first few parameters of a function.
The obvious example would be creating a callable for a member function that is bound to a specific instance:
type *instance = ...;
//lambda
auto func = [instance](auto &&... args) -> decltype(auto) {return instance->function(std::forward<decltype(args)>(args)...);}
//bind
auto func = std::bind_front(&type::function, instance);
The bind_front version is a lot less noisy. It gets right to the point, having exactly 3 named things: bind_front, the member function to be called, and the instance on which it will be called. And that's all that our situation calls for: a marker to denote that we're creating a binding of the first parameters of a function, the function to be bound, and the parameter we want to bind. There is no extraneous syntax or other details.
By contrast, the lambda has a lot of stuff we just don't care about at this location. The auto... args bit, the std::forward stuff, etc. It's a bit harder to figure out what it's doing, and it's definitely much longer to read.
Note that bind_front doesn't allow bind's placeholders at all, so it's not really a replacement. It's more a shorthand for the most useful forms of bind.
The paper that proposed it Simplified partial function application has some good compelling use cases. I will summarize them here, because otherwise I would have to quote most of the paper, so definitely go check it out:
Automatic perfect forwarding
Using a lambda would involve std::forward boilerplate
Propagating mutability
In case of storing object by value std::bind and std::bind_front propagate constness, but in the case of capturing lambda the user must chose a mutable or const version creating problems
Preserving return type
Using a lambda would involve -> decltype(auto) boilerplate on the user side.
Preserving value category
Like preserving mutability, except now we are talking about lvalue/rvalue and only std::bind_front does this correctly
Supporting one-shot invocation
A consequence of propagating mutability and preserving value category
Preserving exception specification
This is especially more important now since exception specification is now part of type system
cppreference has some useful notes as well:
This function is intended to replace std::bind. Unlike std::bind, it
does not support arbitrary argument rearrangement and has no special
treatment for nested bind-expressions or std::reference_wrappers. On
the other hand, it pays attention to the value category of the call
wrapper object and propagates exception specification of the
underlying call operator.

Where is capture by reference more correct in the standard algorithms?

CppCoreGuideline F.52 states that it is more correct to capture by reference for lambdas that are used in algorithms.
I fail to see why - the algorithms are mostly defined with value semantics.
In what situations is capturing by reference more correct?
Note that the guideline doesn't say "for correctness," it says "for efficiency and correctness." It's certainly more efficient to capture by reference, since the functors and predicates used in standard algorithms are passed by value. If you need access to big(gish) local objects in them, capturing by value would mean copying them with each copy of the functor. Capture by reference lets you work on the local variables directly.
I confess that I cannot actually think of a scenario where using references would help correctness. The reason is simple: entities captured by value are const-qualified by default, so if you intend to modify a local variable in the lambda and accidentally capture it by copy instead of reference, you'll get a compilation error (unless you mark the lambda's call operator mutable, at which point you're obviously paying enough attention not to need a rule of thumb).
For one, capturing by value is not always possible. The objects in the example e.g. contain threads.and are hence most likely not copy constructible.
Another example would be random number generators in a loop: Usually you want to ensure you don't get the same sequence over and over again, which is what happens, if you capture by value (As angw points out however, your lambda would have to be mutable to work in the first place).
The guideline states:
F.52: Prefer capturing by reference in lambdas that will be used locally, including passed to algorithms
For efficiency and correctness, you nearly always want to capture by reference when using the lambda locally. This includes when writing or calling parallel algorithms that are local because they join before returning.
For what concerns efficiency capturing by reference can ensure you're not copying around large objects and wasting precious resources away (since the lambda itself can also be copied around). Plus sometimes it's the only viable way if your objects are non-copyable.
Regarding correctness I'd be inclined to agree with the other answers, anyway paying attention to the fact that
lambdas [..] will be used locally
(so that we don't get to deal with dangling references), one might speculate that there are corner cases where capture by value might be (arguably) behaving unintuitively:
int the_variable = 42;
void test( int& value ) {
auto modify_the_variable = [value] () mutable {
value = 2; // Not actually a reference this one
};
modify_the_variable();
}
int main()
{
test(the_variable);
std::cout << the_variable; // Still 42
}
One might expect that since capturing by value is in effect and the lambda is marked as mutable, the type of the captured value would be int&. Anyway §5.1.5/16 says otherwise
An entity is captured by copy if
(16.1) — it is implicitly captured, the capture-default is =, and the captured entity is not *this, or
(16.2) — it is explicitly captured with a capture that is not of the form this, & identifier, or & identifier
initializer.
For each entity captured by copy, an unnamed non-static data member is declared in the closure type. The
declaration order of these members is unspecified. The type of such a data member is the referenced type
if the entity is a reference to an object, an lvalue reference to the referenced function type if the entity
is a reference to a function, or the type of the corresponding captured entity otherwise
(emphasis mine)
In this case capturing by reference would do the right thing. Note that the guideline says:
including passed to algorithms
i.e. not only limited to standard library algorithms.
What makes you think that "The algorithms are mostly defined with value semantics"?
If you look for the algrorithms that need to store an internal value, e.g. std::find, std::fill, std::count etc - they all capture their inputs by const reference.
I would agree however that lambdas in general can be used outside of the scope they are defined in, which as you mentioned in your github issue can lead to having dangling references - this is why the guideline specifically mentions lambdas used in algorithms.
It's safe to say that capturing local object by reference would enable the algorithm to use that object with pretty much no overhead and of course with no unnecessary copies.
The "corectness" argument obviously refers to lambdas that mutate the captured value. Capturing by value in this case is easy to overlook as it would still compile. Essentialy the guideline says "Instead of trying to decide what kind of capture this specific case requires, simply always capture by reference".
It's important to notice that all that reasoning does not apply to iterators and functor objects - they are not the part of lambda closure. Take a look at std::find declaration:
template< class InputIt, class T >
InputIt find( InputIt first, InputIt last, const T& value );
const T& value is what matters here - it is what your lambda would've captured if you implemented the same logic with find_if.
Iterators are just a part of looping mechanics and the fact that they are passed by value is completely irrelevant to the guideline in question.

Is it wise to prefer lambdas to function objects?

After some searching and testing, I have learned the following facts about lambda expression. 1) when we write a lambda expression, the compiler would create an anonymous function object for it, and make it as an instance of the function object; 2) the captured variables of a lambda expression will be used to initialize the member data of the created function object; 3) when we store a lambda function, we actually get a named instance of the function object; 4) a generic lambda function is actually a function object template; 5) a stored (plain and even generic) lambda expression can be declared and defined with template; and 6) a stored lambda expression template can even be partially specialized, just as function objects.
Given all the features of lambdas stated above, it seems to me that, through lambdas, we are able to do whatever we used to do with function objects, and regarding efficiency, they should have same performance.
On the other hand, lambdas also have additional advantages: 1) a lambda expression is more understandable than a function object, especially for inline, short functions; and 2) defining a stored lambda can be seen as a kind of syntactic sugar for defining a function object, and making an instance of it.
Therefore, for me, it seems that we have no reasons to define a function object manually any more.
Of course, I also have some concern for substituting lambdas for function objects universally, like 1) for functions more than, say, 10 lines, defining them as a stored lambda may be unusual (or even awkward, I don't know), and 2) defining lambdas at file-level may (or may not, I am not quite sure) cause some unexpected problems.
Here are my questions: is it wise to prefer lambdas to function objects? Is there any other advantages that a function object have but lambdas not? Is my concern reasonable? And, is there any other concern that I should notice when using lambdas instead of FO universally?
Thanks for any reply!
Lambdas are a terse syntax for certain kinds of function objects.
They cannot be trivially constructed, they can have exactly one (possibly template) operator(), and their type cannot be named without first having access to an instance and using decltype.
As of C++14, they are not constexpr friendly, and they are not guaranteed to be trivially copyable even if their state should be.
Two lambdas with the same capture types and method do not share a type unless declared at the same spot; this can cause symbol bloat.
You cannot declare other operations in a lambda besides (), like friend bool operator< or ==, or whatever.
Given these restructions, sure, use lambdas. Terseness has lots of utility.

Why would one ever use lambda beyond the declaring scope or functions called from the declaring scope?

There are several ways to pass callable objects as parameters or to store them for future use. You can create a class with operator(), you can define a function and pass a pointer to it, and, since C++11, you can define a lambda via [](){} syntax.
I appreciate lambda syntax as a shortcut in expressions such as find_if that often beg for a compact callable expression. What I don't understand about lambda is the desire to use them outside the point of their declaration and risk introducing dangling references and such. C++ already has a powerful way to pass callable objects around which is much safer then lambda, and in those situation there is no benefit of compact expression of lambda.
Thus the question: why does C++11 allow use of a lambda outside the function that declares is or the functions called from it (and therefore introduces the risk of dangling references, etc)? Could you give an example where keeping lambda live outside the declaring function would be desirable?
Consider a function which is registered to be called when a future event occurs. It would be convenient to define it as a lambda, but it has to live beyond the scope in which it is defined:
for example
m_button->setOnClick(YOUR LAMBDA GOES HERE);
What I don't understand about lambda is the desire to use them outside the point of their declaration and risk introducing dangling references and such. C++ already has a powerful way to pass callable objects around which is much safer then lambda, and in those situation there is no benefit of compact expression of lambda.
(1) Lambda isn't implicitly less safe than any other way of defining function objects. The way of passing a lambda is exactly the same as passing an instance of a named functor.
You can store references in a named functor, and you can capture references in a lambda. Storing a reference to a local object in either of those cases is a severe bug if the function object out lives the scope where those references were bound.
Whether the syntax of lambda is beneficial or not, is a matter of preference. I suppose, one could argue that because lambdas make the definition of functors simpler, it also makes the definition of broken functors simpler.
why does C++11 allow use of a lambda outside the function that declares is or the functions called from it (and therefore introduces the risk of dangling references, etc)?
Firstly, I imagine such semantic limitation would be hard to implement. You can't make them non-copyable because that would make them useless in standard algorithms.
Secondly, because storing a function object for later use is useful, see (2) and using lambdas isn't more dangerous than using instances of named functors, see (1).
Could you give an example where keeping lambda live outside the declaring function would be desirable?
(2) Just about any asynchronous callback situation. std::async, std::thread, GUI and other event systems. Callable function objects will be stored for later use in those situations and typically the objects do outlive the scope where they were created.
In general and also in this case, lambdas advantages over named functor types is that you get to place the function definition right where it's used. Well, you can never have the definition where it's actually used in a generic situation of asynchronous callbacks, but the point of registering the callback is as close as you can get.
The disadvantage of lambdas is their hard-for-humans-to-parse syntax that is an explosion of different brackets, braces and parenthesis. Again, this is matter of preference.

Is there a reason lambdas with an empty capture-list can't be default constructed?

C++'s lambdas would be convenient to use in templates that need function objects but alas, they cannot be default constructed.
As discussed in this question, this makes sense for lambdas that have a non-empty capture-list.
Instantiating C++ lambda by its type
Kerrek explains:
The code doesn't make sense. Imagine you have a capturing lambda like
this:
{
int n = 0;
auto t = [&n](int a) -> int { return n += a; };
}
What could it possibly mean to default-construct an object of type
decltype(t)?
What about lambdas with an empty capture-list? Is there a reason those also don't make sense to default construct? Is there anything more to it than "the standard says so"?
In general, lambdas where specified as little as possible to solve specific use cases.
Other possibly useful things, like "a lambda that only copies trivially copyable data must be trivially copyable" was also omitted. (The standard does not specify if a lambda is trivially copyable or not)
The upside is that it makes lambda an easier to implement feature, which is important. The downside is that this rules out certain uses that are not in the "intended" set.
If you think that "a capture free lambda must have a zero-argument constructor" is an important thing, propose it. But this takes one use of lambdas (easy local capture and creation of function objects) and morphs it into something else (easy creation of stateless function objects whose types can be passed around).