Construction of lambda object in case of specified captures in C++ - c++

Starting from C++20 closure types without captures have default constructor, see https://en.cppreference.com/w/cpp/language/lambda:
If no captures are specified, the closure type has a defaulted default constructor.
But what about closure types that capture, how can their objects be constructed?
One way is by using std::bit_cast (provided that the closure type can be trivially copyable). And Visual Studio compiler provides a constructor for closure type as the example shows:
#include <bit>
int main() {
int x = 0;
using A = decltype([x](){ return x; });
// ok everywhere
constexpr A a = std::bit_cast<A>(1);
static_assert( a() == 1 );
// ok in MSVC
constexpr A b(1);
static_assert( b() == 1 );
}
Demo: https://gcc.godbolt.org/z/dnPjWdYx1
Considering that both Clang and GCC reject A b(1), the standard does not require the presence of this constructor. But can a compiler provide such constructor as an extension?

But what about closure types that capture, how can their objects be constructed?
You can't. They can only be created from the lambda expression.
And no, bit_cast does not "work everywhere". There is no rule in the C++ standard which requires that any particular lambda type must be trivially copyable (or the same size as its capture member for that matter). The fact that no current implementations break your code does not mean that future implementations cannot.
And it definitely won't work if you have more than one capture member.
Just stop treating lambdas like a cheap way to create a type. If you want to make a callable type with members that you can construct, do that:
#include <bit>
int main() {
struct A
{
int x = 0;
constexpr auto operator() {return x;}
};
// ok everywhere
constexpr A b(1);
static_assert( b() == 1 );
}

Since this is tagged language-lawyer, here's what the C++ standard has to say about all this.
But what about closure types that capture, how can their objects be constructed?
The actual part of the standard that cppreference link is referencing is [expr.prim.lambda.general] - 7.5.5.1.14:
The closure type associated with a lambda-expression has no default constructor if the lambda-expression has a lambda-capture and a defaulted default constructor otherwise. It has a defaulted copy constructor and a defaulted move constructor ([class.copy.ctor]). It has a deleted copy assignment operator if the lambda-expression has a lambda-capture and defaulted copy and move assignment operators otherwise ([class.copy.assign]).
However, clauses 1 and 2 say:
The type of a lambda-expression (which is also the type of the closure object) is a unique, unnamed non-union class type, called the closure type, whose properties are described below.
The closure type is not an aggregate type. An implementation may define the closure type differently from what is described below provided this does not alter the observable behavior of the program other than by changing:
[unrelated stuff]
Which means that (apart from the unrelated exceptions), the described interface of the lambda as stated is exhaustive. Since no other constructors than the default one is listed, then that's the only one that is supposed to be there.
N.B. : A lambda may be equivalent to a class-based functor, but it is not purely syntactical sugar. The compiler/implementation does not need a constructor in order to construct and parametrize the lambda's type. It's just programmers who are prevented from creating instances by the lack of constructors.
As far as extensions go:
But can a compiler provide such constructor as an extension?
Yes. A compiler is allowed to provide this feature as an extension as long as all it does is make programs that would be ill-formed functional.
From [intro.compliance.general] - 4.1.1.8:
A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any well-formed program. Implementations are required to diagnose programs that use such extensions that are ill-formed according to this document. Having done so, however, they can compile and execute such programs.
However, for the feature at hand, MSVC would be having issues in its implementation as an extension:
It should be emmiting a diagnostic.
By its own documentation, it should refuse the code when using /permissive-. Yet it does not.
So it looks like MSVC is, either intentionally or not, behaving as if this was part of the language, which is not the case as far as I can tell.

Related

Lambda closure type constructors

The cppreference shows that there are different rules for lambda closure type constructors.
Default Construction - Until C++14
ClosureType() = delete; (until C++14)
Closure types are not Default Constructible. Closure types have a
deleted (until C++14)no (since C++14) default constructor.
Default Construction - Since C++14
Closure types have no (since C++14) default constructor.
Default Construction - Since C++20
If no captures are specified, the closure type has a defaulted default
constructor. Otherwise, it has no default constructor (this includes
the case when there is a capture-default, even if it does not actually
capture anything).
Copy Assignment Operator - Until C++20
The copy assignment operator is defined as deleted (and the move
assignment operator is not declared). Closure types are not
CopyAssignable.
Copy Assignment Operator - Since C++20
If no captures are specified, the closure type has a defaulted copy
assignment operator and a defaulted move assignment operator.
Otherwise, it has a deleted copy assignment operator (this includes
the case when there is a capture-default, even if it does not actually
capture anything).
What is the reason behind this change in the rules? Did standard committee identified some short comings in the standard for lambda closure type construction? If so, what are those short comings?
There was a shortcoming. We couldn't use lambdas quite as "on the fly" as one might have wanted. C++20 (with the addition of allowing lambdas in unevaluated contexts) makes this code valid:
struct foo {
int x, y;
};
std::map<foo, decltype([](foo const& a, foo const& b) { return a.x < a.y; })> m;
Note how we defined the compare function inline? No need to create a named functor (could be a good idea otherwise, but we aren't forced too). And there's no need to break the declaration in two:
// C++17
auto cmp = [](foo const& a, foo const& b) { return a.x < a.y; };
std::map<foo, decltype(cmp)> m(cmp); // And also need to pass and hold it!
Usages like this (and many more) were the motivating factor in making this change. In the example above, the anonymous functors type will bring all the benefits a named functor type can bring. Default initialization and EBO among them.

Forwarding to an Aggregate Initializer?

It seems to me that aggregate initialization (of suitable types) is not considered a constructor that you can actually call (except a few cases.)
As an example, if we have a very simple aggregate type:
struct Foo {
int x, y;
};
then this obviously works:
auto p = new Foo {42, 17}; // not "Foo (42, 17)" though...
but this doesn't work on any compiler that I've tested (including latest versions of MSVC, GCC, and Clang):
std::vector<Foo> v;
v.emplace_back(2, 3);
Again, it seems to me that any code that wants to call the constructor for a type T (in this case, the code in vector::emplace_back that forwards the passed arguments to the c'tor of T,) cannot use aggregate initialization simply (it seems) because they use parentheses instead of curly braces!
Why is that? Is it just a missed feature (nobody has proposed/implemented it yet,) or there are deeper reasons? This is a little strange, because aggregate types by definition have no other constructor to make the resolution ambiguous, so the language could have just defined a default aggregate constructor (or something) that would have all the members as defaulted arguments.
Is it just a matter of syntax? If the implementation of vector::emplace_back in the above example had used placement new with curly braces instead of parentheses, would it have worked?
Note: I want to thank those comments who've pointed out the behavior of vector and emplace because their comments will be valuable to those who'll find this question using those keywords, but I also want to point out that those are just examples. I picked the most familiar and concise example, but my point was about explicitly calling the aggregate initializer in any code (or in placement new, more specifically.)
For what it's worth, P0960 "Allow initializing aggregates from a parenthesized list of values" does exactly what it says. It seems to have passed EWG and is on its way into C++20.
aggregate types by definition have no other constructor to make the resolution ambiguous
That is incorrect. All classes have default constructors, as well as copy/move constructors. Even if you = delete them or they are implicitly deleted, they still technically have such constructors (you just can't call them).
C++ being C++, there are naturally corner cases where even P0960 does the "wrong thing", as outlined in the paper:
struct A;
struct C
{
operator A(); //Implicitly convertible to `A`
};
struct A { C c; }; //First member is a `C`
C c2;
A a(c2);
The initialization of a is a case of ambiguity. Two things could happen. You could perform implicit conversion of c2 to an A, then initialize a from the resulting prvalue. Or you could perform aggregate initialization of a by a single value of type C.
P0960 takes the backwards compatible route: if a constructor could be called (under existing rules), then it always takes priority. Parentheses only invoke aggregate initialization if there is no constructor that could have been called.
https://en.cppreference.com/w/cpp/language/aggregate_initialization
Aggregate Initialization is not a constructor.
According to this document, you did not define any constructors and meet the other conditions for Aggregate Initialization. (Refer to the item "class type" in the section "Explanation") That means your code does not call something like automatically generated constructor of signature Foo(int, int) but it is just another feature.
The document says about its effect is:
Each array element, or non-static class member, in order of array subscript/appearance in the class definition, is copy-initialized from the corresponding clause of the initializer list.
As vector::emplace_back(Args&&... args) works this way and it cannot find such constructor.
The arguments args... are forwarded to the constructor as std::forward<Args>(args)....
So it does not find such constructor.
Thinking this way, it also makes sense that your code cannot compile auto p = new Foo (42, 17);.
One more example, if you write any kind of constructor(even Foo::Foo() {}), auto p = new Foo {42, 17}; does not work. Because now it does not meet the condition for Aggregate Initialization.
As far as I know Aggregate Initialization also works in C which does not even support constructors.
Here's a good article worth reading.

polymorphic unique_ptr copy elision

I have the following code that works on Clang 5.0 but not on Clang 3.8, with C++14 enabled:
class Base {};
class Derived : public Base {};
std::unique_ptr<Base> MakeDerived()
{
auto derived = std::make_unique<Derived>();
return derived;
}
int main()
{
auto base = MakeDerived();
std::cout << "Type: " << typeid(base).name() << '\n';
}
Live Sample Here
Is the return value technically being move-constructed in this case, due to copy elision? And if so, does that mean unique_ptr's move constructor is designed to support implicit upcasts of user class types? It's hard to tell from the cppreference documentation (#6), unless I'm supposed to assume on that documentation page that template type U is distinctly different from type T for the class itself.
The one thing that makes it obvious this works is that unique_ptr is not copy constructible, and since I don't convert it to an rvalue via std::move(), there's no other explanation as to why the code would compile other than copy elision. However because it doesn't work with clang 3.8, which accepts the -std=c++14 flag, I want to make sure the standard itself guarantees that somewhere (and if so, where) and this is just a compiler bug or lack of support issue in v3.8 of Clang.
Copy elision (before C++17) always requires the elided constructor to be actually accessible, so it cannot be the cause of your code working.
Note, however, that C++ (since C++11) has a rule (12.8/32) which boils down to the following: "when returning an object, then under certain circumstances, first try treating the object as an rvalue and only if that fails, treat it as the lvalue it is."
In C++11, these "certain circumstances" were "copy elision is possible," which requires the returned object and the return type to be the same type (modulo cv qualification).
In C++14, these "certain circumstances" were relaxed to "copy elision is possible, or the returned object is a local variable in the function."
So in C++11, the code fails, because std::unique_ptr<Derived> (the type of derived) is different from std::unique_ptr<Base> (the return type), and so std::move would have to be used.
In C++14, the code succeeds, because derived is a local in the function, and is thus treated as an rvalue first. This makes the constructor template you were referring to applicable:
std::unique_ptr<T> can be constructed from an rvalue of type std::unique_ptr<U> if U* can be converted to T*.
Your Clang 3.8 seems to be exhibiting the C++11 behaviour; perhaps it has not (yet/fully) implemented the relaxed aspects of C++14 in that version.
(And yes, indeed you're supposed to assume that the "template type" U is distinct from the class's template parameter T. That's why it was introduced in the first place: the constructor being described by #6 is a constructor template).
That return invokes unique_ptr's converting move constructor template, template<class U, class E> unique_ptr(unique_ptr<U, E>&&). There's no elision; it depends on an implicit move from derived enabled by the resolution of core issue 1579.

initializing a non-copyable member (or other object) in-place from a factory function

A class must have a valid copy or move constructor for any of this syntax to be legal:
C x = factory();
C y( factory() );
C z{ factory() };
In C++03 it was fairly common to rely on copy elision to prevent the compiler from touching the copy constructor. Every class has a valid copy constructor signature regardless of whether a definition exists.
In C++11 a non-copyable type should define C( C const & ) = delete;, rendering any reference to the function invalid regardless of use (same for non-moveable). (C++11 §8.4.3/2). GCC, for one, will complain when trying to return such an object by value. Copy elision ceases to help.
Fortunately, we also have new syntax to express intent instead of relying on a loophole. The factory function can return a braced-init-list to construct the result temporary in-place:
C factory() {
return { arg1, 2, "arg3" }; // calls C::C( whatever ), no copy
}
Edit: If there's any doubt, this return statement is parsed as follows:
6.6.3/2: "A return statement with a braced-init-list initializes the object or reference to be returned from the function by copy-list-initialization (8.5.4) from the specified initializer list."
8.5.4/1: "list-initialization in a copy-initialization context is called copy-list-initialization." ¶3: "if T is a class type, constructors are considered. The applicable constructors are enumerated and the best one is chosen through overload resolution (13.3, 13.3.1.7)."
Do not be misled by the name copy-list-initialization. 8.5:
13: The form of initialization (using parentheses or =) is generally insignificant, but does matter when the
initializer or the entity being initialized has a class type; see below. If the entity being initialized does not
have class type, the expression-list in a parenthesized initializer shall be a single expression.
14: The initialization that occurs in the form
T x = a;
as well as in argument passing, function return, throwing an exception (15.1), handling an exception (15.3), and aggregate member initialization (8.5.1) is called copy-initialization.
Both copy-initialization and its alternative, direct-initialization, always defer to list-initialization when the initializer is a braced-init-list. There is no semantic effect in adding the =, which is one reason list-initialization is informally called uniform initialization.
There are differences: direct-initialization may invoke an explicit constructor, unlike copy-initialization. Copy-initialization initializes a temporary and copies it to initialize the object, when converting.
The specification of copy-list-initialization for return { list } statements merely specifies the exact equivalent syntax to be temp T = { list };, where = denotes copy-initialization. It does not immediately imply that a copy constructor is invoked.
-- End edit.
The function result can then be received into an rvalue reference to prevent copying the temporary to a local:
C && x = factory(); // applies to other initialization syntax
The question is, how to initialize a nonstatic member from a factory function returning non-copyable, non-moveable type? The reference trick doesn't work because a reference member doesn't extend the lifetime of a temporary.
Note, I'm not considering aggregate-initialization. This is about defining a constructor.
On your main question:
The question is, how to initialize a nonstatic member from a factory function returning non-copyable, non-moveable type?
You don't.
Your problem is that you are trying to conflate two things: how the return value is generated and how the return value is used at the call site. These two things don't connect to each other. Remember: the definition of a function cannot affect how it is used (in terms of language), since that definition is not necessarily available to the compiler. Therefore, C++ does not allow the way the return value was generated to affect anything (outside of elision, which is an optimization, not a language requirement).
To put it another way, this:
C c = {...};
Is different from this:
C c = [&]() -> C {return {...};}()
You have a function which returns a type by value. It is returning a prvalue expression of type C. If you want to store this value, thus giving it a name, you have exactly two options:
Store it as a const& or &&. This will extend the lifetime of the temporary to the lifetime of the control block. You can't do that with member variables; it can only be done with automatic variables in functions.
Copy/move it into a value. You can do this with a member variable, but it obviously requires the type to be copyable or moveable.
These are the only options C++ makes available to you if you want to store a prvalue expression. So you can either make the type moveable or return a freshly allocated pointer to memory and store that instead of a value.
This limitation is a big part of the reason why moving was created in the first place: to be able to pass things by value and avoid expensive copies. The language couldn't be changed to force elision of return values. So instead, they reduced the cost in many cases.
Issues like this were among the prime motivations for the change in C++17 to allow these initializations (and exclude the copies from the language, not merely as an optimization).

Do these two C++ initializer syntaxes ever differ in semantics?

Assume that the following code is legal code that compiles properly, that T is a type name, and that x is the name of a variable.
Syntax one:
T a(x);
Syntax two:
T a = x;
Do the exact semantics of these two expressions ever differ? If so, under what circumstances?
If these two expressions ever do have different semantics I'm also really curious about which part of the standard talks about this.
Also, if there is a special case when T is the name of a scalar type (aka, int, long, double, etc...), what are the differences when T is a scalar type vs. a non-scalar type?
Yes. If the type of x is not T, then the second example expands to T a = T(x). This requires that T(T const&) is public. The first example doesn't invoke the copy constructor.
After the accessibility has been checked, the copy can be eliminated (as Tony pointed out). However, it cannot be eliminated before checking accessibility.
The difference here is between implicit and explicit construction, and there can be difference.
Imagine having a type Array with the constructor Array(size_t length), and that somewhere else, you have a function count_elements(const Array& array). The purpose of these are easily understandable, and the code seems readable enough, until you realise it will allow you to call count_elements(2000). This is not only ugly code, but will also allocate an array 2000 elements long in memory for no reason.
In addition, you may have other types that are implicitly castable to an integer, allowing you to run count_elements() on those too, giving you completely useless results at a high cost to efficiency.
What you want to do here, is declare the Array(size_t length) an explicit constructor. This will disable the implicit conversions, and Array a = 2000 will no longer be legal syntax.
This was only one example. Once you realise what the explicit keyword does, it is easy to dream up others.
From 8.5.14 (emphasis mine):
The function selected is called with the initializer expression as its argument; if the function is a constructor, the call initializes a temporary of the destination type. The result of the call (which is the temporary for the constructor case) is then used to direct-initialize, according to the rules above, the object that is the destination of the copy-initialization. In certain cases, an implementation is permitted to eliminate the copying inherent in this direct-initialization by constructing the intermediate result directly into the object being initialized; see class.temporary, class.copy.
So, whether they're equivalent is left to the implementation.
8.5.11 is also relevant, but only in confirming that there can be a difference:
-11- The form of initialization (using parentheses or =) is generally insignificant, but does matter when the entity being initialized has a class type; see below. A parenthesized initializer can be a list of expressions only when the entity being initialized has a class type.
T a(x) is direct initialization and T a = x is copy initialization.
From the standard:
8.5.11 The form of initialization (using parentheses or =) is generally insignificant, but does matter when the entity being initialized has a class type; see below. A parenthesized initializer can be a list of expressions only when the entity being initialized has a class type.
8.5.12 The initialization that occurs in argument passing, function return, throwing an exception (15.1), handling an exception (15.3), and brace-enclosed initializer lists (8.5.1) is called copy-initialization and is equivalent to the form
T x = a;
The initialization that occurs in new expressions (5.3.4), static_cast expressions (5.2.9), functional notation type conversions (5.2.3), and base and member initializers (12.6.2) is called direct-initialization and is equivalent to the form
T x(a);
The difference is that copy initialization creates a temporary object which is then used to direct-initialize. The compiler is allowed to avoid creating the temporary object:
8.5.14 ... The result of the call (which is the temporary for the constructor case) is then used to direct-initialize, according to the rules above, the object that is the destination of the copy-initialization. In certain cases, an implementation is permitted to eliminate the copying inherent in this direct-initialization by constructing the intermediate result directly into the object being initialized; see 12.2, 12.8.
Copy initialization requires a non-explicit constructor and a copy constructor to be available.
In C++, when you write this:
class A {
public:
A() { ... }
};
The compiler actually generates this, depending on what your code uses:
class A {
public:
A() { ... }
~A() { ... }
A(const A& other) {...}
A& operator=(const A& other) { ... }
};
So now you can see the different semantics of the various constructors.
A a1; // default constructor
A a2(a1); // copy constructor
a2 = a1; // copy assignment operator
The copy constructors basically copy all the non-static data. They are only generated if the resulting code is legal and sane: if the compiler sees types inside the class that he doesn't know how to copy (per normal assignment rules), then the copy constructor won't get generated. This means that if the T type doesn't support constructors, or if one of the public fields of the class is const or a reference type, for instance, the generator won't create them - and the code won't build. Templates are expanded at build time, so if the resulting code isn't buildable, it'll fail. And sometimes it fails loudly and very cryptically.
If you define a constructor (or destructor) in a class, the generator won't generate a default one. This means you can override the default generated constructors. You can make them private (they're public by default), you can override them so they do nothing (useful for saving memory and avoiding side-effects), etc.