I'm currently trying to implement a really small nested exception mechanism in my code, as std::nested_exceptions aren't available for all the compilers I must compile my code with.
I came across the following useful nesting wrapper in the gcc implementation source code:
template<typename _Except>
struct _Nested_exception : public _Except, public nested_exception
{
explicit _Nested_exception(_Except&& __ex)
: _Except(static_cast<_Except&&>(__ex))
{ }
};
which allows to combine an exception that has been thrown with the nested_exception class in throw_with_nested so that one can dynamic_cast to test whether the exception we've caught is actually a nested exception or not later on.
What I don't understand is the static_cast here. I think I'm missing something about move-semantics, is it really needed?
Although __ex is an _Except&&, once you enter this scope __ex is an lvalue - it's a local variable. To enforce the use of the move constructor inside of _Except they do _Except(static_cast<_Except&&>(__ex)) to force __ex to become an rvalue (it's probably actually an xvalue).
Move semantics means that we know that this constructor has been called with a parameter that is a temporary (or to-be-moved-from object). We see this in the type of __ex which is _Except&&. However if we were to use __ex within this constructor it would be treated as an lvalue - it clearly has a name and we can take it's address.
If you want to pass __ex to any further functions/constructors, it's rvalue-ness needs to be restored. The normal way to do so is by calling std::move(__ex) which will reapply && to the type of __ex. This is really just the same static_cast that you see above.
Related
After learning some Rust and its lifetime specifiers, borrowing semantics, etc, I came across a Rust sample which doesn't allow something like that which is allowed in C++. Why?
struct S {
std::string& str;
S(std::string&& value) : str(value) {}
};
It is not allowed and would cause an error if you actually did try that.
However, the name of a variable is always an lvalue, not a rvalue. What type the variable is doesn't matter at all. You need to call std::move on it to turn it into a rvalue. That's what std::move does.
When using a rvalue reference variable it behaves exactly like a lvalue reference variable. Both refer directly to the bound object as a lvalue. They only differ in how they can or cannot be initialized and how they affect overload resolution and template argument deduction.
The point is to make it explicit when you potentially move from an object. Even if value is a rvalue reference, you may still use it multiple times in the function. But, usually, when you move from the referenced object, you can't use it afterwards anymore or at least it will lose its state. Therefore it must be clear where a potential move can happen while still allowing non-move usage. So the rule makes sense to enforce you to write std::move explicitly everywhere where a move might happen.
If the goal is prevent a user from constructing S from a temporary, which is reasonable, then you should delete the constructor completely:
S(std::string&& value) = delete;
(There are technically some issues with this. A better, but complex approach is to follow what std::reference_wrapper does.)
However, the user already has to think about lifetime of the argument they pass to S. So you only really catch a small subset of potential mistakes with this. There is no way to protect the user from not keeping the passed object alive long enough. If you need to ensure this through S, then S must take (shared) ownership of the object, e.g. by using std::unique_ptr or std::shared_ptr instead of a reference. There is no borrow checker in C++ like there is in Rust to verify that the borrowed/referenced object is kept alive long enough.
C++ lifetime and ownership management is mostly based on convention together with some utilities like the smart pointers. There is no intrinsic core language enforcement aside from automatic storage duration, although core language features like rvalue references are designed to support the conventions, e.g. as I described above.
A followup to this post. Consider the following:
class C;
C foo();
That is a pair of valid declarations. C doesn't need to be fully defined when merely declaring a function. But if we were to add the following function:
class C;
C foo();
inline C bar() { return foo(); }
Then suddenly C needs to be a fully defined type. But with guaranteed copy elision, none of its members are required. There's no copying or even a move, the value is initialized elsewhere, and destroyed only in the context of the caller (to bar).
So why? What in the standard prohibits it?
Guaranteed copy elision has exceptions, for compatibility reasons and/or efficiency. Trivially copyable types may be copied even where copy elision would otherwise be guaranteed. You're right that if this doesn't apply, then the compiler would be able to generate correct code without knowing any details of C, not even its size. But the compiler does need to know if this applies, and for that, it still needs the type to be complete.
According to https://timsong-cpp.github.io/cppwp/class.temporary :
15.2 Temporary objects [class.temporary]
1 Temporary objects are created
[...]
(1.2) -- when needed by the implementation to pass or return an object of trivially-copyable type (see below), and
[...]
3 When an object of class type X is passed to or returned from a function, if each copy constructor, move constructor, and destructor of X is either trivial or deleted, and X has at least one non-deleted copy or move constructor, implementations are permitted to create a temporary object to hold the function parameter or result object. The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the non-deleted trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object). [ Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers. -- end note ]
This has nothing to do with copy ellision. the foo is supposed to return a C value. As long as you just pass a reference or pointer to foo, it's OK. Once you try to call foo - as is the case in bar- the size of its arguments and return value must be at hand; the only valid way to know that is presenting a full declaration of the requiered type.
Had the signature been using a reference or a pointer, all the required info was present and you could do without the full type declaration. This approach has a name: pimpl==Pointer to IMPLementaion, and it is widely used as a means of hiding details in closed-source library distros.
The rule lies in [basic.lval]/9:
Unless otherwise indicated ([dcl.type.simple]), a prvalue shall always have complete type or the void type; ...
Despite the number of answers and amount of commentary posted to this thread (which has answered all of my personal questions), I have decided to post an answer 'for the rest of us'. I didn't initially understand what the OP was getting at but now I do, so I thought I'd share. If you know all this and it bores you, dear reader, then please just move on.
#xskxzr and #hvd effectively answered the question, but #hvd's post especially is in standardese and assumes that readers know how return-by-value (and by extension RVO) works, which I imagine not everybody does. I thought I did, but I was missing an important detail (which, when you think it through, is actually pretty obvious, but still, I missed it).
So this post mainly focuses on that, so that we can all see why (a) the OP wondered why there was an issue compiling bar() in the first place, and then (b) subsequently realised the reason.
So, let's look at that code again. Given this (which is legal, even with an incompletely defined type):
class C;
C foo();
Why can't the compiler compile this (I have removed the inline because it is irrelevant):
C bar() { return foo(); }
The error message from gcc being:
error: return type 'class C' is incomplete
Well, first up, the accepted answer quotes the relevant paragraph from the standard that explicitly forbids it, so no mystery there. But the OP (and indeed commenter Walter, who picked up on this straightaway) wanted to know why.
Well at first that seemed to me to be obvious: space needs to be allocated for the function result by the caller and it doesn't know how big the object is so the compiler is in a quandry. But I was missing a trick, and that lies in the way return-by-value works.
Now for those that don't know, returning class objects by value works by the caller allocating space for the returned object on the stack and passing a pointer to it as a hidden parameter to the function being called, which then constructs the object, manipulates it, whatever.
However, this daisy-chains, so if we have the following code (where we fully define C before calling bar()):
class C
{
public:
int x;
};
C c = bar ();
c.x = 4;
then space for c is allocated before bar() is called and the address of c is then passed as a hidden parameter to bar(), and then passed directly on to foo(), which finally fills constructs the object in the desired location. So, because bar() didn't actually do anything with this pointer (apart from pass it around) then all it cares about is the pointer itself, and not what it points to.
Or does it? Well, actually, yes, it does.
When returning a class object by value, small objects are usually returned in a register (or a pair of registers) as an optimisation. The compiler can get away with doing this in the majority of cases where the object is small enough (more on that in a moment).
But now, bar() needs to know whether this is what foo() is going to do, and to do that it needs, for various reasons, to see the full declaration of the class.
So, in summary, that's why the compiler needs a fully-defined type in order to call foo(), else it won't know what foo() will be expecting and so it doesn't know what code to generate. Not on most platforms anyway, end of story.
Notes:
I looked at gcc and there are seem to be two (entirely logical) rules for determining whether a class object is returned in a register or pair of registers:
a) The object is 16 bytes or smaller (in a 64 bit build).
b) std::is_trivially_copyable<T>::value evaluates to true (maybe someone can find something in the standard about that).
In case any readers don't know, RVO relies on constructing the object in its final resting place (i.e. in the location allocated by the caller). This is because there are objects (such as some implementations of std::basic_string, I believe) that are sensitive to being moved around in memory so you can't just construct them somewhere convenient to you and then memcpy them somewhere else.
If constructing the returned object in that final location is not possible (because of the way you coded the function returning the object), then RVO doesn't happen (how can it?), see live demo below (make_no_RVO()).
As a specific example of point 1b, if a small object contains data members that (might) point either to itself or to any of its data members, then returning it by value will get you into trouble if you don't declare it properly. Just adding an empty copy constructor will do, since then it is no longer trivially copyable. But then I guess that's true in general, don't hide important information from the compiler.
Live demo here. All comments on this post welcome, I will answer them to the best of my ability.
I understand that c++ only allows rvalues or temp objects to bind to const-references. (Or something close to that...)
For example, assuming I have the functions doStuff(SomeValue & input)
and SomeValue getNiceValue() defined:
/* These do not work */
app->doStuff(SomeValue("value1"));
app->doStuff(getNiceValue());
/* These all work, but seem awkward enough that they must be wrong. :) */
app->doStuff(*(new SomeValue("value2")));
SomeValue tmp = SomeValue("value3");
app->doStuff(tmp);
SomeValue tmp2 = getNiceValue();
app->doStuff(tmp2);
So, three questions:
Since I am not free to change the signatures of doStuff() or getNiceValue(), does this mean I must always use some sort of "name" (even if superfluous) for anything I want to pass to doStuff?
Hypothetically, if I could change the function signatures, is there a common pattern for this sort of thing?
Does the new C++11 standard change the things at all? Is there a better way with C++11?
Thank you
An obvious question in this case is why your doStuff declares its parameter as a non-const reference. If it really attempts to modify the referred object, then changing function signature to a const reference is not an option (at least not by itself).
Anyway, "rvalue-ness" is a property of an expression that generated the temporary, not a property of temporary object itself. The temporary object itself can easily be an lvalue, yet you see it as an rvalue, since the expression that produced it was an rvalue expression.
You can work around it by introducing a "rvalue-to-lvalue converter" method into your class. Like, for example
class SomeValue {
public:
SomeValue &get_lvalue() { return *this; }
...
};
and now you can bind non-const references to temporaries as
app->doStuff(SomeValue("value1").get_lvalue());
app->doStuff(getNiceValue().get_lvalue());
Admittedly, it doesn't look very elegant, but it might be seen as a good thing, since it prevents you from doing something like that inadvertently. Of course, it is your responsibility to remember that the lifetime of the temporary extends to the end of the full expression and no further.
Alternatively, a class can overload the unary & operator (with natural semantics)
class SomeValue {
public:
SomeValue *operator &() { return this; }
...
};
which then can be used for the same purpose
app->doStuff(*&SomeValue("value1"));
app->doStuff(*&getNiceValue());
although overriding the & operator for the sole purpose of this workaround is not a good idea. It will also allow one to create pointers to temporaries.
Since I am not free to change the signatures of doStuff() or getNiceValue(), does this mean I must always use some sort of "name"
(even if superfluous) for anything I want to pass to doStuff?
Pretty much yes. This signature assumes that you want to use input as an "out" parameter. So the author of doStuff believes that if the client passes an anonymous object in, that is a logical error best caught at compile time.
Hypothetically, if I could change the function signatures, is there a common pattern for this sort of thing?
In C++11 only, you could change or overload like so:
doStuff(SomeValue&& input);
Now input will only bind to an rvalue. If you've overloaded, then the original will get the lvalues and your new overload will get the rvalues.
Does the new C++11 standard change the things at all? Is there a better way with C++11?
Yes, see the rvalue reference overload above.
std::forward is usually the way to 'convert' value category. However it is prohibited to accept rvalues when forwarding as an lvalue, for the same reasons that a reference to non-const won't bind to an rvalue. That being said, and assuming you don't want to overload doStuff (otherwise see Hinnant's answer), you can write a utility yourself:
template<typename T>
T& unsafe_lvalue(T&& ref)
{ return ref; }
And use it like so: app->doStuff(unsafe_lvalue(getNiceValue())). No intrusive modification needed.
You must always use a name for values you pass to doStuff. The reasons for this are covered in detail at How come a non-const reference cannot bind to a temporary object?. The short summary is that passing a reference implies that doStuff can change the value that it references, and that changing the value of a reference to a temporary is something that the compiler should not let you do.
I'd avoid the first solution, because it allocates memory on the heap that is never freed.
The common pattern for solving this is to change doStuff's signature to take a const reference.
Unfortunately, I think the answer is that you have to have a named object to pass into doStuff. I don't think there is a C++11 feature that gives you flexibility in this area, nor have I heard of any design pattern for this type of situation.
If this is something you expect to encounter often in your program, I would write an interface more suited to the needs of the current app. If it's just a one off, I would tend to just create a temp object to store the result (as you have done).
In C++11, value parameters (and other values) enjoy implicit move when returned:
A func(A a) {
return a; // uses A::A(A&&) if it exists
}
At least in MSVC 2010, rvalue reference parameters need std::move:
A func(A && a) {
return a; // uses A::A(A const&) even if A::A(A&&) exists
}
I would imagine that inside functions, an rvalue reference and a value behave similar, with the only difference that in case of values, the function itself is responsible for destruction, while for rvalue references, the responsibility is outside.
What is the motivation for treating them differently in the standard?
The standardization committee expended great effort in creating wording so that moves would only ever happen in exactly two circumstances:
When it is clearly safe to do so.
When the user explicitly asks (via std::move or a similar cast).
A value parameter will unquestionably be destroyed at the end of the function. Therefore, returning it by move is clearly safe; it can't be touched by other code after the return (not unless you're deliberately trying to break things, in which case you probably triggered undefined behavior). Therefore, it can be moved from in the return.
A && variable could be referring to a temporary. But it could be referring to an lvalue (a named variable). It is therefore not clearly safe to move from it; the original variable could be lurking around. And since you didn't explicitly ask to move from it (ie: you didn't call std::move in this function), no movement can take place.
The only time a && variable will be implicitly moved from (ie: without std::move) is when you return it. std::move<T> returns a T&&. It is legal for that return value to invoke the move constructor, because it is a return value.
Now it is very difficult to call A func(A &&a) with an lvalue without calling std::move (or an equivalent cast). So technically, it should be fine for parameters of && type to be implicitly moved from. But the standards committee wanted moves to be explicit for && types, just to make sure that movement didn't implicitly happen within the scope of this function. That is, it can't use outside-of-function knowledge about where the && comes from.
In general, you should only take parameters by && in two cases: either you're writing a move constructor (or move assignment operator, but even that can be done by value), or you're writing a forwarding function. There may be a few other cases, but you shouldn't take && to a type unless you have something special in mind. If A is a moveable type, then just take it by value.
This was fixed for C++20 by P0527 and P1825. The only way to have a function parameter bind to an rvalue reference is for the source to either be a temporary or for the caller to explicitly cast a non-temporary to an rvalue (for instance, with std::move). Therefore, this "mandatory optimization" was deemed safe.
In your first case, the compiler knows that a is going away and nothing will be able to cling on to it: clearly, this object can be moved from and if it is not it will be destroyed. In the second case, the rvalue reference indicates that it is permissible to move from the object and the caller doesn't expect the object to stay around. However, it is the function's choice whether it takes advantage of this permission or not and there may be reasons why the function sometimes wants to move from the argument and sometimes it doesn't want to. If the compiler were given the liberty to move off this object, there would be no way to prevent the compiler from doing so. However, using std::move(a) there is already a way to indicate that it is desired to move from the object.
The general rule in the standard is that the compiler only ever moves objects implicitly which are known to go away. When an rvalue reference comes in, the compiler doesn't really know that the object is about to away: if it was explicitly std::move()ed it actually stays around.
First: where are std::move and std::forward defined? I know what they do, but I can't find proof that any standard header is required to include them. In gcc44 sometimes std::move is available, and sometimes its not, so a definitive include directive would be useful.
When implementing move semantics, the source is presumably left in an undefined state. Should this state necessarily be a valid state for the object? Obviously, you need to be able to call the object's destructor, and be able to assign to it by whatever means the class exposes. But should other operations be valid? I suppose what I'm asking is, if your class guarantees certain invariants, should you strive to enforce those invariants when the user has said they don't care about them anymore?
Next: when you don't care about move semantics, are there any limitations that would cause a non-const reference to be preferred over an rvalue reference when dealing with function parameters? void function(T&); over void function(T&&); From a caller's perspective, being able to pass functions temporary values is occasionally useful, so it seems as though one should grant that option whenever it is feasible to do so. And rvalue references are themselves lvalues, so you can't inadvertently call a move-constructor instead of a copy-constructor, or something like that. I don't see a downside, but I'm sure there is one.
Which brings me to my final question. You still can not bind temporaries to non-const references. But you can bind them to non-const rvalue references. And you can then pass along that reference as a non-const reference in another function.
void function1(int& r) { r++; }
void function2(int&& r) { function1(r); }
int main() {
function1(5); //bad
function2(5); //good
}
Besides the fact that it doesn't do anything, is there anything wrong with that code? My gut says of course not, since changing rvalue references is kind of the whole point to their existence. And if the passed value is legitimately const, the compiler will catch it and yell at you. But by all appearances, this is a runaround of a mechanism that was presumably put in place for a reason, so I'd just like confirmation that I'm not doing anything foolish.
First: where are std::move and std::forward defined?
See 20.3 Utility components, <utility>.
When implementing move semantics, the source is presumably left in an undefined state. Should this state necessarily be a valid state for the object?
Obviously, the object should still be destructibly. But further than that, I think it's a good idea to be still assignable. The Standard says for objects that satisfy "MoveConstructible" and "MoveAssignable":
[ Note: rv remains a valid object. Its state is unspecified. — end note ]
This would mean, I think, that the object can still participate in any operation that doesn't state any precondition. This includes CopyConstructible, CopyAssignable, Destructible and other things. Notice that this won't require anything for your own objects from a core language perspective. The requirements only take place once you touch Standard library components that state these requirements.
Next: when you don't care about move semantics, are there any limitations that would cause a non-const reference to be preferred over an rvalue reference when dealing with function parameters?
This, unfortunately, crucially depends on whether the parameter is in a function template and uses a template parameter:
void f(int const&); // takes all lvalues and const rvalues
void f(int&&); // can only accept nonconst rvalues
However for a function template
template<typename T> void f(T const&);
template<typename T> void f(T&&);
You can't say that, because the second template will, after being called with an lvalue, have as parameter of the synthesized declaration the type U& for nonconst lvalues (and be a better match), and U const& for const lvalues (and be ambiguous). To my knowledge, there is no partial ordering rule to disambiguate that second ambiguity. However, this is already known.
-- Edit --
Despite that issue report, I don't think that the two templates are ambiguous. Partial ordering will make the first template more specialized, because after taking away the reference modifiers and the const, we will find that both types are the same, and then notice that the first template had a reference to const. The Standard says (14.9.2.4)
If, for a given type, deduction succeeds in both directions (i.e., the types are identical after the transfor-mations above) and if the type from the argument template is more cv-qualified than the type from the parameter template (as described above) that type is considered to be more specialized than the other.
If for each type being considered a given template is at least as specialized for all types and more specialized for some set of types and the other template is not more specialized for any types or is not at least as specialized for any types, then the given template is more specialized than the other template.
This makes the T const& template the winner of partial ordering (and GCC is indeed correct to choose it).
-- Edit End --
Which brings me to my final question. You still can not bind temporaries to non-const references. But you can bind them to non-const rvalue references.
This is nicely explained in this article. The second call using function2 only takes nonconst rvalues. The rest of the program won't notice if they are modified, because they won't be able to access those rvalues afterwards anymore! And the 5 you pass is not a class type, so a hidden temporary is created and then passed to the int&& rvalue reference. The code calling function2 won't be able to access that hidden object here, so it won't notice any change.
A different situation is if you do this one:
SomeComplexObject o;
function2(move(o));
You have explicitly requested that o is moved, so it will be modified according to its move specification. However moving is a logically non-modifying operation (see the article). This means whether you move or not shouldn't be observable from the calling code:
SomeComplexObject o;
moveit(o); // #1
o = foo;
If you erase the line that moves, behavior will still be the same, because it's overwritten anyway. This however means that code that uses the value of o after it has been moved from is bad, because it breaks this implicit contract between moveit and the calling code. Thus, the Standard makes no specification about the concrete value of a moved from container.
where are std::move and std::forward defined?
std::move and std::forward are declared in <utility>. See the synopsis at the beginning of section 20.3[utility].
When implementing move semantics, the source is presumably left in an undefined state.
It of course depends on how you implement the move-constructor and move-assignment operator. If you want to use your objects in standard containers, however, you have to follow the MoveConstructible and MoveAssignable concepts, which says that the object remains valid, but is left in unspecified state, i.e. you definitely can destroy it.
included by utility
Here is the article I read about rvalues.
I can't help you with rest, sorry.