In C++03, an expression is either an rvalue or an lvalue.
In C++11, an expression can be an:
rvalue
lvalue
xvalue
glvalue
prvalue
Two categories have become five categories.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I guess this document might serve as a not so short introduction : n3055
The whole massacre began with the move semantics. Once we have expressions that can be moved and not copied, suddenly easy to grasp rules demanded distinction between expressions that can be moved, and in which direction.
From what I guess based on the draft, the r/l value distinction stays the same, only in the context of moving things get messy.
Are they needed? Probably not if we wish to forfeit the new features. But to allow better optimization we should probably embrace them.
Quoting n3055:
An lvalue (so-called, historically,
because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [Example: If E is an
expression of pointer type, then *E
is an lvalue expression referring to
the object or function to which E
points. As another example, the
result of calling a function whose
return type is an lvalue reference is
an lvalue.]
An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may
be moved, for example). An xvalue is
the result of certain kinds of
expressions involving rvalue
references. [Example: The
result of calling a function whose
return type is an rvalue reference is
an xvalue.]
A glvalue (“generalized” lvalue) is an lvalue
or an xvalue.
An rvalue (so-called,
historically, because rvalues could
appear on the right-hand side of an
assignment expression) is an xvalue,
a temporary object or
subobject thereof, or a value that is
not associated with an object.
A
prvalue (“pure” rvalue) is an rvalue
that is not an xvalue. [Example: The
result of calling a function whose
return type is not a reference is a
prvalue]
The document in question is a great reference for this question, because it shows the exact changes in the standard that have happened as a result of the introduction of the new nomenclature.
What are these new categories of expressions?
The FCD (n3092) has an excellent description:
— An lvalue (so called, historically, because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [ Example: If E is an
expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E
points. As another example, the result
of calling a function whose return
type is an lvalue reference is an
lvalue. —end example ]
— An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may be
moved, for example). An xvalue is the
result of certain kinds of expressions
involving rvalue references (8.3.2). [
Example: The result of calling a
function whose return type is an
rvalue reference is an xvalue. —end
example ]
— A glvalue (“generalized”
lvalue) is an lvalue or an xvalue.
—
An rvalue (so called, historically,
because rvalues could appear on the
right-hand side of an assignment
expressions) is an xvalue, a temporary
object (12.2) or subobject thereof, or
a value that is not associated with an
object.
— A prvalue (“pure” rvalue) is
an rvalue that is not an xvalue. [
Example: The result of calling a
function whose return type is not a
reference is a prvalue. The value of a
literal such as 12, 7.3e5, or true is
also a prvalue. —end example ]
Every
expression belongs to exactly one of
the fundamental classifications in
this taxonomy: lvalue, xvalue, or
prvalue. This property of an
expression is called its value
category. [ Note: The discussion of
each built-in operator in Clause 5
indicates the category of the value it
yields and the value categories of the
operands it expects. For example, the
built-in assignment operators expect
that the left operand is an lvalue and
that the right operand is a prvalue
and yield an lvalue as the result.
User-defined operators are functions,
and the categories of values they
expect and yield are determined by
their parameter and return types. —end
note
I suggest you read the entire section 3.10 Lvalues and rvalues though.
How do these new categories relate to the existing rvalue and lvalue categories?
Again:
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
The semantics of rvalues has evolved particularly with the introduction of move semantics.
Why are these new categories needed?
So that move construction/assignment could be defined and supported.
I'll start with your last question:
Why are these new categories needed?
The C++ standard contains many rules that deal with the value category of an expression. Some rules make a distinction between lvalue and rvalue. For example, when it comes to overload resolution. Other rules make a distinction between glvalue and prvalue. For example, you can have a glvalue with an incomplete or abstract type but there is no prvalue with an incomplete or abstract type. Before we had this terminology the rules that actually need to distinguish between glvalue/prvalue referred to lvalue/rvalue and they were either unintentionally wrong or contained lots of explaining and exceptions to the rule a la "...unless the rvalue is due to unnamed rvalue reference...". So, it seems like a good idea to just give the concepts of glvalues and prvalues their own name.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
We still have the terms lvalue and rvalue that are compatible with C++98. We just divided the rvalues into two subgroups, xvalues and prvalues, and we refer to lvalues and xvalues as glvalues. Xvalues are a new kind of value category for unnamed rvalue references. Every expression is one of these three: lvalue, xvalue, prvalue. A Venn diagram would look like this:
______ ______
/ X \
/ / \ \
| l | x | pr |
\ \ / /
\______X______/
gl r
Examples with functions:
int prvalue();
int& lvalue();
int&& xvalue();
But also don't forget that named rvalue references are lvalues:
void foo(int&& t) {
// t is initialized with an rvalue expression
// but is actually an lvalue expression itself
}
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I don't feel that the other answers (good though many of them are) really capture the answer to this particular question. Yes, these categories and such exist to allow move semantics, but the complexity exists for one reason. This is the one inviolate rule of moving stuff in C++11:
Thou shalt move only when it is unquestionably safe to do so.
That is why these categories exist: to be able to talk about values where it is safe to move from them, and to talk about values where it is not.
In the earliest version of r-value references, movement happened easily. Too easily. Easily enough that there was a lot of potential for implicitly moving things when the user didn't really mean to.
Here are the circumstances under which it is safe to move something:
When it's a temporary or subobject thereof. (prvalue)
When the user has explicitly said to move it.
If you do this:
SomeType &&Func() { ... }
SomeType &&val = Func();
SomeType otherVal{val};
What does this do? In older versions of the spec, before the 5 values came in, this would provoke a move. Of course it does. You passed an rvalue reference to the constructor, and thus it binds to the constructor that takes an rvalue reference. That's obvious.
There's just one problem with this; you didn't ask to move it. Oh, you might say that the && should have been a clue, but that doesn't change the fact that it broke the rule. val isn't a temporary because temporaries don't have names. You may have extended the lifetime of the temporary, but that means it isn't temporary; it's just like any other stack variable.
If it's not a temporary, and you didn't ask to move it, then moving is wrong.
The obvious solution is to make val an lvalue. This means that you can't move from it. OK, fine; it's named, so its an lvalue.
Once you do that, you can no longer say that SomeType&& means the same thing everwhere. You've now made a distinction between named rvalue references and unnamed rvalue references. Well, named rvalue references are lvalues; that was our solution above. So what do we call unnamed rvalue references (the return value from Func above)?
It's not an lvalue, because you can't move from an lvalue. And we need to be able to move by returning a &&; how else could you explicitly say to move something? That is what std::move returns, after all. It's not an rvalue (old-style), because it can be on the left side of an equation (things are actually a bit more complicated, see this question and the comments below). It is neither an lvalue nor an rvalue; it's a new kind of thing.
What we have is a value that you can treat as an lvalue, except that it is implicitly moveable from. We call it an xvalue.
Note that xvalues are what makes us gain the other two categories of values:
A prvalue is really just the new name for the previous type of rvalue, i.e. they're the rvalues that aren't xvalues.
Glvalues are the union of xvalues and lvalues in one group, because they do share a lot of properties in common.
So really, it all comes down to xvalues and the need to restrict movement to exactly and only certain places. Those places are defined by the rvalue category; prvalues are the implicit moves, and xvalues are the explicit moves (std::move returns an xvalue).
IMHO, the best explanation about its meaning gave us Stroustrup + take into account examples of Dániel Sándor and Mohan:
Stroustrup:
Now I was seriously worried. Clearly we were headed for an impasse or
a mess or both. I spent the lunchtime doing an analysis to see which
of the properties (of values) were independent. There were only two
independent properties:
has identity – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.
can be moved from – i.e. we are allowed to leave to source of a "copy" in some indeterminate, but valid state
This led me to the conclusion that there are exactly three kinds of
values (using the regex notational trick of using a capital letter to
indicate a negative – I was in a hurry):
iM: has identity and cannot be moved from
im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)
Im: does not have identity and can be moved from.
The fourth possibility, IM, (doesn’t have identity and cannot be moved) is not
useful in C++ (or, I think) in any other language.
In addition to these three fundamental classifications of values, we
have two obvious generalizations that correspond to the two
independent properties:
i: has identity
m: can be moved from
This led me to put this diagram on the board:
Naming
I observed that we had only limited freedom to name: The two points to
the left (labeled iM and i) are what people with more or less
formality have called lvalues and the two points on the right
(labeled m and Im) are what people with more or less formality
have called rvalues. This must be reflected in our naming. That is,
the left "leg" of the W should have names related to lvalue and the
right "leg" of the W should have names related to rvalue. I note
that this whole discussion/problem arise from the introduction of
rvalue references and move semantics. These notions simply don’t exist
in Strachey’s world consisting of just rvalues and lvalues. Someone
observed that the ideas that
Every value is either an lvalue or an rvalue
An lvalue is not an rvalue and an rvalue is not an lvalue
are deeply embedded in our consciousness, very useful properties, and
traces of this dichotomy can be found all over the draft standard. We
all agreed that we ought to preserve those properties (and make them
precise). This further constrained our naming choices. I observed that
the standard library wording uses rvalue to mean m (the
generalization), so that to preserve the expectation and text of the
standard library the right-hand bottom point of the W should be named
rvalue.
This led to a focused discussion of naming. First, we needed to decide
on lvalue. Should lvalue mean iM or the generalization i? Led
by Doug Gregor, we listed the places in the core language wording
where the word lvalue was qualified to mean the one or the other. A
list was made and in most cases and in the most tricky/brittle text
lvalue currently means iM. This is the classical meaning of lvalue
because "in the old days" nothing was moved; move is a novel notion
in C++0x. Also, naming the topleft point of the W lvalue gives us
the property that every value is an lvalue or an rvalue, but not both.
So, the top left point of the W is lvalue and the bottom right point
is rvalue. What does that make the bottom left and top right points?
The bottom left point is a generalization of the classical lvalue,
allowing for move. So it is a generalized lvalue. We named it
glvalue. You can quibble about the abbreviation, but (I think) not
with the logic. We assumed that in serious use generalized lvalue
would somehow be abbreviated anyway, so we had better do it
immediately (or risk confusion). The top right point of the W is less
general than the bottom right (now, as ever, called rvalue). That
point represent the original pure notion of an object you can move
from because it cannot be referred to again (except by a destructor).
I liked the phrase specialized rvalue in contrast to generalized
lvalue but pure rvalue abbreviated to prvalue won out (and
probably rightly so). So, the left leg of the W is lvalue and
glvalue and the right leg is prvalue and rvalue. Incidentally,
every value is either a glvalue or a prvalue, but not both.
This leaves the top middle of the W: im; that is, values that have
identity and can be moved. We really don’t have anything that guides
us to a good name for those esoteric beasts. They are important to
people working with the (draft) standard text, but are unlikely to
become a household name. We didn’t find any real constraints on the
naming to guide us, so we picked ‘x’ for the center, the unknown, the
strange, the xpert only, or even x-rated.
INTRODUCTION
ISOC++11 (officially ISO/IEC 14882:2011) is the most recent version of the standard of the C++ programming language. It contains some new features, and concepts, for example:
rvalue references
xvalue, glvalue, prvalue expression value categories
move semantics
If we would like to understand the concepts of the new expression value categories we have to be aware of that there are rvalue and lvalue references.
It is better to know rvalues can be passed to non-const rvalue references.
int& r_i=7; // compile error
int&& rr_i=7; // OK
We can gain some intuition of the concepts of value categories if we quote the subsection titled Lvalues and rvalues from the working draft N3337 (the most similar draft to the published ISOC++11 standard).
3.10 Lvalues and rvalues [basic.lval]
1 Expressions are categorized according to the taxonomy in Figure 1.
An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function
or an object. [ Example: If E is an expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function
whose return type is an lvalue reference is an lvalue. —end example ]
An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for
example). An xvalue is the result of certain kinds of expressions
involving rvalue references (8.3.2). [ Example: The result of calling
a function whose return type is an rvalue reference is an xvalue. —end
example ]
A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
An rvalue (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a
temporary object (12.2) or subobject thereof, or a value that is not
associated with an object.
A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [ Example: The result of calling a function whose return type is not a
reference is a prvalue. The value of a literal such as 12, 7.3e5, or
true is also a prvalue. —end example ]
Every expression belongs to exactly one of the fundamental
classifications in this taxonomy: lvalue, xvalue, or prvalue. This
property of an expression is called its value category.
But I am not quite sure about that this subsection is enough to understand the concepts clearly, because "usually" is not really general, "near the end of its lifetime" is not really concrete, "involving rvalue references" is not really clear, and "Example: The result of calling a function whose return type is an rvalue reference is an xvalue." sounds like a snake is biting its tail.
PRIMARY VALUE CATEGORIES
Every expression belongs to exactly one primary value category. These value categories are lvalue, xvalue and prvalue categories.
lvalues
The expression E belongs to the lvalue category if and only if E refers to an entity that ALREADY has had an identity (address, name or alias) that makes it accessible outside of E.
#include <iostream>
int i=7;
const int& f(){
return i;
}
int main()
{
std::cout<<&"www"<<std::endl; // The expression "www" in this row is an lvalue expression, because string literals are arrays and every array has an address.
i; // The expression i in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression i in this row refers to.
int* p_i=new int(7);
*p_i; // The expression *p_i in this row is an lvalue expression, because it refers to the same entity ...
*p_i; // ... as the entity the expression *p_i in this row refers to.
const int& r_I=7;
r_I; // The expression r_I in this row is an lvalue expression, because it refers to the same entity ...
r_I; // ... as the entity the expression r_I in this row refers to.
f(); // The expression f() in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression f() in this row refers to.
return 0;
}
xvalues
The expression E belongs to the xvalue category if and only if it is
— the result of calling a function, whether implicitly or explicitly, whose return type is an rvalue reference to the type of object being returned, or
int&& f(){
return 3;
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because f() return type is an rvalue reference to object type.
return 0;
}
— a cast to an rvalue reference to object type, or
int main()
{
static_cast<int&&>(7); // The expression static_cast<int&&>(7) belongs to the xvalue category, because it is a cast to an rvalue reference to object type.
std::move(7); // std::move(7) is equivalent to static_cast<int&&>(7).
return 0;
}
— a class member access expression designating a non-static data member of non-reference type in which the object expression is an xvalue, or
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f().i; // The expression f().i belongs to the xvalue category, because As::i is a non-static data member of non-reference type, and the subexpression f() belongs to the xvlaue category.
return 0;
}
— a pointer-to-member expression in which the first operand is an xvalue and the second operand is a pointer to data member.
Note that the effect of the rules above is that named rvalue references to objects are treated as lvalues and unnamed rvalue references to objects are treated as xvalues; rvalue references to functions are treated as lvalues whether named or not.
#include <functional>
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because it refers to an unnamed rvalue reference to object.
As&& rr_a=As();
rr_a; // The expression rr_a belongs to the lvalue category, because it refers to a named rvalue reference to object.
std::ref(f); // The expression std::ref(f) belongs to the lvalue category, because it refers to an rvalue reference to function.
return 0;
}
prvalues
The expression E belongs to the prvalue category if and only if E belongs neither to the lvalue nor to the xvalue category.
struct As
{
void f(){
this; // The expression this is a prvalue expression. Note, that the expression this is not a variable.
}
};
As f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the prvalue category, because it belongs neither to the lvalue nor to the xvalue category.
return 0;
}
MIXED VALUE CATEGORIES
There are two further important mixed value categories. These value categories are rvalue and glvalue categories.
rvalues
The expression E belongs to the rvalue category if and only if E belongs to the xvalue category, or to the prvalue category.
Note that this definition means that the expression E belongs to the rvalue category if and only if E refers to an entity that has not had any identity that makes it accessible outside of E YET.
glvalues
The expression E belongs to the glvalue category if and only if E belongs to the lvalue category, or to the xvalue category.
A PRACTICAL RULE
Scott Meyer has published a very useful rule of thumb to distinguish rvalues from lvalues.
If you can take the address of an expression, the expression is an lvalue.
If the type of an expression is an lvalue reference (e.g., T& or const T&, etc.), that expression is an lvalue.
Otherwise, the expression is an rvalue. Conceptually (and typically also in fact), rvalues correspond to temporary objects, such
as those returned from functions or created through implicit type
conversions. Most literal values (e.g., 10 and 5.3) are also rvalues.
I have struggled with this for a long time, until I came across the cppreference.com explanation of the value categories.
It is actually rather simple, but I find that it is often explained in a way that's hard to memorize. Here it is explained very schematically. I'll quote some parts of the page:
Primary categories
The primary value categories correspond to two properties of expressions:
has identity: it's possible to determine whether the expression refers to the same entity as another expression, such as by comparing addresses of the objects or the functions they identify (obtained directly or indirectly);
can be moved from: move constructor, move assignment operator, or another function overload that implements move semantics can bind to the expression.
Expressions that:
have identity and cannot be moved from are called lvalue expressions;
have identity and can be moved from are called xvalue expressions;
do not have identity and can be moved from are called prvalue expressions;
do not have identity and cannot be moved from are not used.
lvalue
An lvalue ("left value") expression is an expression that has identity and cannot be moved from.
rvalue (until C++11), prvalue (since C++11)
A prvalue ("pure rvalue") expression is an expression that does not have identity and can be moved from.
xvalue
An xvalue ("expiring value") expression is an expression that has identity and can be moved from.
glvalue
A glvalue ("generalized lvalue") expression is an expression that is either an lvalue or an xvalue. It has identity. It may or may not be moved from.
rvalue (since C++11)
An rvalue ("right value") expression is an expression that is either a prvalue or an xvalue. It can be moved from. It may or may not have identity.
So let's put that into a table:
Can be moved from (= rvalue)
Cannot be moved from
Has identity (= glvalue)
xvalue
lvalue
No identity
prvalue
not used
C++03's categories are too restricted to capture the introduction of rvalue references correctly into expression attributes.
With the introduction of them, it was said that an unnamed rvalue reference evaluates to an rvalue, such that overload resolution would prefer rvalue reference bindings, which would make it select move constructors over copy constructors. But it was found that this causes problems all around, for example with Dynamic Types and with qualifications.
To show this, consider
int const&& f();
int main() {
int &&i = f(); // disgusting!
}
On pre-xvalue drafts, this was allowed, because in C++03, rvalues of non-class types are never cv-qualified. But it is intended that const applies in the rvalue-reference case, because here we do refer to objects (= memory!), and dropping const from non-class rvalues is mainly for the reason that there is no object around.
The issue for dynamic types is of similar nature. In C++03, rvalues of class type have a known dynamic type - it's the static type of that expression. Because to have it another way, you need references or dereferences, which evaluate to an lvalue. That isn't true with unnamed rvalue references, yet they can show polymorphic behavior. So to solve it,
unnamed rvalue references become xvalues. They can be qualified and potentially have their dynamic type different. They do, like intended, prefer rvalue references during overloading, and won't bind to non-const lvalue references.
What previously was an rvalue (literals, objects created by casts to non-reference types) now becomes an prvalue. They have the same preference as xvalues during overloading.
What previously was an lvalue stays an lvalue.
And two groupings are done to capture those that can be qualified and can have different dynamic types (glvalues) and those where overloading prefers rvalue reference binding (rvalues).
As the previous answers exhaustively covered the theory behind the value categories, there is just another thing I'd like to add: you can actually play with it and test it.
For some hands-on experimentation with the value categories, you can make use of the decltype specifier. Its behavior explicitly distinguishes between the three primary value categories (xvalue, lvalue, and prvalue).
Using the preprocessor saves us some typing ...
Primary categories:
#define IS_XVALUE(X) std::is_rvalue_reference<decltype((X))>::value
#define IS_LVALUE(X) std::is_lvalue_reference<decltype((X))>::value
#define IS_PRVALUE(X) !std::is_reference<decltype((X))>::value
Mixed categories:
#define IS_GLVALUE(X) (IS_LVALUE(X) || IS_XVALUE(X))
#define IS_RVALUE(X) (IS_PRVALUE(X) || IS_XVALUE(X))
Now we can reproduce (almost) all the examples from cppreference on value category.
Here are some examples with C++17 (for terse static_assert):
void doesNothing(){}
struct S
{
int x{0};
};
int x = 1;
int y = 2;
S s;
static_assert(IS_LVALUE(x));
static_assert(IS_LVALUE(x+=y));
static_assert(IS_LVALUE("Hello world!"));
static_assert(IS_LVALUE(++x));
static_assert(IS_PRVALUE(1));
static_assert(IS_PRVALUE(x++));
static_assert(IS_PRVALUE(static_cast<double>(x)));
static_assert(IS_PRVALUE(std::string{}));
static_assert(IS_PRVALUE(throw std::exception()));
static_assert(IS_PRVALUE(doesNothing()));
static_assert(IS_XVALUE(std::move(s)));
// The next one doesn't work in gcc 8.2 but in gcc 9.1. Clang 7.0.0 and msvc 19.16 are doing fine.
static_assert(IS_XVALUE(S().x));
The mixed categories are kind of boring once you figured out the primary category.
For some more examples (and experimentation), check out the following link on compiler explorer. Don't bother reading the assembly, though. I added a lot of compilers just to make sure it works across all the common compilers.
These are terms that the C++ committee used to define move semantics in C++11. Here's the story.
I find it difficult to understand the terms given their precise definitions, the long lists of rules or this popular diagram:
It's easier on a Venn diagram with typical examples:
Basically:
every expression is either lvalue or rvalue
lvalue must be copied, because it has identity, so can be used later
rvalue can be moved, either because it's a temporary (prvalue) or explicitly moved (xvalue)
Now, the good question is that if we have two orthogonal properties ("has identity" and "can be moved"), what's the fourth category to complete lvalue, xvalue and prvalue?
That would be an expression that has no identity (hence cannot be accessed later) and cannot be moved (one need to copy its value). This is simply not useful, so hasn't been named.
How do these new categories relate to the existing rvalue and lvalue categories?
A C++03 lvalue is still a C++11 lvalue, whereas a C++03 rvalue is called a prvalue in C++11.
One addendum to the excellent answers above, on a point that confused me even after I had read Stroustrup and thought I understood the rvalue/lvalue distinction. When you see
int&& a = 3,
it's very tempting to read the int&& as a type and conclude that a is an rvalue. It's not:
int&& a = 3;
int&& c = a; //error: cannot bind 'int' lvalue to 'int&&'
int& b = a; //compiles
a has a name and is ipso facto an lvalue. Don't think of the && as part of the type of a; it's just something telling you what a is allowed to bind to.
This matters particularly for T&& type arguments in constructors. If you write
Foo::Foo(T&& _t) : t{_t} {}
you will copy _t into t. You need
Foo::Foo(T&& _t) : t{std::move(_t)} {} if you want to move. Would that my compiler warned me when I left out the move!
This is Venn diagram I made for a highly visual C++ book I'm writing which I will be publishing on leanpub during development soon.
The other answers go into more detail with words, and show similar diagrams. But hopefully this presentation of the information is fairly complete and useful for referencing, in addition.
The main takeaway for me on this topic is that expressions have these two properties: identity and movability. The first deals with the the "solidness" with which something exists. That's important because the C++ abstract machine is allowed to and encouraged to aggressively change and shrink your code through optimizations, and that means that things without identity might only ever exist in the mind of the compiler or in a register for a brief moment before getting trampled on. But a piece of data like that is also guaranteed not to cause issues if you recycle it's innards since there's no way to try to use it. And thus, move semantics were invented to allow us to capture references to temporaries, upgrading them to lvalues and extending their lifetime.
Move semantics originally were about not just throwing away temporaries wastefully, but instead giving them away so they can consumed by another.
When you give your cornbread away, the person you give it to now owns it. They consume it. You should not attempt to eat or digest said cornbread once you've given it away. Maybe that cornbread was headed for the trash anyway, but now it's headed for their bellies. It's not yours anymore.
In C++ land, the idea of "consuming" a resource means that resource is now owned by us and so we should do any cleanup necessary, and ensure the object isn't accessed elsewhere. Often times, that means borrowing the guts to create new objects. I call that "donating organs". Usually, we are talking about pointers or references contained by the object, or something like that, and we want to keep those pointers or references around because they refer to data elsewhere in our program that is not dying.
Thus you could write a function overload that takes an rvalue reference, and if a temporary (prvalue) were passed in, that's the overload that would be called. A new lvalue would be created upon binding to the rvalue reference taken by the function, extending the life of the temporary so you could consume it within your function.
At some point, we realized that we often had lvalue non-temporary data that we were finished with in one scope, but wanted to cannibalize in another scope. But they aren't rvalues and so wouldn't bind to an rvalue reference. So we made std::move, which is just a fancy cast from lvalue to rvalue reference. Such a datum is an xvalue: a former lvalue now acting as if it were a temporary so it can also be moved from.
Related
In C++03, an expression is either an rvalue or an lvalue.
In C++11, an expression can be an:
rvalue
lvalue
xvalue
glvalue
prvalue
Two categories have become five categories.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I guess this document might serve as a not so short introduction : n3055
The whole massacre began with the move semantics. Once we have expressions that can be moved and not copied, suddenly easy to grasp rules demanded distinction between expressions that can be moved, and in which direction.
From what I guess based on the draft, the r/l value distinction stays the same, only in the context of moving things get messy.
Are they needed? Probably not if we wish to forfeit the new features. But to allow better optimization we should probably embrace them.
Quoting n3055:
An lvalue (so-called, historically,
because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [Example: If E is an
expression of pointer type, then *E
is an lvalue expression referring to
the object or function to which E
points. As another example, the
result of calling a function whose
return type is an lvalue reference is
an lvalue.]
An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may
be moved, for example). An xvalue is
the result of certain kinds of
expressions involving rvalue
references. [Example: The
result of calling a function whose
return type is an rvalue reference is
an xvalue.]
A glvalue (“generalized” lvalue) is an lvalue
or an xvalue.
An rvalue (so-called,
historically, because rvalues could
appear on the right-hand side of an
assignment expression) is an xvalue,
a temporary object or
subobject thereof, or a value that is
not associated with an object.
A
prvalue (“pure” rvalue) is an rvalue
that is not an xvalue. [Example: The
result of calling a function whose
return type is not a reference is a
prvalue]
The document in question is a great reference for this question, because it shows the exact changes in the standard that have happened as a result of the introduction of the new nomenclature.
What are these new categories of expressions?
The FCD (n3092) has an excellent description:
— An lvalue (so called, historically, because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [ Example: If E is an
expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E
points. As another example, the result
of calling a function whose return
type is an lvalue reference is an
lvalue. —end example ]
— An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may be
moved, for example). An xvalue is the
result of certain kinds of expressions
involving rvalue references (8.3.2). [
Example: The result of calling a
function whose return type is an
rvalue reference is an xvalue. —end
example ]
— A glvalue (“generalized”
lvalue) is an lvalue or an xvalue.
—
An rvalue (so called, historically,
because rvalues could appear on the
right-hand side of an assignment
expressions) is an xvalue, a temporary
object (12.2) or subobject thereof, or
a value that is not associated with an
object.
— A prvalue (“pure” rvalue) is
an rvalue that is not an xvalue. [
Example: The result of calling a
function whose return type is not a
reference is a prvalue. The value of a
literal such as 12, 7.3e5, or true is
also a prvalue. —end example ]
Every
expression belongs to exactly one of
the fundamental classifications in
this taxonomy: lvalue, xvalue, or
prvalue. This property of an
expression is called its value
category. [ Note: The discussion of
each built-in operator in Clause 5
indicates the category of the value it
yields and the value categories of the
operands it expects. For example, the
built-in assignment operators expect
that the left operand is an lvalue and
that the right operand is a prvalue
and yield an lvalue as the result.
User-defined operators are functions,
and the categories of values they
expect and yield are determined by
their parameter and return types. —end
note
I suggest you read the entire section 3.10 Lvalues and rvalues though.
How do these new categories relate to the existing rvalue and lvalue categories?
Again:
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
The semantics of rvalues has evolved particularly with the introduction of move semantics.
Why are these new categories needed?
So that move construction/assignment could be defined and supported.
I'll start with your last question:
Why are these new categories needed?
The C++ standard contains many rules that deal with the value category of an expression. Some rules make a distinction between lvalue and rvalue. For example, when it comes to overload resolution. Other rules make a distinction between glvalue and prvalue. For example, you can have a glvalue with an incomplete or abstract type but there is no prvalue with an incomplete or abstract type. Before we had this terminology the rules that actually need to distinguish between glvalue/prvalue referred to lvalue/rvalue and they were either unintentionally wrong or contained lots of explaining and exceptions to the rule a la "...unless the rvalue is due to unnamed rvalue reference...". So, it seems like a good idea to just give the concepts of glvalues and prvalues their own name.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
We still have the terms lvalue and rvalue that are compatible with C++98. We just divided the rvalues into two subgroups, xvalues and prvalues, and we refer to lvalues and xvalues as glvalues. Xvalues are a new kind of value category for unnamed rvalue references. Every expression is one of these three: lvalue, xvalue, prvalue. A Venn diagram would look like this:
______ ______
/ X \
/ / \ \
| l | x | pr |
\ \ / /
\______X______/
gl r
Examples with functions:
int prvalue();
int& lvalue();
int&& xvalue();
But also don't forget that named rvalue references are lvalues:
void foo(int&& t) {
// t is initialized with an rvalue expression
// but is actually an lvalue expression itself
}
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I don't feel that the other answers (good though many of them are) really capture the answer to this particular question. Yes, these categories and such exist to allow move semantics, but the complexity exists for one reason. This is the one inviolate rule of moving stuff in C++11:
Thou shalt move only when it is unquestionably safe to do so.
That is why these categories exist: to be able to talk about values where it is safe to move from them, and to talk about values where it is not.
In the earliest version of r-value references, movement happened easily. Too easily. Easily enough that there was a lot of potential for implicitly moving things when the user didn't really mean to.
Here are the circumstances under which it is safe to move something:
When it's a temporary or subobject thereof. (prvalue)
When the user has explicitly said to move it.
If you do this:
SomeType &&Func() { ... }
SomeType &&val = Func();
SomeType otherVal{val};
What does this do? In older versions of the spec, before the 5 values came in, this would provoke a move. Of course it does. You passed an rvalue reference to the constructor, and thus it binds to the constructor that takes an rvalue reference. That's obvious.
There's just one problem with this; you didn't ask to move it. Oh, you might say that the && should have been a clue, but that doesn't change the fact that it broke the rule. val isn't a temporary because temporaries don't have names. You may have extended the lifetime of the temporary, but that means it isn't temporary; it's just like any other stack variable.
If it's not a temporary, and you didn't ask to move it, then moving is wrong.
The obvious solution is to make val an lvalue. This means that you can't move from it. OK, fine; it's named, so its an lvalue.
Once you do that, you can no longer say that SomeType&& means the same thing everwhere. You've now made a distinction between named rvalue references and unnamed rvalue references. Well, named rvalue references are lvalues; that was our solution above. So what do we call unnamed rvalue references (the return value from Func above)?
It's not an lvalue, because you can't move from an lvalue. And we need to be able to move by returning a &&; how else could you explicitly say to move something? That is what std::move returns, after all. It's not an rvalue (old-style), because it can be on the left side of an equation (things are actually a bit more complicated, see this question and the comments below). It is neither an lvalue nor an rvalue; it's a new kind of thing.
What we have is a value that you can treat as an lvalue, except that it is implicitly moveable from. We call it an xvalue.
Note that xvalues are what makes us gain the other two categories of values:
A prvalue is really just the new name for the previous type of rvalue, i.e. they're the rvalues that aren't xvalues.
Glvalues are the union of xvalues and lvalues in one group, because they do share a lot of properties in common.
So really, it all comes down to xvalues and the need to restrict movement to exactly and only certain places. Those places are defined by the rvalue category; prvalues are the implicit moves, and xvalues are the explicit moves (std::move returns an xvalue).
IMHO, the best explanation about its meaning gave us Stroustrup + take into account examples of Dániel Sándor and Mohan:
Stroustrup:
Now I was seriously worried. Clearly we were headed for an impasse or
a mess or both. I spent the lunchtime doing an analysis to see which
of the properties (of values) were independent. There were only two
independent properties:
has identity – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.
can be moved from – i.e. we are allowed to leave to source of a "copy" in some indeterminate, but valid state
This led me to the conclusion that there are exactly three kinds of
values (using the regex notational trick of using a capital letter to
indicate a negative – I was in a hurry):
iM: has identity and cannot be moved from
im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)
Im: does not have identity and can be moved from.
The fourth possibility, IM, (doesn’t have identity and cannot be moved) is not
useful in C++ (or, I think) in any other language.
In addition to these three fundamental classifications of values, we
have two obvious generalizations that correspond to the two
independent properties:
i: has identity
m: can be moved from
This led me to put this diagram on the board:
Naming
I observed that we had only limited freedom to name: The two points to
the left (labeled iM and i) are what people with more or less
formality have called lvalues and the two points on the right
(labeled m and Im) are what people with more or less formality
have called rvalues. This must be reflected in our naming. That is,
the left "leg" of the W should have names related to lvalue and the
right "leg" of the W should have names related to rvalue. I note
that this whole discussion/problem arise from the introduction of
rvalue references and move semantics. These notions simply don’t exist
in Strachey’s world consisting of just rvalues and lvalues. Someone
observed that the ideas that
Every value is either an lvalue or an rvalue
An lvalue is not an rvalue and an rvalue is not an lvalue
are deeply embedded in our consciousness, very useful properties, and
traces of this dichotomy can be found all over the draft standard. We
all agreed that we ought to preserve those properties (and make them
precise). This further constrained our naming choices. I observed that
the standard library wording uses rvalue to mean m (the
generalization), so that to preserve the expectation and text of the
standard library the right-hand bottom point of the W should be named
rvalue.
This led to a focused discussion of naming. First, we needed to decide
on lvalue. Should lvalue mean iM or the generalization i? Led
by Doug Gregor, we listed the places in the core language wording
where the word lvalue was qualified to mean the one or the other. A
list was made and in most cases and in the most tricky/brittle text
lvalue currently means iM. This is the classical meaning of lvalue
because "in the old days" nothing was moved; move is a novel notion
in C++0x. Also, naming the topleft point of the W lvalue gives us
the property that every value is an lvalue or an rvalue, but not both.
So, the top left point of the W is lvalue and the bottom right point
is rvalue. What does that make the bottom left and top right points?
The bottom left point is a generalization of the classical lvalue,
allowing for move. So it is a generalized lvalue. We named it
glvalue. You can quibble about the abbreviation, but (I think) not
with the logic. We assumed that in serious use generalized lvalue
would somehow be abbreviated anyway, so we had better do it
immediately (or risk confusion). The top right point of the W is less
general than the bottom right (now, as ever, called rvalue). That
point represent the original pure notion of an object you can move
from because it cannot be referred to again (except by a destructor).
I liked the phrase specialized rvalue in contrast to generalized
lvalue but pure rvalue abbreviated to prvalue won out (and
probably rightly so). So, the left leg of the W is lvalue and
glvalue and the right leg is prvalue and rvalue. Incidentally,
every value is either a glvalue or a prvalue, but not both.
This leaves the top middle of the W: im; that is, values that have
identity and can be moved. We really don’t have anything that guides
us to a good name for those esoteric beasts. They are important to
people working with the (draft) standard text, but are unlikely to
become a household name. We didn’t find any real constraints on the
naming to guide us, so we picked ‘x’ for the center, the unknown, the
strange, the xpert only, or even x-rated.
INTRODUCTION
ISOC++11 (officially ISO/IEC 14882:2011) is the most recent version of the standard of the C++ programming language. It contains some new features, and concepts, for example:
rvalue references
xvalue, glvalue, prvalue expression value categories
move semantics
If we would like to understand the concepts of the new expression value categories we have to be aware of that there are rvalue and lvalue references.
It is better to know rvalues can be passed to non-const rvalue references.
int& r_i=7; // compile error
int&& rr_i=7; // OK
We can gain some intuition of the concepts of value categories if we quote the subsection titled Lvalues and rvalues from the working draft N3337 (the most similar draft to the published ISOC++11 standard).
3.10 Lvalues and rvalues [basic.lval]
1 Expressions are categorized according to the taxonomy in Figure 1.
An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function
or an object. [ Example: If E is an expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function
whose return type is an lvalue reference is an lvalue. —end example ]
An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for
example). An xvalue is the result of certain kinds of expressions
involving rvalue references (8.3.2). [ Example: The result of calling
a function whose return type is an rvalue reference is an xvalue. —end
example ]
A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
An rvalue (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a
temporary object (12.2) or subobject thereof, or a value that is not
associated with an object.
A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [ Example: The result of calling a function whose return type is not a
reference is a prvalue. The value of a literal such as 12, 7.3e5, or
true is also a prvalue. —end example ]
Every expression belongs to exactly one of the fundamental
classifications in this taxonomy: lvalue, xvalue, or prvalue. This
property of an expression is called its value category.
But I am not quite sure about that this subsection is enough to understand the concepts clearly, because "usually" is not really general, "near the end of its lifetime" is not really concrete, "involving rvalue references" is not really clear, and "Example: The result of calling a function whose return type is an rvalue reference is an xvalue." sounds like a snake is biting its tail.
PRIMARY VALUE CATEGORIES
Every expression belongs to exactly one primary value category. These value categories are lvalue, xvalue and prvalue categories.
lvalues
The expression E belongs to the lvalue category if and only if E refers to an entity that ALREADY has had an identity (address, name or alias) that makes it accessible outside of E.
#include <iostream>
int i=7;
const int& f(){
return i;
}
int main()
{
std::cout<<&"www"<<std::endl; // The expression "www" in this row is an lvalue expression, because string literals are arrays and every array has an address.
i; // The expression i in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression i in this row refers to.
int* p_i=new int(7);
*p_i; // The expression *p_i in this row is an lvalue expression, because it refers to the same entity ...
*p_i; // ... as the entity the expression *p_i in this row refers to.
const int& r_I=7;
r_I; // The expression r_I in this row is an lvalue expression, because it refers to the same entity ...
r_I; // ... as the entity the expression r_I in this row refers to.
f(); // The expression f() in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression f() in this row refers to.
return 0;
}
xvalues
The expression E belongs to the xvalue category if and only if it is
— the result of calling a function, whether implicitly or explicitly, whose return type is an rvalue reference to the type of object being returned, or
int&& f(){
return 3;
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because f() return type is an rvalue reference to object type.
return 0;
}
— a cast to an rvalue reference to object type, or
int main()
{
static_cast<int&&>(7); // The expression static_cast<int&&>(7) belongs to the xvalue category, because it is a cast to an rvalue reference to object type.
std::move(7); // std::move(7) is equivalent to static_cast<int&&>(7).
return 0;
}
— a class member access expression designating a non-static data member of non-reference type in which the object expression is an xvalue, or
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f().i; // The expression f().i belongs to the xvalue category, because As::i is a non-static data member of non-reference type, and the subexpression f() belongs to the xvlaue category.
return 0;
}
— a pointer-to-member expression in which the first operand is an xvalue and the second operand is a pointer to data member.
Note that the effect of the rules above is that named rvalue references to objects are treated as lvalues and unnamed rvalue references to objects are treated as xvalues; rvalue references to functions are treated as lvalues whether named or not.
#include <functional>
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because it refers to an unnamed rvalue reference to object.
As&& rr_a=As();
rr_a; // The expression rr_a belongs to the lvalue category, because it refers to a named rvalue reference to object.
std::ref(f); // The expression std::ref(f) belongs to the lvalue category, because it refers to an rvalue reference to function.
return 0;
}
prvalues
The expression E belongs to the prvalue category if and only if E belongs neither to the lvalue nor to the xvalue category.
struct As
{
void f(){
this; // The expression this is a prvalue expression. Note, that the expression this is not a variable.
}
};
As f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the prvalue category, because it belongs neither to the lvalue nor to the xvalue category.
return 0;
}
MIXED VALUE CATEGORIES
There are two further important mixed value categories. These value categories are rvalue and glvalue categories.
rvalues
The expression E belongs to the rvalue category if and only if E belongs to the xvalue category, or to the prvalue category.
Note that this definition means that the expression E belongs to the rvalue category if and only if E refers to an entity that has not had any identity that makes it accessible outside of E YET.
glvalues
The expression E belongs to the glvalue category if and only if E belongs to the lvalue category, or to the xvalue category.
A PRACTICAL RULE
Scott Meyer has published a very useful rule of thumb to distinguish rvalues from lvalues.
If you can take the address of an expression, the expression is an lvalue.
If the type of an expression is an lvalue reference (e.g., T& or const T&, etc.), that expression is an lvalue.
Otherwise, the expression is an rvalue. Conceptually (and typically also in fact), rvalues correspond to temporary objects, such
as those returned from functions or created through implicit type
conversions. Most literal values (e.g., 10 and 5.3) are also rvalues.
I have struggled with this for a long time, until I came across the cppreference.com explanation of the value categories.
It is actually rather simple, but I find that it is often explained in a way that's hard to memorize. Here it is explained very schematically. I'll quote some parts of the page:
Primary categories
The primary value categories correspond to two properties of expressions:
has identity: it's possible to determine whether the expression refers to the same entity as another expression, such as by comparing addresses of the objects or the functions they identify (obtained directly or indirectly);
can be moved from: move constructor, move assignment operator, or another function overload that implements move semantics can bind to the expression.
Expressions that:
have identity and cannot be moved from are called lvalue expressions;
have identity and can be moved from are called xvalue expressions;
do not have identity and can be moved from are called prvalue expressions;
do not have identity and cannot be moved from are not used.
lvalue
An lvalue ("left value") expression is an expression that has identity and cannot be moved from.
rvalue (until C++11), prvalue (since C++11)
A prvalue ("pure rvalue") expression is an expression that does not have identity and can be moved from.
xvalue
An xvalue ("expiring value") expression is an expression that has identity and can be moved from.
glvalue
A glvalue ("generalized lvalue") expression is an expression that is either an lvalue or an xvalue. It has identity. It may or may not be moved from.
rvalue (since C++11)
An rvalue ("right value") expression is an expression that is either a prvalue or an xvalue. It can be moved from. It may or may not have identity.
So let's put that into a table:
Can be moved from (= rvalue)
Cannot be moved from
Has identity (= glvalue)
xvalue
lvalue
No identity
prvalue
not used
C++03's categories are too restricted to capture the introduction of rvalue references correctly into expression attributes.
With the introduction of them, it was said that an unnamed rvalue reference evaluates to an rvalue, such that overload resolution would prefer rvalue reference bindings, which would make it select move constructors over copy constructors. But it was found that this causes problems all around, for example with Dynamic Types and with qualifications.
To show this, consider
int const&& f();
int main() {
int &&i = f(); // disgusting!
}
On pre-xvalue drafts, this was allowed, because in C++03, rvalues of non-class types are never cv-qualified. But it is intended that const applies in the rvalue-reference case, because here we do refer to objects (= memory!), and dropping const from non-class rvalues is mainly for the reason that there is no object around.
The issue for dynamic types is of similar nature. In C++03, rvalues of class type have a known dynamic type - it's the static type of that expression. Because to have it another way, you need references or dereferences, which evaluate to an lvalue. That isn't true with unnamed rvalue references, yet they can show polymorphic behavior. So to solve it,
unnamed rvalue references become xvalues. They can be qualified and potentially have their dynamic type different. They do, like intended, prefer rvalue references during overloading, and won't bind to non-const lvalue references.
What previously was an rvalue (literals, objects created by casts to non-reference types) now becomes an prvalue. They have the same preference as xvalues during overloading.
What previously was an lvalue stays an lvalue.
And two groupings are done to capture those that can be qualified and can have different dynamic types (glvalues) and those where overloading prefers rvalue reference binding (rvalues).
As the previous answers exhaustively covered the theory behind the value categories, there is just another thing I'd like to add: you can actually play with it and test it.
For some hands-on experimentation with the value categories, you can make use of the decltype specifier. Its behavior explicitly distinguishes between the three primary value categories (xvalue, lvalue, and prvalue).
Using the preprocessor saves us some typing ...
Primary categories:
#define IS_XVALUE(X) std::is_rvalue_reference<decltype((X))>::value
#define IS_LVALUE(X) std::is_lvalue_reference<decltype((X))>::value
#define IS_PRVALUE(X) !std::is_reference<decltype((X))>::value
Mixed categories:
#define IS_GLVALUE(X) (IS_LVALUE(X) || IS_XVALUE(X))
#define IS_RVALUE(X) (IS_PRVALUE(X) || IS_XVALUE(X))
Now we can reproduce (almost) all the examples from cppreference on value category.
Here are some examples with C++17 (for terse static_assert):
void doesNothing(){}
struct S
{
int x{0};
};
int x = 1;
int y = 2;
S s;
static_assert(IS_LVALUE(x));
static_assert(IS_LVALUE(x+=y));
static_assert(IS_LVALUE("Hello world!"));
static_assert(IS_LVALUE(++x));
static_assert(IS_PRVALUE(1));
static_assert(IS_PRVALUE(x++));
static_assert(IS_PRVALUE(static_cast<double>(x)));
static_assert(IS_PRVALUE(std::string{}));
static_assert(IS_PRVALUE(throw std::exception()));
static_assert(IS_PRVALUE(doesNothing()));
static_assert(IS_XVALUE(std::move(s)));
// The next one doesn't work in gcc 8.2 but in gcc 9.1. Clang 7.0.0 and msvc 19.16 are doing fine.
static_assert(IS_XVALUE(S().x));
The mixed categories are kind of boring once you figured out the primary category.
For some more examples (and experimentation), check out the following link on compiler explorer. Don't bother reading the assembly, though. I added a lot of compilers just to make sure it works across all the common compilers.
These are terms that the C++ committee used to define move semantics in C++11. Here's the story.
I find it difficult to understand the terms given their precise definitions, the long lists of rules or this popular diagram:
It's easier on a Venn diagram with typical examples:
Basically:
every expression is either lvalue or rvalue
lvalue must be copied, because it has identity, so can be used later
rvalue can be moved, either because it's a temporary (prvalue) or explicitly moved (xvalue)
Now, the good question is that if we have two orthogonal properties ("has identity" and "can be moved"), what's the fourth category to complete lvalue, xvalue and prvalue?
That would be an expression that has no identity (hence cannot be accessed later) and cannot be moved (one need to copy its value). This is simply not useful, so hasn't been named.
How do these new categories relate to the existing rvalue and lvalue categories?
A C++03 lvalue is still a C++11 lvalue, whereas a C++03 rvalue is called a prvalue in C++11.
One addendum to the excellent answers above, on a point that confused me even after I had read Stroustrup and thought I understood the rvalue/lvalue distinction. When you see
int&& a = 3,
it's very tempting to read the int&& as a type and conclude that a is an rvalue. It's not:
int&& a = 3;
int&& c = a; //error: cannot bind 'int' lvalue to 'int&&'
int& b = a; //compiles
a has a name and is ipso facto an lvalue. Don't think of the && as part of the type of a; it's just something telling you what a is allowed to bind to.
This matters particularly for T&& type arguments in constructors. If you write
Foo::Foo(T&& _t) : t{_t} {}
you will copy _t into t. You need
Foo::Foo(T&& _t) : t{std::move(_t)} {} if you want to move. Would that my compiler warned me when I left out the move!
This is Venn diagram I made for a highly visual C++ book I'm writing which I will be publishing on leanpub during development soon.
The other answers go into more detail with words, and show similar diagrams. But hopefully this presentation of the information is fairly complete and useful for referencing, in addition.
The main takeaway for me on this topic is that expressions have these two properties: identity and movability. The first deals with the the "solidness" with which something exists. That's important because the C++ abstract machine is allowed to and encouraged to aggressively change and shrink your code through optimizations, and that means that things without identity might only ever exist in the mind of the compiler or in a register for a brief moment before getting trampled on. But a piece of data like that is also guaranteed not to cause issues if you recycle it's innards since there's no way to try to use it. And thus, move semantics were invented to allow us to capture references to temporaries, upgrading them to lvalues and extending their lifetime.
Move semantics originally were about not just throwing away temporaries wastefully, but instead giving them away so they can consumed by another.
When you give your cornbread away, the person you give it to now owns it. They consume it. You should not attempt to eat or digest said cornbread once you've given it away. Maybe that cornbread was headed for the trash anyway, but now it's headed for their bellies. It's not yours anymore.
In C++ land, the idea of "consuming" a resource means that resource is now owned by us and so we should do any cleanup necessary, and ensure the object isn't accessed elsewhere. Often times, that means borrowing the guts to create new objects. I call that "donating organs". Usually, we are talking about pointers or references contained by the object, or something like that, and we want to keep those pointers or references around because they refer to data elsewhere in our program that is not dying.
Thus you could write a function overload that takes an rvalue reference, and if a temporary (prvalue) were passed in, that's the overload that would be called. A new lvalue would be created upon binding to the rvalue reference taken by the function, extending the life of the temporary so you could consume it within your function.
At some point, we realized that we often had lvalue non-temporary data that we were finished with in one scope, but wanted to cannibalize in another scope. But they aren't rvalues and so wouldn't bind to an rvalue reference. So we made std::move, which is just a fancy cast from lvalue to rvalue reference. Such a datum is an xvalue: a former lvalue now acting as if it were a temporary so it can also be moved from.
In C++03, an expression is either an rvalue or an lvalue.
In C++11, an expression can be an:
rvalue
lvalue
xvalue
glvalue
prvalue
Two categories have become five categories.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I guess this document might serve as a not so short introduction : n3055
The whole massacre began with the move semantics. Once we have expressions that can be moved and not copied, suddenly easy to grasp rules demanded distinction between expressions that can be moved, and in which direction.
From what I guess based on the draft, the r/l value distinction stays the same, only in the context of moving things get messy.
Are they needed? Probably not if we wish to forfeit the new features. But to allow better optimization we should probably embrace them.
Quoting n3055:
An lvalue (so-called, historically,
because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [Example: If E is an
expression of pointer type, then *E
is an lvalue expression referring to
the object or function to which E
points. As another example, the
result of calling a function whose
return type is an lvalue reference is
an lvalue.]
An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may
be moved, for example). An xvalue is
the result of certain kinds of
expressions involving rvalue
references. [Example: The
result of calling a function whose
return type is an rvalue reference is
an xvalue.]
A glvalue (“generalized” lvalue) is an lvalue
or an xvalue.
An rvalue (so-called,
historically, because rvalues could
appear on the right-hand side of an
assignment expression) is an xvalue,
a temporary object or
subobject thereof, or a value that is
not associated with an object.
A
prvalue (“pure” rvalue) is an rvalue
that is not an xvalue. [Example: The
result of calling a function whose
return type is not a reference is a
prvalue]
The document in question is a great reference for this question, because it shows the exact changes in the standard that have happened as a result of the introduction of the new nomenclature.
What are these new categories of expressions?
The FCD (n3092) has an excellent description:
— An lvalue (so called, historically, because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [ Example: If E is an
expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E
points. As another example, the result
of calling a function whose return
type is an lvalue reference is an
lvalue. —end example ]
— An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may be
moved, for example). An xvalue is the
result of certain kinds of expressions
involving rvalue references (8.3.2). [
Example: The result of calling a
function whose return type is an
rvalue reference is an xvalue. —end
example ]
— A glvalue (“generalized”
lvalue) is an lvalue or an xvalue.
—
An rvalue (so called, historically,
because rvalues could appear on the
right-hand side of an assignment
expressions) is an xvalue, a temporary
object (12.2) or subobject thereof, or
a value that is not associated with an
object.
— A prvalue (“pure” rvalue) is
an rvalue that is not an xvalue. [
Example: The result of calling a
function whose return type is not a
reference is a prvalue. The value of a
literal such as 12, 7.3e5, or true is
also a prvalue. —end example ]
Every
expression belongs to exactly one of
the fundamental classifications in
this taxonomy: lvalue, xvalue, or
prvalue. This property of an
expression is called its value
category. [ Note: The discussion of
each built-in operator in Clause 5
indicates the category of the value it
yields and the value categories of the
operands it expects. For example, the
built-in assignment operators expect
that the left operand is an lvalue and
that the right operand is a prvalue
and yield an lvalue as the result.
User-defined operators are functions,
and the categories of values they
expect and yield are determined by
their parameter and return types. —end
note
I suggest you read the entire section 3.10 Lvalues and rvalues though.
How do these new categories relate to the existing rvalue and lvalue categories?
Again:
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
The semantics of rvalues has evolved particularly with the introduction of move semantics.
Why are these new categories needed?
So that move construction/assignment could be defined and supported.
I'll start with your last question:
Why are these new categories needed?
The C++ standard contains many rules that deal with the value category of an expression. Some rules make a distinction between lvalue and rvalue. For example, when it comes to overload resolution. Other rules make a distinction between glvalue and prvalue. For example, you can have a glvalue with an incomplete or abstract type but there is no prvalue with an incomplete or abstract type. Before we had this terminology the rules that actually need to distinguish between glvalue/prvalue referred to lvalue/rvalue and they were either unintentionally wrong or contained lots of explaining and exceptions to the rule a la "...unless the rvalue is due to unnamed rvalue reference...". So, it seems like a good idea to just give the concepts of glvalues and prvalues their own name.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
We still have the terms lvalue and rvalue that are compatible with C++98. We just divided the rvalues into two subgroups, xvalues and prvalues, and we refer to lvalues and xvalues as glvalues. Xvalues are a new kind of value category for unnamed rvalue references. Every expression is one of these three: lvalue, xvalue, prvalue. A Venn diagram would look like this:
______ ______
/ X \
/ / \ \
| l | x | pr |
\ \ / /
\______X______/
gl r
Examples with functions:
int prvalue();
int& lvalue();
int&& xvalue();
But also don't forget that named rvalue references are lvalues:
void foo(int&& t) {
// t is initialized with an rvalue expression
// but is actually an lvalue expression itself
}
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I don't feel that the other answers (good though many of them are) really capture the answer to this particular question. Yes, these categories and such exist to allow move semantics, but the complexity exists for one reason. This is the one inviolate rule of moving stuff in C++11:
Thou shalt move only when it is unquestionably safe to do so.
That is why these categories exist: to be able to talk about values where it is safe to move from them, and to talk about values where it is not.
In the earliest version of r-value references, movement happened easily. Too easily. Easily enough that there was a lot of potential for implicitly moving things when the user didn't really mean to.
Here are the circumstances under which it is safe to move something:
When it's a temporary or subobject thereof. (prvalue)
When the user has explicitly said to move it.
If you do this:
SomeType &&Func() { ... }
SomeType &&val = Func();
SomeType otherVal{val};
What does this do? In older versions of the spec, before the 5 values came in, this would provoke a move. Of course it does. You passed an rvalue reference to the constructor, and thus it binds to the constructor that takes an rvalue reference. That's obvious.
There's just one problem with this; you didn't ask to move it. Oh, you might say that the && should have been a clue, but that doesn't change the fact that it broke the rule. val isn't a temporary because temporaries don't have names. You may have extended the lifetime of the temporary, but that means it isn't temporary; it's just like any other stack variable.
If it's not a temporary, and you didn't ask to move it, then moving is wrong.
The obvious solution is to make val an lvalue. This means that you can't move from it. OK, fine; it's named, so its an lvalue.
Once you do that, you can no longer say that SomeType&& means the same thing everwhere. You've now made a distinction between named rvalue references and unnamed rvalue references. Well, named rvalue references are lvalues; that was our solution above. So what do we call unnamed rvalue references (the return value from Func above)?
It's not an lvalue, because you can't move from an lvalue. And we need to be able to move by returning a &&; how else could you explicitly say to move something? That is what std::move returns, after all. It's not an rvalue (old-style), because it can be on the left side of an equation (things are actually a bit more complicated, see this question and the comments below). It is neither an lvalue nor an rvalue; it's a new kind of thing.
What we have is a value that you can treat as an lvalue, except that it is implicitly moveable from. We call it an xvalue.
Note that xvalues are what makes us gain the other two categories of values:
A prvalue is really just the new name for the previous type of rvalue, i.e. they're the rvalues that aren't xvalues.
Glvalues are the union of xvalues and lvalues in one group, because they do share a lot of properties in common.
So really, it all comes down to xvalues and the need to restrict movement to exactly and only certain places. Those places are defined by the rvalue category; prvalues are the implicit moves, and xvalues are the explicit moves (std::move returns an xvalue).
IMHO, the best explanation about its meaning gave us Stroustrup + take into account examples of Dániel Sándor and Mohan:
Stroustrup:
Now I was seriously worried. Clearly we were headed for an impasse or
a mess or both. I spent the lunchtime doing an analysis to see which
of the properties (of values) were independent. There were only two
independent properties:
has identity – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.
can be moved from – i.e. we are allowed to leave to source of a "copy" in some indeterminate, but valid state
This led me to the conclusion that there are exactly three kinds of
values (using the regex notational trick of using a capital letter to
indicate a negative – I was in a hurry):
iM: has identity and cannot be moved from
im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)
Im: does not have identity and can be moved from.
The fourth possibility, IM, (doesn’t have identity and cannot be moved) is not
useful in C++ (or, I think) in any other language.
In addition to these three fundamental classifications of values, we
have two obvious generalizations that correspond to the two
independent properties:
i: has identity
m: can be moved from
This led me to put this diagram on the board:
Naming
I observed that we had only limited freedom to name: The two points to
the left (labeled iM and i) are what people with more or less
formality have called lvalues and the two points on the right
(labeled m and Im) are what people with more or less formality
have called rvalues. This must be reflected in our naming. That is,
the left "leg" of the W should have names related to lvalue and the
right "leg" of the W should have names related to rvalue. I note
that this whole discussion/problem arise from the introduction of
rvalue references and move semantics. These notions simply don’t exist
in Strachey’s world consisting of just rvalues and lvalues. Someone
observed that the ideas that
Every value is either an lvalue or an rvalue
An lvalue is not an rvalue and an rvalue is not an lvalue
are deeply embedded in our consciousness, very useful properties, and
traces of this dichotomy can be found all over the draft standard. We
all agreed that we ought to preserve those properties (and make them
precise). This further constrained our naming choices. I observed that
the standard library wording uses rvalue to mean m (the
generalization), so that to preserve the expectation and text of the
standard library the right-hand bottom point of the W should be named
rvalue.
This led to a focused discussion of naming. First, we needed to decide
on lvalue. Should lvalue mean iM or the generalization i? Led
by Doug Gregor, we listed the places in the core language wording
where the word lvalue was qualified to mean the one or the other. A
list was made and in most cases and in the most tricky/brittle text
lvalue currently means iM. This is the classical meaning of lvalue
because "in the old days" nothing was moved; move is a novel notion
in C++0x. Also, naming the topleft point of the W lvalue gives us
the property that every value is an lvalue or an rvalue, but not both.
So, the top left point of the W is lvalue and the bottom right point
is rvalue. What does that make the bottom left and top right points?
The bottom left point is a generalization of the classical lvalue,
allowing for move. So it is a generalized lvalue. We named it
glvalue. You can quibble about the abbreviation, but (I think) not
with the logic. We assumed that in serious use generalized lvalue
would somehow be abbreviated anyway, so we had better do it
immediately (or risk confusion). The top right point of the W is less
general than the bottom right (now, as ever, called rvalue). That
point represent the original pure notion of an object you can move
from because it cannot be referred to again (except by a destructor).
I liked the phrase specialized rvalue in contrast to generalized
lvalue but pure rvalue abbreviated to prvalue won out (and
probably rightly so). So, the left leg of the W is lvalue and
glvalue and the right leg is prvalue and rvalue. Incidentally,
every value is either a glvalue or a prvalue, but not both.
This leaves the top middle of the W: im; that is, values that have
identity and can be moved. We really don’t have anything that guides
us to a good name for those esoteric beasts. They are important to
people working with the (draft) standard text, but are unlikely to
become a household name. We didn’t find any real constraints on the
naming to guide us, so we picked ‘x’ for the center, the unknown, the
strange, the xpert only, or even x-rated.
INTRODUCTION
ISOC++11 (officially ISO/IEC 14882:2011) is the most recent version of the standard of the C++ programming language. It contains some new features, and concepts, for example:
rvalue references
xvalue, glvalue, prvalue expression value categories
move semantics
If we would like to understand the concepts of the new expression value categories we have to be aware of that there are rvalue and lvalue references.
It is better to know rvalues can be passed to non-const rvalue references.
int& r_i=7; // compile error
int&& rr_i=7; // OK
We can gain some intuition of the concepts of value categories if we quote the subsection titled Lvalues and rvalues from the working draft N3337 (the most similar draft to the published ISOC++11 standard).
3.10 Lvalues and rvalues [basic.lval]
1 Expressions are categorized according to the taxonomy in Figure 1.
An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function
or an object. [ Example: If E is an expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function
whose return type is an lvalue reference is an lvalue. —end example ]
An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for
example). An xvalue is the result of certain kinds of expressions
involving rvalue references (8.3.2). [ Example: The result of calling
a function whose return type is an rvalue reference is an xvalue. —end
example ]
A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
An rvalue (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a
temporary object (12.2) or subobject thereof, or a value that is not
associated with an object.
A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [ Example: The result of calling a function whose return type is not a
reference is a prvalue. The value of a literal such as 12, 7.3e5, or
true is also a prvalue. —end example ]
Every expression belongs to exactly one of the fundamental
classifications in this taxonomy: lvalue, xvalue, or prvalue. This
property of an expression is called its value category.
But I am not quite sure about that this subsection is enough to understand the concepts clearly, because "usually" is not really general, "near the end of its lifetime" is not really concrete, "involving rvalue references" is not really clear, and "Example: The result of calling a function whose return type is an rvalue reference is an xvalue." sounds like a snake is biting its tail.
PRIMARY VALUE CATEGORIES
Every expression belongs to exactly one primary value category. These value categories are lvalue, xvalue and prvalue categories.
lvalues
The expression E belongs to the lvalue category if and only if E refers to an entity that ALREADY has had an identity (address, name or alias) that makes it accessible outside of E.
#include <iostream>
int i=7;
const int& f(){
return i;
}
int main()
{
std::cout<<&"www"<<std::endl; // The expression "www" in this row is an lvalue expression, because string literals are arrays and every array has an address.
i; // The expression i in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression i in this row refers to.
int* p_i=new int(7);
*p_i; // The expression *p_i in this row is an lvalue expression, because it refers to the same entity ...
*p_i; // ... as the entity the expression *p_i in this row refers to.
const int& r_I=7;
r_I; // The expression r_I in this row is an lvalue expression, because it refers to the same entity ...
r_I; // ... as the entity the expression r_I in this row refers to.
f(); // The expression f() in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression f() in this row refers to.
return 0;
}
xvalues
The expression E belongs to the xvalue category if and only if it is
— the result of calling a function, whether implicitly or explicitly, whose return type is an rvalue reference to the type of object being returned, or
int&& f(){
return 3;
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because f() return type is an rvalue reference to object type.
return 0;
}
— a cast to an rvalue reference to object type, or
int main()
{
static_cast<int&&>(7); // The expression static_cast<int&&>(7) belongs to the xvalue category, because it is a cast to an rvalue reference to object type.
std::move(7); // std::move(7) is equivalent to static_cast<int&&>(7).
return 0;
}
— a class member access expression designating a non-static data member of non-reference type in which the object expression is an xvalue, or
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f().i; // The expression f().i belongs to the xvalue category, because As::i is a non-static data member of non-reference type, and the subexpression f() belongs to the xvlaue category.
return 0;
}
— a pointer-to-member expression in which the first operand is an xvalue and the second operand is a pointer to data member.
Note that the effect of the rules above is that named rvalue references to objects are treated as lvalues and unnamed rvalue references to objects are treated as xvalues; rvalue references to functions are treated as lvalues whether named or not.
#include <functional>
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because it refers to an unnamed rvalue reference to object.
As&& rr_a=As();
rr_a; // The expression rr_a belongs to the lvalue category, because it refers to a named rvalue reference to object.
std::ref(f); // The expression std::ref(f) belongs to the lvalue category, because it refers to an rvalue reference to function.
return 0;
}
prvalues
The expression E belongs to the prvalue category if and only if E belongs neither to the lvalue nor to the xvalue category.
struct As
{
void f(){
this; // The expression this is a prvalue expression. Note, that the expression this is not a variable.
}
};
As f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the prvalue category, because it belongs neither to the lvalue nor to the xvalue category.
return 0;
}
MIXED VALUE CATEGORIES
There are two further important mixed value categories. These value categories are rvalue and glvalue categories.
rvalues
The expression E belongs to the rvalue category if and only if E belongs to the xvalue category, or to the prvalue category.
Note that this definition means that the expression E belongs to the rvalue category if and only if E refers to an entity that has not had any identity that makes it accessible outside of E YET.
glvalues
The expression E belongs to the glvalue category if and only if E belongs to the lvalue category, or to the xvalue category.
A PRACTICAL RULE
Scott Meyer has published a very useful rule of thumb to distinguish rvalues from lvalues.
If you can take the address of an expression, the expression is an lvalue.
If the type of an expression is an lvalue reference (e.g., T& or const T&, etc.), that expression is an lvalue.
Otherwise, the expression is an rvalue. Conceptually (and typically also in fact), rvalues correspond to temporary objects, such
as those returned from functions or created through implicit type
conversions. Most literal values (e.g., 10 and 5.3) are also rvalues.
I have struggled with this for a long time, until I came across the cppreference.com explanation of the value categories.
It is actually rather simple, but I find that it is often explained in a way that's hard to memorize. Here it is explained very schematically. I'll quote some parts of the page:
Primary categories
The primary value categories correspond to two properties of expressions:
has identity: it's possible to determine whether the expression refers to the same entity as another expression, such as by comparing addresses of the objects or the functions they identify (obtained directly or indirectly);
can be moved from: move constructor, move assignment operator, or another function overload that implements move semantics can bind to the expression.
Expressions that:
have identity and cannot be moved from are called lvalue expressions;
have identity and can be moved from are called xvalue expressions;
do not have identity and can be moved from are called prvalue expressions;
do not have identity and cannot be moved from are not used.
lvalue
An lvalue ("left value") expression is an expression that has identity and cannot be moved from.
rvalue (until C++11), prvalue (since C++11)
A prvalue ("pure rvalue") expression is an expression that does not have identity and can be moved from.
xvalue
An xvalue ("expiring value") expression is an expression that has identity and can be moved from.
glvalue
A glvalue ("generalized lvalue") expression is an expression that is either an lvalue or an xvalue. It has identity. It may or may not be moved from.
rvalue (since C++11)
An rvalue ("right value") expression is an expression that is either a prvalue or an xvalue. It can be moved from. It may or may not have identity.
So let's put that into a table:
Can be moved from (= rvalue)
Cannot be moved from
Has identity (= glvalue)
xvalue
lvalue
No identity
prvalue
not used
C++03's categories are too restricted to capture the introduction of rvalue references correctly into expression attributes.
With the introduction of them, it was said that an unnamed rvalue reference evaluates to an rvalue, such that overload resolution would prefer rvalue reference bindings, which would make it select move constructors over copy constructors. But it was found that this causes problems all around, for example with Dynamic Types and with qualifications.
To show this, consider
int const&& f();
int main() {
int &&i = f(); // disgusting!
}
On pre-xvalue drafts, this was allowed, because in C++03, rvalues of non-class types are never cv-qualified. But it is intended that const applies in the rvalue-reference case, because here we do refer to objects (= memory!), and dropping const from non-class rvalues is mainly for the reason that there is no object around.
The issue for dynamic types is of similar nature. In C++03, rvalues of class type have a known dynamic type - it's the static type of that expression. Because to have it another way, you need references or dereferences, which evaluate to an lvalue. That isn't true with unnamed rvalue references, yet they can show polymorphic behavior. So to solve it,
unnamed rvalue references become xvalues. They can be qualified and potentially have their dynamic type different. They do, like intended, prefer rvalue references during overloading, and won't bind to non-const lvalue references.
What previously was an rvalue (literals, objects created by casts to non-reference types) now becomes an prvalue. They have the same preference as xvalues during overloading.
What previously was an lvalue stays an lvalue.
And two groupings are done to capture those that can be qualified and can have different dynamic types (glvalues) and those where overloading prefers rvalue reference binding (rvalues).
As the previous answers exhaustively covered the theory behind the value categories, there is just another thing I'd like to add: you can actually play with it and test it.
For some hands-on experimentation with the value categories, you can make use of the decltype specifier. Its behavior explicitly distinguishes between the three primary value categories (xvalue, lvalue, and prvalue).
Using the preprocessor saves us some typing ...
Primary categories:
#define IS_XVALUE(X) std::is_rvalue_reference<decltype((X))>::value
#define IS_LVALUE(X) std::is_lvalue_reference<decltype((X))>::value
#define IS_PRVALUE(X) !std::is_reference<decltype((X))>::value
Mixed categories:
#define IS_GLVALUE(X) (IS_LVALUE(X) || IS_XVALUE(X))
#define IS_RVALUE(X) (IS_PRVALUE(X) || IS_XVALUE(X))
Now we can reproduce (almost) all the examples from cppreference on value category.
Here are some examples with C++17 (for terse static_assert):
void doesNothing(){}
struct S
{
int x{0};
};
int x = 1;
int y = 2;
S s;
static_assert(IS_LVALUE(x));
static_assert(IS_LVALUE(x+=y));
static_assert(IS_LVALUE("Hello world!"));
static_assert(IS_LVALUE(++x));
static_assert(IS_PRVALUE(1));
static_assert(IS_PRVALUE(x++));
static_assert(IS_PRVALUE(static_cast<double>(x)));
static_assert(IS_PRVALUE(std::string{}));
static_assert(IS_PRVALUE(throw std::exception()));
static_assert(IS_PRVALUE(doesNothing()));
static_assert(IS_XVALUE(std::move(s)));
// The next one doesn't work in gcc 8.2 but in gcc 9.1. Clang 7.0.0 and msvc 19.16 are doing fine.
static_assert(IS_XVALUE(S().x));
The mixed categories are kind of boring once you figured out the primary category.
For some more examples (and experimentation), check out the following link on compiler explorer. Don't bother reading the assembly, though. I added a lot of compilers just to make sure it works across all the common compilers.
These are terms that the C++ committee used to define move semantics in C++11. Here's the story.
I find it difficult to understand the terms given their precise definitions, the long lists of rules or this popular diagram:
It's easier on a Venn diagram with typical examples:
Basically:
every expression is either lvalue or rvalue
lvalue must be copied, because it has identity, so can be used later
rvalue can be moved, either because it's a temporary (prvalue) or explicitly moved (xvalue)
Now, the good question is that if we have two orthogonal properties ("has identity" and "can be moved"), what's the fourth category to complete lvalue, xvalue and prvalue?
That would be an expression that has no identity (hence cannot be accessed later) and cannot be moved (one need to copy its value). This is simply not useful, so hasn't been named.
How do these new categories relate to the existing rvalue and lvalue categories?
A C++03 lvalue is still a C++11 lvalue, whereas a C++03 rvalue is called a prvalue in C++11.
One addendum to the excellent answers above, on a point that confused me even after I had read Stroustrup and thought I understood the rvalue/lvalue distinction. When you see
int&& a = 3,
it's very tempting to read the int&& as a type and conclude that a is an rvalue. It's not:
int&& a = 3;
int&& c = a; //error: cannot bind 'int' lvalue to 'int&&'
int& b = a; //compiles
a has a name and is ipso facto an lvalue. Don't think of the && as part of the type of a; it's just something telling you what a is allowed to bind to.
This matters particularly for T&& type arguments in constructors. If you write
Foo::Foo(T&& _t) : t{_t} {}
you will copy _t into t. You need
Foo::Foo(T&& _t) : t{std::move(_t)} {} if you want to move. Would that my compiler warned me when I left out the move!
This is Venn diagram I made for a highly visual C++ book I'm writing which I will be publishing on leanpub during development soon.
The other answers go into more detail with words, and show similar diagrams. But hopefully this presentation of the information is fairly complete and useful for referencing, in addition.
The main takeaway for me on this topic is that expressions have these two properties: identity and movability. The first deals with the the "solidness" with which something exists. That's important because the C++ abstract machine is allowed to and encouraged to aggressively change and shrink your code through optimizations, and that means that things without identity might only ever exist in the mind of the compiler or in a register for a brief moment before getting trampled on. But a piece of data like that is also guaranteed not to cause issues if you recycle it's innards since there's no way to try to use it. And thus, move semantics were invented to allow us to capture references to temporaries, upgrading them to lvalues and extending their lifetime.
Move semantics originally were about not just throwing away temporaries wastefully, but instead giving them away so they can consumed by another.
When you give your cornbread away, the person you give it to now owns it. They consume it. You should not attempt to eat or digest said cornbread once you've given it away. Maybe that cornbread was headed for the trash anyway, but now it's headed for their bellies. It's not yours anymore.
In C++ land, the idea of "consuming" a resource means that resource is now owned by us and so we should do any cleanup necessary, and ensure the object isn't accessed elsewhere. Often times, that means borrowing the guts to create new objects. I call that "donating organs". Usually, we are talking about pointers or references contained by the object, or something like that, and we want to keep those pointers or references around because they refer to data elsewhere in our program that is not dying.
Thus you could write a function overload that takes an rvalue reference, and if a temporary (prvalue) were passed in, that's the overload that would be called. A new lvalue would be created upon binding to the rvalue reference taken by the function, extending the life of the temporary so you could consume it within your function.
At some point, we realized that we often had lvalue non-temporary data that we were finished with in one scope, but wanted to cannibalize in another scope. But they aren't rvalues and so wouldn't bind to an rvalue reference. So we made std::move, which is just a fancy cast from lvalue to rvalue reference. Such a datum is an xvalue: a former lvalue now acting as if it were a temporary so it can also be moved from.
In C++03, an expression is either an rvalue or an lvalue.
In C++11, an expression can be an:
rvalue
lvalue
xvalue
glvalue
prvalue
Two categories have become five categories.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I guess this document might serve as a not so short introduction : n3055
The whole massacre began with the move semantics. Once we have expressions that can be moved and not copied, suddenly easy to grasp rules demanded distinction between expressions that can be moved, and in which direction.
From what I guess based on the draft, the r/l value distinction stays the same, only in the context of moving things get messy.
Are they needed? Probably not if we wish to forfeit the new features. But to allow better optimization we should probably embrace them.
Quoting n3055:
An lvalue (so-called, historically,
because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [Example: If E is an
expression of pointer type, then *E
is an lvalue expression referring to
the object or function to which E
points. As another example, the
result of calling a function whose
return type is an lvalue reference is
an lvalue.]
An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may
be moved, for example). An xvalue is
the result of certain kinds of
expressions involving rvalue
references. [Example: The
result of calling a function whose
return type is an rvalue reference is
an xvalue.]
A glvalue (“generalized” lvalue) is an lvalue
or an xvalue.
An rvalue (so-called,
historically, because rvalues could
appear on the right-hand side of an
assignment expression) is an xvalue,
a temporary object or
subobject thereof, or a value that is
not associated with an object.
A
prvalue (“pure” rvalue) is an rvalue
that is not an xvalue. [Example: The
result of calling a function whose
return type is not a reference is a
prvalue]
The document in question is a great reference for this question, because it shows the exact changes in the standard that have happened as a result of the introduction of the new nomenclature.
What are these new categories of expressions?
The FCD (n3092) has an excellent description:
— An lvalue (so called, historically, because lvalues could appear on the
left-hand side of an assignment
expression) designates a function or
an object. [ Example: If E is an
expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E
points. As another example, the result
of calling a function whose return
type is an lvalue reference is an
lvalue. —end example ]
— An xvalue (an
“eXpiring” value) also refers to an
object, usually near the end of its
lifetime (so that its resources may be
moved, for example). An xvalue is the
result of certain kinds of expressions
involving rvalue references (8.3.2). [
Example: The result of calling a
function whose return type is an
rvalue reference is an xvalue. —end
example ]
— A glvalue (“generalized”
lvalue) is an lvalue or an xvalue.
—
An rvalue (so called, historically,
because rvalues could appear on the
right-hand side of an assignment
expressions) is an xvalue, a temporary
object (12.2) or subobject thereof, or
a value that is not associated with an
object.
— A prvalue (“pure” rvalue) is
an rvalue that is not an xvalue. [
Example: The result of calling a
function whose return type is not a
reference is a prvalue. The value of a
literal such as 12, 7.3e5, or true is
also a prvalue. —end example ]
Every
expression belongs to exactly one of
the fundamental classifications in
this taxonomy: lvalue, xvalue, or
prvalue. This property of an
expression is called its value
category. [ Note: The discussion of
each built-in operator in Clause 5
indicates the category of the value it
yields and the value categories of the
operands it expects. For example, the
built-in assignment operators expect
that the left operand is an lvalue and
that the right operand is a prvalue
and yield an lvalue as the result.
User-defined operators are functions,
and the categories of values they
expect and yield are determined by
their parameter and return types. —end
note
I suggest you read the entire section 3.10 Lvalues and rvalues though.
How do these new categories relate to the existing rvalue and lvalue categories?
Again:
Are the rvalue and lvalue categories in C++0x the same as they are in C++03?
The semantics of rvalues has evolved particularly with the introduction of move semantics.
Why are these new categories needed?
So that move construction/assignment could be defined and supported.
I'll start with your last question:
Why are these new categories needed?
The C++ standard contains many rules that deal with the value category of an expression. Some rules make a distinction between lvalue and rvalue. For example, when it comes to overload resolution. Other rules make a distinction between glvalue and prvalue. For example, you can have a glvalue with an incomplete or abstract type but there is no prvalue with an incomplete or abstract type. Before we had this terminology the rules that actually need to distinguish between glvalue/prvalue referred to lvalue/rvalue and they were either unintentionally wrong or contained lots of explaining and exceptions to the rule a la "...unless the rvalue is due to unnamed rvalue reference...". So, it seems like a good idea to just give the concepts of glvalues and prvalues their own name.
What are these new categories of expressions?
How do these new categories relate to the existing rvalue and lvalue categories?
We still have the terms lvalue and rvalue that are compatible with C++98. We just divided the rvalues into two subgroups, xvalues and prvalues, and we refer to lvalues and xvalues as glvalues. Xvalues are a new kind of value category for unnamed rvalue references. Every expression is one of these three: lvalue, xvalue, prvalue. A Venn diagram would look like this:
______ ______
/ X \
/ / \ \
| l | x | pr |
\ \ / /
\______X______/
gl r
Examples with functions:
int prvalue();
int& lvalue();
int&& xvalue();
But also don't forget that named rvalue references are lvalues:
void foo(int&& t) {
// t is initialized with an rvalue expression
// but is actually an lvalue expression itself
}
Why are these new categories needed? Are the WG21 gods just trying to confuse us mere mortals?
I don't feel that the other answers (good though many of them are) really capture the answer to this particular question. Yes, these categories and such exist to allow move semantics, but the complexity exists for one reason. This is the one inviolate rule of moving stuff in C++11:
Thou shalt move only when it is unquestionably safe to do so.
That is why these categories exist: to be able to talk about values where it is safe to move from them, and to talk about values where it is not.
In the earliest version of r-value references, movement happened easily. Too easily. Easily enough that there was a lot of potential for implicitly moving things when the user didn't really mean to.
Here are the circumstances under which it is safe to move something:
When it's a temporary or subobject thereof. (prvalue)
When the user has explicitly said to move it.
If you do this:
SomeType &&Func() { ... }
SomeType &&val = Func();
SomeType otherVal{val};
What does this do? In older versions of the spec, before the 5 values came in, this would provoke a move. Of course it does. You passed an rvalue reference to the constructor, and thus it binds to the constructor that takes an rvalue reference. That's obvious.
There's just one problem with this; you didn't ask to move it. Oh, you might say that the && should have been a clue, but that doesn't change the fact that it broke the rule. val isn't a temporary because temporaries don't have names. You may have extended the lifetime of the temporary, but that means it isn't temporary; it's just like any other stack variable.
If it's not a temporary, and you didn't ask to move it, then moving is wrong.
The obvious solution is to make val an lvalue. This means that you can't move from it. OK, fine; it's named, so its an lvalue.
Once you do that, you can no longer say that SomeType&& means the same thing everwhere. You've now made a distinction between named rvalue references and unnamed rvalue references. Well, named rvalue references are lvalues; that was our solution above. So what do we call unnamed rvalue references (the return value from Func above)?
It's not an lvalue, because you can't move from an lvalue. And we need to be able to move by returning a &&; how else could you explicitly say to move something? That is what std::move returns, after all. It's not an rvalue (old-style), because it can be on the left side of an equation (things are actually a bit more complicated, see this question and the comments below). It is neither an lvalue nor an rvalue; it's a new kind of thing.
What we have is a value that you can treat as an lvalue, except that it is implicitly moveable from. We call it an xvalue.
Note that xvalues are what makes us gain the other two categories of values:
A prvalue is really just the new name for the previous type of rvalue, i.e. they're the rvalues that aren't xvalues.
Glvalues are the union of xvalues and lvalues in one group, because they do share a lot of properties in common.
So really, it all comes down to xvalues and the need to restrict movement to exactly and only certain places. Those places are defined by the rvalue category; prvalues are the implicit moves, and xvalues are the explicit moves (std::move returns an xvalue).
IMHO, the best explanation about its meaning gave us Stroustrup + take into account examples of Dániel Sándor and Mohan:
Stroustrup:
Now I was seriously worried. Clearly we were headed for an impasse or
a mess or both. I spent the lunchtime doing an analysis to see which
of the properties (of values) were independent. There were only two
independent properties:
has identity – i.e. and address, a pointer, the user can determine whether two copies are identical, etc.
can be moved from – i.e. we are allowed to leave to source of a "copy" in some indeterminate, but valid state
This led me to the conclusion that there are exactly three kinds of
values (using the regex notational trick of using a capital letter to
indicate a negative – I was in a hurry):
iM: has identity and cannot be moved from
im: has identity and can be moved from (e.g. the result of casting an lvalue to a rvalue reference)
Im: does not have identity and can be moved from.
The fourth possibility, IM, (doesn’t have identity and cannot be moved) is not
useful in C++ (or, I think) in any other language.
In addition to these three fundamental classifications of values, we
have two obvious generalizations that correspond to the two
independent properties:
i: has identity
m: can be moved from
This led me to put this diagram on the board:
Naming
I observed that we had only limited freedom to name: The two points to
the left (labeled iM and i) are what people with more or less
formality have called lvalues and the two points on the right
(labeled m and Im) are what people with more or less formality
have called rvalues. This must be reflected in our naming. That is,
the left "leg" of the W should have names related to lvalue and the
right "leg" of the W should have names related to rvalue. I note
that this whole discussion/problem arise from the introduction of
rvalue references and move semantics. These notions simply don’t exist
in Strachey’s world consisting of just rvalues and lvalues. Someone
observed that the ideas that
Every value is either an lvalue or an rvalue
An lvalue is not an rvalue and an rvalue is not an lvalue
are deeply embedded in our consciousness, very useful properties, and
traces of this dichotomy can be found all over the draft standard. We
all agreed that we ought to preserve those properties (and make them
precise). This further constrained our naming choices. I observed that
the standard library wording uses rvalue to mean m (the
generalization), so that to preserve the expectation and text of the
standard library the right-hand bottom point of the W should be named
rvalue.
This led to a focused discussion of naming. First, we needed to decide
on lvalue. Should lvalue mean iM or the generalization i? Led
by Doug Gregor, we listed the places in the core language wording
where the word lvalue was qualified to mean the one or the other. A
list was made and in most cases and in the most tricky/brittle text
lvalue currently means iM. This is the classical meaning of lvalue
because "in the old days" nothing was moved; move is a novel notion
in C++0x. Also, naming the topleft point of the W lvalue gives us
the property that every value is an lvalue or an rvalue, but not both.
So, the top left point of the W is lvalue and the bottom right point
is rvalue. What does that make the bottom left and top right points?
The bottom left point is a generalization of the classical lvalue,
allowing for move. So it is a generalized lvalue. We named it
glvalue. You can quibble about the abbreviation, but (I think) not
with the logic. We assumed that in serious use generalized lvalue
would somehow be abbreviated anyway, so we had better do it
immediately (or risk confusion). The top right point of the W is less
general than the bottom right (now, as ever, called rvalue). That
point represent the original pure notion of an object you can move
from because it cannot be referred to again (except by a destructor).
I liked the phrase specialized rvalue in contrast to generalized
lvalue but pure rvalue abbreviated to prvalue won out (and
probably rightly so). So, the left leg of the W is lvalue and
glvalue and the right leg is prvalue and rvalue. Incidentally,
every value is either a glvalue or a prvalue, but not both.
This leaves the top middle of the W: im; that is, values that have
identity and can be moved. We really don’t have anything that guides
us to a good name for those esoteric beasts. They are important to
people working with the (draft) standard text, but are unlikely to
become a household name. We didn’t find any real constraints on the
naming to guide us, so we picked ‘x’ for the center, the unknown, the
strange, the xpert only, or even x-rated.
INTRODUCTION
ISOC++11 (officially ISO/IEC 14882:2011) is the most recent version of the standard of the C++ programming language. It contains some new features, and concepts, for example:
rvalue references
xvalue, glvalue, prvalue expression value categories
move semantics
If we would like to understand the concepts of the new expression value categories we have to be aware of that there are rvalue and lvalue references.
It is better to know rvalues can be passed to non-const rvalue references.
int& r_i=7; // compile error
int&& rr_i=7; // OK
We can gain some intuition of the concepts of value categories if we quote the subsection titled Lvalues and rvalues from the working draft N3337 (the most similar draft to the published ISOC++11 standard).
3.10 Lvalues and rvalues [basic.lval]
1 Expressions are categorized according to the taxonomy in Figure 1.
An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function
or an object. [ Example: If E is an expression of pointer type, then
*E is an lvalue expression referring to the object or function to which E points. As another example, the result of calling a function
whose return type is an lvalue reference is an lvalue. —end example ]
An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for
example). An xvalue is the result of certain kinds of expressions
involving rvalue references (8.3.2). [ Example: The result of calling
a function whose return type is an rvalue reference is an xvalue. —end
example ]
A glvalue (“generalized” lvalue) is an lvalue or an xvalue.
An rvalue (so called, historically, because rvalues could appear on the right-hand side of an assignment expression) is an xvalue, a
temporary object (12.2) or subobject thereof, or a value that is not
associated with an object.
A prvalue (“pure” rvalue) is an rvalue that is not an xvalue. [ Example: The result of calling a function whose return type is not a
reference is a prvalue. The value of a literal such as 12, 7.3e5, or
true is also a prvalue. —end example ]
Every expression belongs to exactly one of the fundamental
classifications in this taxonomy: lvalue, xvalue, or prvalue. This
property of an expression is called its value category.
But I am not quite sure about that this subsection is enough to understand the concepts clearly, because "usually" is not really general, "near the end of its lifetime" is not really concrete, "involving rvalue references" is not really clear, and "Example: The result of calling a function whose return type is an rvalue reference is an xvalue." sounds like a snake is biting its tail.
PRIMARY VALUE CATEGORIES
Every expression belongs to exactly one primary value category. These value categories are lvalue, xvalue and prvalue categories.
lvalues
The expression E belongs to the lvalue category if and only if E refers to an entity that ALREADY has had an identity (address, name or alias) that makes it accessible outside of E.
#include <iostream>
int i=7;
const int& f(){
return i;
}
int main()
{
std::cout<<&"www"<<std::endl; // The expression "www" in this row is an lvalue expression, because string literals are arrays and every array has an address.
i; // The expression i in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression i in this row refers to.
int* p_i=new int(7);
*p_i; // The expression *p_i in this row is an lvalue expression, because it refers to the same entity ...
*p_i; // ... as the entity the expression *p_i in this row refers to.
const int& r_I=7;
r_I; // The expression r_I in this row is an lvalue expression, because it refers to the same entity ...
r_I; // ... as the entity the expression r_I in this row refers to.
f(); // The expression f() in this row is an lvalue expression, because it refers to the same entity ...
i; // ... as the entity the expression f() in this row refers to.
return 0;
}
xvalues
The expression E belongs to the xvalue category if and only if it is
— the result of calling a function, whether implicitly or explicitly, whose return type is an rvalue reference to the type of object being returned, or
int&& f(){
return 3;
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because f() return type is an rvalue reference to object type.
return 0;
}
— a cast to an rvalue reference to object type, or
int main()
{
static_cast<int&&>(7); // The expression static_cast<int&&>(7) belongs to the xvalue category, because it is a cast to an rvalue reference to object type.
std::move(7); // std::move(7) is equivalent to static_cast<int&&>(7).
return 0;
}
— a class member access expression designating a non-static data member of non-reference type in which the object expression is an xvalue, or
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f().i; // The expression f().i belongs to the xvalue category, because As::i is a non-static data member of non-reference type, and the subexpression f() belongs to the xvlaue category.
return 0;
}
— a pointer-to-member expression in which the first operand is an xvalue and the second operand is a pointer to data member.
Note that the effect of the rules above is that named rvalue references to objects are treated as lvalues and unnamed rvalue references to objects are treated as xvalues; rvalue references to functions are treated as lvalues whether named or not.
#include <functional>
struct As
{
int i;
};
As&& f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the xvalue category, because it refers to an unnamed rvalue reference to object.
As&& rr_a=As();
rr_a; // The expression rr_a belongs to the lvalue category, because it refers to a named rvalue reference to object.
std::ref(f); // The expression std::ref(f) belongs to the lvalue category, because it refers to an rvalue reference to function.
return 0;
}
prvalues
The expression E belongs to the prvalue category if and only if E belongs neither to the lvalue nor to the xvalue category.
struct As
{
void f(){
this; // The expression this is a prvalue expression. Note, that the expression this is not a variable.
}
};
As f(){
return As();
}
int main()
{
f(); // The expression f() belongs to the prvalue category, because it belongs neither to the lvalue nor to the xvalue category.
return 0;
}
MIXED VALUE CATEGORIES
There are two further important mixed value categories. These value categories are rvalue and glvalue categories.
rvalues
The expression E belongs to the rvalue category if and only if E belongs to the xvalue category, or to the prvalue category.
Note that this definition means that the expression E belongs to the rvalue category if and only if E refers to an entity that has not had any identity that makes it accessible outside of E YET.
glvalues
The expression E belongs to the glvalue category if and only if E belongs to the lvalue category, or to the xvalue category.
A PRACTICAL RULE
Scott Meyer has published a very useful rule of thumb to distinguish rvalues from lvalues.
If you can take the address of an expression, the expression is an lvalue.
If the type of an expression is an lvalue reference (e.g., T& or const T&, etc.), that expression is an lvalue.
Otherwise, the expression is an rvalue. Conceptually (and typically also in fact), rvalues correspond to temporary objects, such
as those returned from functions or created through implicit type
conversions. Most literal values (e.g., 10 and 5.3) are also rvalues.
I have struggled with this for a long time, until I came across the cppreference.com explanation of the value categories.
It is actually rather simple, but I find that it is often explained in a way that's hard to memorize. Here it is explained very schematically. I'll quote some parts of the page:
Primary categories
The primary value categories correspond to two properties of expressions:
has identity: it's possible to determine whether the expression refers to the same entity as another expression, such as by comparing addresses of the objects or the functions they identify (obtained directly or indirectly);
can be moved from: move constructor, move assignment operator, or another function overload that implements move semantics can bind to the expression.
Expressions that:
have identity and cannot be moved from are called lvalue expressions;
have identity and can be moved from are called xvalue expressions;
do not have identity and can be moved from are called prvalue expressions;
do not have identity and cannot be moved from are not used.
lvalue
An lvalue ("left value") expression is an expression that has identity and cannot be moved from.
rvalue (until C++11), prvalue (since C++11)
A prvalue ("pure rvalue") expression is an expression that does not have identity and can be moved from.
xvalue
An xvalue ("expiring value") expression is an expression that has identity and can be moved from.
glvalue
A glvalue ("generalized lvalue") expression is an expression that is either an lvalue or an xvalue. It has identity. It may or may not be moved from.
rvalue (since C++11)
An rvalue ("right value") expression is an expression that is either a prvalue or an xvalue. It can be moved from. It may or may not have identity.
So let's put that into a table:
Can be moved from (= rvalue)
Cannot be moved from
Has identity (= glvalue)
xvalue
lvalue
No identity
prvalue
not used
C++03's categories are too restricted to capture the introduction of rvalue references correctly into expression attributes.
With the introduction of them, it was said that an unnamed rvalue reference evaluates to an rvalue, such that overload resolution would prefer rvalue reference bindings, which would make it select move constructors over copy constructors. But it was found that this causes problems all around, for example with Dynamic Types and with qualifications.
To show this, consider
int const&& f();
int main() {
int &&i = f(); // disgusting!
}
On pre-xvalue drafts, this was allowed, because in C++03, rvalues of non-class types are never cv-qualified. But it is intended that const applies in the rvalue-reference case, because here we do refer to objects (= memory!), and dropping const from non-class rvalues is mainly for the reason that there is no object around.
The issue for dynamic types is of similar nature. In C++03, rvalues of class type have a known dynamic type - it's the static type of that expression. Because to have it another way, you need references or dereferences, which evaluate to an lvalue. That isn't true with unnamed rvalue references, yet they can show polymorphic behavior. So to solve it,
unnamed rvalue references become xvalues. They can be qualified and potentially have their dynamic type different. They do, like intended, prefer rvalue references during overloading, and won't bind to non-const lvalue references.
What previously was an rvalue (literals, objects created by casts to non-reference types) now becomes an prvalue. They have the same preference as xvalues during overloading.
What previously was an lvalue stays an lvalue.
And two groupings are done to capture those that can be qualified and can have different dynamic types (glvalues) and those where overloading prefers rvalue reference binding (rvalues).
As the previous answers exhaustively covered the theory behind the value categories, there is just another thing I'd like to add: you can actually play with it and test it.
For some hands-on experimentation with the value categories, you can make use of the decltype specifier. Its behavior explicitly distinguishes between the three primary value categories (xvalue, lvalue, and prvalue).
Using the preprocessor saves us some typing ...
Primary categories:
#define IS_XVALUE(X) std::is_rvalue_reference<decltype((X))>::value
#define IS_LVALUE(X) std::is_lvalue_reference<decltype((X))>::value
#define IS_PRVALUE(X) !std::is_reference<decltype((X))>::value
Mixed categories:
#define IS_GLVALUE(X) (IS_LVALUE(X) || IS_XVALUE(X))
#define IS_RVALUE(X) (IS_PRVALUE(X) || IS_XVALUE(X))
Now we can reproduce (almost) all the examples from cppreference on value category.
Here are some examples with C++17 (for terse static_assert):
void doesNothing(){}
struct S
{
int x{0};
};
int x = 1;
int y = 2;
S s;
static_assert(IS_LVALUE(x));
static_assert(IS_LVALUE(x+=y));
static_assert(IS_LVALUE("Hello world!"));
static_assert(IS_LVALUE(++x));
static_assert(IS_PRVALUE(1));
static_assert(IS_PRVALUE(x++));
static_assert(IS_PRVALUE(static_cast<double>(x)));
static_assert(IS_PRVALUE(std::string{}));
static_assert(IS_PRVALUE(throw std::exception()));
static_assert(IS_PRVALUE(doesNothing()));
static_assert(IS_XVALUE(std::move(s)));
// The next one doesn't work in gcc 8.2 but in gcc 9.1. Clang 7.0.0 and msvc 19.16 are doing fine.
static_assert(IS_XVALUE(S().x));
The mixed categories are kind of boring once you figured out the primary category.
For some more examples (and experimentation), check out the following link on compiler explorer. Don't bother reading the assembly, though. I added a lot of compilers just to make sure it works across all the common compilers.
These are terms that the C++ committee used to define move semantics in C++11. Here's the story.
I find it difficult to understand the terms given their precise definitions, the long lists of rules or this popular diagram:
It's easier on a Venn diagram with typical examples:
Basically:
every expression is either lvalue or rvalue
lvalue must be copied, because it has identity, so can be used later
rvalue can be moved, either because it's a temporary (prvalue) or explicitly moved (xvalue)
Now, the good question is that if we have two orthogonal properties ("has identity" and "can be moved"), what's the fourth category to complete lvalue, xvalue and prvalue?
That would be an expression that has no identity (hence cannot be accessed later) and cannot be moved (one need to copy its value). This is simply not useful, so hasn't been named.
How do these new categories relate to the existing rvalue and lvalue categories?
A C++03 lvalue is still a C++11 lvalue, whereas a C++03 rvalue is called a prvalue in C++11.
One addendum to the excellent answers above, on a point that confused me even after I had read Stroustrup and thought I understood the rvalue/lvalue distinction. When you see
int&& a = 3,
it's very tempting to read the int&& as a type and conclude that a is an rvalue. It's not:
int&& a = 3;
int&& c = a; //error: cannot bind 'int' lvalue to 'int&&'
int& b = a; //compiles
a has a name and is ipso facto an lvalue. Don't think of the && as part of the type of a; it's just something telling you what a is allowed to bind to.
This matters particularly for T&& type arguments in constructors. If you write
Foo::Foo(T&& _t) : t{_t} {}
you will copy _t into t. You need
Foo::Foo(T&& _t) : t{std::move(_t)} {} if you want to move. Would that my compiler warned me when I left out the move!
This is Venn diagram I made for a highly visual C++ book I'm writing which I will be publishing on leanpub during development soon.
The other answers go into more detail with words, and show similar diagrams. But hopefully this presentation of the information is fairly complete and useful for referencing, in addition.
The main takeaway for me on this topic is that expressions have these two properties: identity and movability. The first deals with the the "solidness" with which something exists. That's important because the C++ abstract machine is allowed to and encouraged to aggressively change and shrink your code through optimizations, and that means that things without identity might only ever exist in the mind of the compiler or in a register for a brief moment before getting trampled on. But a piece of data like that is also guaranteed not to cause issues if you recycle it's innards since there's no way to try to use it. And thus, move semantics were invented to allow us to capture references to temporaries, upgrading them to lvalues and extending their lifetime.
Move semantics originally were about not just throwing away temporaries wastefully, but instead giving them away so they can consumed by another.
When you give your cornbread away, the person you give it to now owns it. They consume it. You should not attempt to eat or digest said cornbread once you've given it away. Maybe that cornbread was headed for the trash anyway, but now it's headed for their bellies. It's not yours anymore.
In C++ land, the idea of "consuming" a resource means that resource is now owned by us and so we should do any cleanup necessary, and ensure the object isn't accessed elsewhere. Often times, that means borrowing the guts to create new objects. I call that "donating organs". Usually, we are talking about pointers or references contained by the object, or something like that, and we want to keep those pointers or references around because they refer to data elsewhere in our program that is not dying.
Thus you could write a function overload that takes an rvalue reference, and if a temporary (prvalue) were passed in, that's the overload that would be called. A new lvalue would be created upon binding to the rvalue reference taken by the function, extending the life of the temporary so you could consume it within your function.
At some point, we realized that we often had lvalue non-temporary data that we were finished with in one scope, but wanted to cannibalize in another scope. But they aren't rvalues and so wouldn't bind to an rvalue reference. So we made std::move, which is just a fancy cast from lvalue to rvalue reference. Such a datum is an xvalue: a former lvalue now acting as if it were a temporary so it can also be moved from.
I was reading Thomas Becker's article on rvalue reference and their use. In there he defines what he calls if-it-has-a-name rule:
Things that are declared as rvalue reference can be lvalues or
rvalues. The distinguishing criterion is: if it has a name, then it is
an lvalue. Otherwise, it is an rvalue.
This sounds very reasonable to me. It also clearly identifies the rvalueness of an rvalue reference.
My questions are:
Do you agree with this rule? If not, can you give an example where this rule can be violated?
If there are no violations of this rule. Can we use this rule to define rvalueness/lvaluness of an expression?
This is one of the most common "rules of thumb" used to explain what is the difference between lvalues and rvalues.
The situation in C++ is much more complex than that so this can't be nothing but a rule of thumb. I'll try to resume a couple of concepts and try to make it clear why this issue is so complex in the C++ world. First let's recap a bit what happened once upon a time
At the beginning there was C
First, what "lvalue" and "rvalue" used to mean originally, in the world of programming languages in general?
In a simpler language like C or Pascal, the terms used to refer to what could be placed at the Left or at the Right of an assignment operator.
In a language like Pascal where the assignment is not an expression but only a statement, the difference is pretty clear and it's defined in grammatical terms. An lvalue is a name of a variable, or a subscript of an array.
That's because only these two things could stand at the left of an assignment:
i := 42; (* ok *)
a[i] := 42; (* ok *)
42 := 42; (* no sense *)
In C, the same difference applies, and it is still pretty much grammatical in the sense that you could look at a line of code and tell if an expression would produce an lvalue or an rvalue.
i = 42; // ok, a variable
*p = 42; // ok, a pointer dereference
a[i] = 42; // ok, a subscript (which is a pointer dereference anyway)
s->var = 42; // ok, a struct member access
So what changed in C++?
Little languages grow up
In C++ things become much more complex and the difference is not grammatical anymore but involves the type checking process, for two reasons:
Everything could stay at the left of an assignment, as long as its type has a suitable overload of operator=
References
So this means that in C++ you can't say if an expression will produce an lvalue only by looking at its grammatical structure. For example:
f() = g();
is a statement that would have no sense in C but can be perfectly legal in C++ if, for example, f() returns a reference. That's how expressions like v[i] = j work for std::vector: the operator[] returns a reference to the element so you can assign to it.
So what's the point of having a distinction between lvalues and rvalues anymore? The distinction is still relevant for basic types of course, but also to decide what can be bound to a non-const reference.
That's because you don't want to have legal code like:
int &x = 42;
x = 0; // Have we changed the meaning of a natural number??
So the language specifies carefully what is an lvalue and what isn't, and then says that only lvalues can be bound to non-const references. So the above code is not legal because an integer literal is not an lvalue so a non-const reference cannot be bound to it.
Note that const references are different, since they can bind to literals and temporaries (and local references even extend the lifetime of those temporaries):
int const&x = 42; // It's ok
And until now we've only touched what already used to happen in C++98. The rules were already more complex than "if it has a name it's an lvalue", since you have to consider the references. So an expression returning a non-const reference is still considered an lvalue.
Also, other rules of thumb mentioned here already don't work in all cases. For example "if you can take it's address, it's an lvalue". If by "taking the address" you mean "applying operator&", then it might work, but don't trick yourself into thinking that you can't ever come to have the address of a temporary: The this pointer inside a temporary's member function, for example, will point to it.
What changed in C++11
C++11 puts more complexity into the bin by adding the concept of an rvalue reference, that is, a reference that can be bound to an rvalue even if non-const. The fact that it can only be applied to an rvalue make it both safe and useful. I don't think its needed to explain why rvalue reference are useful, so move on.
The point here is that now we have a lot more of cases to consider. So what is an rvalue now? The Standard actually distinguish between different kinds of rvalues to be able to correctly state the behavior of rvalue references and overload resolution and template argument deduction in the presence of rvalue references. So we have terms like xvalue, prvalue and things like that, which make things more complex.
What about our rules of thumb?
So "everything that has a name is an lvalue" can still be true, but for sure it isn't true that every lvalue has a name. A function returning a non-const lvalue reference is an lvalue. A function returning something by value creates a temporary and it is an rvalue, so is a function returning an rvalue reference.
What about "temporaries are rvalues". It's true, but also non-temporaries can be made into rvalues by simply casting the type (as does std::move).
So I think that all these rules are useful if we keep in mind what they are: rules of thumb.
They'll always have some corner case where they don't apply, because to exactly specify what an rvalue is and what isn't, we can't avoid using the exact terms and rules used in the standard. That's why they were written for!
While the rule covers a majority of case, I can't agree with it in general:
The dereferencing of an anonymous pointer does not have a name, yet it is an lvalue:
foo(*new X); // Not allowed if foo expects an rvalue reference (example of the article)
Based on the standard, and taking into account the special cases of temporary objects being rvalues, I'd suggest to update the second sentence of the rule :
" ... The criterion is: if it designates a function or an object
which is not of temporary nature, then it's an lvalue. ... ".
Question 1: That rule is strictly referring to classifying expressions of rvalue reference type, not expressions in general. I almost agree with it in this context ('almost' because there's a bit more to it, see the quote below). The precise wording is in a note in the Standard [Clause 5 paragraph 7]:
In general, the effect of this rule is that named rvalue references
are treated as lvalues and unnamed rvalue references to objects are
treated as xvalues; rvalue references to functions are treated as
lvalues whether named or not.
(emphases mine, for obvious reasons)
Question 2: As you can see from the other answers and comments (some nice examples in there), there are issues with general, concise statements about the value category of an expression. Here's the way I think about it.
We need to look at the problem from the other side: instead of trying to specify what expressions are lvalues, list the kinds that are rvalues; lvalues are everything else.
First, a couple of definitions to keep things clear:
An object means a region of storage for data, not a function and not a reference (it's the definition in the Standard).
When I say an expression generates something, I mean it doesn't just name it or refer to it, but actually constructs and returns it as the result of a combination of operators, function calls (possibly constructor calls) or casts (possibly implicit casts).
Now, based primarily on [3.10] (but also quite a few other places in the Standard), an expression is an rvalue if and only if it is one of the following:
a value that is not associated with an object (like this, or literals like 7, not string ones);
an expression that generates an object by value, a.k.a. a temporary object;
an expression that generates an rvalue reference to an object;
recursively, one of the following expressions using an rvalue:
x.y, where x is an rvalue and y is a non-static member object;
x.*y, where x is an rvalue and y is a pointer to a member object;
x[y], where either x or y is an rvalue of array type (using the built-in [] operator).
That's it.
Well, technically, the following special cases are also rvalues, but I don't think they're relevant in practice:
a function call returning void, a cast to void, or a throw (obviously not lvalues, I'm not sure why I'd ever be interested in their value category in practice);
one of obj.mf, ptr->mf, obj.*pmf, or ptr->*pmf (mf is a non-static member function, pmf is a pointer to member function); here we're talking strictly about these forms, not the function call expressions that can be built with them, and you really can't do anything with these but make a function call, which is a different expression altogether (to which we need to apply the rules above).
And that's really it. Everything else is an lvalue. I find it easy enough to reason about expressions this way, as all categories above are easily recognizable. For example, it's easy to look at an expression, rule out the cases above, and decide it's an lvalue. Even for category 4, which has a longer description, the expressions are easily recognizable (I tried hard to make it a one-liner, but ultimately failed).
Expressions involving operators can be lvalues or rvalues depending on the exact operator being used. Built-in operators specify what happens in each case, but user-defined operator functions can change the rules. When determining the value category of an expression, both the structure of the expression and the types involved matter.
Notes:
Regarding category 1:
this in the example refers to this the pointer value, not *this.
String literals are lvalues because they're arrays of static storage duration, so they don't fit in category 1 (they're associated with objects).
Some examples related to categories 2 and 3:
Given the declaration int& f(int), the expression f(7) doesn't generate an object by value, so it doesn't fit in category 2; it does generate a reference, but it's not an rvalue reference, so category 3 doesn't apply either; the expression is an lvalue.
Given the declaration int&& f(int), the expression f(7) generates an rvalue reference; category 3 applies here, so the expression is an rvalue.
Given the declaration int f(int), the expression f(7) generates an object by value; category 2 applies here, the expression is an rvalue.
For casts, we can apply the same reasoning as for the three bullets above.
Given the declaration int&& a, using the expression a doesn't generate an rvalue reference; it just uses an identifier of reference type. Category 3 doesn't apply, the expression is an lvalue.
Lambda expressions generate closure objects by value - they are in category 2.
Some examples related to category 4:
x->y is translated to (*x).y. *x is an lvalue (it doesn't fit in any of the categories above). So, if y is a non-static member object, x->y is an lvalue (it doesn't fit in category 4 because of *x and it doesn't fit in 6 because that one only talks about member functions).
In x.y, if y is a static member, then category 4 doesn't apply. Such an expression is always an lvalue, even if x is an rvalue (6 doesn't apply either, because it talks about non-static member functions).
In x.y, if y is of type T& or T&&, then it's not a member object (remember, objects, not references, not functions), so category 4 doesn't apply. Such an expression is always an lvalue, even if x is an rvalue and even if y is an rvalue reference.
Category 4 used to be a bit different in C++11, but I believe this wording is correct for C++14. (If you insist to know, the result of subscripting into an rvalue array used to be an lvalue in C++11, but is an xvalue in C++14 - issue 1213.)
Further separating rvalues into xvalues and prvalues is relatively straightforward for C++14: categories 1, 2, 5 and 6 are prvalues, 3 and 4 are xvalues. Things were slightly different for C++11: category 4 was split between prvalues, xvalues and lvalues (changed as noted above, and also as part of the resolution of issue 616). This can be important, as it can affect the type you get back from decltype, for example.
All references are to N4140, the last C++14 draft before publication.
I first found the last two special rvalue cases here (everything's also in the Standard, of course, but harder to find). Note that not everything on that page is accurate for C++14. It also contains a very nice summary on the rationale behind the primary value categories (at the top).
It took me quite some time to understand the difference between an rvalue and a temporary object. But now the final committee draft states on page 75:
An rvalue [...] is an xvalue, a temporary object or subobject thereof, or a value that is not associated with an object.
I can't believe my eyes. This must be an error, right?
To clarify, here is how I understand the terms:
#include <string>
void foo(std::string&& str)
{
std::cout << str << std::endl;
}
int main()
{
foo(std::string("hello"));
}
In this program, there are two expressions that denote the same temporary object: the prvalue std::string("hello") and the lvalue str. Expressions are not objects, but their evaluation might yield one. Specifically, the evaluation of a prvalue yields a temporary object, but a prvalue IS NOT a temporary object. Does anyone agree with me or have I gone insane? :)
Yes, i agree with you. This should be fixed in my opinion, and several people i deeply pay respect to have risen the exact same question about this.
This isn't as complicated as it sounds. I am referring to the now finalized standard ISO/IEC 14882-2011. Page 78 says:
An xvalue (an “eXpiring” value) also refers to an object, usually near
the end of its lifetime (so that its resources may be moved, for
example). An xvalue is the result of certain kinds of expressions
involving rvalue references (8.3.2).
The bold above has been added by me. The standard further says:
An rvalue (so called, historically, because rvalues could appear on
the right-hand side of an assignment expression) is an xvalue, a
temporary object (12.2) or subobject thereof, or a value that is not
associated with an object.
So you only get an xvalue when you are playing around with 'certain kinds of expressions involving rvalue references'. Otherwise your temporary objects are just that - temporary objects.