Related
I use boost::variant a lot and am quite familiar with it. boost::variant does not restrict the bounded types in any way, in particular, they may be references:
#include <boost/variant.hpp>
#include <cassert>
int main() {
int x = 3;
boost::variant<int&, char&> v(x); // v can hold references
boost::get<int>(v) = 4; // manipulate x through v
assert(x == 4);
}
I have a real use-case for using a variant of references as a view of some other data.
I was then surprised to find, that std::variant does not allow references as bounded types, std::variant<int&, char&> does not compile and it says here explicitly:
A variant is not permitted to hold references, arrays, or the type void.
I wonder why this is not allowed, I don't see a technical reason. I know that the implementations of std::variant and boost::variant are different, so maybe it has to do with that? Or did the authors think it is unsafe?
PS: I cannot really work around the limitation of std::variant using std::reference_wrapper, because the reference wrapper does not allow assignment from the base type.
#include <variant>
#include <cassert>
#include <functional>
int main() {
using int_ref = std::reference_wrapper<int>;
int x = 3;
std::variant<int_ref> v(std::ref(x)); // v can hold references
static_cast<int&>(std::get<int_ref>(v)) = 4; // manipulate x through v, extra cast needed
assert(x == 4);
}
Fundamentally, the reason that optional and variant don't allow reference types is that there's disagreement on what assignment (and, to a lesser extent, comparison) should do for such cases. optional is easier than variant to show in examples, so I'll stick with that:
int i = 4, j = 5;
std::optional<int&> o = i;
o = j; // (*)
The marked line can be interpreted to either:
Rebind o, such that &*o == &j. As a result of this line, the values of i and j themselves remain changed.
Assign through o, such &*o == &i is still true but now i == 5.
Disallow assignment entirely.
Assign-through is the behavior you get by just pushing = through to T's =, rebind is a more sound implementation and is what you really want (see also this question, as well as a Matt Calabrese talk on Reference Types).
A different way of explaining the difference between (1) and (2) is how we might implement both externally:
// rebind
o.emplace(j);
// assign through
if (o) {
*o = j;
} else {
o.emplace(j);
}
The Boost.Optional documentation provides this rationale:
Rebinding semantics for the assignment of initialized optional references has been chosen to provide consistency among initialization states even at the expense of lack of consistency with the semantics of bare C++ references. It is true that optional<U> strives to behave as much as possible as U does whenever it is initialized; but in the case when U is T&, doing so would result in inconsistent behavior w.r.t to the lvalue initialization state.
Imagine optional<T&> forwarding assignment to the referenced object (thus changing the referenced object value but not rebinding), and consider the following code:
optional<int&> a = get();
int x = 1 ;
int& rx = x ;
optional<int&> b(rx);
a = b ;
What does the assignment do?
If a is uninitialized, the answer is clear: it binds to x (we now have another reference to x). But what if a is already initialized? it would change the value of the referenced object (whatever that is); which is inconsistent with the other possible case.
If optional<T&> would assign just like T& does, you would never be able to use Optional's assignment without explicitly handling the previous initialization state unless your code is capable of functioning whether after the assignment, a aliases the same object as b or not.
That is, you would have to discriminate in order to be consistent.
If in your code rebinding to another object is not an option, then it is very likely that binding for the first time isn't either. In such case, assignment to an uninitialized optional<T&> shall be prohibited. It is quite possible that in such a scenario it is a precondition that the lvalue must be already initialized. If it isn't, then binding for the first time is OK while rebinding is not which is IMO very unlikely. In such a scenario, you can assign the value itself directly, as in:
assert(!!opt);
*opt=value;
Lack of agreement on what that line should do meant it was easier to just disallow references entirely, so that most of the value of optional and variant can at least make it for C++17 and start being useful. References could always be added later - or so the argument went.
The fundamental reason is that a reference must be assigned to something.
Unions naturally do not - can not, even - set all their fields simultaneously and therefore simply cannot contain references, from the C++ standard:
If a union contains a non-static data member of reference type the
program is ill-formed.
std::variant is a union with extra data denoting the type currently assigned to the union, so the above statement implicitly holds true for std:variant as well. Even if it were to be implemented as a straight class rather than a union, we'd be back to square one and have an uninitialised reference when a different field was in use.
Of course we can get around this by faking references using pointers, but this is what std::reference_wrapper takes care of.
Take this for example:
const Integer operator+(const Integer& left, const Integer& right) {
return Integer(left.i + right.i);
}
(taken from page 496 of Thinking in C++)
What is the part after the return statement? A cast (to make the result of the sum an Integer) or a call to the constructor of the class Integer? Or maybe something else that I ignore..
This is the constructor:
Integer(long ll = 0) : i(ll) {}
Edit:
i it's long int.
Casting means "changing an entity of one data type into another". That said, you can consider Integer() as a cast from long to Integer, as the two types are related and the operation translates into "build an object of type B, starting with an object of type A".
With this syntax, there is no protection against misuse, i.e. if the constructor takes only one parameter, the parameter might not be used to build an object directly representing the first (e.g. each QWidget takes a pointer to the parent, but it is not representing its parent, obviously), and you cannot do anything to prevent this. You could block implicit initialization by marking single-parameter constructor as explicit, but nothing more.
The syntax for old-style casts and constructors with only one parameter is exactly the same, and that's the reason why a new syntax was created for the first: use new style (explicit) C++ syntax for casts, that is, const_cast, dynamic_cast, static_cast or reinterpret_cast as appropriate.
In the very words of Bjarne Stroustrup, this verbose casting syntax was introduced to make clear when a cast is taking place. Note that having four forms also allows for proper differentiation of the programmer's intent.
Finally, int() and such are considered old-style for plain types (int, long, etc.) and newvar = (T)oldvar form exists only because of C compatibility constraint.
Its a constructor call.
Object creation in c++ will be in 2 ways,
Integer* i = new Integer(args); //A Pointer i , pointing to the object created at a memory location.
or
Integer i = Integer(args); //Object i
Your case is 2nd one, but the initialized object is not assigned to i. Rather it is passed as it is.
Moreover,
A cast could be trivial if it is (DataType) value., In this case it would be surely a cast.
But in the case of DataType(value) if it is a primitive type, it would be a cast, but if it is a non-primitive type surely it will be a constructor call.
Say the object is
class A {
public : void Silly(){
this = 0x12341234;
}
I know I will get compiler error ' "this" is not a lvalue.' But then it is not a temporary either. So what is the hypothetical declaration of "this" ?
Compiler : GCC 4.2 compiler on mac.
For some class X, this has the type X* this;, but you're not allowed to assign to it, so even though it doesn't actually have the type X *const this, it acts almost like it was as far as preventing assignment goes. Officially, it's a prvalue, which is the same category as something like an integer literal, so trying to assign to it is roughly equivalent to trying to assign a different value to 'a' or 10.
Note that in early C++, this was an lvalue -- assigning to this was allowed -- you did that to handle the memory allocation for an object, vaguely similar to overloading new and delete for the class (which wasn't supported yet at that time).
It is impossible to provide a "declaration" for this. There's no way to "declare" an rvalue in C++. And this is an rvalue, as you already know.
Lvalueness and rvalueness are the properties of expressions that produce these values, not the properties of declarations or objects. In that regard, one can even argue that it impossible to declare an lvalue either. You declare an object. Lvalue is what is produced when you use the name of that object as an expression. In that sense both "to declare an rvalue" and "to declare an lvalue" are oxymoron expressions.
Your question also seems to suggest that the properties of "being an lvalue" and "being a temporary" are somehow complementary, i.e. everything is supposedly an lvalue or a temporary. In reality, the property of "being a temporary" has no business being here. All expressions are either lvalues or rvalues. And this happens to be an rvalue.
Temporaries, on the other hand, can be perceived as rvalues or as lvalues, depending on how you access the temporary.
P.S. Note, BTW, that in C++ (as opposed to C) ordinary functions are lvalues.
For one thing, this is not a variable - it's a keyword. When used as a rvalue, its type is A * or A const *. In modern C++, assigning to this is prohibited. You cannot take the address of this, either. In other words, it's not a valid lvalue.
To answer the second part, "why is this not an lvalue", I'm speculating as to the committee's actual motivation, but advantages include:
assigning to this doesn't make much logical sense, so there's no particular need for it to appear on the left-hand-side of assignments. Making it an rvalue emphasises that this doesn't make much sense by forbidding it, and means that the standard doesn't have to define what happens if you do it.
making it an rvalue prevents you taking a pointer to it, which in turn relieves the implementation of any need to furnish it with an address, just like a register-modified automatic variable. It could for example devote a register in non-static member functions to storing this. If you take a const reference, then unless the use permits cunning optimization it needs to be copied somewhere that has an address, but at least it needn't be the same address if you do it twice in quick succession, as it would need to be if this were a declared variable.
You get a compiler error because this is a const pointer to the instance of the class of the same type as that class. You can't assign to it although you can use it to change other class members in non-const qualified methods, call methods, and operators. Also note because it's an instance that static methods do not have a this pointer.
Hypothetical:
class Whatever
{
// your error because this is Whatever* const this;
void DoWhatever(const Whatever& obj) { this = &obj; }
// this is ok
void DoWhatever(const Whatever& obj) { *this = obj; }
// error because this is now: const Whatever* const this;
void DoWhatever(const Whatever& obj) const { *this = obj; }
// error because this doesn't exist in this scope
static void DoWhatever(const Whatever& obj) { *this = obj; }
};
I've got a C++ data-structure that is a required "scratchpad" for other computations. It's not long-lived, and it's not frequently used so not performance critical. However, it includes a random number generator amongst other updatable tracking fields, and while the actual value of the generator isn't important, it is important that the value is updated rather than copied and reused. This means that in general, objects of this class are passed by reference.
If an instance is only needed once, the most natural approach is to construct them whereever needed (perhaps using a factory method or a constructor), and then passing the scratchpad to the consuming method. Consumers' method signatures use pass by reference since they don't know this is the only use, but factory methods and constructors return by value - and you can't pass unnamed temporaries by reference.
Is there a way to avoid clogging the code with nasty temporary variables? I'd like to avoid things like the following:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
xyz.initialize_computation(useless_temp);
I could make the scratchpad intrinsically mutable and just label all parameters const &, but that doesn't strike me as best-practice since it's misleading, and I can't do this for classes I don't fully control. Passing by rvalue reference would require adding overloads to all consumers of scratchpad, which kind of defeats the purpose - having clear and concise code.
Given the fact that performance is not critical (but code size and readability are), what's the best-practice approach to passing in such a scratchpad? Using C++0x features is OK if required but preferably C++03-only features should suffice.
Edit: To be clear, using a temporary is doable, it's just unfortunate clutter in code I'd like to avoid. If you never give the temporary a name, it's clearly only used once, and the fewer lines of code to read, the better. Also, in constructors' initializers, it's impossible to declare temporaries.
While it is not okay to pass rvalues to functions accepting non-const references, it is okay to call member functions on rvalues, but the member function does not know how it was called. If you return a reference to the current object, you can convert rvalues to lvalues:
class scratchpad_t
{
// ...
public:
scratchpad_t& self()
{
return *this;
}
};
void foo(scratchpad_t& r)
{
}
int main()
{
foo(scratchpad_t().self());
}
Note how the call to self() yields an lvalue expression even though scratchpad_t is an rvalue.
Please correct me if I'm wrong, but Rvalue reference parameters don't accept lvalue references so using them would require adding overloads to all consumers of scratchpad, which is also unfortunate.
Well, you could use templates...
template <typename Scratch> void foo(Scratch&& scratchpad)
{
// ...
}
If you call foo with an rvalue parameter, Scratch will be deduced to scratchpad_t, and thus Scratch&& will be scratchpad_t&&.
And if you call foo with an lvalue parameter, Scratch will be deduced to scratchpad_t&, and because of reference collapsing rules, Scratch&& will also be scratchpad_t&.
Note that the formal parameter scratchpad is a name and thus an lvalue, no matter if its type is an lvalue reference or an rvalue reference. If you want to pass scratchpad on to other functions, you don't need the template trick for those functions anymore, just use an lvalue reference parameter.
By the way, you do realize that the temporary scratchpad involved in xyz.initialize_computation(scratchpad_t(1, 2, 3)); will be destroyed as soon as initialize_computation is done, right? Storing the reference inside the xyz object for later user would be an extremely bad idea.
self() doesn't need to be a member method, it can be a templated function
Yes, that is also possible, although I would rename it to make the intention clearer:
template <typename T>
T& as_lvalue(T&& x)
{
return x;
}
Is the problem just that this:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
is ugly? If so, then why not change it to this?:
auto useless_temp = factory(rng_parm);
Personally, I would rather see const_cast than mutable. When I see mutable, I'm assuming someone's doing logical const-ness, and don't think much of it. const_cast however raises red flags, as code like this should.
One option would be to use something like shared_ptr (auto_ptr would work too depending on what factory is doing) and pass it by value, which avoids the copy cost and maintains only a single instance, yet can be passed in from your factory method.
If you allocate the object in the heap you might be able to convert the code to something like:
std::auto_ptr<scratch_t> create_scratch();
foo( *create_scratch() );
The factory creates and returns an auto_ptr instead of an object in the stack. The returned auto_ptr temporary will take ownership of the object, but you are allowed to call non-const methods on a temporary and you can dereference the pointer to get a real reference. At the next sequence point the smart pointer will be destroyed and the memory freed. If you need to pass the same scratch_t to different functions in a row you can just capture the smart pointer:
std::auto_ptr<scratch_t> s( create_scratch() );
foo( *s );
bar( *s );
This can be replaced with std::unique_ptr in the upcoming standard.
I marked FredOverflow's response as the answer for his suggestion to use a method to simply return a non-const reference; this works in C++03. That solution requires a member method per scratchpad-like type, but in C++0x we can also write that method more generally for any type:
template <typename T> T & temp(T && temporary_value) {return temporary_value;}
This function simply forwards normal lvalue references, and converts rvalue references into lvalue references. Of course, doing this returns a modifiable value whose result is ignored - which happens to be exactly what I want, but may seem odd in some contexts.
Where I work, people mostly think that objects are best initialised using C++-style construction (with parentheses), whereas primitive types should be initialised with the = operator:
std::string strFoo( "Foo" );
int nBar = 5;
Nobody seems to be able to explain why they prefer things this way, though. I can see that std::string = "Foo"; would be inefficient because it would involve an extra copy, but what's wrong with just banishing the = operator altogether and using parentheses everywhere?
Is it a common convention? What's the thinking behind it?
Initializing variables with the = operator or with a constructor call are semantically the same, it's just a question of style. I prefer the = operator, since it reads more naturally.
Using the = operator usually does not generate an extra copy - it just calls the normal constructor. Note, however, that with non-primitive types, this is only for initializations that occur at the same time as the declarations. Compare:
std::string strFooA("Foo"); // Calls std::string(const char*) constructor
std::string strFoo = "Foo"; // Calls std::string(const char*) constructor
// This is a valid (and standard) compiler optimization.
std::string strFoo; // Calls std::string() default constructor
strFoo = "Foo"; // Calls std::string::operator = (const char*)
When you have non-trivial default constructors, the latter construction can be slightly more inefficient.
The C++ standard, section 8.5, paragraph 14 states:
Otherwise (i.e., for the remaining copy-initialization cases), a temporary is created. User-defined conversion sequences that can convert from the source type to the destination type or a derived class thereof are enumerated (13.3.1.4), and the best one is chosen through overload resolution (13.3). The user-defined conversion so selected is called to convert the initializer expression into a temporary, whose type is the type returned by the call of the user-defined conversion function, with the cv-qualifiers
of the destination type. If the conversion cannot be done or is ambiguous, the initialization is ill-formed. The object being initialized is then direct-initialized
from the temporary according to the rules above.87) In certain cases, an implementation is permitted to eliminate the temporary by initializing the object directly; see 12.2.
Part of section 12.2 states:
Even when the creation of the temporary object is avoided, all the semantic restrictions must be respected as if the temporary object was created. [Example:
even if the copy constructor is not called, all the semantic restrictions, such as accessibility (11), shall be satisfied. ]
I just felt the need for another silly litb post.
string str1 = "foo";
is called copy-initialization, because what the compiler does, if it doesn't elide any temporaries, is:
string str1(string("foo"));
beside checking that the conversion constructor used is implicit. In fact, all implicit conversions are defined by the standard in terms of copy initialization. It is said that an implicit conversion from type U to type T is valid, if
T t = u; // u of type U
is valid.
In constrast,
string str1("foo");
is doing exactly what is written, and is called direct initialization. It also works with explicit constructors.
By the way, you can disable eliding of temporaries by using -fno-elide-constructors:
-fno-elide-constructors
The C++ standard allows an implementation to omit creating a temporary which
is only used to initialize another object of the same type. Specifying this
option disables that optimization, and forces G++ to call the copy constructor
in all cases.
The Standard says there is practically no difference between
T a = u;
and
T a(u);
if T and the type of u are primitive types. So you may use both forms. I think that it's just the style of it that makes people use the first form rather than the second.
Some people may use the first in some situation, because they want to disambiguate the declaration:
T u(v(a));
migh look to someone as a definition of a variable u that is initialized using a temporary of a type v that gets a parameter for its constructor called a. But in fact, what the compiler does with that is this:
T u(v a);
It creates a function declaration that takes a argument of type v, and with a parameter called a. So people do
T u = v(a);
to disambiguate that, even though they could have done
T u((v(a)));
too, because there are never parentheses around function parameters, the compiler would read it as a variable definition instead of a function declaration too :)
Unless you've proven that it matters with respect to performance, I wouldn't worry about an extra copy using the assignment operator in your example (std::string foo = "Foo";). I'd be pretty surprised if that copy even exists once you look at the optimized code, I believe that will actually call the appropriate parameterized constructor.
In answer to your question, yes, I'd say that it's a pretty common convention. Classically, people have used assignment to initialize built-in types, and there isn't a compelling reason to change the tradition. Readability and habit are perfectly valid reasons for this convention given how little impact it has on the ultimate code.
You will probably find that code such as
std::string strFoo = "Foo";
will avoid doing an extra copy and compiles to the same code (a call of a single-argument constructor) as the one with parentheses.
On the other hand, there are cases where one must use parentheses, such as a constructor member initialisation list.
I think the use of = or parentheses to construct local variables is largely a matter of personal choice.
Well, who knows what they think, but I also prefer the = for primitive types, mainly because they are not objects, and because that's the "usual" way to initialize them.
But then just to confuse you even more you initialize primitives in the initialization list using object syntax.
foo::foo()
,anInt(0)
,aFloat(0.0)
{
}
It's an issue of style. Even the statement that "std::string = "Foo"; would be inefficient because it would involve an extra copy" is not correct. This "extra copy" is removed by the compiler.
I believe that is more of a habit, very few objects could be initialized using = , the string is one of them. It's also a way of doing what you said "using parenthesis everywhere (that the language allows you to use it)"
One argument that one could make for:
std::string foo("bar");
Is that it keeps things the same even if the argument count changes, i.e.:
std::string foo("bar", 5);
Doesn't work with a '=' sign.
Another thing is that for many objects a '=' feels unnatural, for example say you have a Array class where the argument gives the length:
Array arr = 5;
Doesn't feel good, since we don't construct an Array with the value 5, but with length 5:
Array arr(5);
feels more natural, since you are constructing an object with the given parameter, not just copying a value.