Is using "operator &" on a reference a portable C++ construct? - c++

Suppose I have:
void function1( Type* object ); //whatever implementation
void function2( Type& object )
{
function1( &object );
}
supposing Type doesn't have an overloaded operator &() will this construct - using operator & on a reference - obtain the actual address of the object (variable of Type type) on all decently standard-compliant C++ compilers?

Yes, and the reason is that on the very beginning of evaluating any expression, references are being replaced by the object that's referenced, as defined at 5[expr]/6 in the Standard. That will make it so the &-operator doesn't see any difference:
If an expression initially has the type "reference to T" (8.3.2, 8.5.3), the type is adjusted to "T" prior to any further analysis, the expression designates the object or function denoted by the reference, and the expression is an lvalue.
This makes it so that any operator that operates on an expression "sees through" the reference.

Yes, it takes the address of the object referred to. Once you have an initialised reference, ALL operations on it are performed on the referred to object.
This is actually a fairly frequently used trope:
struct A {
A( X & x ) : myx( &x ) {}
X * myx;
};

Yes, in 15 characters or more.

Related

Why do we return *this in asignment operator and generally (and not &this) when we want to return a reference to the object?

I'm learning C++ and pointers and I thought I understood pointers until I saw this.
On one side the asterix(*) operator is dereferecing, which means it returns the value in the address the value is pointing to, and that the ampersand (&) operator is the opposite, and returns the address of where the value is stored in memory.
Reading now about assignment overloading, it says "we return *this because we want to return a reference to the object". Though from what I read *this actually returns the value of this, and actually &this logically should be returned if we want to return a reference to the object.
How does this add up? I guess I'm missing something here because I didn't find this question asked elsewhere, but the explanation seems like the complete opposite of what should be, regarding the logic of * to dereference, & get a reference.
For example here:
struct A {
A& operator=(const A&) {
cout << "A::operator=(const A&)" << endl;
return *this;
}
};
this is a pointer that keeps the address of the current object. So dereferencing the pointer like *this you will get the lvalue of the current object itself. And the return type of the copy assignment operator of the presented class is A&. So returning the expression *this you are returning a reference to the current object.
According to the C++ 17 Standard (8.1.2 This)
1 The keyword this names a pointer to the object for which a
non-static member function (12.2.2.1) is invoked or a non-static data
member’s initializer (12.2) is evaluated.
Consider the following code snippet as an simplified example.
int x = 10;
int *this_x = &x;
Now to return a reference to the object you need to use the expression *this_x as for example
std::cout << *this_x << '\n';
& has multiple meanings depending on the context. In C and used alone, I can either be a bitwise AND operator or the address of something referenced by a symbol.
In C++, after a type name, it also means that what follows is a reference to an object of this type.
This means that is you enter :
int a = 0;
int & b = a;
… b will become de facto an alias of a.
In your example, operator= is made to return an object of type A (not a pointer onto it). This will be seen this way by uppers functions, but what will actually be returned is an existing object, more specifically the instance of the class of which this member function has been called.
Yes, *this is (the value of?) the current object. But the pointer to the current object is this, not &this.
&this, if it was legal, would be a pointer-to-pointer to the current object. But it's illegal, since this (the pointer itself) is a temporary object, and you can't take addresses of those with &.
It would make more sense to ask why we don't do return this;.
The answer is: forming a pointer requires &, but forming a reference doesn't. Compare:
int x = 42;
int *ptr = &x;
int &ref = x;
So, similarly:
int *f1() return {return &x;}
int &f1() return {return x;}
A simple mnemonic you can use is that the * and & operators match the type syntax of the thing you're converting from, not the thing you're converting to:
* converts a foo* to a foo&
& converts a foo& to a foo*
In expressions, there's no meaningful difference between foo and foo&, so I could have said that * converts foo* to foo, but the version above is easier to remember.
C++ inherited its type syntax from C, and C type syntax named types after the expression syntax for using them, not the syntax for creating them. Arrays are written foo x[...] because you use them by accessing an element, and pointers are written foo *x because you use them by dereferencing them. Pointers to arrays are written foo (*x)[...] because you use them by dereferencing them and then accessing an element, while arrays of pointers are written foo *x[...] because you use them by accessing an element and then dereferencing it. People don't like the syntax, but it's consistent.
References were added later, and break the consistency, because there isn't any syntax for using a reference that differs from using the referenced object "directly". As a result, you shouldn't try to make sense of the type syntax for references. It just is.
The reason this is a pointer is also purely historical: this was added to C++ before references were. But since it is a pointer, and you need a reference, you have to use * to get rid of the *.

C++ Type and Value Category for Expression and Variable

From this link, it says that
Objects, references, functions including function template specializations, and expressions have a property called type
So given the following:
int &&rf_int = 10;
I can say that variable rf_int is of compound type rvalue reference to int.
But when talking about value category, it specifically says that
Each expression has some non-reference type
and
Each C++ expression (an operator with its operands, a literal, a variable name, etc.)
Based on the above two statement, rf_int can be treated as an expression and expression has non-reference type.
Now I am really confused. Does rf_int have a reference type or not? Do we have to provide context when talking about the type of a name, be it a variable or an expression?
More specifically, when a variable name is used in function call:
SomeFunc(rf_int);
Is rf_int now considered an expression (thus it is an lvalue with type int), or a variable (thus it is an lvalue with type rvalue reference to int)?
EDIT: A comment here got me wonder about this issue.
It confused me first too but let me clear out the ambiguity in a simple way.
EXPRESSION is something that can be evaluated and must evaluate to a non-reference type right ?
( Yes, of course duh !! )
Now we also know a variable name is an l-value EXPRESSION.
( Dude get to the point already and stop making the expression word bold )
Okay now here is the catch, when we say a variable we mean a place in memory. Now will we call a place in memory an expression ? No, definitely not that is just completely absurd.
Expression is a generic term that we identify by defining some rules and anything that fall under those rules is an expression. It is necessary to define it this way to make sense of the code during compiler construction. One of that rule is that anything that evaluates to a value is an expression. Since from coding perspective using a variable name means that you wish the actual value is used when the code is compiled, so we call that variable name an expression.
So when they say a variable is an expression they don't mean the variable as in place in memory but the variable NAME from coding perspective. But using the term "variable name" to differentiate from the actual variable (place in memory) is just absurd. So saying "variable is an expression" is just fine as long as you think it from coding perspective.
Now to answer this first :
More specifically, when a variable name is used in function call:
SomeFunc(rf_int);
Is rf_int now considered an expression (thus it is an lvalue with
type int), or a variable (thus it is an lvalue with type rvalue
reference to int)?
A single variable is also an expression. So this question becomes invalid.
Now to answer this :
Based on the above two statement, rf_int can be treated as an
expression and expression has non-reference type.
Now I am really confused. Does rf_int have a reference type or not?
What if I say rf_int is an r-value reference and rf_int is also an l-value EXPRESSION.
( Oh brother, this guy and his obsession with expression )
It is true because if you do the following it will work.
int &&rf_int = 10; // rf_int is an r-value reference
int &x = rf_int; // x is an l-value reference and l-value reference can be initialized with l-value expression
cout << x; //Output will be 10
Now is rf_int an expression or a r-value reference, what will it be ? The answer is both. It depends on from which perspective you are thinking.
In other words what I'm trying to say is that if we think rf_int as a variable (some place in memory) then surely it has the type of r-value reference but since rf_int is also a variable name and from coding perspective it is an expression and more precisely an l-value expression and whenever you use this variable for the sake of evaluation you will get the value 10 which is an int so we are bound to say that rf_int type as an expression is an int which is a non-reference type.
If you think from compiler perceptive for a moment what line of code evaluates to a reference ? None right ?. You can try searching if you find any do let me know as well. But the point here is that the type of expression doesn't mean type of variable. It means the type of value that you get after evaluating the expression.
Hope I have clarified your question. If I missed something do let me know.
Does rf_int have a reference type or not?
The entity (or variable) with the name rf_int has type int&& (a reference type) because of the way it is declared, but the expression rf_int has type int (a non-reference type) per [expr]/5:
If an expression initially has the type “reference to T” ([dcl.ref],
[dcl.init.ref]), the type is adjusted to T prior to any further
analysis. The expression designates the object or function denoted by
the reference, and the expression is an lvalue or an xvalue, depending
on the expression. [ Note: Before the lifetime of the reference
has started or after it has ended, the behavior is undefined (see
[basic.life]).  — end note ]
Do we have to provide context when talking about the type of a name,
be it a variable or an expression?
Yes, we do. rf_int can be said to have different types depending on whether it refers to the entity or the expression.
More specifically, when a variable name is used in function call:
SomeFunc(rf_int);
Is rf_int now considered an expression (thus it is an lvalue with
type int), or a variable (thus it is an lvalue with type rvalue
reference to int)?
It is considered an expression, which is an lvalue of type int. (Note that value category is a property of expressions. It is not correct to say a variable is an lvalue.)
Based on this each function call is an expression.
Each argument passed to the function is also an expression.
Therefore, when you make a call to SumFunc(rf_int); you create an expression from the single rf_int variable.
ISO/IEC 14882 (c++14 standard) states that "The expression designates the object or function denoted by the reference, and the expression is an lvalue or an xvalue, depending on the expression."
Thus, the rf_int expression will be of type int.
This comment (you mention it before) is about type conversion error.
To illustrate my explanation, I have prepared a more complex, but (hopefully) more understandable example.
This example is about why we need rvalue refernce type and how to use it correctly.
class Obj {
int* pvalue;
public:
Obj(int m) {
pvalue = new int[100];
for(int i = 0; i < 100; i++)
pvalue[i] = m + i;
}
Obj(Obj& o) { // copy constructor
pvalue = new int[100];
for(int i = 0; i < 100; i++)
pvalue[i] = o.pvalue[i];
}
Obj(Obj&& o) { // move constructor
for(int i = 0; i < 100; i++)
pvalue = o.pvalue;
o.pvalue = nullptr;
}
// ...
};
Example 1
Obj obj1(3);
Obj obj2(obj1); // copy constructor
Obj obj3(std::move(obj1)); // move constructor
Obj obj4(Obj(3)); // move constructor
Line #1 - obj1 created.
Line #2 - obj2 created as a copy of obj1
Line #3 - obj3 created by moving value of obj1 to the obj3; obj1 loose it's value
Line #4 - temporary object created by Obj(3); obj4 created by moving value of temporary object to obj4; temporary object loose it's value
I think we have no real need of int&& but Obj&& can be very useful. Especially in the line #4.
Example 2
with my (simplified) interpretation of compiler errors
void SomeFunc0(Obj arg) {};
void SomeFunc1(Obj& arg) {};
void SomeFunc2(Obj&& arg) {};
int main()
{
Obj obj1(3); // object
Obj& obj2 = obj1; // reference to the object
Obj&& obj3 = Obj(3); // reference to the temporary object with extened lifetime
SomeFunc0(obj1); // ok - new object created from Obj
SomeFunc0(obj2); // ok - new object created from Obj&
SomeFunc0(obj3); // ok - new object created from Obj&&
SomeFunc0(Obj(3)); // ok - new object created from temporary object
SomeFunc0(std::move(obj1)); // ok - new object created from temporary object
SomeFunc0(std::move(obj2)); // ok - new object created from temporary object
SomeFunc0(std::move(obj3)); // ok - new object created from temporary object
SomeFunc1(obj1); // ok - reference to obj1 passed
SomeFunc1(obj2); // ok - reference to obj1 passed
SomeFunc1(obj3); // ok - reference to temp. obj. passed
SomeFunc1(Obj(3)); // error - lifetime of the temp. obj. too short
SomeFunc1(std::move(obj1)); // error - lifetime of the temp. obj. too short
SomeFunc1(std::move(obj2)); // error - lifetime of the temp. obj. too short
SomeFunc1(std::move(obj3)); // error - lifetime of the temp. obj. too short
SomeFunc2(obj1); // error - temporary object required
SomeFunc2(obj2); // error - temporary object required
SomeFunc2(obj3); // error - lifetime of the temp. obj. too long
SomeFunc2(Obj(3)); // ok
SomeFunc2(std::move(obj1)); // ok
SomeFunc2(std::move(obj2)); // ok
SomeFunc2(std::move(obj3)); // ok
return 0;
}

error: invalid initialization of non-const reference of type ‘int&’ from an rvalue of type ‘int’

Wrong form:
int &z = 12;
Correct form:
int y;
int &r = y;
Question:
Why is the first code wrong? What is the "meaning" of the error in the title?
C++03 3.10/1 says: "Every expression is either an lvalue or an rvalue." It's important to remember that lvalueness versus rvalueness is a property of expressions, not of objects.
Lvalues name objects that persist beyond a single expression. For example, obj , *ptr , ptr[index] , and ++x are all lvalues.
Rvalues are temporaries that evaporate at the end of the full-expression in which they live ("at the semicolon"). For example, 1729 , x + y , std::string("meow") , and x++ are all rvalues.
The address-of operator requires that its "operand shall be an lvalue". if we could take the address of one expression, the expression is an lvalue, otherwise it's an rvalue.
&obj; // valid
&12; //invalid
int &z = 12;
On the right hand side, a temporary object of type int is created from the integral literal 12, but the temporary cannot be bound to non-const reference. Hence the error. It is same as:
int &z = int(12); //still same error
Why a temporary gets created? Because a reference has to refer to an object in the memory, and for an object to exist, it has to be created first. Since the object is unnamed, it is a temporary object. It has no name. From this explanation, it became pretty much clear why the second case is fine.
A temporary object can be bound to const reference, which means, you can do this:
const int &z = 12; //ok
C++11 and Rvalue Reference:
For the sake of the completeness, I would like to add that C++11 has introduced rvalue-reference, which can bind to temporary object. So in C++11, you can write this:
int && z = 12; //C+11 only
Note that there is && intead of &. Also note that const is not needed anymore, even though the object which z binds to is a temporary object created out of integral-literal 12.
Since C++11 has introduced rvalue-reference, int& is now henceforth called lvalue-reference.
12 is a compile-time constant which can not be changed unlike the data referenced by int&. What you can do is
const int& z = 12;
Non-const and const reference binding follow different rules
These are the rules of the C++ language:
an expression consisting of a literal number (12) is a "rvalue"
it is not permitted to create a non-const reference with a rvalue: int &ri = 12; is ill-formed
it is permitted to create a const reference with a rvalue: in this case, an unnamed object is created by the compiler; this object will persist as long as the reference itself exist.
You have to understand that these are C++ rules. They just are.
It is easy to invent a different language, say C++', with slightly different rules. In C++', it would be permitted to create a non-const reference with a rvalue. There is nothing inconsistent or impossible here.
But it would allow some risky code where the programmer might not get what he intended, and C++ designers rightly decided to avoid that risk.
References are "hidden pointers" (non-null) to things which can change (lvalues). You cannot define them to a constant. It should be a "variable" thing.
EDIT::
I am thinking of
int &x = y;
as almost equivalent of
int* __px = &y;
#define x (*__px)
where __px is a fresh name, and the #define x works only inside the block containing the declaration of x reference.

Constant Reference Wrapper

I came across the following piece of code in the book on data structures by Mark Allen Weiss.
template <class Object>
class Cref
{
public:
Cref ( ) : obj ( NULL ) { }
explicit Cref( const Object & x ) : obj ( &x ) {
const Object & get( ) const
{
if ( isNull( ) )
throw NullPointerException( ) ;
else
return *obj;
}
bool isNull( ) const
( return obj == NULL; }
private:
const Object *obj;
};
So the point here is to assign null/initialize a constant reference. But I am not sure I understand the following:
1. We initialize a constant reference with another constant reference x. But why is it again done as obj(&x) ? the & in const Object & x is different from the &x in obj(&x) ? I see this but not very clear why it should be so. Pls explain.
2. The get method() - We try to return a const reference of the private member obj of this class. It is already a const reference. Why return *obj and not just obj ?
3. Why explicit keyword ? What might happen if an implicit type conversion takes place ? Can someone provide a scenario for this ?
Thanks
The member obj is of type Object*, but the constructor takes a reference. Therefore to get a pointer, the address-of operator, &, has to be applied. And the member is a pointer because it can be NULL (set in the default constructor), and references never can be NULL.
The private member is not a const reference, but a poinetr to const. It is dereferenced to get a reference.
In this specific case, I cannot see any negative effect of a potential implicit conversion either.
1) &x in obj(&x) is used as the address-of operator.
2) No, it is a pointer. Its time is Object *, not Object &.
3) To prevent casting from incompatible pointer types which might have their own type-cast operator.
C++ has three flavors of null available:
NULL - obsolete; use it only for checking the return values of C functions which return pointers
0 - deprecated; the literal zero is defined in the Standard to be the null pointer
nullptr - as of C++11, this is the preferred way of testing for null. Furthermore, nullptr_t is a type-safe null.
1) The token & has three meanings:
Unary address-of operator: Take the address of any lvalue expression, giving a pointer to that object.
Reference sigil: In a declaration, means a reference type.
Binary bitwise AND operator - not used here
So it's important to know whether you're looking at a declaration or an expression.
explicit Cref( const Object & x )
Here the sigil appears in the declaration of a function parameter, meaning the type of parameter x is a reference to an Object which is const.
: obj ( &x ) {}
Here the operator is used in a member initializer expression. Member obj is initialized to be a pointer to x.
2) Since member obj is actually a pointer, the dereference operator unary * is needed to get a reference.
3) In general, it's a good idea to use explicit on any (non-copy) constructor which can take exactly one argument. It's just unfortunate that the language doesn't default to explicit and make you use some sort of "implicit" keyword instead when you mean to. In this case, here's one rather bad thing that could happen if the constructor were implicit:
Object create_obj();
void setup_ref(Cref& ref) {
ref = create_obj();
}
No compiler error or warning, but that setup_ref function stores a pointer to a temporary object, which is invalid by the time the function returns!
1) The type of obj is not a reference, obj is a pointer, and a const pointer must be initialized with a address, so, the address-of operator, &, must be applied.
2) The reference the get() method returns is not a reference to the class member obj, but a reference to the object which obj points, so you have to use * to deference it.
3) The keyword explicit means we reject the implicit conversions, that is, we tell the compiler, don't make implicit conversions with this constructor for us.
Example:
class Cref<Object> A;
Object B;
A=B;
It's OK without the explicit--Since we need a Cref object on the right side, the compiler will automate making a Cref object with constructor Cref( const Object & B ). But if you add an explicit, the compiler won't make conversions, there will be a compiling error.

What's the lifetime of the object returned by typeid operator?

If I call typeid and retrieve the address of returned type_info:
const type_info* info = &( typeid( Something ) );
what's the lifetime of the object returned by typeid and how long will the pointer to that object remain valid?
However the implementation implements them, the results of typeid expressions are lvalues and the lifetime of the objects that those lvalues refer to must last until the end of the program.
From ISO/IEC 14882:2003 5.2.8 [expr.typeid]:
The result of a typeid expression is an lvalue [...] The lifetime of the object referred to by the lvalue extends to the end of the program.
From 5.2.8.1 of C++ 2003 standard:
The result of a typeid expression is an lvalue of static type const
std::type_info (18.5.1) and dynamic type const std::type_info or const
name where name is an implementation-defined class derived from
std::type_info which preserves the behavior described in 18.5.1.61)
The lifetime of the object referred to by the lvalue extends to the
end of the program. Whether or not the destructor is called for the
type_info object at the end of the program is unspecified.
Its lifetime is the duration of the program. And no matter how many times you write typeid(x), it will return the same type_info object everytime, for same type.
That is,
T x, y;
const type_info & xinfo = typeid(x);
const type_info & yinfo = typeid(y);
The references xinfo and yinfo both refer to the same object. So try printing the address to verify it:
cout << &xinfo << endl; //printing the address
cout << &yinfo << endl; //printing the address
Output:
0x80489c0
0x80489c0
Note: for your run, the address might be different from the above, but whatever it is, it will be same!
Demo : http://www.ideone.com/jO4CO