const_cast<char *>(char* const) not lvalue? - c++

When compiling the code below, I am getting an error on line 3 about the result of const_cast not being an lvalue. Is this only a problem because I used gcc 7.x (even though it is supposed to be fully C++17 compliant)? Or is this indeed invalid code according to the standard?
The code below is a minimal example that triggers the error. Tried gcc 7.1, 7.4, and also https://www.onlinegdb.com/online_c++_compiler and got the same error.
char* const a = "xyz";
char* b;
const_cast<char*>(a) = b; // not lvalue error
The precise error gcc gives is: "error: lvalue required as left operand of assignment".
NOTE (forgot to add): the example has nothing to do with actual code I would ever write. It is an example I came about which (I presume) was created to test how well people understand the standard. So I am only interested in precisely what I asked in the question, i.e., whether this is valid code or not (and why). Thx!

So I am only interested in precisely what I asked in the question, i.e., whether this is valid code or not
It's not. The result of const_cast is a glvalue (lvalue or xvalue) only when casting to a reference type.
[expr.const.cast] (emphasis mine)
1 The result of the expression const_­cast<T>(v) is of type
T. If T is an lvalue reference to object type, the result is an
lvalue; if T is an rvalue reference to object type, the result is an
xvalue; otherwise, the result is a prvalue and the
lvalue-to-rvalue, array-to-pointer, and function-to-pointer standard
conversions are performed on the expression v. Conversions that can be
performed explicitly using const_­cast are listed below. No other
conversion shall be performed explicitly using const_­cast.
You don't cast to a reference type, so the result is a prvalue; not something you may assign to. And don't go casting to a reference type either; attempting to modify an object declared as const gives undefined behavior. Your program will be another sort of invalid then.

First, char* const a = "xyz"; is illegal. a string literal has the type const char[N] and assign it to a char * const removes the constness of the characters which is illegal in an implicit cast.
Now lets pretend that it's fine and lets look at
const_cast<char*>(a) = b
This has two issues. The first is that const_cast<char*>(a) results in a rvalue. For non-class types you cannot assign to rvalues. You would need const_cast<char*&>(a) in order to have an lvalue to assign to, and that brings up the next problem. You can't assign to an object that is const. Stripping away the const using const_cast doesn't fix the issue. It is still not allowed per [dcl.type.cv]/4
Any attempt to modify ([expr.ass], [expr.post.incr], [expr.pre.incr]) a const object ([basic.type.qualifier]) during its lifetime ([basic.life]) results in undefined behavior.
Even with the proper cast, the underlying object is still const so you violate the above clause and have undefined behavior.

The type char * const a defines a pointer variable a, which cannot be changed, but points to characters that can be changed. This is not a common use to make the pointer constant.
The error is telling you that you cannot update the value of a - it's not an lvalue, and I don't believe that const_cast gets around that in this case.
Could you possibly mean const char *a, which allows the pointer itself to be changed, but not the things pointed to?

An "lvalue" is a syntactic construct, meaning a kind of expression that can appear on the left of an assignment. You can assign to a variable, or an array component, or a field, but it's a syntax error to write an assignment to other kinds of expression such as x + y = 7; or f(x) = 5;. A function call such as const_cast<char*>(a) is not a kind of expression which can be assigned to.
It would be syntactically valid to write a = const_cast<char*>(b);, where the function call appears on the right of the assignment.

Related

Why can't casting an address to int* be an lvalue but casting to a struct pointer can?

I suspect this is true for all primitive types in C/C++.
For example, if you do this:
((unsigned int*)0x1234) = 1234;
The compiler will not let it pass. Whereas if you do this
((data_t*)0x1234 )->s = 1234;
where data_t is a struct, the compiler allows it.
This seems to be the case for at least two compilers I experimented on, one ARM GCC, one TDM-GCC.
Why is this?
The first code snippet doesn't work because the left hand side is not an lvalue. It is only a pointer value, and pointers by themselves are not lvalues.
The second code snippet works because a pointer is being dereferenced, and a dereferenced pointer is an lvalue. It may not be immediately clear from the syntax this is the case, so let's rewrite this:
((data_t*)0x1234 )->s = 1234;
As:
(*(data_t*)0x1234).s = 1234;
Now we can see that the value which is casted to a pointer is dereferenced to an lvalue of struct type, and a member of that struct is subsequently accessed and assigned to.
This is described in section 6.5.2.3p4 of the C standard regarding the -> operator:
A postfix expression followed by the -> operator and an identifier
designates a member of a structure or union object. The value
is that of the named member of the object to which the first
expression points, and is an lvalue. If the first expression is a
pointer to a qualified type, the result has the so-qualified
version of the type of the designated member.
Regarding the first snippet, section 6.5.4p5 regarding the typecast operator states:
Preceding an expression by a parenthesized type name converts
the value of the expression to the named type. This construction is
called a cast. 104) A cast that specifies no conversion has
no effect on the type or value of an expression.
Where footnote 104 states:
A cast does not yield an lvalue. Thus, a cast to a qualified
type has the same effect as a cast to the unqualified version
of the type.
So this describes why the first snippet won't compile but the second snippet will.
However, treating an arbitrary value as a pointer and dereferencing it is implementation defined behavior at best, and most likely undefined behavior.
Your examples are:
((unsigned int*)0x1234) = 1234;
((data_t*)0x1234 )->s = 1234;
Neither ((unsigned int*)0x1234) nor ((data_t*)0x1234 ) is an lvalue, and you can't assign to either of them.
More generally, the prefix of -> doesn't have to be an lvalue. But prefix->member is always an lvalue, whether prefix is or not. Similarly, *p is an value whether p is an lvalue or not.

Accessing field of NULL pointer to a struct works if address-of is used [duplicate]

Code sample:
struct name
{
int a, b;
};
int main()
{
&(((struct name *)NULL)->b);
}
Does this cause undefined behaviour? We could debate whether it "dereferences null", however C11 doesn't define the term "dereference".
6.5.3.2/4 clearly says that using * on a null pointer causes undefined behaviour; however it doesn't say the same for -> and also it does not define a -> b as being (*a).b ; it has separate definitions for each operator.
The semantics of -> in 6.5.2.3/4 says:
A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.
However, NULL does not point to an object, so the second sentence seems underspecified.
Also relevant might be 6.5.3.2/1:
Constraints:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
However I feel that the bolded text is defective and should read lvalue that potentially designates an object , as per 6.3.2.1/1 (definition of lvalue) -- C99 messed up the definition of lvalue, so C11 had to rewrite it and perhaps this section got missed.
6.3.2.1/1 does say:
An lvalue is an expression (with an object type other than void) that potentially
designates an object; if an lvalue does not designate an object when it is evaluated, the
behavior is undefined
however the & operator does evaluate its operand. (It doesn't access the stored value but that is different).
This long chain of reasoning seems to suggest that the code causes UB however it is fairly tenuous and it's not clear to me what the writers of the Standard intended. If in fact they intended anything, rather than leaving it up to us to debate :)
From a lawyer point of view, the expression &(((struct name *)NULL)->b); should lead to UB, since you could not find a path in which there would be no UB. IMHO the root cause is that at a moment you apply the -> operator on an expression that does not point to an object.
From a compiler point of view, assuming the compiler programmer was not overcomplicated, it is clear that the expression returns the same value as offsetof(name, b) would, and I'm pretty sure that provided it is compiled without error any existing compiler will give that result.
As written, we could not blame a compiler that would note that in the inner part you use operator -> on an expression than cannot point to an object (since it is null) and issue a warning or an error.
My conclusion is that until there is a special paragraph saying that provided it is only to take its address it is legal do dereference a null pointer, this expression is not legal C.
Yes, this use of -> has undefined behavior in the direct sense of the English term undefined.
The behavior is only defined if the first expression points to an object and not defined (=undefined) otherwise. In general you shouldn't search more in the term undefined, it means just that: the standard doesn't provide a meaning for your code. (Sometimes it points explicitly to such situations that it doesn't define, but this doesn't change the general meaning of the term.)
This is a slackness that is introduced to help compiler builders to deal with things. They may defined a behavior, even for the code that you are presenting. In particular, for a compiler implementation it is perfectly fine to use such code or similar for the offsetof macro. Making this code a constraint violation would block that path for compiler implementations.
Let's start with the indirection operator *:
6.5.3.2 p4:
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type "pointer to type", the result has type "type". If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined. 102)
*E, where E is a null pointer, is undefined behavior.
There is a footnote that states:
102) Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is
always true that if E is a function designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of
an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
Which means that &*E, where E is NULL, is defined, but the question is whether the same is true for &(*E).m, where E is a null pointer and its type is a struct that has a member m?
C Standard doesn't define that behavior.
If it were defined, new problems would arise, one of which is listed below. C Standard is correct to keep it undefined, and provides a macro offsetof that handles the problem internally.
6.3.2.3 Pointers
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. 66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that an integer constant expression with the value 0 is converted to a null pointer constant.
But the value of a null pointer constant is not defined as 0. The value is implementation defined.
7.19 Common definitions
The macros are
NULL
which expands to an implementation-defined null pointer constant
This means C allows an implementation where the null pointer will have a value where all bits are set and using member access on that value will result in an overflow which is undefined behavior
Another problem is how do you evaluate &(*E).m? Do the brackets apply and is * evaluated first. Keeping it undefined solves this problem.
First, let's establish that we need a pointer to an object:
6.5.2.3 Structure and union members
4 A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.96) If the first expression is a pointer to
a qualified type, the result has the so-qualified version of the type of the designated
member.
Unfortunately, no null pointer ever points to an object.
6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
Result: Undefined Behavior.
As a side-note, some other things to chew over:
6.3.2.3 Pointers
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.
Any two null pointers shall compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
So even if the UB should happen to be benign this time, it might still result in some totally unexpected number.
Nothing in the C standard would impose any requirements on what a system could do with the expression. It would, when the standard was written, have been perfectly reasonable for it to to cause the following sequence of events at runtime:
Code loads a null pointer into the addressing unit
Code asks the addressing unit to add the offset of field b.
The addressing unit trigger a trap when attempting to add an integer to a null pointer (which should for robustness be a run-time trap, even though many systems don't catch it)
The system starts executing essentially random code after being dispatched through a trap vector that was never set because code to set it would have wasted been a waste of memory, as addressing traps shouldn't occur.
The very essence of what Undefined Behavior meant at the time.
Note that most of the compilers that have appeared since the early days of C would regard the address of a member of an object located at a constant address as being a compile-time constant, but I don't think such behavior was mandated then, nor has anything been added to the standard which would mandate that compile-time address calculations involving null pointers be defined in cases where run-time calculations would not.
No. Let's take this apart:
&(((struct name *)NULL)->b);
is the same as:
struct name * ptr = NULL;
&(ptr->b);
The first line is obviously valid and well defined.
In the second line, we calculate the address of a field relative to the address 0x0 which is perfectly legal as well. The Amiga, for example, had the pointer to the kernel in the address 0x4. So you could use a method like this to call kernel functions.
In fact, the same approach is used on the C macro offsetof (wikipedia):
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
So the confusion here revolves around the fact that NULL pointers are scary. But from a compiler and standard point of view, the expression is legal in C (C++ is a different beast since you can overload the & operator).

Is dereferencing invalid pointers legal if no lvalue-to-rvalue conversion occurs

Try as I might, the closest answer I've seen is this, with two completely opposing answers(!)
The question is simple, is this legal?
auto p = reinterpret_cast<int*>(0xbadface);
*p; // legal?
My take on the matter
Casting integer to pointer: no restrictions on what may be casted
Indirection: only states the result is a lvalue.
Lifetimes: only states what can't be done on objects, there is no object here
Expression statements: *p is a discarded value expression
Discarded value expressions: no lvalue-to-rvalue conversion occurs
Undefined-ness of lvalues: aka strict aliasing rule, only if the lvalue is converted to a rvalue
So I conclude there is nothing explicitly saying this is undefined behaviour. Yet I distinctively remember that some platforms trap on indirection for invalid pointers. What went wrong with my reasoning?
[basic.compound] says:
Every value of pointer type is one of the following:
a pointer to an object or function (the pointer is said to point to the object or function), or
a pointer past the end of an object ([expr.add]), or
the null pointer value ([conv.ptr]) for that type, or
an invalid pointer value.
By the process of elimination we can deduce that p is an invalid pointer value.
[basic.stc] says:
Indirection through an invalid pointer value and passing an invalid
pointer value to a deallocation function have undefined behavior. Any
other use of an invalid pointer value has implementation-defined
behavior.
As indirection operator is said to perform indirection by [expr.unary.op], I would say, that expression *p causes UB no matter if the result is used or not.
... some platforms trap on indirection for invalid pointers.
Most platforms trap on invalid address access. This does not contradict the issue in any way. The question of what happens in *p; boils down to whether an attempt to actually fetch at an invalid address takes place or not.
The question of fetching is very similar to the core issue 232 (indirection through a null pointer). As you have already pointed out, *p; is a discarded value expression, and as such no lvalue-to-rvalue conversion ("fetching") takes place:
Tom Plum:
...it is only the act of "fetching", of lvalue-to-rvalue conversion, that triggers the ill-formed or undefined behavior.
And subsequently:
Notes from the October 2003 meeting:
We agreed that the approach in the standard seems okay: p = 0; *p; is
not inherently an error. An lvalue-to-rvalue conversion would give it
undefined behavior.
As to whether or not reinterpret_cast<int*>(0xbadface) produces a valid pointer, indeed in implementations with strict pointer safety, it wouldn't be a safely-derived pointer, and as such is invalid and any use of it is UB.
But in case of relaxed pointer safety the resulting pointer is valid (otherwise it would be impossible to use pointers returned from binary libraries and components written in C or other languages).

c++ access static members using null pointer

Recently tried the following program and it compiles, runs fine and produces expected output instead of any runtime error.
#include <iostream>
class demo
{
public:
static void fun()
{
std::cout<<"fun() is called\n";
}
static int a;
};
int demo::a=9;
int main()
{
demo* d=nullptr;
d->fun();
std::cout<<d->a;
return 0;
}
If an uninitialized pointer is used to access class and/or struct members behaviour is undefined, but why it is allowed to access static members using null pointers also. Is there any harm in my program?
TL;DR: Your example is well-defined. Merely dereferencing a null pointer is not invoking UB.
There is a lot of debate over this topic, which basically boils down to whether indirection through a null pointer is itself UB.
The only questionable thing that happens in your example is the evaluation of the object expression. In particular, d->a is equivalent to (*d).a according to [expr.ref]/2:
The expression E1->E2 is converted to the equivalent form
(*(E1)).E2; the remainder of 5.2.5 will address only the first
option (dot).
*d is just evaluated:
The postfix expression before the dot or arrow is evaluated;65 the
result of that evaluation, together with the id-expression, determines
the result of the entire postfix expression.
65) If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary
to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
Let's extract the critical part of the code. Consider the expression statement
*d;
In this statement, *d is a discarded value expression according to [stmt.expr]. So *d is solely evaluated1, just as in d->a.
Hence if *d; is valid, or in other words the evaluation of the expression *d, so is your example.
Does indirection through null pointers inherently result in undefined behavior?
There is the open CWG issue #232, created over fifteen years ago, which concerns this exact question. A very important argument is raised. The report starts with
At least a couple of places in the IS state that indirection through a
null pointer produces undefined behavior: 1.9 [intro.execution]
paragraph 4 gives "dereferencing the null pointer" as an example of
undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses
this supposedly undefined behavior as justification for the
nonexistence of "null references."
Note that the example mentioned was changed to cover modifications of const objects instead, and the note in [dcl.ref] - while still existing - is not normative. The normative passage was removed to avoid commitment.
However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary
"*" operator, does not say that the behavior is undefined if the
operand is a null pointer, as one might expect. Furthermore, at least
one passage gives dereferencing a null pointer well-defined behavior:
5.2.8 [expr.typeid] paragraph 2 says
If the lvalue expression is obtained by applying the unary * operator
to a pointer and the pointer is a null pointer value (4.10
[conv.ptr]), the typeid expression throws the bad_typeid exception
(18.7.3 [bad.typeid]).
This is inconsistent and should be cleaned up.
The last point is especially important. The quote in [expr.typeid] still exists and appertains to glvalues of polymorphic class type, which is the case in the following example:
int main() try {
// Polymorphic type
class A
{
virtual ~A(){}
};
typeid( *((A*)0) );
}
catch (std::bad_typeid)
{
std::cerr << "bad_exception\n";
}
The behavior of this program is well-defined (an exception will be thrown and catched), and the expression *((A*)0) is evaluated as it isn't part of an unevaluated operand. Now if indirection through null pointers induced UB, then the expression written as
*((A*)0);
would be doing just that, inducing UB, which seems nonsensical when compared to the typeid scenario. If the above expression is merely evaluated as every discarded-value expression is1, where is the crucial difference that makes the evaluation in the second snippet UB? There is no existing implementation that analyzes the typeid-operand, finds the innermost, corresponding dereference and surrounds its operand with a check - there would be a performance loss, too.
A note in that issue then ends the short discussion with:
We agreed that the approach in the standard seems okay: p = 0; *p;
is not inherently an error. An lvalue-to-rvalue conversion would give
it undefined behavior.
I.e. the committee agreed upon this. Although the proposed resolution of this report, which introduced so-called "empty lvalues", was never adopted…
However, “not modifiable” is a compile-time concept, while in fact
this deals with runtime values and thus should produce undefined
behavior instead. Also, there are other contexts in which lvalues can
occur, such as the left operand of . or .*, which should also be
restricted. Additional drafting is required.
…that does not affect the rationale. Then again, it should be noted that this issue even precedes C++03, which makes it less convincing while we approach C++17.
CWG-issue #315 seems to cover your case as well:
Another instance to consider is that of invoking a member function
from a null pointer:
struct A { void f () { } };
int main ()
{
A* ap = 0;
ap->f ();
}
[…]
Rationale (October 2003):
We agreed the example should be allowed. p->f() is rewritten as
(*p).f() according to 5.2.5 [expr.ref]. *p is not an error when
p is null unless the lvalue is converted to an rvalue (4.1
[conv.lval]), which it isn't here.
According to this rationale, indirection through a null pointer per se does not invoke UB without further lvalue-to-rvalue conversions (=accesses to stored value), reference bindings, value computations or the like. (Nota bene: Calling a non-static member function with a null pointer should invoke UB, albeit merely hazily disallowed by [class.mfct.non-static]/2. The rationale is outdated in this respect.)
I.e. a mere evaluation of *d does not suffice to invoke UB. The identity of the object is not required, and neither is its previously stored value. On the other hand, e.g.
*p = 123;
is undefined since there is a value computation of the left operand, [expr.ass]/1:
In all cases, the assignment is sequenced after the value computation
of the right and left operands
Because the left operand is expected to be a glvalue, the identity of the object referred to by that glvalue must be determined as mentioned by the definition of evaluation of an expression in [intro.execution]/12, which is impossible (and thus leads to UB).
1 [expr]/11:
In some contexts, an expression only appears for its side effects.
Such an expression is called a discarded-value expression. The
expression is evaluated and its value is discarded. […]. The lvalue-to-rvalue conversion (4.1) is
applied if and only if the expression is a glvalue of
volatile-qualified type and […]
From the C++ Draft Standard N3337:
9.4 Static members
2 A static member s of class X may be referred to using the qualified-id expression X::s; it is not necessary to use the class member access syntax (5.2.5) to refer to a static member. A static member may be referred
to using the class member access syntax, in which case the object expression is evaluated.
And in the section about object expression...
5.2.5 Class member access
4 If E2 is declared to have type “reference to T,” then E1.E2 is an lvalue; the type of E1.E2 is T. Otherwise,
one of the following rules applies.
— If E2 is a static data member and the type of E2 is T, then E1.E2 is an lvalue; the expression designates the named member of the class. The type of E1.E2 is T.
Based on the last paragraph of the standard, the expressions:
d->fun();
std::cout << d->a;
work because they both designate the named member of the class regardless of the value of d.
runs fine and produces expected output instead of any runtime error.
That's a basic assumption error. What you are doing is undefined behavior, which means that your claim for any kind of "expected output" is faulty.
Addendum: Note that, while there is a CWG defect (#315) report that is closed as "in agreement" of not making the above UB, it relies on the positive closing of another CWG defect (#232) that is still active, and hence none of it is added to the standard.
Let me quote a part of a comment from James McNellis to an answer to a similar Stack Overflow question:
I don't think CWG defect 315 is as "closed" as its presence on the "closed issues" page implies. The rationale says that it should be allowed because "*p is not an error when p is null unless the lvalue is converted to an rvalue." However, that relies on the concept of an "empty lvalue," which is part of the proposed resolution to CWG defect 232, but which has not been adopted.
The expressions d->fun and d->a() both cause evaluation of *d ([expr.ref]/2).
The complete definition of the unary * operator from [expr.unary.op]/1 is:
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.
For the expression d there is no "object or function to which the expression points" . Therefore this paragraph does not define the behaviour of *d.
Hence the code is undefined by omission, since the behaviour of evaluating *d is not defined anywhere in the Standard.
What you are seeing here is what I would consider an ill-conceived and unfortunate design choice in the specification of the C++ language and many other languages that belong to the same general family of programming languages.
These languages allow you to refer to static members of a class using a reference to an instance of the class. The actual value of the instance reference is of course ignored, since no instance is required to access static members.
So, in d->fun(); the the compiler uses the d pointer only during compilation to figure out that you are referring to a member of the demo class, and then it ignores it. No code is emitted by the compiler to dereference the pointer, so the fact that it is going to be NULL during runtime does not matter.
So, what you see happening is in perfect accordance to the specification of the language, and in my opinion the specification suffers in this respect, because it allows an illogical thing to happen: to use an instance reference to refer to a static member.
P.S. Most compilers in most languages are actually capable of issuing warnings for that kind of stuff. I do not know about your compiler, but you might want to check, because the fact that you received no warning for doing what you did might mean that you do not have enough warnings enabled.

Deferencing a returned reference

Given:
int& foo(); // don't care what the reference is to
int intVal;
In the following two cases the right hand side is the same function call
int& intRef = foo();
intVal = foo(); // a reference is returned... a value is assigned.
In the second case how is the returned reference "converted" into a value?
Is it done by the assignment operator for the int?
At the language level there's no such concept as "dereferencing a reference". A reference implements the concept of an lvalue. A variable and a reference are basically the same thing. The only difference between a variable and a reference is that the variable is bound to its location in storage automatically, by the compiler, while a reference is generally bound through user action at run time.
In your example, there's no conceptual difference between intRef and intVal. Both are lvalues of type int. And at the conceptual level both are accessed through the same mechanism. You can even think of all variables in your program as references, which were implicitly pre-bound for you by the compiler. This is basically what Bjarne Stroustrup means in TC++PL when he says (not verbatim) that one can think of references as just alternative names for existing variables.
The only moment when the difference between the two is perceptible is when you create these entities and initialize them. Initialization of a reference is an act of binding it to some location in storage. Initialization of a variable is an act of copying the initial value into the existing storage.
But once a reference is initialized, it acts as an ordinary variable: an act of reading/writing a reference is an act of reading/writing the storage location it is bound to. Taking the address of a reference evaluates to the address of the storage location it is bound to. And so on.
It is not a secret that in many cases a reference is implemented internally as a pointer in disguise, i.e. as an invisible pointer that is implicitly dereferenced for you every time you access it. In such cases (when it is really implemented through a pointer) the dereference is done, again, every time you access it. So, it is not the assignment operator that does it, as you ask in your question. It is the very fact that you mentioned the name of that reference in your code that causes the invisible pointer to get dereferenced.
However, an entity that implements "alternative name for existing variable" does not necessarily require storage for itself, i.e. in a compiled language it is not required to be represented by anything material, like a hidden pointer. This is why the language standard states in 8.3.2 that "It is unspecified whether or not a reference requires storage".
foo is returning some reference to an object of type "int". We won't care about where that "int" came from and we'll just assume it exists.
The first line, int& intRef = foo(), creates intRef which also refers to exactly the same object of type "int" as is referenced by the return value of foo.
The second line, the value of intVal is replaced by the value of the object referred to by the returned reference.
In response to your comments:
You seem to be getting very confused between pointers and references. References are just like aliases for an object. Doing anything to a reference will actually affect the object it refers to.
There is no such thing as dereferencing a reference. You can only dereference pointers. Dereferencing is the act of using the unary * operator to get the object pointed at by a point. For example, if you have a int* p, you can do *p to get the object that it points at. This is dereferencing p.
The only time you can do * on a reference is if the object it refers to is a pointer (or if it overloads operator*). In your case, since foo returns an int&, we can't dereference it. The expression *foo() just won't compile. That's because the return value of foo has type "int" which is not a pointer and doesn't overload operator*.
For all intents and purposes, you can treat the reference returned from foo as simply being the object it refers to. Assigning this value to intVal is really no different to assigning x to intVal in the following code:
int intVal;
int x = 5;
intVal = x;
As I'm sure you understand, intVal is given the value of x. This is defined simply by the standard:
In simple assignment (=), the value of the expression replaces that of the object referred to by the left operand.
No conversion needs to occur at all because both sides of the operator are the same type.
This is really no different to your situation. You just have:
intVal = some_ref_to_int;
Where some_ref_to_int is the expression foo(). The fact that it's a reference doesn't matter. intVal receives the value of the object that the reference denotes.
Assigning to intVal is an assignment-expression defined in 5.17 [exp.ass] in the standard. The grammar rules for an assignment-expression are quite complicated, depending on several other grammar rules, but basically you need a modifiable lvalue on the left hand side of the = operator, and a prvalue expression on the right hand side.
In the case of
intVal = foo();
the expression on the RHS is an lvalue of type int, so the built-in lvalue-to-rvalue conversion takes place ... this is barely a conversion, in that the value doesn't change and neither does the type (except that for fundamental types cv-qualifiers are removed, so if the lvalue is type const int the prvalue will be type int). [conv.lval] says
A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. [...] If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T. [...] the value contained in the object indicated by the glvalue is the prvalue result.
So the prvalue has type int and the same value as foo() i.e. the same value as the variable the returned reference is bound to.
The rules of assignment expressions say:
In simple assignment (=), the value of the expression replaces that of the object referred to by the left operand.
So the value of intVal will be replaced by the value of the prvalue. The rules continue:
If the left operand is not of class type, the expression is implicitly converted (Clause 4) to the cv-unqualified type of the left operand.
So because int is not a class type (and therefore has no overloaded operator= it just uses the built-in assignment operator) the assignment will convert the RHS to int, which is the type it already has in your case.
So the value of intVal gets set to the value of the prvalue, which we said is the value of the glvalue expression foo(), i.e. the value of the variable the reference is bound to.
Note that the lvalue-to-rvalue conversion is nothing to do with the RHS being a reference. The same thing happens here:
int val = 0;
intVal = val;
val is an lvalue of type int so it's converted to a prvalue of type int and the value of intVal is set to the value of that prvalue.
The rules are expressed in terms of an expression's "value category" (i.e. lvalue or rvalue) not whether it's a reference or not. Any "dereferencing" of a reference that's needed is done implicitly and invisibly by the compiler in order to implement the required behaviour.