Related
C++ references have two properties:
They always point to the same object.
They can not be 0.
Pointers are the opposite:
They can point to different objects.
They can be 0.
Why is there no "non-nullable, reseatable reference or pointer" in C++? I can't think of a good reason why references shouldn't be reseatable.
Edit:
The question comes up often because I usually use references when I want to make sure that an "association" (I'm avoiding the words "reference" or "pointer" here) is never invalid.
I don't think I ever thought "great that this ref always refers to the same object". If references were reseatable, one could still get the current behavior like this:
int i = 3;
int& const j = i;
This is already legal C++, but meaningless.
I restate my question like this: "What was the rationale behind the 'a reference is the object' design? Why was it considered useful to have references always be the same object, instead of only when declared as const?"
Cheers, Felix
The reason that C++ does not allow you to rebind references is given in Stroustrup's "Design and Evolution of C++" :
It is not possible to change what a reference refers to after initialization. That is, once a C++ reference is initialized it cannot be made to refer to a different object later; it cannot be re-bound. I had in the past been bitten by Algol68 references where r1=r2 can either assign through r1 to the object referred to or assign a new reference value to r1 (re-binding r1) depending on the type of r2. I wanted to avoid such problems in C++.
In C++, it is often said that "the reference is the object". In one sense, it is true: though references are handled as pointers when the source code is compiled, the reference is intended to signify an object that is not copied when a function is called. Since references are not directly addressable (for example, references have no address, & returns the address of the object), it would not semantically make sense to reassign them. Moreover, C++ already has pointers, which handles the semantics of re-setting.
Because then you'd have no reseatable type which can not be 0. Unless, you included 3 types of references/pointers. Which would just complicate the language for very little gain (And then why not add the 4th type too? Non-reseatable reference which can be 0?)
A better question may be, why would you want references to be reseatable? If they were, that would make them less useful in a lot of situations. It would make it harder for the compiler to do alias analysis.
It seems that the main reason references in Java or C# are reseatable is because they do the work of pointers. They point to objects. They are not aliases for an object.
What should the effect of the following be?
int i = 42;
int& j = i;
j = 43;
In C++ today, with non-reseatable references, it is simple. j is an alias for i, and i ends up with the value 43.
If references had been reseatable, then the third line would bind the reference j to a different value. It would no longer alias i, but instead the integer literal 43 (which isn't valid, of course). Or perhaps a simpler (or at least syntactically valid) example:
int i = 42;
int k = 43;
int& j = i;
j = k;
With reseatable references. j would point to k after evaluating this code.
With C++'s non-reseatable references, j still points to i, and i is assigned the value 43.
Making references reseatable changes the semantics of the language. The reference can no longer be an alias for another variable. Instead it becomes a separate type of value, with its own assignment operator. And then one of the most common usages of references would be impossible. And nothing would be gained in exchange. The newly gained functionality for references already existed in the form of pointers. So now we'd have two ways to do the same thing, and no way to do what references in the current C++ language do.
A reference is not a pointer, it may be implemented as a pointer in the background, but its core concept is not equivalent to a pointer. A reference should be looked at like it *is* the object it is referring to. Therefore you cannot change it, and it cannot be NULL.
A pointer is simply a variable that holds a memory address. The pointer itself has a memory address of its own, and inside that memory address it holds another memory address that it is said to point to. A reference is not the same, it does not have an address of its own, and hence it cannot be changed to "hold" another address.
I think the parashift C++ FAQ on references says it best:
Important note: Even though a
reference is often implemented using
an address in the underlying assembly
language, please do not think of a
reference as a funny looking pointer
to an object. A reference is the
object. It is not a pointer to the
object, nor a copy of the object. It
is the object.
and again in FAQ 8.5 :
Unlike a pointer, once a reference is
bound to an object, it can not be
"reseated" to another object. The
reference itself isn't an object (it
has no identity; taking the address of
a reference gives you the address of
the referent; remember: the reference
is its referent).
A reseatable reference would be functionally identical to a pointer.
Concerning nullability: you cannot guarantee that such a "reseatable reference" is non-NULL at compile time, so any such test would have to take place at runtime. You could achieve this yourself by writing a smart pointer-style class template that throws an exception when initialised or assigned NULL:
struct null_pointer_exception { ... };
template<typename T>
struct non_null_pointer {
// No default ctor as it could only sensibly produce a NULL pointer
non_null_pointer(T* p) : _p(p) { die_if_null(); }
non_null_pointer(non_null_pointer const& nnp) : _p(nnp._p) {}
non_null_pointer& operator=(T* p) { _p = p; die_if_null(); }
non_null_pointer& operator=(non_null_pointer const& nnp) { _p = nnp._p; }
T& operator*() { return *_p; }
T const& operator*() const { return *_p; }
T* operator->() { return _p; }
// Allow implicit conversion to T* for convenience
operator T*() const { return _p; }
// You also need to implement operators for +, -, +=, -=, ++, --
private:
T* _p;
void die_if_null() const {
if (!_p) { throw null_pointer_exception(); }
}
};
This might be useful on occasion -- a function taking a non_null_pointer<int> parameter certainly communicates more information to the caller than does a function taking int*.
Intrestingly, many answers here are a bit fuzzy or even beside the point (e.g. it's not because references cannot be zero or similar, in fact, you can easily construct an example where a reference is zero).
The real reason why re-setting a reference is not possible is rather simple.
Pointers enable you to do two things: To change the value behind the pointer (either through the -> or the * operator), and to change the pointer itself (direct assign =). Example:
int a;
int * p = &a;
Changing the value requires dereferencing: *p = 42;
Changing the pointer: p = 0;
References allow you to only change the value. Why? Since there is no other syntax to express the re-set. Example:
int a = 10;
int b = 20;
int & r = a;
r = b; // re-set r to b, or set a to 20?
In other words, it would be ambiguous if you were allowed to re-set a reference. It makes even more sense when passing by reference:
void foo(int & r)
{
int b = 20;
r = b; // re-set r to a? or set a to 20?
}
void main()
{
int a = 10;
foo(a);
}
Hope that helps :-)
It would probably have been less confusing to name C++ references "aliases"? As others have mentioned, references in C++ should be though of as the variable they refer to, not as a pointer/reference to the variable. As such, I can't think of a good reason they should be resettable.
when dealing with pointers, it often makes sense allowing null as a value (and otherwise, you probably want a reference instead). If you specifically want to disallow holding null, you could always code your own smart pointer type ;)
C++ references can sometimes be forced to be 0 with some compilers (it's just a bad idea to do so*, and it violates the standard*).
int &x = *((int*)0); // Illegal but some compilers accept it
EDIT: according to various people who know the standard much better than myself, the above code produces "undefined behavior". In at least some versions of GCC and Visual Studio, I've seen this do the expected thing: the equivalent of setting a pointer to NULL (and causes a NULL pointer exception when accessed).
You can't do this:
int theInt = 0;
int& refToTheInt = theInt;
int otherInt = 42;
refToTheInt = otherInt;
...for the same reason why secondInt and firstInt don't have the same value here:
int firstInt = 1;
int secondInt = 2;
secondInt = firstInt;
firstInt = 3;
assert( firstInt != secondInt );
This is not actually an answer, but a workaround for this limitation.
Basically, when you try to "rebind" a reference you are actually trying to use the same name to refer to a new value in the following context. In C++, this can be achieve by introducing a block scope.
In jalf's example
int i = 42;
int k = 43;
int& j = i;
//change i, or change j?
j = k;
if you want to change i, write it as above. However, if you want to change the meaning of j to mean k, you can do this:
int i = 42;
int k = 43;
int& j = i;
//change i, or change j?
//change j!
{
int& j = k;
//do what ever with j's new meaning
}
I would imagine that it is related to optimization.
Static optimization is much easier when you can know unambiguously what bit of memory a variable means. Pointers break this condition and re-setable reference would too.
Because sometimes things should not be re-pointable. (E.g., the reference to a Singleton.)
Because it's great in a function to know that your argument can't be null.
But mostly, because it allows use to have something that really is a pointer, but which acts like a local value object. C++ tries hard, to quote Stroustrup, to make class instances "do as the ints d". Passing an int by vaue is cheap, because an int fitss into a machine register. Classes are often bigger than ints, and passing them by value has significant overhead.
Being able to pass a pointer (which is often the size of an int, or maybe two ints) that "looks like" a value object allows us to write cleaner code, without the "implementation detail" of dereferences. And, along with operator overloading, it allows us to write classes use syntax similar to the syntax used with ints. In particular, it allows us to write template classes with syntax that can be equally applied to primitive, like ints, and classes (like a Complex number class).
And, with operator overloading especially, there are places were we should return an object, but again, it's much cheaper to return a pointer. Oncve again, returning a reference is our "out.
And pointers are hard. Not for you, maybe, and not to anyone that realizes a pointer is just the value of a memory address. But recalling my CS 101 class, they tripped up a number of students.
char* p = s; *p = *s; *p++ = *s++; i = ++*p;
can be confusing.
Heck, after 40 years of C, people still can't even agree if a pointer declaration should be:
char* p;
or
char *p;
I always wondered why they didn't make a reference assignment operator (say :=) for this.
Just to get on someone's nerves I wrote some code to change the target of a reference in a structure.
No, I do not recommend repeating my trick. It will break if ported to a sufficiently different architecture.
The fact that references in C++ are not nullable is a side-effect of them being just an alias.
I agree with the accepted answer.
But for constness, they behave much like pointers though.
struct A{
int y;
int& x;
A():y(0),x(y){}
};
int main(){
A a;
const A& ar=a;
ar.x++;
}
works.
See
Design reasons for the behavior of reference members of classes passed by const reference
There's a workaround if you want a member variable that's a reference and you want to be able to rebind it. While I find it useful and reliable, note that it uses some (very weak) assumptions on memory layout. It's up to you to decide whether it's within your coding standards.
#include <iostream>
struct Field_a_t
{
int& a_;
Field_a_t(int& a)
: a_(a) {}
Field_a_t& operator=(int& a)
{
// a_.~int(); // do this if you have a non-trivial destructor
new(this)Field_a_t(a);
}
};
struct MyType : Field_a_t
{
char c_;
MyType(int& a, char c)
: Field_a_t(a)
, c_(c) {}
};
int main()
{
int i = 1;
int j = 2;
MyType x(i, 'x');
std::cout << x.a_;
x.a_ = 3;
std::cout << i;
((Field_a_t&)x) = j;
std::cout << x.a_;
x.a_ = 4;
std::cout << j;
}
This is not very efficient as you need a separate type for each reassignable reference field and make them base classes; also, there's a weak assumption here that a class having a single reference type won't have a __vfptr or any other type_id-related field that could potentially destroy runtime bindings of MyType. All the compilers I know satisfy that condition (and it would make little sense not doing so).
Being half serious: IMHO to make them little more different from pointers ;) You know that you can write:
MyClass & c = *new MyClass();
If you could also later write:
c = *new MyClass("other")
would it make sense to have any references alongside with pointers?
MyClass * a = new MyClass();
MyClass & b = *new MyClass();
a = new MyClass("other");
b = *new MyClass("another");
I was reading "Beginning C++ Through Game Programming, Fourth Edition" when I found a section about to pass a reference as an argument in a function that says:
"Pass a reference only when you want to alter the value
of the argument variable. However, you should try to avoid changing
argument variables whenever possible."
Basically it says that I shoud avoid to do something like this:
void swap(int& x, int& y)
{
int temp = x;
x = y;
y = temp;
}
But, why should I avoid doing this?
I found very useful to create functions to change variables, because these functions keep my code organized avoiding me to write every variable change in the main() function, and they also avoid me to write the same code repeatedly.
What alternative is there if I should not do this?
With the case of swap, pass by reference shoud be used, because multiple variables are being changed in the function and the expected behavior of the function is obvious. If you were only changing one variable or have a function that should be returning a result, then it is better to return a value for the most part (With some exceptions). For example, supoose you had a function sum():
int sum(int a, int b)
{
return a + b;
}
Usage of this function might be:
int x = sum(1, 2); // Obvious and to the point
Since there is no need to change any of the variables, one would use pass by value here (Pass by const reference doesn't matter here, since POD types are pretty small). If you used pass by reference here, then the function would have changed to:
void sum(int a, int b, int& sum)
{
sum = a + b;
}
Usage of the function above might be:
int s;
sum(1, 2, s); // Obscure, unclear
Disadvantages of this are that it is unclear what the intent of the function is, and now the result of the function cannot be passed to another function in one line. So pass by reference should only be used when absolutely necessary.
On a side note, using a const reference for passing large objects to functions is always recommended, since it avoids a copy.
I got a question which is The function below may result in a run time error. Why?
the code is :
int& sub(int& a , int& b){
int c = a - b ;
return c ;
}
how can I write code in main so there will be a run-time error??
thanks!!
As it's undefined behaviour, there is no guaranteed portable error.
But here and example, betting on the gfact that nested calls will produce the referred result be overwritten.
int a = 5, b=4, c=2;
int r = msub(a, msub(b,c));
cout << "Should be 3: "<<r<<endl;
// output depends on compiler. I received 0, so incorect !
Here the online demo.
Needless to say that such errors are extremely nasty ! What happens here ?
The compiler first calls msub(b,c) the resutl is a temporary reference to the former local variable on the stack. At that moment, there's high probability, that the computed value of 2 is still there, even if the temporary variable doesn't exist anymore.
Then the compiler calls msub(a, ...) using this reference. But this call will change the stack, overwriting the value that was referred to.
so there's no segfault, no horror (in this simple compiler specific case), but the computed value is completely inacurate.
I've tried to describe the general principle of the problem:
This function will ALWAYS result in a runtime error, its just that you could get lucky and find your temp variable untouched on return (and usually will find this code executing properly if you us it "simply," which I will explain below). Lets consider what you're doing in this function.
int& sub(int& a , int& b)
This code means take the reference to the integer a and the reference to the integer b, return a reference to some other integer.
int c = a - b;
Take the value of variable a minus the value of variable b, store this in the temporary stack variable c.
return c;
Return a reference to the temporary stack variable c.
I hope that you see what the problem here is. The variable c is a temporary variable, and when the function returns, will go out of scope. So Consider the following two invocations of the function:
int some_other_function(int& x, int& y)
{
int a,b,c;
a = b = c= -1;
return x+y;
}
....
int a, b, c, d;
....
a = sub(b,c); //First call
d = some_other_function(sub(b,c),a); //Second call
...
The first call will likely perform as expected, because the temp variable stored on the stack inside sub will likely be intact, there are no function calls between when sub finishes execution and when the value of the reference returned by sub is assigned to the variable a, meaning nothing changes the stack.
In the second case, however, the sub() call will perform as expected, returning a reference to some_other_function(). But when some_other_function is called, it will trample on the same memory spaced used by sub(), meaning that it will almost certainly corrupt the temp memory location whose reference you took. The second call will fail, as long as sub(b,c) does not resolve to -1.
The easiest fix (and no doubt the fix you want, at least for this problem) is as follows:
int sub(int& a , int& b){
int c = a - b ;
return c ;
}
This will actually return a brand new integer not stored on the stack of sub(). Now, I will assume that you're asking this question because you're concerned what happens with larger data structures, and you're trying to prevent excess memory copies when passing data by value instead of by reference. I say this because passing an integer by reference saves no memory space, your function would actually be more efficient if it looked like this:
int sub(int a , int b){
return a - b ;
}
But, if you are trying to do this with a large data structure and attempting to eliminate excess by passing by reference instead of by value, you would want this:
class my_class;
my_class* sub(my_class& a, my_class& b)
{
my_class* c = new my_class(a-b);
return c;
}
And yes, you'd pretty much have to deal with dynamic memory allocation....this is the price of persistent data not stored on the stack.
You're returning a reference to a local variable, which is an error. The local variable goes out of scope as soon as the function exists.
One of the answers above mentions:
Now, I will assume that you're asking this question because you're concerned what happens with larger data structures, and you're trying to prevent excess memory copies
While this is indeed true, this will not be an issue in many cases as most compilers will make use of return value optimization (RVO) / copy elision. Moreover, in modern C++ (i.e. C++11) we enjoy move semantics. Meaning, if you have a type T for which move construction is implemented and subtraction is defined:
T sub(const T& a, const T& b)
{
T c = a - b; // creates a new object of type T
return c; // the move is implicit
}
So when you call the function like this
// a, b already exists and are of type T
// c is move constructed as the right-hand side is a non-named r-value
T c = sub(a, b);
So even if T is an expensive-to-copy type, move-semantics can help keep performance and without having to use pointers and references. Bjarne Stroustrup says in his book "The C++ Programming Language",
Unfortunately, overuse of new (and of pointers and references) seems to be an increasing problem.
If I have a C++ function declaration:
int func(const vector<int> a)
Would it always be beneficial to replace it with
int func(const vector<int> &a)
since the latter does not need to make a copy of a to pass into the function?
In general, yes. You should always pass large objects by reference (or pass a pointer to them, especially if you are using C).
In terms of efficiency like you're thinking, almost always yes. There are times where (purportedly) this may be slower, typically with types that are fundamental or small:
// copy x? fits in register: fast
void foo(const int x);
// reference x? requires dereferencing on typical implementations: slow
void foo(const int& x);
But with inlining this doesn't matter anyway, plus you can just type it by-value yourself; this only matters with generic template functions.
However it's important to note that your transformation may not always be valid, namely because your function gets its own copy of the data. Consider this simpler example:
void foo(const int x, int& y)
{
y += x;
y += x;
}
int v = 1;
foo(v, v); // results in v == 3
Make your transformation and you get:
void foo(const int& x, int& y)
{
y += x;
y += x;
}
int v = 1;
foo(v, v); // results in v == 4
Because even though you cannot write to x, it can be written to through other means. This is called aliasing. While probably not a concern with the example you've given (though global variables could still alias!), just be wary of the difference in principle.
Lastly, if you're going to make your own copy anyway, just do it in the parameter list; the compiler can optimize that for you, especially with C++11's rvalue references/move semantics.
Mostly it would be more efficient -- but if it happens that func needs to make its own copy of the vector and modify it destructively while it does whatever it does anyway, then you might as well save a few lines and let the language make the copy for you implicitly as a pass-by-value parameter. It is conceivable that the compiler might then be able to figure out that the copying can be omitted if the caller is not actually using its copy of the vector afterwards.
In short, yes. Since you can't modify a anyway, all your function body could do is make another copy, which you can just as well make from a const-reference.
Some reasons I can imagine the pass by value could be more efficient:
It can be better paralellized. Because there's no aliasing. The original can change without affecting the value inside the function.
Better cache locality
Correct. Passing a reference will avoid a copy. You should make use of references when there's a copy involved and you don't actually need one. (Either because you don't intent to modify the value, in which case operating on the original is fine and you'd use a const reference, or because you do want to modify the original rather than a copy of it, in which case you'd use a non-const reference.)
This isn't limited to function arguments of course. For example, look at this function:
std::string foo();
Most people would use that function in this way:
std::string result = foo();
However, if you're not modifying result, this is way better:
const std::string& result = foo();
No copy is being made. Also, contrary to pointers, a reference guarantees that the temporary returned by foo() stays valid and will not go out of scope (a pointer to a temporary is dangerous, while a reference to a temporary is perfectly safe.)
The C++-11 standard solves this problem by using move semantics, but most existing code doesn't make use of this new feature yet, so using references wherever possible is a good habit to get into.
Also, note that you have to be careful about temporary lifetimes when binding temporaries to references, e.g.:
const int& f(const int& x)
{ return x; }
const int& y = f(23);
int z = y; /* OOPS */
The point being that the lifetime of the temporary int with value 23 doesn't extend beyond the end of the expression binding f(23) to y, so the attempt to assign y to z results in undefined behavior (due to the dangling reference).
Note that when you're dealing with POD types (Plain Old Data), like int or char, you don't win anything by avoiding a copy. Usually a reference is just as big as an int or long int (usually as big as a pointer), so copying an int by reference is the same as copying the int itself.
C++ references have two properties:
They always point to the same object.
They can not be 0.
Pointers are the opposite:
They can point to different objects.
They can be 0.
Why is there no "non-nullable, reseatable reference or pointer" in C++? I can't think of a good reason why references shouldn't be reseatable.
Edit:
The question comes up often because I usually use references when I want to make sure that an "association" (I'm avoiding the words "reference" or "pointer" here) is never invalid.
I don't think I ever thought "great that this ref always refers to the same object". If references were reseatable, one could still get the current behavior like this:
int i = 3;
int& const j = i;
This is already legal C++, but meaningless.
I restate my question like this: "What was the rationale behind the 'a reference is the object' design? Why was it considered useful to have references always be the same object, instead of only when declared as const?"
Cheers, Felix
The reason that C++ does not allow you to rebind references is given in Stroustrup's "Design and Evolution of C++" :
It is not possible to change what a reference refers to after initialization. That is, once a C++ reference is initialized it cannot be made to refer to a different object later; it cannot be re-bound. I had in the past been bitten by Algol68 references where r1=r2 can either assign through r1 to the object referred to or assign a new reference value to r1 (re-binding r1) depending on the type of r2. I wanted to avoid such problems in C++.
In C++, it is often said that "the reference is the object". In one sense, it is true: though references are handled as pointers when the source code is compiled, the reference is intended to signify an object that is not copied when a function is called. Since references are not directly addressable (for example, references have no address, & returns the address of the object), it would not semantically make sense to reassign them. Moreover, C++ already has pointers, which handles the semantics of re-setting.
Because then you'd have no reseatable type which can not be 0. Unless, you included 3 types of references/pointers. Which would just complicate the language for very little gain (And then why not add the 4th type too? Non-reseatable reference which can be 0?)
A better question may be, why would you want references to be reseatable? If they were, that would make them less useful in a lot of situations. It would make it harder for the compiler to do alias analysis.
It seems that the main reason references in Java or C# are reseatable is because they do the work of pointers. They point to objects. They are not aliases for an object.
What should the effect of the following be?
int i = 42;
int& j = i;
j = 43;
In C++ today, with non-reseatable references, it is simple. j is an alias for i, and i ends up with the value 43.
If references had been reseatable, then the third line would bind the reference j to a different value. It would no longer alias i, but instead the integer literal 43 (which isn't valid, of course). Or perhaps a simpler (or at least syntactically valid) example:
int i = 42;
int k = 43;
int& j = i;
j = k;
With reseatable references. j would point to k after evaluating this code.
With C++'s non-reseatable references, j still points to i, and i is assigned the value 43.
Making references reseatable changes the semantics of the language. The reference can no longer be an alias for another variable. Instead it becomes a separate type of value, with its own assignment operator. And then one of the most common usages of references would be impossible. And nothing would be gained in exchange. The newly gained functionality for references already existed in the form of pointers. So now we'd have two ways to do the same thing, and no way to do what references in the current C++ language do.
A reference is not a pointer, it may be implemented as a pointer in the background, but its core concept is not equivalent to a pointer. A reference should be looked at like it *is* the object it is referring to. Therefore you cannot change it, and it cannot be NULL.
A pointer is simply a variable that holds a memory address. The pointer itself has a memory address of its own, and inside that memory address it holds another memory address that it is said to point to. A reference is not the same, it does not have an address of its own, and hence it cannot be changed to "hold" another address.
I think the parashift C++ FAQ on references says it best:
Important note: Even though a
reference is often implemented using
an address in the underlying assembly
language, please do not think of a
reference as a funny looking pointer
to an object. A reference is the
object. It is not a pointer to the
object, nor a copy of the object. It
is the object.
and again in FAQ 8.5 :
Unlike a pointer, once a reference is
bound to an object, it can not be
"reseated" to another object. The
reference itself isn't an object (it
has no identity; taking the address of
a reference gives you the address of
the referent; remember: the reference
is its referent).
A reseatable reference would be functionally identical to a pointer.
Concerning nullability: you cannot guarantee that such a "reseatable reference" is non-NULL at compile time, so any such test would have to take place at runtime. You could achieve this yourself by writing a smart pointer-style class template that throws an exception when initialised or assigned NULL:
struct null_pointer_exception { ... };
template<typename T>
struct non_null_pointer {
// No default ctor as it could only sensibly produce a NULL pointer
non_null_pointer(T* p) : _p(p) { die_if_null(); }
non_null_pointer(non_null_pointer const& nnp) : _p(nnp._p) {}
non_null_pointer& operator=(T* p) { _p = p; die_if_null(); }
non_null_pointer& operator=(non_null_pointer const& nnp) { _p = nnp._p; }
T& operator*() { return *_p; }
T const& operator*() const { return *_p; }
T* operator->() { return _p; }
// Allow implicit conversion to T* for convenience
operator T*() const { return _p; }
// You also need to implement operators for +, -, +=, -=, ++, --
private:
T* _p;
void die_if_null() const {
if (!_p) { throw null_pointer_exception(); }
}
};
This might be useful on occasion -- a function taking a non_null_pointer<int> parameter certainly communicates more information to the caller than does a function taking int*.
Intrestingly, many answers here are a bit fuzzy or even beside the point (e.g. it's not because references cannot be zero or similar, in fact, you can easily construct an example where a reference is zero).
The real reason why re-setting a reference is not possible is rather simple.
Pointers enable you to do two things: To change the value behind the pointer (either through the -> or the * operator), and to change the pointer itself (direct assign =). Example:
int a;
int * p = &a;
Changing the value requires dereferencing: *p = 42;
Changing the pointer: p = 0;
References allow you to only change the value. Why? Since there is no other syntax to express the re-set. Example:
int a = 10;
int b = 20;
int & r = a;
r = b; // re-set r to b, or set a to 20?
In other words, it would be ambiguous if you were allowed to re-set a reference. It makes even more sense when passing by reference:
void foo(int & r)
{
int b = 20;
r = b; // re-set r to a? or set a to 20?
}
void main()
{
int a = 10;
foo(a);
}
Hope that helps :-)
It would probably have been less confusing to name C++ references "aliases"? As others have mentioned, references in C++ should be though of as the variable they refer to, not as a pointer/reference to the variable. As such, I can't think of a good reason they should be resettable.
when dealing with pointers, it often makes sense allowing null as a value (and otherwise, you probably want a reference instead). If you specifically want to disallow holding null, you could always code your own smart pointer type ;)
C++ references can sometimes be forced to be 0 with some compilers (it's just a bad idea to do so*, and it violates the standard*).
int &x = *((int*)0); // Illegal but some compilers accept it
EDIT: according to various people who know the standard much better than myself, the above code produces "undefined behavior". In at least some versions of GCC and Visual Studio, I've seen this do the expected thing: the equivalent of setting a pointer to NULL (and causes a NULL pointer exception when accessed).
You can't do this:
int theInt = 0;
int& refToTheInt = theInt;
int otherInt = 42;
refToTheInt = otherInt;
...for the same reason why secondInt and firstInt don't have the same value here:
int firstInt = 1;
int secondInt = 2;
secondInt = firstInt;
firstInt = 3;
assert( firstInt != secondInt );
This is not actually an answer, but a workaround for this limitation.
Basically, when you try to "rebind" a reference you are actually trying to use the same name to refer to a new value in the following context. In C++, this can be achieve by introducing a block scope.
In jalf's example
int i = 42;
int k = 43;
int& j = i;
//change i, or change j?
j = k;
if you want to change i, write it as above. However, if you want to change the meaning of j to mean k, you can do this:
int i = 42;
int k = 43;
int& j = i;
//change i, or change j?
//change j!
{
int& j = k;
//do what ever with j's new meaning
}
I would imagine that it is related to optimization.
Static optimization is much easier when you can know unambiguously what bit of memory a variable means. Pointers break this condition and re-setable reference would too.
Because sometimes things should not be re-pointable. (E.g., the reference to a Singleton.)
Because it's great in a function to know that your argument can't be null.
But mostly, because it allows use to have something that really is a pointer, but which acts like a local value object. C++ tries hard, to quote Stroustrup, to make class instances "do as the ints d". Passing an int by vaue is cheap, because an int fitss into a machine register. Classes are often bigger than ints, and passing them by value has significant overhead.
Being able to pass a pointer (which is often the size of an int, or maybe two ints) that "looks like" a value object allows us to write cleaner code, without the "implementation detail" of dereferences. And, along with operator overloading, it allows us to write classes use syntax similar to the syntax used with ints. In particular, it allows us to write template classes with syntax that can be equally applied to primitive, like ints, and classes (like a Complex number class).
And, with operator overloading especially, there are places were we should return an object, but again, it's much cheaper to return a pointer. Oncve again, returning a reference is our "out.
And pointers are hard. Not for you, maybe, and not to anyone that realizes a pointer is just the value of a memory address. But recalling my CS 101 class, they tripped up a number of students.
char* p = s; *p = *s; *p++ = *s++; i = ++*p;
can be confusing.
Heck, after 40 years of C, people still can't even agree if a pointer declaration should be:
char* p;
or
char *p;
I always wondered why they didn't make a reference assignment operator (say :=) for this.
Just to get on someone's nerves I wrote some code to change the target of a reference in a structure.
No, I do not recommend repeating my trick. It will break if ported to a sufficiently different architecture.
The fact that references in C++ are not nullable is a side-effect of them being just an alias.
I agree with the accepted answer.
But for constness, they behave much like pointers though.
struct A{
int y;
int& x;
A():y(0),x(y){}
};
int main(){
A a;
const A& ar=a;
ar.x++;
}
works.
See
Design reasons for the behavior of reference members of classes passed by const reference
There's a workaround if you want a member variable that's a reference and you want to be able to rebind it. While I find it useful and reliable, note that it uses some (very weak) assumptions on memory layout. It's up to you to decide whether it's within your coding standards.
#include <iostream>
struct Field_a_t
{
int& a_;
Field_a_t(int& a)
: a_(a) {}
Field_a_t& operator=(int& a)
{
// a_.~int(); // do this if you have a non-trivial destructor
new(this)Field_a_t(a);
}
};
struct MyType : Field_a_t
{
char c_;
MyType(int& a, char c)
: Field_a_t(a)
, c_(c) {}
};
int main()
{
int i = 1;
int j = 2;
MyType x(i, 'x');
std::cout << x.a_;
x.a_ = 3;
std::cout << i;
((Field_a_t&)x) = j;
std::cout << x.a_;
x.a_ = 4;
std::cout << j;
}
This is not very efficient as you need a separate type for each reassignable reference field and make them base classes; also, there's a weak assumption here that a class having a single reference type won't have a __vfptr or any other type_id-related field that could potentially destroy runtime bindings of MyType. All the compilers I know satisfy that condition (and it would make little sense not doing so).
Being half serious: IMHO to make them little more different from pointers ;) You know that you can write:
MyClass & c = *new MyClass();
If you could also later write:
c = *new MyClass("other")
would it make sense to have any references alongside with pointers?
MyClass * a = new MyClass();
MyClass & b = *new MyClass();
a = new MyClass("other");
b = *new MyClass("another");