Related
What is the difference between a pointer variable and a reference variable?
A pointer can be re-assigned:
int x = 5;
int y = 6;
int *p;
p = &x;
p = &y;
*p = 10;
assert(x == 5);
assert(y == 10);
A reference cannot be re-bound, and must be bound at initialization:
int x = 5;
int y = 6;
int &q; // error
int &r = x;
A pointer variable has its own identity: a distinct, visible memory address that can be taken with the unary & operator and a certain amount of space that can be measured with the sizeof operator. Using those operators on a reference returns a value corresponding to whatever the reference is bound to; the reference’s own address and size are invisible. Since the reference assumes the identity of the original variable in this way, it is convenient to think of a reference as another name for the same variable.
int x = 0;
int &r = x;
int *p = &x;
int *p2 = &r;
assert(p == p2); // &x == &r
assert(&p != &p2);
You can have arbitrarily nested pointers to pointers offering extra levels of indirection. References only offer one level of indirection.
int x = 0;
int y = 0;
int *p = &x;
int *q = &y;
int **pp = &p;
**pp = 2;
pp = &q; // *pp is now q
**pp = 4;
assert(y == 4);
assert(x == 2);
A pointer can be assigned nullptr, whereas a reference must be bound to an existing object. If you try hard enough, you can bind a reference to nullptr, but this is undefined and will not behave consistently.
/* the code below is undefined; your compiler may optimise it
* differently, emit warnings, or outright refuse to compile it */
int &r = *static_cast<int *>(nullptr);
// prints "null" under GCC 10
std::cout
<< (&r != nullptr
? "not null" : "null")
<< std::endl;
bool f(int &r) { return &r != nullptr; }
// prints "not null" under GCC 10
std::cout
<< (f(*static_cast<int *>(nullptr))
? "not null" : "null")
<< std::endl;
You can, however, have a reference to a pointer whose value is nullptr.
Pointers can iterate over an array; you can use ++ to go to the next item that a pointer is pointing to, and + 4 to go to the 5th element. This is no matter what size the object is that the pointer points to.
A pointer needs to be dereferenced with * to access the memory location it points to, whereas a reference can be used directly. A pointer to a class/struct uses -> to access its members whereas a reference uses a ..
References cannot be put into an array, whereas pointers can be (Mentioned by user #litb)
Const references can be bound to temporaries. Pointers cannot (not without some indirection):
const int &x = int(12); // legal C++
int *y = &int(12); // illegal to take the address of a temporary.
This makes const & more convenient to use in argument lists and so forth.
What's a C++ reference (for C programmers)
A reference can be thought of as a constant pointer (not to be confused with a pointer to a constant value!) with automatic indirection, ie the compiler will apply the * operator for you.
All references must be initialized with a non-null value or compilation will fail. It's neither possible to get the address of a reference - the address operator will return the address of the referenced value instead - nor is it possible to do arithmetics on references.
C programmers might dislike C++ references as it will no longer be obvious when indirection happens or if an argument gets passed by value or by pointer without looking at function signatures.
C++ programmers might dislike using pointers as they are considered unsafe - although references aren't really any safer than constant pointers except in the most trivial cases - lack the convenience of automatic indirection and carry a different semantic connotation.
Consider the following statement from the C++ FAQ:
Even though a reference is often implemented using an address in the
underlying assembly language, please do not think of a reference as a
funny looking pointer to an object. A reference is the object. It is
not a pointer to the object, nor a copy of the object. It is the
object.
But if a reference really were the object, how could there be dangling references? In unmanaged languages, it's impossible for references to be any 'safer' than pointers - there generally just isn't a way to reliably alias values across scope boundaries!
Why I consider C++ references useful
Coming from a C background, C++ references may look like a somewhat silly concept, but one should still use them instead of pointers where possible: Automatic indirection is convenient, and references become especially useful when dealing with RAII - but not because of any perceived safety advantage, but rather because they make writing idiomatic code less awkward.
RAII is one of the central concepts of C++, but it interacts non-trivially with copying semantics. Passing objects by reference avoids these issues as no copying is involved. If references were not present in the language, you'd have to use pointers instead, which are more cumbersome to use, thus violating the language design principle that the best-practice solution should be easier than the alternatives.
If you want to be really pedantic, there is one thing you can do with a reference that you can't do with a pointer: extend the lifetime of a temporary object. In C++ if you bind a const reference to a temporary object, the lifetime of that object becomes the lifetime of the reference.
std::string s1 = "123";
std::string s2 = "456";
std::string s3_copy = s1 + s2;
const std::string& s3_reference = s1 + s2;
In this example s3_copy copies the temporary object that is a result of the concatenation. Whereas s3_reference in essence becomes the temporary object. It's really a reference to a temporary object that now has the same lifetime as the reference.
If you try this without the const it should fail to compile. You cannot bind a non-const reference to a temporary object, nor can you take its address for that matter.
Apart from syntactic sugar, a reference is a const pointer (not pointer to a const). You must establish what it refers to when you declare the reference variable, and you cannot change it later.
Update: now that I think about it some more, there is an important difference.
A const pointer's target can be replaced by taking its address and using a const cast.
A reference's target cannot be replaced in any way short of UB.
This should permit the compiler to do more optimization on a reference.
Contrary to popular opinion, it is possible to have a reference that is NULL.
int * p = NULL;
int & r = *p;
r = 1; // crash! (if you're lucky)
Granted, it is much harder to do with a reference - but if you manage it, you'll tear your hair out trying to find it. References are not inherently safe in C++!
Technically this is an invalid reference, not a null reference. C++ doesn't support null references as a concept as you might find in other languages. There are other kinds of invalid references as well. Any invalid reference raises the spectre of undefined behavior, just as using an invalid pointer would.
The actual error is in the dereferencing of the NULL pointer, prior to the assignment to a reference. But I'm not aware of any compilers that will generate any errors on that condition - the error propagates to a point further along in the code. That's what makes this problem so insidious. Most of the time, if you dereference a NULL pointer, you crash right at that spot and it doesn't take much debugging to figure it out.
My example above is short and contrived. Here's a more real-world example.
class MyClass
{
...
virtual void DoSomething(int,int,int,int,int);
};
void Foo(const MyClass & bar)
{
...
bar.DoSomething(i1,i2,i3,i4,i5); // crash occurs here due to memory access violation - obvious why?
}
MyClass * GetInstance()
{
if (somecondition)
return NULL;
...
}
MyClass * p = GetInstance();
Foo(*p);
I want to reiterate that the only way to get a null reference is through malformed code, and once you have it you're getting undefined behavior. It never makes sense to check for a null reference; for example you can try if(&bar==NULL)... but the compiler might optimize the statement out of existence! A valid reference can never be NULL so from the compiler's view the comparison is always false, and it is free to eliminate the if clause as dead code - this is the essence of undefined behavior.
The proper way to stay out of trouble is to avoid dereferencing a NULL pointer to create a reference. Here's an automated way to accomplish this.
template<typename T>
T& deref(T* p)
{
if (p == NULL)
throw std::invalid_argument(std::string("NULL reference"));
return *p;
}
MyClass * p = GetInstance();
Foo(deref(p));
For an older look at this problem from someone with better writing skills, see Null References from Jim Hyslop and Herb Sutter.
For another example of the dangers of dereferencing a null pointer see Exposing undefined behavior when trying to port code to another platform by Raymond Chen.
You forgot the most important part:
member-access with pointers uses ->
member-access with references uses .
foo.bar is clearly superior to foo->bar in the same way that vi is clearly superior to Emacs :-)
References are very similar to pointers, but they are specifically crafted to be helpful to optimizing compilers.
References are designed such that it is substantially easier for the compiler to trace which reference aliases which variables. Two major features are very important: no "reference arithmetic" and no reassigning of references. These allow the compiler to figure out which references alias which variables at compile time.
References are allowed to refer to variables which do not have memory addresses, such as those the compiler chooses to put into registers. If you take the address of a local variable, it is very hard for the compiler to put it in a register.
As an example:
void maybeModify(int& x); // may modify x in some way
void hurtTheCompilersOptimizer(short size, int array[])
{
// This function is designed to do something particularly troublesome
// for optimizers. It will constantly call maybeModify on array[0] while
// adding array[1] to array[2]..array[size-1]. There's no real reason to
// do this, other than to demonstrate the power of references.
for (int i = 2; i < (int)size; i++) {
maybeModify(array[0]);
array[i] += array[1];
}
}
An optimizing compiler may realize that we are accessing a[0] and a[1] quite a bunch. It would love to optimize the algorithm to:
void hurtTheCompilersOptimizer(short size, int array[])
{
// Do the same thing as above, but instead of accessing array[1]
// all the time, access it once and store the result in a register,
// which is much faster to do arithmetic with.
register int a0 = a[0];
register int a1 = a[1]; // access a[1] once
for (int i = 2; i < (int)size; i++) {
maybeModify(a0); // Give maybeModify a reference to a register
array[i] += a1; // Use the saved register value over and over
}
a[0] = a0; // Store the modified a[0] back into the array
}
To make such an optimization, it needs to prove that nothing can change array[1] during the call. This is rather easy to do. i is never less than 2, so array[i] can never refer to array[1]. maybeModify() is given a0 as a reference (aliasing array[0]). Because there is no "reference" arithmetic, the compiler just has to prove that maybeModify never gets the address of x, and it has proven that nothing changes array[1].
It also has to prove that there are no ways a future call could read/write a[0] while we have a temporary register copy of it in a0. This is often trivial to prove, because in many cases it is obvious that the reference is never stored in a permanent structure like a class instance.
Now do the same thing with pointers
void maybeModify(int* x); // May modify x in some way
void hurtTheCompilersOptimizer(short size, int array[])
{
// Same operation, only now with pointers, making the
// optimization trickier.
for (int i = 2; i < (int)size; i++) {
maybeModify(&(array[0]));
array[i] += array[1];
}
}
The behavior is the same; only now it is much harder to prove that maybeModify does not ever modify array[1], because we already gave it a pointer; the cat is out of the bag. Now it has to do the much more difficult proof: a static analysis of maybeModify to prove it never writes to &x + 1. It also has to prove that it never saves off a pointer that can refer to array[0], which is just as tricky.
Modern compilers are getting better and better at static analysis, but it is always nice to help them out and use references.
Of course, barring such clever optimizations, compilers will indeed turn references into pointers when needed.
EDIT: Five years after posting this answer, I found an actual technical difference where references are different than just a different way of looking at the same addressing concept. References can modify the lifespan of temporary objects in a way that pointers cannot.
F createF(int argument);
void extending()
{
const F& ref = createF(5);
std::cout << ref.getArgument() << std::endl;
};
Normally temporary objects such as the one created by the call to createF(5) are destroyed at the end of the expression. However, by binding that object to a reference, ref, C++ will extend the lifespan of that temporary object until ref goes out of scope.
Actually, a reference is not really like a pointer.
A compiler keeps "references" to variables, associating a name with a memory address; that's its job to translate any variable name to a memory address when compiling.
When you create a reference, you only tell the compiler that you assign another name to the pointer variable; that's why references cannot "point to null", because a variable cannot be, and not be.
Pointers are variables; they contain the address of some other variable, or can be null. The important thing is that a pointer has a value, while a reference only has a variable that it is referencing.
Now some explanation of real code:
int a = 0;
int& b = a;
Here you are not creating another variable that points to a; you are just adding another name to the memory content holding the value of a. This memory now has two names, a and b, and it can be addressed using either name.
void increment(int& n)
{
n = n + 1;
}
int a;
increment(a);
When calling a function, the compiler usually generates memory spaces for the arguments to be copied to. The function signature defines the spaces that should be created and gives the name that should be used for these spaces. Declaring a parameter as a reference just tells the compiler to use the input variable memory space instead of allocating a new memory space during the method call. It may seem strange to say that your function will be directly manipulating a variable declared in the calling scope, but remember that when executing compiled code, there is no more scope; there is just plain flat memory, and your function code could manipulate any variables.
Now there may be some cases where your compiler may not be able to know the reference when compiling, like when using an extern variable. So a reference may or may not be implemented as a pointer in the underlying code. But in the examples I gave you, it will most likely not be implemented with a pointer.
A reference can never be NULL.
There is a semantic difference that may appear esoteric if you are not familiar with studying computer languages in an abstract or even academic fashion.
At the highest-level, the idea of references is that they are transparent "aliases". Your computer may use an address to make them work, but you're not supposed to worry about that: you're supposed to think of them as "just another name" for an existing object and the syntax reflects that. They are stricter than pointers so your compiler can more reliably warn you when you about to create a dangling reference, than when you are about to create a dangling pointer.
Beyond that, there are of course some practical differences between pointers and references. The syntax to use them is obviously different, and you cannot "re-seat" references, have references to nothingness, or have pointers to references.
While both references and pointers are used to indirectly access another value, there are two important differences between references and pointers. The first is that a reference always refers to an object: It is an error to define a reference without initializing it. The behavior of assignment is the second important difference: Assigning to a reference changes the object to which the reference is bound; it does not rebind the reference to another object. Once initialized, a reference always refers to the same underlying object.
Consider these two program fragments. In the first, we assign one pointer to another:
int ival = 1024, ival2 = 2048;
int *pi = &ival, *pi2 = &ival2;
pi = pi2; // pi now points to ival2
After the assignment, ival, the object addressed by pi remains unchanged. The assignment changes the value of pi, making it point to a different object. Now consider a similar program that assigns two references:
int &ri = ival, &ri2 = ival2;
ri = ri2; // assigns ival2 to ival
This assignment changes ival, the value referenced by ri, and not the reference itself. After the assignment, the two references still refer to their original objects, and the value of those objects is now the same as well.
A reference is an alias for another variable whereas a pointer holds the memory address of a variable. References are generally used as function parameters so that the passed object is not the copy but the object itself.
void fun(int &a, int &b); // A common usage of references.
int a = 0;
int &b = a; // b is an alias for a. Not so common to use.
The direct answer
What is a reference in C++? Some specific instance of type that is not an object type.
What is a pointer in C++? Some specific instance of type that is an object type.
From the ISO C++ definition of object type:
An object type is a (possibly cv-qualified) type that is not a function type, not a reference type, and not cv void.
It may be important to know, object type is a top-level category of the type universe in C++. Reference is also a top-level category. But pointer is not.
Pointers and references are mentioned together in the context of compound type. This is basically due to the nature of the declarator syntax inherited from (and extended) C, which has no references. (Besides, there are more than one kind of declarator of references since C++ 11, while pointers are still "unityped": &+&& vs. *.) So drafting a language specific by "extension" with similar style of C in this context is somewhat reasonable. (I will still argue that the syntax of declarators wastes the syntactic expressiveness a lot, makes both human users and implementations frustrating. Thus, all of them are not qualified to be built-in in a new language design. This is a totally different topic about PL design, though.)
Otherwise, it is insignificant that pointers can be qualified as a specific sorts of types with references together. They simply share too few common properties besides the syntax similarity, so there is no need to put them together in most cases.
Note the statements above only mentions "pointers" and "references" as types. There are some interested questions about their instances (like variables). There also come too many misconceptions.
The differences of the top-level categories can already reveal many concrete differences not tied to pointers directly:
Object types can have top-level cv qualifiers. References cannot.
Variable of object types do occupy storage as per the abstract machine semantics. Reference do not necessary occupy storage (see the section about misconceptions below for details).
...
A few more special rules on references:
Compound declarators are more restrictive on references.
References can collapse.
Special rules on && parameters (as the "forwarding references") based on reference collapsing during template parameter deduction allow "perfect forwarding" of parameters.
References have special rules in initialization. The lifetime of variable declared as a reference type can be different to ordinary objects via extension.
BTW, a few other contexts like initialization involving std::initializer_list follows some similar rules of reference lifetime extension. It is another can of worms.
...
The misconceptions
Syntactic sugar
I know references are syntactic sugar, so code is easier to read and write.
Technically, this is plain wrong. References are not syntactic sugar of any other features in C++, because they cannot be exactly replaced by other features without any semantic differences.
(Similarly, lambda-expressions are not syntactic sugar of any other features in C++ because it cannot be precisely simulated with "unspecified" properties like the declaration order of the captured variables, which may be important because the initialization order of such variables can be significant.)
C++ only has a few kinds of syntactic sugars in this strict sense. One instance is (inherited from C) the built-in (non-overloaded) operator [], which is defined exactly having same semantic properties of specific forms of combination over built-in operator unary * and binary +.
Storage
So, a pointer and a reference both use the same amount of memory.
The statement above is simply wrong. To avoid such misconceptions, look at the ISO C++ rules instead:
From [intro.object]/1:
... An object occupies a region of storage in its period of construction, throughout its lifetime, and in its period of destruction. ...
From [dcl.ref]/4:
It is unspecified whether or not a reference requires storage.
Note these are semantic properties.
Pragmatics
Even that pointers are not qualified enough to be put together with references in the sense of the language design, there are still some arguments making it debatable to make choice between them in some other contexts, for example, when making choices on parameter types.
But this is not the whole story. I mean, there are more things than pointers vs references you have to consider.
If you don't have to stick on such over-specific choices, in most cases the answer is short: you do not have the necessity to use pointers, so you don't. Pointers are usually bad enough because they imply too many things you don't expect and they will rely on too many implicit assumptions undermining the maintainability and (even) portability of the code. Unnecessarily relying on pointers is definitely a bad style and it should be avoided in the sense of modern C++. Reconsider your purpose and you will finally find that pointer is the feature of last sorts in most cases.
Sometimes the language rules explicitly require specific types to be used. If you want to use these features, obey the rules.
Copy constructors require specific types of cv-& reference type as the 1st parameter type. (And usually it should be const qualified.)
Move constructors require specific types of cv-&& reference type as the 1st parameter type. (And usually there should be no qualifiers.)
Specific overloads of operators require reference or non reference types. For example:
Overloaded operator= as special member functions requires reference types similar to 1st parameter of copy/move constructors.
Postfix ++ requires dummy int.
...
If you know pass-by-value (i.e. using non-reference types) is sufficient, use it directly, particularly when using an implementation supporting C++17 mandated copy elision. (Warning: However, to exhaustively reason about the necessity can be very complicated.)
If you want to operate some handles with ownership, use smart pointers like unique_ptr and shared_ptr (or even with homebrew ones by yourself if you require them to be opaque), rather than raw pointers.
If you are doing some iterations over a range, use iterators (or some ranges which are not provided by the standard library yet), rather than raw pointers unless you are convinced raw pointers will do better (e.g. for less header dependencies) in very specific cases.
If you know pass-by-value is sufficient and you want some explicit nullable semantics, use wrapper like std::optional, rather than raw pointers.
If you know pass-by-value is not ideal for the reasons above, and you don't want nullable semantics, use {lvalue, rvalue, forwarding}-references.
Even when you do want semantics like traditional pointer, there are often something more appropriate, like observer_ptr in Library Fundamental TS.
The only exceptions cannot be worked around in the current language:
When you are implementing smart pointers above, you may have to deal with raw pointers.
Specific language-interoperation routines require pointers, like operator new. (However, cv-void* is still quite different and safer compared to the ordinary object pointers because it rules out unexpected pointer arithmetics unless you are relying on some non conforming extension on void* like GNU's.)
Function pointers can be converted from lambda expressions without captures, while function references cannot. You have to use function pointers in non-generic code for such cases, even you deliberately do not want nullable values.
So, in practice, the answer is so obvious: when in doubt, avoid pointers. You have to use pointers only when there are very explicit reasons that nothing else is more appropriate. Except a few exceptional cases mentioned above, such choices are almost always not purely C++-specific (but likely to be language-implementation-specific). Such instances can be:
You have to serve to old-style (C) APIs.
You have to meet the ABI requirements of specific C++ implementations.
You have to interoperate at runtime with different language implementations (including various assemblies, language runtime and FFI of some high-level client languages) based on assumptions of specific implementations.
You have to improve efficiency of the translation (compilation & linking) in some extreme cases.
You have to avoid symbol bloat in some extreme cases.
Language neutrality caveats
If you come to see the question via some Google search result (not specific to C++), this is very likely to be the wrong place.
References in C++ is quite "odd", as it is essentially not first-class: they will be treated as the objects or the functions being referred to so they have no chance to support some first-class operations like being the left operand of the member access operator independently to the type of the referred object. Other languages may or may not have similar restrictions on their references.
References in C++ will likely not preserve the meaning across different languages. For example, references in general do not imply nonnull properties on values like they in C++, so such assumptions may not work in some other languages (and you will find counterexamples quite easily, e.g. Java, C#, ...).
There can still be some common properties among references in different programming languages in general, but let's leave it for some other questions in SO.
(A side note: the question may be significant earlier than any "C-like" languages are involved, like ALGOL 68 vs. PL/I.)
It doesn't matter how much space it takes up since you can't actually see any side effect (without executing code) of whatever space it would take up.
On the other hand, one major difference between references and pointers is that temporaries assigned to const references live until the const reference goes out of scope.
For example:
class scope_test
{
public:
~scope_test() { printf("scope_test done!\n"); }
};
...
{
const scope_test &test= scope_test();
printf("in scope\n");
}
will print:
in scope
scope_test done!
This is the language mechanism that allows ScopeGuard to work.
This is based on the tutorial. What is written makes it more clear:
>>> The address that locates a variable within memory is
what we call a reference to that variable. (5th paragraph at page 63)
>>> The variable that stores the reference to another
variable is what we call a pointer. (3rd paragraph at page 64)
Simply to remember that,
>>> reference stands for memory location
>>> pointer is a reference container (Maybe because we will use it for
several times, it is better to remember that reference.)
What's more, as we can refer to almost any pointer tutorial, a pointer is an object that is supported by pointer arithmetic which makes pointer similar to an array.
Look at the following statement,
int Tom(0);
int & alias_Tom = Tom;
alias_Tom can be understood as an alias of a variable (different with typedef, which is alias of a type) Tom. It is also OK to forget the terminology of such statement is to create a reference of Tom.
A reference is not another name given to some memory. It's a immutable pointer that is automatically de-referenced on usage. Basically it boils down to:
int& j = i;
It internally becomes
int* const j = &i;
A reference to a pointer is possible in C++, but the reverse is not possible means a pointer to a reference isn't possible. A reference to a pointer provides a cleaner syntax to modify the pointer.
Look at this example:
#include<iostream>
using namespace std;
void swap(char * &str1, char * &str2)
{
char *temp = str1;
str1 = str2;
str2 = temp;
}
int main()
{
char *str1 = "Hi";
char *str2 = "Hello";
swap(str1, str2);
cout<<"str1 is "<<str1<<endl;
cout<<"str2 is "<<str2<<endl;
return 0;
}
And consider the C version of the above program. In C you have to use pointer to pointer (multiple indirection), and it leads to confusion and the program may look complicated.
#include<stdio.h>
/* Swaps strings by swapping pointers */
void swap1(char **str1_ptr, char **str2_ptr)
{
char *temp = *str1_ptr;
*str1_ptr = *str2_ptr;
*str2_ptr = temp;
}
int main()
{
char *str1 = "Hi";
char *str2 = "Hello";
swap1(&str1, &str2);
printf("str1 is %s, str2 is %s", str1, str2);
return 0;
}
Visit the following for more information about reference to pointer:
C++: Reference to Pointer
Pointer-to-Pointer and Reference-to-Pointer
As I said, a pointer to a reference isn't possible. Try the following program:
#include <iostream>
using namespace std;
int main()
{
int x = 10;
int *ptr = &x;
int &*ptr1 = ptr;
}
There is one fundamental difference between pointers and references that I didn't see anyone had mentioned: references enable pass-by-reference semantics in function arguments. Pointers, although it is not visible at first do not: they only provide pass-by-value semantics. This has been very nicely described in this article.
Regards,
&rzej
I use references unless I need either of these:
Null pointers can be used as a
sentinel value, often a cheap way to
avoid function overloading or use of
a bool.
You can do arithmetic on a pointer.
For example, p += offset;
At the risk of adding to confusion, I want to throw in some input, I'm sure it mostly depends on how the compiler implements references, but in the case of gcc the idea that a reference can only point to a variable on the stack is not actually correct, take this for example:
#include <iostream>
int main(int argc, char** argv) {
// Create a string on the heap
std::string *str_ptr = new std::string("THIS IS A STRING");
// Dereference the string on the heap, and assign it to the reference
std::string &str_ref = *str_ptr;
// Not even a compiler warning! At least with gcc
// Now lets try to print it's value!
std::cout << str_ref << std::endl;
// It works! Now lets print and compare actual memory addresses
std::cout << str_ptr << " : " << &str_ref << std::endl;
// Exactly the same, now remember to free the memory on the heap
delete str_ptr;
}
Which outputs this:
THIS IS A STRING
0xbb2070 : 0xbb2070
If you notice even the memory addresses are exactly the same, meaning the reference is successfully pointing to a variable on the heap! Now if you really want to get freaky, this also works:
int main(int argc, char** argv) {
// In the actual new declaration let immediately de-reference and assign it to the reference
std::string &str_ref = *(new std::string("THIS IS A STRING"));
// Once again, it works! (at least in gcc)
std::cout << str_ref;
// Once again it prints fine, however we have no pointer to the heap allocation, right? So how do we free the space we just ignorantly created?
delete &str_ref;
/*And, it works, because we are taking the memory address that the reference is
storing, and deleting it, which is all a pointer is doing, just we have to specify
the address with '&' whereas a pointer does that implicitly, this is sort of like
calling delete &(*str_ptr); (which also compiles and runs fine).*/
}
Which outputs this:
THIS IS A STRING
Therefore a reference IS a pointer under the hood, they both are just storing a memory address, where the address is pointing to is irrelevant, what do you think would happen if I called std::cout << str_ref; AFTER calling delete &str_ref? Well, obviously it compiles fine, but causes a segmentation fault at runtime because it's no longer pointing at a valid variable, we essentially have a broken reference that still exists (until it falls out of scope), but is useless.
In other words, a reference is nothing but a pointer that has the pointer mechanics abstracted away, making it safer and easier to use (no accidental pointer math, no mixing up '.' and '->', etc.), assuming you don't try any nonsense like my examples above ;)
Now regardless of how a compiler handles references, it will always have some kind of pointer under the hood, because a reference must refer to a specific variable at a specific memory address for it to work as expected, there is no getting around this (hence the term 'reference').
The only major rule that's important to remember with references is that they must be defined at the time of declaration (with the exception of a reference in a header, in that case it must be defined in the constructor, after the object it's contained in is constructed it's too late to define it).
Remember, my examples above are just that, examples demonstrating what a reference is, you would never want to use a reference in those ways! For proper usage of a reference there are plenty of answers on here already that hit the nail on the head
Another difference is that you can have pointers to a void type (and it means pointer to anything) but references to void are forbidden.
int a;
void * p = &a; // ok
void & p = a; // forbidden
I can't say I'm really happy with this particular difference. I would much prefer it would be allowed with the meaning reference to anything with an address and otherwise the same behavior for references. It would allow to define some equivalents of C library functions like memcpy using references.
Also, a reference that is a parameter to a function that is inlined may be handled differently than a pointer.
void increment(int *ptrint) { (*ptrint)++; }
void increment(int &refint) { refint++; }
void incptrtest()
{
int testptr=0;
increment(&testptr);
}
void increftest()
{
int testref=0;
increment(testref);
}
Many compilers when inlining the pointer version one will actually force a write to memory (we are taking the address explicitly). However, they will leave the reference in a register which is more optimal.
Of course, for functions that are not inlined the pointer and reference generate the same code and it's always better to pass intrinsics by value than by reference if they are not modified and returned by the function.
Another interesting use of references is to supply a default argument of a user-defined type:
class UDT
{
public:
UDT() : val_d(33) {};
UDT(int val) : val_d(val) {};
virtual ~UDT() {};
private:
int val_d;
};
class UDT_Derived : public UDT
{
public:
UDT_Derived() : UDT() {};
virtual ~UDT_Derived() {};
};
class Behavior
{
public:
Behavior(
const UDT &udt = UDT()
) {};
};
int main()
{
Behavior b; // take default
UDT u(88);
Behavior c(u);
UDT_Derived ud;
Behavior d(ud);
return 1;
}
The default flavor uses the 'bind const reference to a temporary' aspect of references.
This program might help in comprehending the answer of the question. This is a simple program of a reference "j" and a pointer "ptr" pointing to variable "x".
#include<iostream>
using namespace std;
int main()
{
int *ptr=0, x=9; // pointer and variable declaration
ptr=&x; // pointer to variable "x"
int & j=x; // reference declaration; reference to variable "x"
cout << "x=" << x << endl;
cout << "&x=" << &x << endl;
cout << "j=" << j << endl;
cout << "&j=" << &j << endl;
cout << "*ptr=" << *ptr << endl;
cout << "ptr=" << ptr << endl;
cout << "&ptr=" << &ptr << endl;
getch();
}
Run the program and have a look at the output and you'll understand.
Also, spare 10 minutes and watch this video: https://www.youtube.com/watch?v=rlJrrGV0iOg
I feel like there is yet another point that hasn't been covered here.
Unlike the pointers, references are syntactically equivalent to the object they refer to, i.e. any operation that can be applied to an object works for a reference, and with the exact same syntax (the exception is of course the initialization).
While this may appear superficial, I believe this property is crucial for a number of C++ features, for example:
Templates. Since template parameters are duck-typed, syntactic properties of a type is all that matters, so often the same template can be used with both T and T&.
(or std::reference_wrapper<T> which still relies on an implicit cast
to T&)
Templates that cover both T& and T&& are even more common.
Lvalues. Consider the statement str[0] = 'X'; Without references it would only work for c-strings (char* str). Returning the character by reference allows user-defined classes to have the same notation.
Copy constructors. Syntactically it makes sense to pass objects to copy constructors, and not pointers to objects. But there is just no way for a copy constructor to take an object by value - it would result in a recursive call to the same copy constructor. This leaves references as the only option here.
Operator overloads. With references it is possible to introduce indirection to an operator call - say, operator+(const T& a, const T& b) while retaining the same infix notation. This also works for regular overloaded functions.
These points empower a considerable part of C++ and the standard library so this is quite a major property of references.
A reference is a const pointer. int * const a = &b is the same as int& a = b. This is why there's is no such thing as a const reference, because it is already const, whereas a reference to const is const int * const a. When you compile using -O0, the compiler will place the address of b on the stack in both situations, and as a member of a class, it will also be present in the object on the stack/heap identically to if you had declared a const pointer. With -Ofast, it is free to optimise this out. A const pointer and reference are both optimised away.
Unlike a const pointer, there is no way to take the address of the reference itself, as it will be interpreted as the address of the variable it references. Because of this, on -Ofast, the const pointer representing the reference (the address of the variable being referenced) will always be optimised off the stack, but if the program absolutely needs the address of an actual const pointer (the address of the pointer itself, not the address it points to) i.e. you print the address of the const pointer, then the const pointer will be placed on the stack so that it has an address.
Otherwise it is identical i.e. when you print the that address it points to:
#include <iostream>
int main() {
int a =1;
int* b = &a;
std::cout << b ;
}
int main() {
int a =1;
int& b = a;
std::cout << &b ;
}
they both have the same assembly output
-Ofast:
main:
sub rsp, 24
mov edi, OFFSET FLAT:_ZSt4cout
lea rsi, [rsp+12]
mov DWORD PTR [rsp+12], 1
call std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<void const*>(void const*)
xor eax, eax
add rsp, 24
ret
--------------------------------------------------------------------
-O0:
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-12], 1
lea rax, [rbp-12]
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov rsi, rax
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(void const*)
mov eax, 0
leave
ret
The pointer has been optimised off the stack, and the pointer isn't even dereferenced on -Ofast in both cases, instead it uses a compile time value.
As members of an object they are identical on -O0 through -Ofast.
#include <iostream>
int b=1;
struct A {int* i=&b; int& j=b;};
A a;
int main() {
std::cout << &a.j << &a.i;
}
The address of b is stored twice in the object.
a:
.quad b
.quad b
mov rax, QWORD PTR a[rip+8] //&a.j
mov esi, OFFSET FLAT:a //&a.i
When you pass by reference, on -O0, you pass the address of the variable referenced, so it is identical to passing by pointer i.e. the address the const pointer contains. On -Ofast this is optimised out by the compiler in an inline call if the function can be inlined, as the dynamic scope is known, but in the function definition, the parameter is always dereferenced as a pointer (expecting the address of the variable being referenced by the reference) where it may be used by another translation unit and the dynamic scope is unknown to the compiler, unless of course the function is declared as a static function, then it can't be used outside of the translation unit and then it passes by value so long as it isn't modified in the function by reference, then it will pass the address of the variable being referenced by the reference that you're passing, and on -Ofast this will be passed in a register and kept off of the stack if there are enough volatile registers in the calling convention.
There is a very important non-technical difference between pointers and references: An argument passed to a function by pointer is much more visible than an argument passed to a function by non-const reference. For example:
void fn1(std::string s);
void fn2(const std::string& s);
void fn3(std::string& s);
void fn4(std::string* s);
void bar() {
std::string x;
fn1(x); // Cannot modify x
fn2(x); // Cannot modify x (without const_cast)
fn3(x); // CAN modify x!
fn4(&x); // Can modify x (but is obvious about it)
}
Back in C, a call that looks like fn(x) can only be passed by value, so it definitely cannot modify x; to modify an argument you would need to pass a pointer fn(&x). So if an argument wasn't preceded by an & you knew it would not be modified. (The converse, & means modified, was not true because you would sometimes have to pass large read-only structures by const pointer.)
Some argue that this is such a useful feature when reading code, that pointer parameters should always be used for modifiable parameters rather than non-const references, even if the function never expects a nullptr. That is, those people argue that function signatures like fn3() above should not be allowed. Google's C++ style guidelines are an example of this.
Maybe some metaphors will help;
In the context of your desktop screenspace -
A reference requires you to specify an actual window.
A pointer requires the location of a piece of space on screen that you assure it will contain zero or more instances of that window type.
Difference between pointer and reference
A pointer can be initialized to 0 and a reference not. In fact, a reference must also refer to an object, but a pointer can be the null pointer:
int* p = 0;
But we can’t have int& p = 0; and also int& p=5 ;.
In fact to do it properly, we must have declared and defined an object at the first then we can make a reference to that object, so the correct implementation of the previous code will be:
Int x = 0;
Int y = 5;
Int& p = x;
Int& p1 = y;
Another important point is that is we can make the declaration of the pointer without initialization however no such thing can be done in case of reference which must make a reference always to variable or object. However such use of a pointer is risky so generally we check if the pointer is actually is pointing to something or not. In case of a reference no such check is necessary, because we know already that referencing to an object during declaration is mandatory.
Another difference is that pointer can point to another object however reference is always referencing to the same object, let’s take this example:
Int a = 6, b = 5;
Int& rf = a;
Cout << rf << endl; // The result we will get is 6, because rf is referencing to the value of a.
rf = b;
cout << a << endl; // The result will be 5 because the value of b now will be stored into the address of a so the former value of a will be erased
Another point: When we have a template like an STL template such kind of a class template will always return a reference, not a pointer, to make easy reading or assigning new value using operator []:
Std ::vector<int>v(10); // Initialize a vector with 10 elements
V[5] = 5; // Writing the value 5 into the 6 element of our vector, so if the returned type of operator [] was a pointer and not a reference we should write this *v[5]=5, by making a reference we overwrite the element by using the assignment "="
Some key pertinent details about references and pointers
Pointers
Pointer variables are declared using the unary suffix declarator operator *
Pointer objects are assigned an address value, for example, by assignment to an array object, the address of an object using the & unary prefix operator, or assignment to the value of another pointer object
A pointer can be reassigned any number of times, pointing to different objects
A pointer is a variable that holds the assigned address. It takes up storage in memory equal to the size of the address for the target machine architecture
A pointer can be mathematically manipulated, for instance, by the increment or addition operators. Hence, one can iterate with a pointer, etc.
To get or set the contents of the object referred to by a pointer, one must use the unary prefix operator * to dereference it
References
References must be initialized when they are declared.
References are declared using the unary suffix declarator operator &.
When initializing a reference, one uses the name of the object to which they will refer directly, without the need for the unary prefix operator &
Once initialized, references cannot be pointed to something else by assignment or arithmetical manipulation
There is no need to dereference the reference to get or set the contents of the object it refers to
Assignment operations on the reference manipulate the contents of the object it points to (after initialization), not the reference itself (does not change where it points to)
Arithmetic operations on the reference manipulate the contents of the object it points to, not the reference itself (does not change where it points to)
In pretty much all implementations, the reference is actually stored as an address in memory of the referred to object. Hence, it takes up storage in memory equal to the size of the address for the target machine architecture just like a pointer object
Even though pointers and references are implemented in much the same way "under-the-hood," the compiler treats them differently, resulting in all the differences described above.
Article
A recent article I wrote that goes into much greater detail than I can show here and should be very helpful for this question, especially about how things happen in memory:
Arrays, Pointers and References Under the Hood In-Depth Article
After reading about possible ways of rebinding a reference in C++, which should be illegal, I found a particularly ugly way of doing it. The reason I think the reference really gets rebound is because it does not modify the original referenced value, but the memory of the reference itself. After some more researching, I found a reference is not guaranteed to have memory, but when it does have, we can try to use the code:
#include <iostream>
using namespace std;
template<class T>
class Reference
{
public:
T &r;
Reference(T &r) : r(r) {}
};
int main(void)
{
int five = 5, six = 6;
Reference<int> reference(five);
cout << "reference value is " << reference.r << " at memory " << &reference.r << endl;
// Used offsetof macro for simplicity, even though its support is conditional in C++ as warned by GCC. Anyway, the macro can be hard-coded
*(reinterpret_cast<int**>(reinterpret_cast<char*>(&reference) + offsetof(Reference<int>, r))) = &six;
cout << "reference value changed to " << reference.r << " at memory " << &reference.r << endl;
// The value of five still exists in memory and remains untouched
cout << "five value is still " << five << " at memory " << &five << endl;
}
A sample output using GCC 8.1, but also tested in MSVC, is:
reference value is 5 at memory 0x7ffd1b4eb6b8
reference value changed to 6 at memory 0x7ffd1b4eb6bc
five value is still 5 at memory 0x7ffd1b4eb6b8
The questions are:
Is the method above considered undefined behavior? Why?
Can we technically say the reference gets rebound, even though it should be illegal?
In a practical situation, when the code has already worked using a specific compiler in a specific machine, is the code above portable (guaranteed to work in every operational system and every processor), assuming we use the same compiler version?
Above code has undefined behavior. The result of your reinterpret_cast<int**>(…) does not actually point to an object of type int*, yet you dereference and overwrite the stored value of the hypothetical int* object at that location, violating at least the strict aliasing rule in the process [basic.lval]/11. In reality, there is not even an object of any type at that location (references are not objects)…
Exactly one reference is being bound in your code and that happens when the constructor of Reference initializes the member r. At no point is a reference being rebound to another object. This simply appears to work due to the fact that the compiler happens to implement your reference member via a field that stores the address of the object the reference is refering to, which happens to be located at the location your invalid pointer happens to point to…
Apart from that, I would have my doubts whether it's even legal to use offsetof on a reference member to begin with. Even if it is, that part of your code would at best be conditionally-supported with effectively implementation-defined behavior [support.types.layout]/1, since your class Reference is not a standard-layout class [class.prop]/3.1 (it has a member of reference type).
Since your code has undefined behavior, it cannot possibly be portable…
As shown in the other answer, your code has UB. A reference cannot be re-boud - this is by language design and no matter what kind of casting trickery you try you cannot get around that, you will still end up with UB.
But you can have re-binding reference semantics with std::reference_wrapper:
int a = 24;
int b = 11;
auto r = std::ref(a); // bind r to a
r.get() = 5; // a is changed to 5
r = b; // re-bind r to b
r.get() = 13; // b is changed to 13
References can be rebound legally, if you jump through the right hoops:
#include <new>
#include <cassert>
struct ref {
int& value;
};
void test() {
int x = 1, y = 2;
ref r{x};
assert(&r.value == &x);
// overwrite the memory of r with a new ref referring to y.
ref* rebound_r_ptr = std::launder(new (&r) ref{y});
// rebound_r_ptr points to r, but you really have to use it.
// using r directly could give old value.
assert(&rebound_r_ptr->value == &y);
}
Edit: godbolt link. You can tell that it works because the function always returns 1.
This question already has an answer here:
Change "const int" via an "int *" pointer. Surprising and interesting [duplicate]
(1 answer)
Closed 6 years ago.
In c it's possible to change const using pointers like so:
//mainc.c
#include <stdio.h>
int main(int argc, char** argv) {
const int i = 5;
const int *cpi = &i;
printf(" 5:\n");
printf("%d\n", &i);
printf("%d\n", i);
printf("%d\n", cpi);
printf("%d\n", *cpi);
*((int*)cpi) = 8;
printf(" 8?:\n");
printf("%d\n", &i);
printf("%d\n", i);
printf("%d\n", cpi);
printf("%d\n", *cpi);
}
The constant is changed as can be seen in the output:
If we try the same in c++:
//main.cpp
#include <iostream>
using std::cout;
using std::endl;
int main(int argc, char** argv) {
const int i = 5;
const int *cpi = &i;
cout << " 5:" << '\n';
cout << &i << '\n';
cout << i << '\n';
cout << cpi << '\n';
cout << *cpi << '\n';
*((int*)cpi) = 8;
cout << " 8?:" << '\n';
cout << &i << '\n';
cout << i << '\n';
cout << cpi << '\n';
cout << *cpi << '\n';
int* addr = (int*)0x28ff24;
cout << *addr << '\n';
}
The result is not so clear:
From the output is looks like i is still 5 and is still located at 0x28ff24 so the const is unchanged. But in the same time cpi is also 0x28ff24 (the same as &i) but the value it points to is 8 (not 5).
Can someone please explain what kind of magic is happening here?
Explained here: https://stackoverflow.com/a/41098196/2277240
The behaviour on casting away const from a variable (even via a pointer or a reference in C++) that was originally declared as const, and then subsequently attempting to change the variable through that pointer or reference, is undefined.
So changing i if it's declared as const int i = 5; is undefined behaviour: the output you are observing is a manifestation of that.
It is undefined behavior as per C11 6.7.3/6:
If an attempt is made to modify an object defined with a
const-qualified type through use of an lvalue with non-const-qualified
type, the behavior is undefined.
(C++ will have a similar normative text.)
And since it is undefined behavior, anything can happen. Including: weird output, program crashes, "seems to work fine" (this build).
The rule of const_cast<Type *>() or c-type conversion (Type *):
The conversion is to remove const declaration, NOT to remove the const of the value (object) itself.
const Type i = 1;
// p is a variable, i is an object
const Type * p = &i; // i is const --- const is the property of i, you can't remove it
(Type *)p; // remove the const of p, instead the const of i ---- Here p is non-const but i is ALWAYS const!
Now if you try to change the value of i through p, it's Undefined Behavior because i is ALWAYS const.
When to use this kind of conversion?
1) If you can make sure that the pointed value is NOT const.
e.g.
int j = 1;
const int *p = &j;
*(int *)p = 2; // You can change the value of j because j is NOT const
2) The pointed value is const but you ONLY read it and NEVER change it.
If you really need to change a const value, please redesign you code to avoid this kind of case.
So after some thinking I guess I know what happens here. Though it is architecture/implementation dependent since it is undefined behaviour as Marian pointed out. My setup is mingw 5.x 32bit on windows 7 64 bit in case someone is interested.
C++ consts act like #defines, g++ replaces all i references with its value in compiled code (since i is a const) but it also writes 5 (i value) to some address in memory to provide acceses to i via pointer (a dummy pointer). And replaces all the occurences of &i with that adress (not exactly the compiler does it but you know what I mean).
In C consts are treated mostly like usual variables. With the only difference being that the compiler doesn't allow to change them directly.
That's why Bjarne Stroustrup says in his book that you don't need #defines in c++.
Here comes the proof:
It's a violation of the strict aliasing rule (the compiler assumes that two pointers of different types never reference the same memory location) combined with compiler optimization (the compiler is not performing the second memory access to read i but uses the previous variable).
EDIT (as suggested inside the comments):
From the working draft of the ISO C++ standard (N3376):
"If a program attempts to access the stored value of an object through
a glvalue of other than one of the following types the behavior is
undefined [...]
— a cv-qualified version of the dynamic type of the
object, [...]
— a type that is the signed or unsigned type
corresponding to a cv-qualified version of the dynamic type of the
object, [...]
— a type that is a (possibly cv-qualified) base class
type of the dynamic type of the object,"
As far as i understand it specifies, that a possibly cv-qualified type can be used as an alias, but not that a non cv qualified type for a cv qualified type can be.
It would be more fruitful to ask what one specific compiler with certain flags set does with that code than what “C” or “C++” does, because neither C nor C++ will do anything consistently with code like that. It’s undefined behavior. Anything could happen.
It would, for example, be entirely legal to stick const variables in a read-only page of memory that will cause a hardware fault if the program attempts to write to it. Or to fail silently if you try writing to it. Or to turn a dereferenced int* cast from a const int* into a temporary copy that can be modified without affecting the original. Or to modify every reference to that variable after the reassignment. Or to refactor the code on the assumption that a const variable can’t change so that the operations happen in a different order, and you end up modifying the variable before you think you did or not modifying it after. Or to make i an alias for other references to the constant 1 and modify those, too, elsewhere in the program. Or to break a program invariant that makes the program bug out in totally unpredictable ways. Or to print an error message and stop compiling if it catches a bug like that. Or for the behavior to depend on the phase of the moon. Or anything else.
There are combinations of compilers and flags and targets that will do those things, with the possible exception of the phase-of-the-moon bug. The funniest variant I’ve heard of, though, is that in some versions of Fortran, you could set the constant 1 equal to -1, and all loops would run backwards.
Writing production code like this is a terrible idea, because your compiler almost certainly makes no guarantees what this code will do in your next build.
The short answer is that C++ 'const' declaration rules allow it to use the constant value directly in places where C would have to dereference the variable. I.e, C++ compiles the statement
cout << i << '\n';
as if it what was actually written was
cout << 5 << '\n';
All of the other non-pointer values are the results of dereferencing pointers.
So I am having a discussion with a friend about reference and pointers.
What we got talking about is "you can take an address of a pointer but you cant take an address of a reference"
And I disagree on that point. lets take an example:
int x = 0;
int &xRef = x;
cout << &xRef << &x <<endl;
this example shows the same address, but never the less ain't I taking the address of xRef by doing &xRef. Couldn't you argue that we have 2 variables with the same address, so even though I am taking the address of the reference, it is still the address of the reference (even though that is the address of x)?
C++ Standard n3337 § 8.3.2/4
It is unspecified whether or not a reference requires storage (3.7).
So this is unspecified whether reference has storage. Most probably not. It is just alias. When you use it in the code it follows that special operations are taken by compiler, it might do some things similar to pointer operations.
The unary operator & returns the address of the designated object. Reference is not an object. It is reference to object. So this statement
cout << &xRef << &x <<endl;
outputs in the both cases the address of the designated object that is of x. Even though the compiler can allocate memory for a reference the reference itself has no address That is you can not apply operator & that to get its address. It is the object (or a function) that is referenced to by a reference that has an address.
You can think of references as aliases for objects. So, in your example, &xRef declares xRef which is another name for x. Hence you're printing twice the adress of the same object.
By the time you apply the address-of operator in your example, there is no distinction between the reference and the real object anymore. The reference is the object.
The rule instead applies at the moment you try to declare a pointer to a reference. Try it:
int x = 0;
int &*ptr = &x;
Result with MSVC 2013:
error C2528: 'ptr' : pointer to reference is illegal
Just for completenesses sake: There is a way to produce a pointer to a reference.
auto foo(int &x){
return [&]{std::cout << x;};
}
What the compiler is allowed to do here is not to capture a reference tox but to capture the stack pointer instead. Based on the stack pointer the compiler knows the offsets to the various parameters of foo and may save some memory for the lambda. However, after the lambda has been returned the parameters of foo are gone and the lambda references objects that do not exist. Therefore it is forbidden / UB to do this.
What is the difference between a pointer variable and a reference variable?
A pointer can be re-assigned:
int x = 5;
int y = 6;
int *p;
p = &x;
p = &y;
*p = 10;
assert(x == 5);
assert(y == 10);
A reference cannot be re-bound, and must be bound at initialization:
int x = 5;
int y = 6;
int &q; // error
int &r = x;
A pointer variable has its own identity: a distinct, visible memory address that can be taken with the unary & operator and a certain amount of space that can be measured with the sizeof operator. Using those operators on a reference returns a value corresponding to whatever the reference is bound to; the reference’s own address and size are invisible. Since the reference assumes the identity of the original variable in this way, it is convenient to think of a reference as another name for the same variable.
int x = 0;
int &r = x;
int *p = &x;
int *p2 = &r;
assert(p == p2); // &x == &r
assert(&p != &p2);
You can have arbitrarily nested pointers to pointers offering extra levels of indirection. References only offer one level of indirection.
int x = 0;
int y = 0;
int *p = &x;
int *q = &y;
int **pp = &p;
**pp = 2;
pp = &q; // *pp is now q
**pp = 4;
assert(y == 4);
assert(x == 2);
A pointer can be assigned nullptr, whereas a reference must be bound to an existing object. If you try hard enough, you can bind a reference to nullptr, but this is undefined and will not behave consistently.
/* the code below is undefined; your compiler may optimise it
* differently, emit warnings, or outright refuse to compile it */
int &r = *static_cast<int *>(nullptr);
// prints "null" under GCC 10
std::cout
<< (&r != nullptr
? "not null" : "null")
<< std::endl;
bool f(int &r) { return &r != nullptr; }
// prints "not null" under GCC 10
std::cout
<< (f(*static_cast<int *>(nullptr))
? "not null" : "null")
<< std::endl;
You can, however, have a reference to a pointer whose value is nullptr.
Pointers can iterate over an array; you can use ++ to go to the next item that a pointer is pointing to, and + 4 to go to the 5th element. This is no matter what size the object is that the pointer points to.
A pointer needs to be dereferenced with * to access the memory location it points to, whereas a reference can be used directly. A pointer to a class/struct uses -> to access its members whereas a reference uses a ..
References cannot be put into an array, whereas pointers can be (Mentioned by user #litb)
Const references can be bound to temporaries. Pointers cannot (not without some indirection):
const int &x = int(12); // legal C++
int *y = &int(12); // illegal to take the address of a temporary.
This makes const & more convenient to use in argument lists and so forth.
What's a C++ reference (for C programmers)
A reference can be thought of as a constant pointer (not to be confused with a pointer to a constant value!) with automatic indirection, ie the compiler will apply the * operator for you.
All references must be initialized with a non-null value or compilation will fail. It's neither possible to get the address of a reference - the address operator will return the address of the referenced value instead - nor is it possible to do arithmetics on references.
C programmers might dislike C++ references as it will no longer be obvious when indirection happens or if an argument gets passed by value or by pointer without looking at function signatures.
C++ programmers might dislike using pointers as they are considered unsafe - although references aren't really any safer than constant pointers except in the most trivial cases - lack the convenience of automatic indirection and carry a different semantic connotation.
Consider the following statement from the C++ FAQ:
Even though a reference is often implemented using an address in the
underlying assembly language, please do not think of a reference as a
funny looking pointer to an object. A reference is the object. It is
not a pointer to the object, nor a copy of the object. It is the
object.
But if a reference really were the object, how could there be dangling references? In unmanaged languages, it's impossible for references to be any 'safer' than pointers - there generally just isn't a way to reliably alias values across scope boundaries!
Why I consider C++ references useful
Coming from a C background, C++ references may look like a somewhat silly concept, but one should still use them instead of pointers where possible: Automatic indirection is convenient, and references become especially useful when dealing with RAII - but not because of any perceived safety advantage, but rather because they make writing idiomatic code less awkward.
RAII is one of the central concepts of C++, but it interacts non-trivially with copying semantics. Passing objects by reference avoids these issues as no copying is involved. If references were not present in the language, you'd have to use pointers instead, which are more cumbersome to use, thus violating the language design principle that the best-practice solution should be easier than the alternatives.
If you want to be really pedantic, there is one thing you can do with a reference that you can't do with a pointer: extend the lifetime of a temporary object. In C++ if you bind a const reference to a temporary object, the lifetime of that object becomes the lifetime of the reference.
std::string s1 = "123";
std::string s2 = "456";
std::string s3_copy = s1 + s2;
const std::string& s3_reference = s1 + s2;
In this example s3_copy copies the temporary object that is a result of the concatenation. Whereas s3_reference in essence becomes the temporary object. It's really a reference to a temporary object that now has the same lifetime as the reference.
If you try this without the const it should fail to compile. You cannot bind a non-const reference to a temporary object, nor can you take its address for that matter.
Apart from syntactic sugar, a reference is a const pointer (not pointer to a const). You must establish what it refers to when you declare the reference variable, and you cannot change it later.
Update: now that I think about it some more, there is an important difference.
A const pointer's target can be replaced by taking its address and using a const cast.
A reference's target cannot be replaced in any way short of UB.
This should permit the compiler to do more optimization on a reference.
Contrary to popular opinion, it is possible to have a reference that is NULL.
int * p = NULL;
int & r = *p;
r = 1; // crash! (if you're lucky)
Granted, it is much harder to do with a reference - but if you manage it, you'll tear your hair out trying to find it. References are not inherently safe in C++!
Technically this is an invalid reference, not a null reference. C++ doesn't support null references as a concept as you might find in other languages. There are other kinds of invalid references as well. Any invalid reference raises the spectre of undefined behavior, just as using an invalid pointer would.
The actual error is in the dereferencing of the NULL pointer, prior to the assignment to a reference. But I'm not aware of any compilers that will generate any errors on that condition - the error propagates to a point further along in the code. That's what makes this problem so insidious. Most of the time, if you dereference a NULL pointer, you crash right at that spot and it doesn't take much debugging to figure it out.
My example above is short and contrived. Here's a more real-world example.
class MyClass
{
...
virtual void DoSomething(int,int,int,int,int);
};
void Foo(const MyClass & bar)
{
...
bar.DoSomething(i1,i2,i3,i4,i5); // crash occurs here due to memory access violation - obvious why?
}
MyClass * GetInstance()
{
if (somecondition)
return NULL;
...
}
MyClass * p = GetInstance();
Foo(*p);
I want to reiterate that the only way to get a null reference is through malformed code, and once you have it you're getting undefined behavior. It never makes sense to check for a null reference; for example you can try if(&bar==NULL)... but the compiler might optimize the statement out of existence! A valid reference can never be NULL so from the compiler's view the comparison is always false, and it is free to eliminate the if clause as dead code - this is the essence of undefined behavior.
The proper way to stay out of trouble is to avoid dereferencing a NULL pointer to create a reference. Here's an automated way to accomplish this.
template<typename T>
T& deref(T* p)
{
if (p == NULL)
throw std::invalid_argument(std::string("NULL reference"));
return *p;
}
MyClass * p = GetInstance();
Foo(deref(p));
For an older look at this problem from someone with better writing skills, see Null References from Jim Hyslop and Herb Sutter.
For another example of the dangers of dereferencing a null pointer see Exposing undefined behavior when trying to port code to another platform by Raymond Chen.
You forgot the most important part:
member-access with pointers uses ->
member-access with references uses .
foo.bar is clearly superior to foo->bar in the same way that vi is clearly superior to Emacs :-)
References are very similar to pointers, but they are specifically crafted to be helpful to optimizing compilers.
References are designed such that it is substantially easier for the compiler to trace which reference aliases which variables. Two major features are very important: no "reference arithmetic" and no reassigning of references. These allow the compiler to figure out which references alias which variables at compile time.
References are allowed to refer to variables which do not have memory addresses, such as those the compiler chooses to put into registers. If you take the address of a local variable, it is very hard for the compiler to put it in a register.
As an example:
void maybeModify(int& x); // may modify x in some way
void hurtTheCompilersOptimizer(short size, int array[])
{
// This function is designed to do something particularly troublesome
// for optimizers. It will constantly call maybeModify on array[0] while
// adding array[1] to array[2]..array[size-1]. There's no real reason to
// do this, other than to demonstrate the power of references.
for (int i = 2; i < (int)size; i++) {
maybeModify(array[0]);
array[i] += array[1];
}
}
An optimizing compiler may realize that we are accessing a[0] and a[1] quite a bunch. It would love to optimize the algorithm to:
void hurtTheCompilersOptimizer(short size, int array[])
{
// Do the same thing as above, but instead of accessing array[1]
// all the time, access it once and store the result in a register,
// which is much faster to do arithmetic with.
register int a0 = a[0];
register int a1 = a[1]; // access a[1] once
for (int i = 2; i < (int)size; i++) {
maybeModify(a0); // Give maybeModify a reference to a register
array[i] += a1; // Use the saved register value over and over
}
a[0] = a0; // Store the modified a[0] back into the array
}
To make such an optimization, it needs to prove that nothing can change array[1] during the call. This is rather easy to do. i is never less than 2, so array[i] can never refer to array[1]. maybeModify() is given a0 as a reference (aliasing array[0]). Because there is no "reference" arithmetic, the compiler just has to prove that maybeModify never gets the address of x, and it has proven that nothing changes array[1].
It also has to prove that there are no ways a future call could read/write a[0] while we have a temporary register copy of it in a0. This is often trivial to prove, because in many cases it is obvious that the reference is never stored in a permanent structure like a class instance.
Now do the same thing with pointers
void maybeModify(int* x); // May modify x in some way
void hurtTheCompilersOptimizer(short size, int array[])
{
// Same operation, only now with pointers, making the
// optimization trickier.
for (int i = 2; i < (int)size; i++) {
maybeModify(&(array[0]));
array[i] += array[1];
}
}
The behavior is the same; only now it is much harder to prove that maybeModify does not ever modify array[1], because we already gave it a pointer; the cat is out of the bag. Now it has to do the much more difficult proof: a static analysis of maybeModify to prove it never writes to &x + 1. It also has to prove that it never saves off a pointer that can refer to array[0], which is just as tricky.
Modern compilers are getting better and better at static analysis, but it is always nice to help them out and use references.
Of course, barring such clever optimizations, compilers will indeed turn references into pointers when needed.
EDIT: Five years after posting this answer, I found an actual technical difference where references are different than just a different way of looking at the same addressing concept. References can modify the lifespan of temporary objects in a way that pointers cannot.
F createF(int argument);
void extending()
{
const F& ref = createF(5);
std::cout << ref.getArgument() << std::endl;
};
Normally temporary objects such as the one created by the call to createF(5) are destroyed at the end of the expression. However, by binding that object to a reference, ref, C++ will extend the lifespan of that temporary object until ref goes out of scope.
Actually, a reference is not really like a pointer.
A compiler keeps "references" to variables, associating a name with a memory address; that's its job to translate any variable name to a memory address when compiling.
When you create a reference, you only tell the compiler that you assign another name to the pointer variable; that's why references cannot "point to null", because a variable cannot be, and not be.
Pointers are variables; they contain the address of some other variable, or can be null. The important thing is that a pointer has a value, while a reference only has a variable that it is referencing.
Now some explanation of real code:
int a = 0;
int& b = a;
Here you are not creating another variable that points to a; you are just adding another name to the memory content holding the value of a. This memory now has two names, a and b, and it can be addressed using either name.
void increment(int& n)
{
n = n + 1;
}
int a;
increment(a);
When calling a function, the compiler usually generates memory spaces for the arguments to be copied to. The function signature defines the spaces that should be created and gives the name that should be used for these spaces. Declaring a parameter as a reference just tells the compiler to use the input variable memory space instead of allocating a new memory space during the method call. It may seem strange to say that your function will be directly manipulating a variable declared in the calling scope, but remember that when executing compiled code, there is no more scope; there is just plain flat memory, and your function code could manipulate any variables.
Now there may be some cases where your compiler may not be able to know the reference when compiling, like when using an extern variable. So a reference may or may not be implemented as a pointer in the underlying code. But in the examples I gave you, it will most likely not be implemented with a pointer.
A reference can never be NULL.
There is a semantic difference that may appear esoteric if you are not familiar with studying computer languages in an abstract or even academic fashion.
At the highest-level, the idea of references is that they are transparent "aliases". Your computer may use an address to make them work, but you're not supposed to worry about that: you're supposed to think of them as "just another name" for an existing object and the syntax reflects that. They are stricter than pointers so your compiler can more reliably warn you when you about to create a dangling reference, than when you are about to create a dangling pointer.
Beyond that, there are of course some practical differences between pointers and references. The syntax to use them is obviously different, and you cannot "re-seat" references, have references to nothingness, or have pointers to references.
While both references and pointers are used to indirectly access another value, there are two important differences between references and pointers. The first is that a reference always refers to an object: It is an error to define a reference without initializing it. The behavior of assignment is the second important difference: Assigning to a reference changes the object to which the reference is bound; it does not rebind the reference to another object. Once initialized, a reference always refers to the same underlying object.
Consider these two program fragments. In the first, we assign one pointer to another:
int ival = 1024, ival2 = 2048;
int *pi = &ival, *pi2 = &ival2;
pi = pi2; // pi now points to ival2
After the assignment, ival, the object addressed by pi remains unchanged. The assignment changes the value of pi, making it point to a different object. Now consider a similar program that assigns two references:
int &ri = ival, &ri2 = ival2;
ri = ri2; // assigns ival2 to ival
This assignment changes ival, the value referenced by ri, and not the reference itself. After the assignment, the two references still refer to their original objects, and the value of those objects is now the same as well.
A reference is an alias for another variable whereas a pointer holds the memory address of a variable. References are generally used as function parameters so that the passed object is not the copy but the object itself.
void fun(int &a, int &b); // A common usage of references.
int a = 0;
int &b = a; // b is an alias for a. Not so common to use.
The direct answer
What is a reference in C++? Some specific instance of type that is not an object type.
What is a pointer in C++? Some specific instance of type that is an object type.
From the ISO C++ definition of object type:
An object type is a (possibly cv-qualified) type that is not a function type, not a reference type, and not cv void.
It may be important to know, object type is a top-level category of the type universe in C++. Reference is also a top-level category. But pointer is not.
Pointers and references are mentioned together in the context of compound type. This is basically due to the nature of the declarator syntax inherited from (and extended) C, which has no references. (Besides, there are more than one kind of declarator of references since C++ 11, while pointers are still "unityped": &+&& vs. *.) So drafting a language specific by "extension" with similar style of C in this context is somewhat reasonable. (I will still argue that the syntax of declarators wastes the syntactic expressiveness a lot, makes both human users and implementations frustrating. Thus, all of them are not qualified to be built-in in a new language design. This is a totally different topic about PL design, though.)
Otherwise, it is insignificant that pointers can be qualified as a specific sorts of types with references together. They simply share too few common properties besides the syntax similarity, so there is no need to put them together in most cases.
Note the statements above only mentions "pointers" and "references" as types. There are some interested questions about their instances (like variables). There also come too many misconceptions.
The differences of the top-level categories can already reveal many concrete differences not tied to pointers directly:
Object types can have top-level cv qualifiers. References cannot.
Variable of object types do occupy storage as per the abstract machine semantics. Reference do not necessary occupy storage (see the section about misconceptions below for details).
...
A few more special rules on references:
Compound declarators are more restrictive on references.
References can collapse.
Special rules on && parameters (as the "forwarding references") based on reference collapsing during template parameter deduction allow "perfect forwarding" of parameters.
References have special rules in initialization. The lifetime of variable declared as a reference type can be different to ordinary objects via extension.
BTW, a few other contexts like initialization involving std::initializer_list follows some similar rules of reference lifetime extension. It is another can of worms.
...
The misconceptions
Syntactic sugar
I know references are syntactic sugar, so code is easier to read and write.
Technically, this is plain wrong. References are not syntactic sugar of any other features in C++, because they cannot be exactly replaced by other features without any semantic differences.
(Similarly, lambda-expressions are not syntactic sugar of any other features in C++ because it cannot be precisely simulated with "unspecified" properties like the declaration order of the captured variables, which may be important because the initialization order of such variables can be significant.)
C++ only has a few kinds of syntactic sugars in this strict sense. One instance is (inherited from C) the built-in (non-overloaded) operator [], which is defined exactly having same semantic properties of specific forms of combination over built-in operator unary * and binary +.
Storage
So, a pointer and a reference both use the same amount of memory.
The statement above is simply wrong. To avoid such misconceptions, look at the ISO C++ rules instead:
From [intro.object]/1:
... An object occupies a region of storage in its period of construction, throughout its lifetime, and in its period of destruction. ...
From [dcl.ref]/4:
It is unspecified whether or not a reference requires storage.
Note these are semantic properties.
Pragmatics
Even that pointers are not qualified enough to be put together with references in the sense of the language design, there are still some arguments making it debatable to make choice between them in some other contexts, for example, when making choices on parameter types.
But this is not the whole story. I mean, there are more things than pointers vs references you have to consider.
If you don't have to stick on such over-specific choices, in most cases the answer is short: you do not have the necessity to use pointers, so you don't. Pointers are usually bad enough because they imply too many things you don't expect and they will rely on too many implicit assumptions undermining the maintainability and (even) portability of the code. Unnecessarily relying on pointers is definitely a bad style and it should be avoided in the sense of modern C++. Reconsider your purpose and you will finally find that pointer is the feature of last sorts in most cases.
Sometimes the language rules explicitly require specific types to be used. If you want to use these features, obey the rules.
Copy constructors require specific types of cv-& reference type as the 1st parameter type. (And usually it should be const qualified.)
Move constructors require specific types of cv-&& reference type as the 1st parameter type. (And usually there should be no qualifiers.)
Specific overloads of operators require reference or non reference types. For example:
Overloaded operator= as special member functions requires reference types similar to 1st parameter of copy/move constructors.
Postfix ++ requires dummy int.
...
If you know pass-by-value (i.e. using non-reference types) is sufficient, use it directly, particularly when using an implementation supporting C++17 mandated copy elision. (Warning: However, to exhaustively reason about the necessity can be very complicated.)
If you want to operate some handles with ownership, use smart pointers like unique_ptr and shared_ptr (or even with homebrew ones by yourself if you require them to be opaque), rather than raw pointers.
If you are doing some iterations over a range, use iterators (or some ranges which are not provided by the standard library yet), rather than raw pointers unless you are convinced raw pointers will do better (e.g. for less header dependencies) in very specific cases.
If you know pass-by-value is sufficient and you want some explicit nullable semantics, use wrapper like std::optional, rather than raw pointers.
If you know pass-by-value is not ideal for the reasons above, and you don't want nullable semantics, use {lvalue, rvalue, forwarding}-references.
Even when you do want semantics like traditional pointer, there are often something more appropriate, like observer_ptr in Library Fundamental TS.
The only exceptions cannot be worked around in the current language:
When you are implementing smart pointers above, you may have to deal with raw pointers.
Specific language-interoperation routines require pointers, like operator new. (However, cv-void* is still quite different and safer compared to the ordinary object pointers because it rules out unexpected pointer arithmetics unless you are relying on some non conforming extension on void* like GNU's.)
Function pointers can be converted from lambda expressions without captures, while function references cannot. You have to use function pointers in non-generic code for such cases, even you deliberately do not want nullable values.
So, in practice, the answer is so obvious: when in doubt, avoid pointers. You have to use pointers only when there are very explicit reasons that nothing else is more appropriate. Except a few exceptional cases mentioned above, such choices are almost always not purely C++-specific (but likely to be language-implementation-specific). Such instances can be:
You have to serve to old-style (C) APIs.
You have to meet the ABI requirements of specific C++ implementations.
You have to interoperate at runtime with different language implementations (including various assemblies, language runtime and FFI of some high-level client languages) based on assumptions of specific implementations.
You have to improve efficiency of the translation (compilation & linking) in some extreme cases.
You have to avoid symbol bloat in some extreme cases.
Language neutrality caveats
If you come to see the question via some Google search result (not specific to C++), this is very likely to be the wrong place.
References in C++ is quite "odd", as it is essentially not first-class: they will be treated as the objects or the functions being referred to so they have no chance to support some first-class operations like being the left operand of the member access operator independently to the type of the referred object. Other languages may or may not have similar restrictions on their references.
References in C++ will likely not preserve the meaning across different languages. For example, references in general do not imply nonnull properties on values like they in C++, so such assumptions may not work in some other languages (and you will find counterexamples quite easily, e.g. Java, C#, ...).
There can still be some common properties among references in different programming languages in general, but let's leave it for some other questions in SO.
(A side note: the question may be significant earlier than any "C-like" languages are involved, like ALGOL 68 vs. PL/I.)
It doesn't matter how much space it takes up since you can't actually see any side effect (without executing code) of whatever space it would take up.
On the other hand, one major difference between references and pointers is that temporaries assigned to const references live until the const reference goes out of scope.
For example:
class scope_test
{
public:
~scope_test() { printf("scope_test done!\n"); }
};
...
{
const scope_test &test= scope_test();
printf("in scope\n");
}
will print:
in scope
scope_test done!
This is the language mechanism that allows ScopeGuard to work.
This is based on the tutorial. What is written makes it more clear:
>>> The address that locates a variable within memory is
what we call a reference to that variable. (5th paragraph at page 63)
>>> The variable that stores the reference to another
variable is what we call a pointer. (3rd paragraph at page 64)
Simply to remember that,
>>> reference stands for memory location
>>> pointer is a reference container (Maybe because we will use it for
several times, it is better to remember that reference.)
What's more, as we can refer to almost any pointer tutorial, a pointer is an object that is supported by pointer arithmetic which makes pointer similar to an array.
Look at the following statement,
int Tom(0);
int & alias_Tom = Tom;
alias_Tom can be understood as an alias of a variable (different with typedef, which is alias of a type) Tom. It is also OK to forget the terminology of such statement is to create a reference of Tom.
A reference is not another name given to some memory. It's a immutable pointer that is automatically de-referenced on usage. Basically it boils down to:
int& j = i;
It internally becomes
int* const j = &i;
A reference to a pointer is possible in C++, but the reverse is not possible means a pointer to a reference isn't possible. A reference to a pointer provides a cleaner syntax to modify the pointer.
Look at this example:
#include<iostream>
using namespace std;
void swap(char * &str1, char * &str2)
{
char *temp = str1;
str1 = str2;
str2 = temp;
}
int main()
{
char *str1 = "Hi";
char *str2 = "Hello";
swap(str1, str2);
cout<<"str1 is "<<str1<<endl;
cout<<"str2 is "<<str2<<endl;
return 0;
}
And consider the C version of the above program. In C you have to use pointer to pointer (multiple indirection), and it leads to confusion and the program may look complicated.
#include<stdio.h>
/* Swaps strings by swapping pointers */
void swap1(char **str1_ptr, char **str2_ptr)
{
char *temp = *str1_ptr;
*str1_ptr = *str2_ptr;
*str2_ptr = temp;
}
int main()
{
char *str1 = "Hi";
char *str2 = "Hello";
swap1(&str1, &str2);
printf("str1 is %s, str2 is %s", str1, str2);
return 0;
}
Visit the following for more information about reference to pointer:
C++: Reference to Pointer
Pointer-to-Pointer and Reference-to-Pointer
As I said, a pointer to a reference isn't possible. Try the following program:
#include <iostream>
using namespace std;
int main()
{
int x = 10;
int *ptr = &x;
int &*ptr1 = ptr;
}
There is one fundamental difference between pointers and references that I didn't see anyone had mentioned: references enable pass-by-reference semantics in function arguments. Pointers, although it is not visible at first do not: they only provide pass-by-value semantics. This has been very nicely described in this article.
Regards,
&rzej
I use references unless I need either of these:
Null pointers can be used as a
sentinel value, often a cheap way to
avoid function overloading or use of
a bool.
You can do arithmetic on a pointer.
For example, p += offset;
At the risk of adding to confusion, I want to throw in some input, I'm sure it mostly depends on how the compiler implements references, but in the case of gcc the idea that a reference can only point to a variable on the stack is not actually correct, take this for example:
#include <iostream>
int main(int argc, char** argv) {
// Create a string on the heap
std::string *str_ptr = new std::string("THIS IS A STRING");
// Dereference the string on the heap, and assign it to the reference
std::string &str_ref = *str_ptr;
// Not even a compiler warning! At least with gcc
// Now lets try to print it's value!
std::cout << str_ref << std::endl;
// It works! Now lets print and compare actual memory addresses
std::cout << str_ptr << " : " << &str_ref << std::endl;
// Exactly the same, now remember to free the memory on the heap
delete str_ptr;
}
Which outputs this:
THIS IS A STRING
0xbb2070 : 0xbb2070
If you notice even the memory addresses are exactly the same, meaning the reference is successfully pointing to a variable on the heap! Now if you really want to get freaky, this also works:
int main(int argc, char** argv) {
// In the actual new declaration let immediately de-reference and assign it to the reference
std::string &str_ref = *(new std::string("THIS IS A STRING"));
// Once again, it works! (at least in gcc)
std::cout << str_ref;
// Once again it prints fine, however we have no pointer to the heap allocation, right? So how do we free the space we just ignorantly created?
delete &str_ref;
/*And, it works, because we are taking the memory address that the reference is
storing, and deleting it, which is all a pointer is doing, just we have to specify
the address with '&' whereas a pointer does that implicitly, this is sort of like
calling delete &(*str_ptr); (which also compiles and runs fine).*/
}
Which outputs this:
THIS IS A STRING
Therefore a reference IS a pointer under the hood, they both are just storing a memory address, where the address is pointing to is irrelevant, what do you think would happen if I called std::cout << str_ref; AFTER calling delete &str_ref? Well, obviously it compiles fine, but causes a segmentation fault at runtime because it's no longer pointing at a valid variable, we essentially have a broken reference that still exists (until it falls out of scope), but is useless.
In other words, a reference is nothing but a pointer that has the pointer mechanics abstracted away, making it safer and easier to use (no accidental pointer math, no mixing up '.' and '->', etc.), assuming you don't try any nonsense like my examples above ;)
Now regardless of how a compiler handles references, it will always have some kind of pointer under the hood, because a reference must refer to a specific variable at a specific memory address for it to work as expected, there is no getting around this (hence the term 'reference').
The only major rule that's important to remember with references is that they must be defined at the time of declaration (with the exception of a reference in a header, in that case it must be defined in the constructor, after the object it's contained in is constructed it's too late to define it).
Remember, my examples above are just that, examples demonstrating what a reference is, you would never want to use a reference in those ways! For proper usage of a reference there are plenty of answers on here already that hit the nail on the head
Another difference is that you can have pointers to a void type (and it means pointer to anything) but references to void are forbidden.
int a;
void * p = &a; // ok
void & p = a; // forbidden
I can't say I'm really happy with this particular difference. I would much prefer it would be allowed with the meaning reference to anything with an address and otherwise the same behavior for references. It would allow to define some equivalents of C library functions like memcpy using references.
Also, a reference that is a parameter to a function that is inlined may be handled differently than a pointer.
void increment(int *ptrint) { (*ptrint)++; }
void increment(int &refint) { refint++; }
void incptrtest()
{
int testptr=0;
increment(&testptr);
}
void increftest()
{
int testref=0;
increment(testref);
}
Many compilers when inlining the pointer version one will actually force a write to memory (we are taking the address explicitly). However, they will leave the reference in a register which is more optimal.
Of course, for functions that are not inlined the pointer and reference generate the same code and it's always better to pass intrinsics by value than by reference if they are not modified and returned by the function.
Another interesting use of references is to supply a default argument of a user-defined type:
class UDT
{
public:
UDT() : val_d(33) {};
UDT(int val) : val_d(val) {};
virtual ~UDT() {};
private:
int val_d;
};
class UDT_Derived : public UDT
{
public:
UDT_Derived() : UDT() {};
virtual ~UDT_Derived() {};
};
class Behavior
{
public:
Behavior(
const UDT &udt = UDT()
) {};
};
int main()
{
Behavior b; // take default
UDT u(88);
Behavior c(u);
UDT_Derived ud;
Behavior d(ud);
return 1;
}
The default flavor uses the 'bind const reference to a temporary' aspect of references.
This program might help in comprehending the answer of the question. This is a simple program of a reference "j" and a pointer "ptr" pointing to variable "x".
#include<iostream>
using namespace std;
int main()
{
int *ptr=0, x=9; // pointer and variable declaration
ptr=&x; // pointer to variable "x"
int & j=x; // reference declaration; reference to variable "x"
cout << "x=" << x << endl;
cout << "&x=" << &x << endl;
cout << "j=" << j << endl;
cout << "&j=" << &j << endl;
cout << "*ptr=" << *ptr << endl;
cout << "ptr=" << ptr << endl;
cout << "&ptr=" << &ptr << endl;
getch();
}
Run the program and have a look at the output and you'll understand.
Also, spare 10 minutes and watch this video: https://www.youtube.com/watch?v=rlJrrGV0iOg
I feel like there is yet another point that hasn't been covered here.
Unlike the pointers, references are syntactically equivalent to the object they refer to, i.e. any operation that can be applied to an object works for a reference, and with the exact same syntax (the exception is of course the initialization).
While this may appear superficial, I believe this property is crucial for a number of C++ features, for example:
Templates. Since template parameters are duck-typed, syntactic properties of a type is all that matters, so often the same template can be used with both T and T&.
(or std::reference_wrapper<T> which still relies on an implicit cast
to T&)
Templates that cover both T& and T&& are even more common.
Lvalues. Consider the statement str[0] = 'X'; Without references it would only work for c-strings (char* str). Returning the character by reference allows user-defined classes to have the same notation.
Copy constructors. Syntactically it makes sense to pass objects to copy constructors, and not pointers to objects. But there is just no way for a copy constructor to take an object by value - it would result in a recursive call to the same copy constructor. This leaves references as the only option here.
Operator overloads. With references it is possible to introduce indirection to an operator call - say, operator+(const T& a, const T& b) while retaining the same infix notation. This also works for regular overloaded functions.
These points empower a considerable part of C++ and the standard library so this is quite a major property of references.
A reference is a const pointer. int * const a = &b is the same as int& a = b. This is why there's is no such thing as a const reference, because it is already const, whereas a reference to const is const int * const a. When you compile using -O0, the compiler will place the address of b on the stack in both situations, and as a member of a class, it will also be present in the object on the stack/heap identically to if you had declared a const pointer. With -Ofast, it is free to optimise this out. A const pointer and reference are both optimised away.
Unlike a const pointer, there is no way to take the address of the reference itself, as it will be interpreted as the address of the variable it references. Because of this, on -Ofast, the const pointer representing the reference (the address of the variable being referenced) will always be optimised off the stack, but if the program absolutely needs the address of an actual const pointer (the address of the pointer itself, not the address it points to) i.e. you print the address of the const pointer, then the const pointer will be placed on the stack so that it has an address.
Otherwise it is identical i.e. when you print the that address it points to:
#include <iostream>
int main() {
int a =1;
int* b = &a;
std::cout << b ;
}
int main() {
int a =1;
int& b = a;
std::cout << &b ;
}
they both have the same assembly output
-Ofast:
main:
sub rsp, 24
mov edi, OFFSET FLAT:_ZSt4cout
lea rsi, [rsp+12]
mov DWORD PTR [rsp+12], 1
call std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<void const*>(void const*)
xor eax, eax
add rsp, 24
ret
--------------------------------------------------------------------
-O0:
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-12], 1
lea rax, [rbp-12]
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov rsi, rax
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(void const*)
mov eax, 0
leave
ret
The pointer has been optimised off the stack, and the pointer isn't even dereferenced on -Ofast in both cases, instead it uses a compile time value.
As members of an object they are identical on -O0 through -Ofast.
#include <iostream>
int b=1;
struct A {int* i=&b; int& j=b;};
A a;
int main() {
std::cout << &a.j << &a.i;
}
The address of b is stored twice in the object.
a:
.quad b
.quad b
mov rax, QWORD PTR a[rip+8] //&a.j
mov esi, OFFSET FLAT:a //&a.i
When you pass by reference, on -O0, you pass the address of the variable referenced, so it is identical to passing by pointer i.e. the address the const pointer contains. On -Ofast this is optimised out by the compiler in an inline call if the function can be inlined, as the dynamic scope is known, but in the function definition, the parameter is always dereferenced as a pointer (expecting the address of the variable being referenced by the reference) where it may be used by another translation unit and the dynamic scope is unknown to the compiler, unless of course the function is declared as a static function, then it can't be used outside of the translation unit and then it passes by value so long as it isn't modified in the function by reference, then it will pass the address of the variable being referenced by the reference that you're passing, and on -Ofast this will be passed in a register and kept off of the stack if there are enough volatile registers in the calling convention.
There is a very important non-technical difference between pointers and references: An argument passed to a function by pointer is much more visible than an argument passed to a function by non-const reference. For example:
void fn1(std::string s);
void fn2(const std::string& s);
void fn3(std::string& s);
void fn4(std::string* s);
void bar() {
std::string x;
fn1(x); // Cannot modify x
fn2(x); // Cannot modify x (without const_cast)
fn3(x); // CAN modify x!
fn4(&x); // Can modify x (but is obvious about it)
}
Back in C, a call that looks like fn(x) can only be passed by value, so it definitely cannot modify x; to modify an argument you would need to pass a pointer fn(&x). So if an argument wasn't preceded by an & you knew it would not be modified. (The converse, & means modified, was not true because you would sometimes have to pass large read-only structures by const pointer.)
Some argue that this is such a useful feature when reading code, that pointer parameters should always be used for modifiable parameters rather than non-const references, even if the function never expects a nullptr. That is, those people argue that function signatures like fn3() above should not be allowed. Google's C++ style guidelines are an example of this.
Maybe some metaphors will help;
In the context of your desktop screenspace -
A reference requires you to specify an actual window.
A pointer requires the location of a piece of space on screen that you assure it will contain zero or more instances of that window type.
Difference between pointer and reference
A pointer can be initialized to 0 and a reference not. In fact, a reference must also refer to an object, but a pointer can be the null pointer:
int* p = 0;
But we can’t have int& p = 0; and also int& p=5 ;.
In fact to do it properly, we must have declared and defined an object at the first then we can make a reference to that object, so the correct implementation of the previous code will be:
Int x = 0;
Int y = 5;
Int& p = x;
Int& p1 = y;
Another important point is that is we can make the declaration of the pointer without initialization however no such thing can be done in case of reference which must make a reference always to variable or object. However such use of a pointer is risky so generally we check if the pointer is actually is pointing to something or not. In case of a reference no such check is necessary, because we know already that referencing to an object during declaration is mandatory.
Another difference is that pointer can point to another object however reference is always referencing to the same object, let’s take this example:
Int a = 6, b = 5;
Int& rf = a;
Cout << rf << endl; // The result we will get is 6, because rf is referencing to the value of a.
rf = b;
cout << a << endl; // The result will be 5 because the value of b now will be stored into the address of a so the former value of a will be erased
Another point: When we have a template like an STL template such kind of a class template will always return a reference, not a pointer, to make easy reading or assigning new value using operator []:
Std ::vector<int>v(10); // Initialize a vector with 10 elements
V[5] = 5; // Writing the value 5 into the 6 element of our vector, so if the returned type of operator [] was a pointer and not a reference we should write this *v[5]=5, by making a reference we overwrite the element by using the assignment "="
Some key pertinent details about references and pointers
Pointers
Pointer variables are declared using the unary suffix declarator operator *
Pointer objects are assigned an address value, for example, by assignment to an array object, the address of an object using the & unary prefix operator, or assignment to the value of another pointer object
A pointer can be reassigned any number of times, pointing to different objects
A pointer is a variable that holds the assigned address. It takes up storage in memory equal to the size of the address for the target machine architecture
A pointer can be mathematically manipulated, for instance, by the increment or addition operators. Hence, one can iterate with a pointer, etc.
To get or set the contents of the object referred to by a pointer, one must use the unary prefix operator * to dereference it
References
References must be initialized when they are declared.
References are declared using the unary suffix declarator operator &.
When initializing a reference, one uses the name of the object to which they will refer directly, without the need for the unary prefix operator &
Once initialized, references cannot be pointed to something else by assignment or arithmetical manipulation
There is no need to dereference the reference to get or set the contents of the object it refers to
Assignment operations on the reference manipulate the contents of the object it points to (after initialization), not the reference itself (does not change where it points to)
Arithmetic operations on the reference manipulate the contents of the object it points to, not the reference itself (does not change where it points to)
In pretty much all implementations, the reference is actually stored as an address in memory of the referred to object. Hence, it takes up storage in memory equal to the size of the address for the target machine architecture just like a pointer object
Even though pointers and references are implemented in much the same way "under-the-hood," the compiler treats them differently, resulting in all the differences described above.
Article
A recent article I wrote that goes into much greater detail than I can show here and should be very helpful for this question, especially about how things happen in memory:
Arrays, Pointers and References Under the Hood In-Depth Article