Related
Here is the way I understand * and & symbols in C and C++.
In C, * serves two purposes. First it can be used to declare a pointer variable like so int* pointerVariable
It can however be used as a dereference operator like so *pointerVariable which returns value saved at that address, it understands how to interpret bytes at that address based on what data type we have declared that pointer is pointing to. In our case int* therefore it reads bytes saved at that address and returns back whole number.
We also have address-of operator in C like so &someVariable which returns address of bytes saved underneath someVariable name.
However in C++ (not in C), we also get a possibility to use & in declaration of reference like so int& someReference. This will turn variable someReference into a reference, which means that whatever value we pass into that variable, it will automatically get address of the value we are passing into it and it will hold it.
Do I get this correctly?
Do I get this correctly?
Yes, but it is better to think about pointers and references in terms of what you want to do.
References are very useful for all those cases where you need to refer to some object without copying it. References are simple: they are always valid and there is no change in syntax when you use the object.
Pointers are for the rest of cases. Pointers allow you to work with addresses (pointer arithmetic), require explicit syntax to refer to the object behind them (*, &, -> operators), are nullable (NULL, nullptr), can be modified, etc.
In summary, references are simpler and easier to reason about. Use pointers when a reference does not cut it.
General Syntax for defining a pointer:
data-type * pointer-name = &variable-name
The data-type of the pointer must be the same as that of the variable to which it is pointing.
void type pointer can handle all data-types.
General Syntax for defining a reference variable:
data-type & reference-name = variable-name
The data-type of the reference variable must be the same as that of the variable of which it is an alias.
Let's look at each one of them, for the purpose of explanation, I will go with a simple Swap Program both in C and C++.
Swapping two variables by the pass by reference in C
#include <stdio.h>
void swap(int *,int *); //Function prototype
int main()
{
int a = 10;
int b = 20;
printf("Before Swap: a=%d, b=%d\n",a,b);
swap(&a,&b); //Value of a,b are passed by reference
printf("After Swap: a=%d, b=%d\n",a,b);
return 0;
}
void swap(int *ptra,int *ptrb)
{
int temp = *ptra;
*ptra = *ptrb;
*ptrb = temp;
}
In the code above we have declared and initialized variable a and
b to 10 and 20 respectively.
We then pass the address of a
and b to swap function by using the addressof (&) operator. This operator gives the address of the variable.
These passed arguments are assigned to the respective formal parameters which in this case are int pointers ptra and ptrb.
To swap the variables, we first need to temporarily store the value of one of the variables. For this, we stored value pointed by the pointer ptra to a variable temp. This was done by first dereferencing the pointer by using dereference (*) operator and then assigning it to temp. dereference (*) operator is used to access the value stored in the memory location pointed to by a pointer.
Once, the value of pointed by ptra is saved, we can then assign it a new value, which in this case, we assigned it the value of variable b(again with the help of dereference (*) operator). And the ptrb was assigned the value saved in temp(original value of a). Therefore, swapping the value of a and b, by altering the memory location of those variables.
Note: We can use dereference (*) operator and the addressof (&) operator together like this, *&a, they nullify each other resulting in just a
We can write a similar program in C++ by using pointers to swap two numbers as well but the language supports another type variable known as the reference variable. It provides an alias (alternative name) for a previously defined variable.
Swapping two variables by the call by reference in C++
#include <iostream>
using namespace std;
void swap(int &,int &); //Function prototype
int main()
{
int a = 10;
int b = 20;
cout << "Before Swap: a= " << a << " b= " << b << endl;
swap(a,b);
cout << "After Swap: a= " << a << " b= " << b << endl;
return 0;
}
void swap(int &refa,int &refb)
{
int temp = refa;
refa = refb;
refb = temp;
}
In the code above when we passed the variables a and b to the function swap, what happened is the variable a and b got their respective reference variables refa and refb inside the swap. It's like giving a variable another alias name.
Now, we can directly swap the variables without the dereferencing (*) operator using the reference variables.
Rest logic remains the same.
So before we get into the differences between pointers and references, I feel like we need to talk a little bit about declaration syntax, partly to explain why pointer and reference declarations are written that way and partly because the way many C++ programmers write pointer and reference declarations misrepresent that syntax (get comfortable, this is going to take a while).
In both C and C++, declarations are composed of a sequence of declaration specifiers followed by a sequence of declarators1. In a declaration like
static unsigned long int a[10], *p, f(void);
the declaration specifiers are static unsigned long int and the declarators are a[10], *p, and f(void).
Array-ness, pointer-ness, function-ness, and in C++ reference-ness are all specified as part of the declarator, not the declaration specifiers. This means when you write something like
int* p;
it’s parsed as
int (*p);
Since the unary * operator is a unique token, the compiler doesn't need whitespace to distinguish it from the int type specifier or the p identifier. You can write it as int *p;, int* p;, int * p;, or even int*p;
It also means that in a declaration like
int* p, q;
only p is declared as a pointer - q is a regular int.
The idea is that the declaration of a variable closely matches its use in the code ("declaration mimics use"). If you have a pointer to int named p and you want to access the pointed-to value, you use the * operator to dereference it:
printf( "%d\n", *p );
The expression *p has type int, so the declaration of p is written
int *p;
This tells us that the variable p has type "pointer to int" because the combination of p and the unary operator * give us an expression of type int. Most C programmers will write the pointer declaration as shown above, with the * visibly grouped with p.
Now, Bjarne and the couple of generations of C++ programmers who followed thought it was more important to emphasize the pointer-ness of p rather than the int-ness of *p, so they introduced the
int* p;
convention. However, this convention falls down for anything but a simple pointer (or pointer to pointer). It doesn't work for pointers to arrays:
int (*a)[N];
or pointers to functions
int (*f)(void);
or arrays of pointers to functions
int (*p[N])(void);
etc. Declaring an array of pointers as
int* a[N];
just indicates confused thinking. Since [] and () are postfix, you cannot associate the array-ness or function-ness with the declaration specifiers by writing
int[N] a;
int(void) f;
like you can with the unary * operator, but the unary * operator is bound to the declarator in exactly the same way as the [] and () operators are.2
C++ references break the rule about "declaration mimics use" hard. In a non-declaration statement, an expression &x always yields a pointer type. If x has type int, &x has type int *. So & has a completely different meaning in a declaration than in an expression.
So that's syntax, let's talk about pointers vs. references.
A pointer is just an address value (although with additional type information). You can do (some) arithmetic on pointers, you can initialize them to arbitrary values (or NULL), you can apply the [] subscript operator to them as though they were an array (indeed, the array subscript operation is defined in terms of pointer operations). A pointer is not required to be valid (that is, contain the address of an object during that object's lifetime) when it's first created.
A reference is another name for an object or function, not just that object's or function's address (this is why you don't use the * operator when working with references). You can't do pointer arithmetic on references, you can't assign arbitrary values to a reference, etc. When instantiated, a reference must refer to a valid object or function. How exactly references are represented internally isn't specified.
This is the C terminology - the C++ terminology is a little different.
In case it isn't clear by now I consider the T* p; idiom to be poor practice and responsible for no small amount of confusion about pointer declaration syntax; however, since that's how the C++ community has decided to do things, that's how I write my C++ code. I don't like it and it makes me itch, but it's not worth the heartburn to argue over it or to have inconsistently formatted code.
Simple answer:
Reference variables are an alias to the data passed to them, another label.
int var = 0;
int& refVar = var;
In practical terms, var and refVar are the same object.
Its worth noting that references to heap pointer data cannot deallocate (delete) the data, as its an alias of the data;
int* var = new int{0};
int& refVar = *var;
delete refVar // error
and references to the pointer itself can deallocate (delete) the data, as its an alias of the pointer.
int* var = new int{0};
int*& refVar = var;
delete refVar // good
This question already has answers here:
When to use references vs. pointers
(17 answers)
Closed 7 years ago.
I am learning some C++, and I came across pointers and addresses. However, in none of the materials, I could find a good explanation on when to use pointer, and when to use address. As I understand it is that when I use pointer, I POINT to address in memory, where some variable is stored. So for example:
int x = 5;
int k* = &x;
Which will mean that:
k represents x
When I change k, I also change value of x.
My question is as follows: when should I use pointer, and when should I use address? When I declare a function, should I use pointer, or address as a variable?
You have a typo there. It should be:
int x = 5;
int *k = &x;
It should be read as: k points to x. Or if you insist on the "represent" word: *k represents x.
The & operator takes any variable as argument and returns its address (pointer). The * gets a pointer (address) as argument and returns the value stored there. As such they are opposite operations: &*k is the same as k and likewise *&x is just like x. You can look at these expression this way:
x //an integer
k //pointer to integer
&x //pointer to x
*k //integer pointed to by k
*&x //integer pointed to by a pointer to x, that is x
&*k //pointer to the integer pointed to by k, that is k
Note that operator & can only be used on variables, as expressions in general do not have addresses, only variables do (well, and temporaries, but that is another matter). For example, this expression is invalid:
&(x + 1) //error!
The funniest thing with these operator is how a pointer is declared:
int *k;
You might think that it would be better written as int &k, but that's not how declarators are read in C++. Instead, that declaration should be read as:
int (*k);
that is, *k is an integer, so it results that k is a pointer to an integer.
Now, the declaration with initialization is weirder:
int *k = &x;
should actually be read as:
int (*k);
k = &x;
That is, you declare *k as being an integer, thus, k is a pointer-to-integer. And then you initialize k (the pointer) with the address of x.
Actually you can create pointers of any type, even pointers to pointers, pointers to pointers to pointers... but note that this syntax is illegal:
int x;
int **p = &&x; //error, &x is not a variable
but this is valid:
int x;
int *k = &x;
int **p = &k;
If a variable stores an address of a memory location, it is considered a pointer. So an int * for example stores the address of an int-variable. By using the dereference-operator *, you can access the memory location at the address of the pointer and assign to it.
To get the address of a variable, you use the &-operator. This is what your example does. To assign to the memory-location where x is stored, you would use the derefence-operator again like this: *k = 0;
Note that the derefence-operator and the * to express a pointer type are two different things. * and & are each other's inverse (overloading aside) operators, while int * is a type.
In C++ in particular, if the & is used together with a type, the type of the variable is a reference-type, e.g. const string &s. Just like the pointer-type this is different from the address-of operator. Reference-types do not need to be derefenced using *, but will direcly modify the memory location they reference.
Pointer operator (*) is used whenever you want to point to a variable.
Address operator (&) is used whenever you point to the memory address of a variable.
Does int* var and &var both store addresses, the only difference is that you have to deference int* to get the value back but don't references already do that? Having trouble understanding these thoroughly.
And why is that when you have a function that accepts int* into the parameters you can pass values in by &
They are different ways of expressing what probably eventually boils down to the same thing. They are both constructs that have been invented by the language's designers, meaning your compiler's authors must implement them in whatever manner they see fit for the underlying machine.
However, just because they may represent the same thing on the machine doesn't mean that they are equivalent. Pointers allow the concept of pointing-to-nothing-ness (NULL pointers) and also allow one to perform mathematic operations to obtain a portion of memory indexed off of a starting position... like so:
int *x = new int [10];
*(x+2) = 5; //set the 3rd element of the array pointed to by 'x' to 5
is perfectly sensible.
References have no notions of such things, i.e. one can do
int *x = new int[10];
*(x+2) = 5;
int &y = *(x+2);
but not
int *x = new int[10];
*(x+2) = 5;
int &y = *(x+2);
y = y + 5;//this just changes the value of x[2]
which means it's more difficult to write off the end of a struct because of bad pointer math, so they are safer provided they've been initialized to something that makes sense (i.e. not returned from a function where they are declared on the stack or to an array element that doesn't exist)
int &dontdoit() {
//don't do this!
int x = 7;
int &y = x;
return y;
}
I think this is perfectly legal and safe in that you won't be corrupting memory, but it's not recommended as you're mixing idioms and you have no way to free the resulting allocated memory:
int &dontdothiseither() {
int *x = new int;
int &y = *x;
return y;
}
Also, you can set a pointer as many times as you like but not a reference.
int x[2];
int *y = x;//works
y = y+1; //works, now points to x[1];
int &z = x[0]; //works
z = x[1];//nope! This just sets x[0] to be the value in x[1]
int* is a variable whose value is the address of some int. &var is not a reference. The unary & operator simply returns the address of var. That's why if you have a function that takes a parameter of type int*, you use &var at the calling site.
It's a bit confusing since C++ uses & to mean both "address of" and "reference", but the context is what makes them different.
int a = 5;
int& ref = a; // Now ref and a both mean the same thing.
vs
int b = 6;
int* ptr = &b; // Now ptr POINTS to b, but they are not the same thing.
no no no!
references become nickname of that variable, where as pointers store variable's address.
suppose we have a variable of type int :
int x;
int &y=x; // now y is an other name of x
int *p=&x; // here p is a pointer, which points to x
Yes, pointers and references both store addresses, and are compiled to exactly the same code. The only major difference though is that references cannot be null, whereas pointers can - hence the well-known "null-pointer". Obviously they are accessed in different ways by the programmer: using -> and . respectively but that is really of no significance.
You can view a reference as a new "name" for a variable, while a pointer is a variable storing an address.
In practice, a reference might be implemented with pointers (so, it might store an address), but you do not need to worry about that.
For example:
int i;
int *pointer = &i; //Holds address of i
int& reference = i; //reference is a new name of i
int *pointer2 = &reference; //pointer2 holds the address of i
Yes, internally passing a pointer to T and passing a reference to T are the same in any known sensible implementation, references being semantic sugar for pointers which are always dereferenced and cannot be NULL.
Implementations are not obligated to make sense though.
After optimization, that is even sure for references used inside a function.
Still, they lead to different method / object signatures.
Also, they have different semantic load for the reader.
The & character is used for many different purposes in C++. Two of them look like they might be similar, but they're not.
The first is to create a reference:
int a = 1;
int &b = a;
Now a and b both refer to the same variable, and a change made to one will be reflected in the other.
The other usage is to create a pointer:
int *p = &a;
This has nothing to do with references at all. The & is being used as an address-of operator, taking the variable and creating a pointer that points to it.
In a book I'm reading about C++ (C++ for Dummies) there is a section that says the following:
int nVar = 10;
int* pVar = &nVar;
const int* pcVar = pVar; // this is legal
int* pVar2 = pcVar; // this is not
The book then goes on to explain:
The assignment pcVar = pVar; is okay -- this is adding the const
restriction. The final assignment in the snippet is not allowed since
it attempts to remove the const-ness of pcVar
My question is why is the last line not "legal". I don't understand how that impedes on the "const-ness" of pcVar. Thanks.
const int *pcVar = pVar;
int *pVar2 = pcVar;
If pcVar is const int *, that implies that the int it points to may be const. (It isn't in this case, but it may be.) So if you assign pVar2, which is a non-const int *, it still allows the int it points to to be modified.
So if pcVar actually pointed to a const int, and you assign an int * to its address, then that int * pointer (pVar2 in this case) will allow you, by dereferencing, to modify it, and that's illegal (it's a constraint violation, so it invokes undefined behavior).
All the compiler knows is that pcVar is a const int*. That is, it points to a const int. Just because you made it point at a non-const int doesn't matter. For all the compiler knows, the pointer value could have changed at some point to point at a truly const int. Therefore, the compiler won't let you convert from a const int* back to a int* because it would be lying about the constness of the object it was pointing at.
For a simpler example, consider:
const int x;
const int* pc = x;
int* p = pc; // Illegal
Here, x truly is a const int. If you could do that third line, you could then access the const int object through p (by doing *pc) and modify it. That would be bad - x is const for a reason.
However, in the example you gave, since you know that the original object was non-const, you could use const_cast to force the compiler into trusting you:
int* pVar2 = const_cast<int*>(pcVar);
Note that this is only valid if you know for certain that the object is non-const.
It's just saying you can't create a non-const pointer from one that was const (at least, not without a const_cast).
The idea behind const is to have objects that cannot be modified by accident. Getting rid of const through a simple assignment would be quite dangerous, and would allow things like this:
void function(int* m) {
*m = 20;
}
int main() {
const int x = 10;
//Oops! x isn't constant inside function any more, and is now 20!
function(&x);
}
Also, please check out The Definitive C++ Book and Guide List, it has lots of great references (C++ for dummies doesn't quite make the cut).
Mixing const and non-const is illegal. The reason being, if you tell the compiler that one location's value is const and then use another pointer to modify that value, you have violated the const contract you made with the first element.
pcVar stays the same but pVar2 points to a non-const, const can be added but not taken away. The compiler does not look at the original nVar being non-const, only the attempt to assign a const to a non const. Otherwise you could get around the const and change the value.
int * pVar = &nVar;
*pVar = 4 //is legal
const int* pcVar = pVar; // this is legal
*pcVar = 3 // this is not legal, we said the value was const thus it can not be changed
int* pVar2 = pcVar; // this is not legal because...
*pVar2 = 3 -> *pcVar = 3
The second line
int pVar = &nVar;
is error.
g++ compiler says.
error: invalid conversion from ‘int*’ to ‘int’
I have a fairly good understanding of the dereferencing operator, the address of operator, and pointers in general.
I however get confused when I see stuff such as this:
int* returnA() {
int *j = &a;
return j;
}
int* returnB() {
return &b;
}
int& returnC() {
return c;
}
int& returnC2() {
int *d = &c;
return *d;
}
In returnA() I'm asking to return a pointer; just to clarify this works because j is a pointer?
In returnB() I'm asking to return a pointer; since a pointer points to an address, the reason why returnB() works is because I'm returning &b?
In returnC() I'm asking for an address of int to be returned. When I return c is the & operator automatically "appended" c?
In returnC2() I'm asking again for an address of int to be returned. Does *d work because pointers point to an address?
Assume a, b, c are initialized as integers as Global.
Can someone validate if I am correct with all four of my questions?
Although Peter answered your question, one thing that's clearly confusing you is the symbols * and &. The tough part about getting your head around these is that they both have two different meanings that have to do with indirection (even excluding the third meanings of * for multiplication and & for bitwise-and).
*, when used as part of a type
indicates that the type is a pointer:
int is a type, so int* is a
pointer-to-int type, and int** is a
pointer-to-pointer-to-int type.
& when used as part of a type indicates that the type is a reference. int is a type, so int& is a reference-to-int (there is no such thing as reference-to-reference). References and pointers are used for similar things, but they are quite different and not interchangable. A reference is best thought of as an alias, or alternate name, for an existing variable. If x is an int, then you can simply assign int& y = x to create a new name y for x. Afterwords, x and y can be used interchangeably to refer to the same integer. The two main implications of this are that references cannot be NULL (since there must be an original variable to reference), and that you don't need to use any special operator to get at the original value (because it's just an alternate name, not a pointer). References can also not be reassigned.
* when used as a unary operator performs an operation called dereference (which has nothing to do with reference types!). This operation is only meaningful on pointers. When you dereference a pointer, you get back what it points to. So, if p is a pointer-to-int, *p is the int being pointed to.
& when used as a unary operator performs an operation called address-of. That's pretty self-explanatory; if x is a variable, then &x is the address of x. The address of a variable can be assigned to a pointer to the type of that variable. So, if x is an int, then &x can be assigned to a pointer of type int*, and that pointer will point to x. E.g. if you assign int* p = &x, then *p can be used to retrieve the value of x.
So remember, the type suffix & is for references, and has nothing to do with the unary operatory &, which has to do with getting addresses for use with pointers. The two uses are completely unrelated. And * as a type suffix declares a pointer, while * as a unary operator performs an action on pointers.
In returnA() I'm asking to return a pointer; just to clarify this works because j is a pointer?
Yes, int *j = &a initializes j to point to a. Then you return the value of j, that is the address of a.
In returnB() I'm asking to return a pointer; since a pointer points to an address, the reason why returnB() works is because I'm returning &b?
Yes. Here the same thing happens as above, just in a single step. &b gives the address of b.
In returnC() I'm asking for an address of int to be returned. When I return c is the & operator automatically appended?
No, it is a reference to an int which is returned. A reference is not an address the same way as a pointer is - it is just an alternative name for a variable. Therefore you don't need to apply the & operator to get a reference of a variable.
In returnC2() I'm asking again for an address of int to be returned. Does *d work because pointers point to an address?
Again, it is a reference to an int which is returned. *d refers to the original variable c (whatever that may be), pointed to by c. And this can implicitly be turned into a reference, just as in returnC.
Pointers do not in general point to an address (although they can - e.g. int** is a pointer to pointer to int). Pointers are an address of something. When you declare the pointer like something*, that something is the thing your pointer points to. So in my above example, int** declares a pointer to an int*, which happens to be a pointer itself.
Tyler, that was very helpful explanation, I did some experiment using visual studio debugger to clarify this difference even further:-
int sample = 90;
int& alias = sample;
int* pointerToSample = &sample;
Name Address Type
&alias 0x0112fc1c {90} int *
&sample 0x0112fc1c {90} int *
pointerToSample 0x0112fc1c {90} int *
*pointerToSample 90 int
alias 90 int &
&pointerToSample 0x0112fc04 {0x0112fc1c {90}} int * *
Memory Layout
PointerToSample Sample/alias
_______________......____________________
0x0112fc1c | | 90 |
___________|___.....__|________|_______...
[0x0112fc04] ... [0x0112fc1c
In returnC() and returnC2() you are not asking to return the address.
Both these functions return references to objects.
A reference is not the address of anything it is an alternative name of something (this may mean the compiler may (or may not depending on situation) use an address to represent the object (alternatively it may also know to keep it in register)).
All you know that a reference points at a specific object.
While a reference itself is not an object just an alternative name.
All of your examples produce undefined run-time behavior. You are returning pointers or references to items that disappear after execution leaves the function.
Let me clarify:
int * returnA()
{
static int a; // The static keyword keeps the variable from disappearing.
int * j = 0; // Declare a pointer to an int and initialize to location 0.
j = &a; // j now points to a.
return j; // return the location of the static variable (evil).
}
In your function, the variable j is assigned to point to a's temporary location. Upon exit of your function the variable a disappears, but it's former location is returned via j. Since a no longer exists at the location pointed to by j, undefined behavior will happen with accessing *j.
Variables inside functions should not be modified via reference or pointer by other code. It can happen although it produces undefined behavior.
Being pedantic, the pointers returned should be declared as pointing to constant data. The references returned should be const:
const char * Hello()
{
static const char text[] = "Hello";
return text;
}
The above function returns a pointer to constant data. Other code can access (read) the static data but cannot be modified.
const unsigned int& Counter()
{
static unsigned int value = 0;
value = value + 1;
return value;
}
In the above function, the value is initialized to zero on the first entry. All next executions of this function cause value to be incremented by one. The function returns a reference to a constant value. This means that other functions can use the value (from afar) as if it was a variable (without having to dereference a pointer).
In my thinking, a pointer is used for an optional parameter or object. A reference is passed when the object must exist. Inside the function, a referenced parameter means that the value exists, however a pointer must be checked for null before dereferencing it. Also, with a reference, there is more guarantee that the target object is valid. A pointer could point to an invalid address (not null) and cause undefined behavior.
Semantically, references do act as addresses. However, syntactically, they are the compiler's job, not yours, and you can treat a reference as if it is the original object it points to, including binding other references to it and having them refer to the original object too. Say goodbye to pointer arithmetic in this case.
The downside of that is that you can't modify what they refer to - they are bound at construct time.