C++ - Understanding References and Memory Addresses - c++

I have read some of the technical differences between references and memory addresses here, however I am trying to find a more abstract way to understand them. Consider the code:
char foo = 'a';
char& bar = foo;
char& bar2 = *(char*)(&foo);
cout << bar << endl;
cout << bar2 << endl;
The output in both cases is 'a'. Is it then correct to conclude from this that a reference (bar2) is simply a memory address (&foo) with an associated type (char)? Or does this explanation fall apart?

The most succinct definition of a reference in C++ is that:
It declares a named variable as a reference, that is, an alias to an already-existing object or function.
Its value is the same as the object it is an alias of.
Its address (obtained by the & operator) is the same as the address of the object it is an alias of.

Any time you have a value (i.e. an object that is the result of evaluating an expression) in C++, you can bind that value to a reference variable. Evaluating the reference later results in an lvalue that is precisely the object which was bound to the reference. For example:
int a = 10;
a; // the value of the expression is immediately discarded
int & r = a; // this time the value is bound to a reference variable
r; // this is the same value as a
r = 20;
Evaluating both a and r produces an lvalue of type int, which is the variable a.
Another example:
Foo f();
f(); // a discarded prvalue of type Foo
Foo && r = f(); // this time the value of f() is bound to r
r.do_stuff();
This time, each evaluation of f() produces a distinct value of type Foo (a "temporary object"). The first one is immediately discarded; the second one is bound to the reference r. Evaluating r produces an lvalue of type Foo, namely the temporary object returned from the second function call.

bar2 is identical to bar, since *(char*)(&foo) (value at the address of foo, interpreted as char) is the same as foo itself.
The important thing to understand here, is that a reference is basically a pointer that is automatically dereferenced, and whose address cannot be changed. Or, in other words, an alias for a variable or memory location.
Consider this:
char *c = new char('x');
char &ref = *c;
ref = 'y';
assert(*c == 'y'); //the memory at c was changed through ref
assert(&ref == c); //the address of ref is the same as c, because ref is an alias for the memory at c
char c2 = 'x';
char &ref2 = c2;
char *p = &c2;
c2 = 'y';
assert(p == &ref2);
assert(ref2 == 'y' && *p == 'y');

Related

C++: What is a de-reference actually doing?

I am reading through Stroustrup's 4th edition : The C++ Programming Language. I have a python/java background so the first 4 chapters are fine so far.
In Chapter 3 I saw:
complex& operator+=(complex z) { re+=z.re , im+=z.im; return ∗this; }
That began a day long attempt to write this question:
First I figured out that it is returning a reference to the object and not a copy. As I was able to confirm in this question.
And I was able to understand the difference between returning a reference into a reference variable vs. a regular variable from this question
And I did my own trial
class Test {
public:
Test():x{5}{}
int x;
void setX(int a) {x = a;}
Test& operator+=(Test z) {x+=z.x; return *this;}
// the keyword this is a pointer
Test* getTest() {return this;}
// but I can return the reference by *this
Test& getTest1() {return *this;}
// or I can return a copy
Test getTest2() {return *this;}
};
That lead me to question why it is called de-reference, so I did this trial
int x = 8;
int* p = &x;
int y = *p;
int& z = *p;
x++; // let's have some fun
std::cout << y << std::endl;
std::cout << z << std::endl;
As expected y = 8 and z = 9, so how did the de-reference return the address in one case, and the value in the other? More importantly how is C++ making that distinction?
It's exactly like in your Test class functions.
int y = *p;
int& z = *p;
y is a copy of what p points to.
z is a reference to (not an address) what p points to. So changing z changes *p and vice-versa. But changing y has no effect on *p.
As expected y = 8 and z = 9, so how did the de-reference return the address in one case, and the value in the other? More importantly how is C++ making that distinction?
The de-reference returned the actual thing referenced in both cases. So there is no distinction for C++ to make. The difference is in what was done with the result of the dereference.
If you do int j = <something>; then the result of the something is used to initialize j. Since j is an integer, the <something> must be an integer value.
If you do int &j = <something>; then the result of the something is still used to initialize j. But now, j is a reference to an integer, and the <something> must be an integer, not just an integer value.
So, what *this does is the same in both cases. How you use a value doesn't affect how that value is computed. But how you use it does affect what happens when you use it. And these two pieces of code use the dereferenced object differently. In one case, its value is taken. In the other case, a reference is bound to it.
It's possible to consider a pointer int* p as pointing to an address where data of type int resides. When you de-reference this, the system retrieves the value at that memory address (the address is the actual value of p itself). In the case of int y = *p; you put a copy of that int value on the stack as the locator value y.
On the other hand, de-referencing on the left-hand side in *p = 13; means you are replacing the int value *p stored at the memory address denoted by the value of p with the right-hand-side value 13.
The reference lvalue int& z in int& z = *p; is not a copy of the int value pointed to by p but rather a left-hand side reference to whatever is at the particular memory address returned by *p (i.e. the actual value held by p itself).
This doesn't mean much difference in your contrived case, but e.g. given a Foo class with a Foo::incrementCount() value,
Foo* p = new Foo();
p->incrementCount();
Foo& ref = *p;
ref.incrementCount();
The same method for the same instance will be called twice. In contrast, Foo foo = *p will actually copy the entire Foo instance, creating a separate copy on the stack. Thus, calling foo.incrementValue() won't affect the separate object still pointed to by p.

Universal reference deduction if the argument is address

Quick query regarding universal references:
Let's say i have this code
int q = 10;
auto && wtf = &q;
This one compiles fine, but i have no idea what's happening behind the hood. It's taking a reference to an address? isn't that the pointer's job?
I was trying to deduce that auto&&'s type will be and i did it by:
int & test = &q //error
int && test = &q //error too
So what does it become? I need clarification on what's happening and what's the purpose of taking & from a universal reference? I am doing this because i am trying to understand std::bind since it can take address or pointers(which is the address of the being pointed aka pointer's value).
When you write &q you create a temporary value.
An rvalue [...] is an xvalue, a temporary object (12.2) or subobject thereof, or a value that is not associated with an object.
[§ 3.10]
Those are best bound to rvalue references, so
auto && wtf = &q;
becomes an rvalue reference (&& and && stays &&) to the type of &q. This isn't int, it's int *. That's why your manual attempt failed.
If you would instead bind to a lvalue, like a local variable, then you get a lvalue reference:
int * qptr = &q;
auto && wtf2 = qptr;
// auto becomes (int *)&
// & combined with && becomes &
The whole thing can new seen in action here.

declaration of reference and pointer in c++

For example, if F is a reference to an integer, where the reference is not permitted to be pointed to a new object once it is initially pointed to one.
Can I write to declaration like: const int & F?
I am confused about reference and pointer, because they both represent the address of something, but we always write parameter use reference as: const & F, I understand that this is to reduce the copy and does not allow others to change it, but are there any other meanings? and why do we need "const" after a function declaration like: int F(int z) const; this const makes the return type const or everything in the function const?
One more example,
void F(int* p)
{
p+=3;
}
int z=8;
F(&z);
std::cout<<z<<std::endl;
What is the output for z since z is a reference, and I pass it as a pointer who points to an integer.Increasing p by 3 just makes the address different and does not change its value?
Just a first pass at some answers - if anything is unclear please comment and I'll try to elaborate.
int a = 3;
declares an integer, a, with the initial value 3, but you are allowed to change it. For example, later you can do
a = 5; // (*)
and a will have the value 5. If you want to prevent this, you can instead write
const int a = 3;
which will make the assignment (*) illegal - the compiler will issue an error.
If you create a reference to an integer, you are basically creating an alias:
int& b = a;
, despite appearances, does not create a new integer b. Instead, it declares b as an alias for a. If a had the value 3 before, so will b, if you write b = 6 and print the value of a, you will get 6 as well. Just as for a, you can make the assignment b = 6 illegal by declaring it as const:
const int& b = a;
means that b is still an alias for a, but it will not be used to assign a different value to a. It will only be used to read the value of a. Note that a itself still may or may not be constant - if you declared it as non-const you can still write a = 6 and b will also be 6.
As for the question about the pointers: the snippet
void F(int* p) {
p += 3;
}
int z = 8;
F(&z);
does not do what you expected. You pass the address of z into the function F, so inside F, the pointer p will point to z. However, what you are doing then, is adding 3 to the value of p, i.e. to the address that p points to. So you will change to pointer to point at some (semi)random memory address. Luckily, it's just a copy, and it will be discarded. What you probably wanted to do, is increment the value of the integer that p points to, which would be *p += 3. You could have prevented this mistake by making the argument a int* const, meaning: the value of p (i.e. address pointed to) cannot be changed, but the value it points to (i.e. the value of z, in this case) can. This would have made *p += 3 legal but not the "erroneous" (unintended) p += 3. Other versions would be const int* p which would make p += 3 legal but not *p += 3, and const int* const` which would have allowed neither.
Actually, the way you have written F is dangerous: suppose that you expand the function and later you write (correctly) *p += 3. You think that you are updating the value of z whose address you passed in, while actually you are updating the value of a more-or-less random memory address. In fact, when I tried compiling the following:
// WARNING WARNING WARNING
// DANGEROUS CODE - This will probably produce a segfault - don't run it!
void F(int* p) {
p += 3; // I thought I wrote *p += 3
// ... Lots of other code in between, I forgot I accidentally changed p
*p += 3; // NOOOOOOOOOOO!
}
int main()
{
int z=8;
F(&z);
std::cout << z;
return 0;
}
I got a segmentation fault, because I'm writing at an address where I haven't allocated a variable (for all I know I could have just screwed up my boot sector).
Finally, about const after a function declaration: it makes the this pointer a const pointer - basically the compiler emits const A* this instead of just A* this. Conceptually, it states your intention that the function will not change the state of the class, which usually means it won't change any of the (internal) variables. For example, it would make the following code illegal:
class A {
int a;
void f() const {
a = 3; // f is const, so it cannot change a!
}
};
A a;
a.f();
Of course, if the function returns something, this value can have its own type, for example
void f();
int f();
int& f();
const int f();
const int& f();
are functions that return nothing, a (copy of) an integer, a (reference to) an integer, a constant (copy of) an integer, and a constant reference of an integer. If in addition f is guaranteed not to change any class fields, you can also add const after the brackets:
void f() const;
int f() const;
int& f() const;
const int f() const;
const int& f() const;
The way I remember the difference between references and pointers is that a reference must exist and the reference cannot change.
A pointer can be changed, and usually needs to be checked against NULL or tested to verify it points to a valid object.
Also, an object passed by reference can be treated syntactically like it was declared in the function. Pointers must use deferencing syntax.
Hope that helps.
You are confusing things.
First of all int z=8; F(&z); here z IS NOT a reference.
So let me start with the basics:
when found in a type declaration the symbol & denotes a reference, but in any other context, the symbol & means address of.
Similar, in a type declaration * has the meaning of declaring a pointer, anywhere else it it the dereferencing operator, denoting you use the value at an address.
For instance:
int *p : p is a pointer of type int.
x = *p : x is assigned the value found at address p.
int &r = a : r is reference of type int, and r refers the variable a.
p = &a : p is assigned the address of variable a.
Another question you have: the const at the end of a function, like int f(int x) const. This can be used only on non-static class methods and specifies that the function does not modify the object. It has nothing to do with the return value.

c++ references appear reassigned when documentation suggests otherwise

According to this question you can't change what a reference refers to. Likewise the C++ Primer 5th Edition states
Once we have defined a reference, there is no way to make that reference
refer to a different object. When we use a reference, we always get the object to
which the reference was initially bound.
However the following code compiles and prints the value 4 which looks to me like the reference was changed?? Please elaborate if this is or is not so.
int a = 2;
int b = 4;
int &ref = a;
ref = b;
cout << ref;
You are not reassigning a reference. A reference acts as an alias for a variable. In this case, ref is an alias for a, so
ref = b;
is the equivalent of
a = b;
You can easily check that by printing out the value of a:
std::cout << a << std::endl; // prints 4
You can understand how references work by comparing their behavior to that of a pointer. A pointer can be thought of as the name of the address of a variable; however a reference is just the name of the variable itself--it is an alias. An alias, once set, can never be changed whereas you can assign a pointer a new address if you want. So you have:
int main(void)
{
int a = 2;
int b = 4;
int* ptr_a = &a;
int& ref_a = a;
ptr_a = &b; //Ok, assign ptr_a a new address
ref_a = &b; //Error--invalid conversion. References are not addresses.
&ref_a = &b; //Error--the result of the `&` operator is not an R-value, i.e. you can't assign to it.
return 0;
}

Why can I assign a new value to a reference, and how can I make a reference refer to something else?

I have couple of questions related to usage of references in C++.
In the code shown below, how does it work and not give a error at line q = "world";?
#include <iostream>
using namespace std;
int main()
{
char *p = "Hello";
char* &q = p;
cout <<p <<' '<<q <<"\n";
q = "World"; //Why is there no error on this line
cout <<p <<' '<<q <<"\n";
}
How can a reference q be reinitialized to something else?
Isn't the string literal, p = "Hello", a constant or in read-only space? So if we do,
q = "World";
wouldn't the string at p which is supposed to be constant be changed?
I have read about C++ reference type variables as they cannot be reinitialized or reassigned, since they are stored 'internally' as constant pointers. So a compiler would give a error.
But how actually a reference variable can be reassigned?
int i;
int &j = i;
int k;
j = k; //This should be fine, but how we reassign to something else to make compiler flag an error?
I am trying to get hold of this reference, and in that maybe missed some key things related, so these questions.
So any pointers to clear this up, would be useful.
a) It cannot, the line you quote doesn't change the reference q, it changes p.
b) No the literal is constant, but p is a pointer which points at a literal.
The pointer can be changed, what is being pointed to cannot.
q = "world"; makes the pointer p point to something else.
You seem to think that this code
int i;
int &j = i;
int k;
j = k;
is reassigning a reference, but it isn't.
It's assigning the value of k to i, j still refers to i.
I would guess that this is your major misunderstanding.
An important detail about references that I think you're missing is that once the reference is bound to an object, you can never reassign it. From that point forward, any time you use the reference, it's indistinguishable from using the object it refers to. As an example, in your first piece of code, when you write
q = "World";
Since q is a reference bound to p, this is equivalent to writing
p = "World";
Which just changes where p is pointing, not the contents of the string it's pointing at. (This also explains why it doesn't crash!)
As for your second question, references cannot be reassigned once bound to an object. If you need to have a reference that can change its referent, you should be using a pointer instead.
Hope this helps!
It is to be noted that since C++20, it is possible to change the reference held by a reference variable inside a class, using placement new, like in the following example taken from this SO post:
struct C {
int& i; // <= a reference field
void foo(const C& other) {
if ( this != &other ) {
this->~C();
new (this) C(other); // valid since C++20 even on a class
// with a reference field
}
}
};
int main() {
int a = 3, b = 5;
C c1 {.i = a};
C c2 {.i = b};
c1.foo(c2); // the inner reference field i inside c1
// was referring to a and now refers to b!
}
Code: http://coliru.stacked-crooked.com/a/4674071ea82ba31b
a) How can a reference q be reinitialized to something else?
It cannot be!
An reference variable remains an alias to which it was initialized at time of creation.
b)Isn't the string literal, p = "Hello", a constant/in read only space. So if we do,
No it doesn't.
char* &q = p;
Here q is an reference to pointer of the type char p. The string here is constant put the pointer is not, it can be pointed to another string, and the reference is alias to this pointer not the string literal so it is valid.
c) Second question I have is I have read about C++ reference type variables as they cannot be reinitialized/reassigned, since they are stored 'internally' as constant pointers. So a compiler would give a error.
int i;
int &j = i;
int k;
j = k; //This should be fine, but how we reassign to something else to make compiler flag an error
Does not reassign the reference. it changes the value of the variable to which it was alias.
In this case it changes the value of i to k
Treat reference as an alias name and I hope the world of reference will much easier to understand.
int p; // Declares p as an integer; Defines p & allocates space
int &q = p ; // Declares a Reference. Though they are symbolically 2 variables,
// they essentially refer to same name and same memory location.
So, p = 5 and q = 5 will be all the same.
In your example,
char *p = "Hello"; // Declares your pointer to "Hello". p has its own existence.
char* &q = p; // This now creates a reference (alias) to p with name q.
So all in all, p & q are names of the entity/object (memory).
So, if you assign q something, it reflects in p too. Coz it is same as the assignment to p.
So q = "World", means p too now points to "World". i.e. the Memory location which p & q both refer to - holds the address of first character of "World".
I hope the second question need not be answered if you understand the notion of reference as an alias.