Related
I did find some questions already on StackOverflow with similar title, but when I read the answers, they were focusing on different parts of the question, which were really specific (e.g. STL/containers).
Could someone please show me, why you must use pointers/references for implementing polymorphism? I can understand pointers may help, but surely references only differentiate between pass-by-value and pass-by-reference?
Surely so long as you allocate memory on the heap, so that you can have dynamic binding, then this would have been enough. Obviously not.
"Surely so long as you allocate memory on the heap" - where the memory is allocated has nothing to do with it. It's all about the semantics. Take, for instance:
Derived d;
Base* b = &d;
d is on the stack (automatic memory), but polymorphism will still work on b.
If you don't have a base class pointer or reference to a derived class, polymorphism doesn't work because you no longer have a derived class. Take
Base c = Derived();
The c object isn't a Derived, but a Base, because of slicing. So, technically, polymorphism still works, it's just that you no longer have a Derived object to talk about.
Now take
Base* c = new Derived();
c just points to some place in memory, and you don't really care whether that's actually a Base or a Derived, but the call to a virtual method will be resolved dynamically.
In C++, an object always has a fixed type and size known at compile-time and (if it can and does have its address taken) always exists at a fixed address for the duration of its lifetime. These are features inherited from C which help make both languages suitable for low-level systems programming. (All of this is subject to the as-if, rule, though: a conforming compiler is free to do whatever it pleases with code as long as it can be proven to have no detectable effect on any behavior of a conforming program that is guaranteed by the standard.)
A virtual function in C++ is defined (more or less, no need for extreme language lawyering) as executing based on the run-time type of an object; when called directly on an object this will always be the compile-time type of the object, so there is no polymorphism when a virtual function is called this way.
Note that this didn't necessarily have to be the case: object types with virtual functions are usually implemented in C++ with a per-object pointer to a table of virtual functions which is unique to each type. If so inclined, a compiler for some hypothetical variant of C++ could implement assignment on objects (such as Base b; b = Derived()) as copying both the contents of the object and the virtual table pointer along with it, which would easily work if both Base and Derived were the same size. In the case that the two were not the same size, the compiler could even insert code that pauses the program for an arbitrary amount of time in order to rearrange memory in the program and update all possible references to that memory in a way that could be proven to have no detectable effect on the semantics of the program, terminating the program if no such rearrangement could be found: this would be very inefficient, though, and could not be guaranteed to ever halt, obviously not desirable features for an assignment operator to have.
So in lieu of the above, polymorphism in C++ is accomplished by allowing references and pointers to objects to reference and point to objects of their declared compile-time types and any subtypes thereof. When a virtual function is called through a reference or pointer, and the compiler cannot prove that the object referenced or pointed to is of a run-time type with a specific known implementation of that virtual function, the compiler inserts code which looks up the correct virtual function to call a run-time. It did not have to be this way, either: references and pointers could have been defined as being non-polymorphic (disallowing them to reference or point to subtypes of their declared types) and forcing the programmer to come up with alternative ways of implementing polymorphism. The latter is clearly possible since it's done all the time in C, but at that point there's not much reason to have a new language at all.
In sum, the semantics of C++ are designed in such a way to allow the high-level abstraction and encapsulation of object-oriented polymorphism while still retaining features (like low-level access and explicit management of memory) which allow it to be suitable for low-level development. You could easily design a language that had some other semantics, but it would not be C++ and would have different benefits and drawbacks.
I found it helpful to understand that a copy constructor is invoked when assigning like this:
class Base { };
class Derived : public Base { };
Derived x; /* Derived type object created */
Base y = x; /* Copy is made (using Base's copy constructor), so y really is of type Base. Copy can cause "slicing" btw. */
Since y is an actual object of class Base, rather than the original one, functions called on this are Base's functions.
Consider little endian architectures: values are stored low-order-bytes first. So, for any given unsigned integer, the values 0-255 are stored in the first byte of the value. Accessing the low 8-bits of any value simply requires a pointer to it's address.
So we could implement uint8 as a class. We know that an instance of uint8 is ... one byte. If we derive from it and produce uint16, uint32, etc, the interface remains the same for purposes of abstraction, but the one most important change is size of the concrete instances of the object.
Of course, if we implemented uint8 and char, the sizes may be the same, likewise sint8.
However, operator= of uint8 and uint16 are going to move different quantities of data.
In order to create a Polymorphic function we must either be able to:
a/ receive the argument by value by copying the data into a new location of the correct size and layout,
b/ take a pointer to the object's location,
c/ take a reference to the object instance,
We can use templates to achieve a, so polymorphism can work without pointers and references, but if we are not counting templates, then lets consider what happens if we implement uint128 and pass it to a function expecting uint8? Answer: 8 bits get copied instead of 128.
So what if we made our polymorphic function accept uint128 and we passed it a uint8. If our uint8 we were copying was unfortunately located, our function would attempt to copy 128 bytes of which 127 were outside of our accessible memory -> crash.
Consider the following:
class A { int x; };
A fn(A a)
{
return a;
}
class B : public A {
uint64_t a, b, c;
B(int x_, uint64_t a_, uint64_t b_, uint64_t c_)
: A(x_), a(a_), b(b_), c(c_) {}
};
B b1 { 10, 1, 2, 3 };
B b2 = fn(b1);
// b2.x == 10, but a, b and c?
At the time fn was compiled, there was no knowledge of B. However, B is derived from A so polymorphism should allow that we can call fn with a B. However, the object it returns should be an A comprising a single int.
If we pass an instance of B to this function, what we get back should be just a { int x; } with no a, b, c.
This is "slicing".
Even with pointers and references we don't avoid this for free. Consider:
std::vector<A*> vec;
Elements of this vector could be pointers to A or something derived from A. The language generally solves this through the use of the "vtable", a small addition to the object's instance which identifies the type and provides function pointers for virtual functions. You can think of it as something like:
template<class T>
struct PolymorphicObject {
T::vtable* __vtptr;
T __instance;
};
Rather than every object having its own distinct vtable, classes have them, and object instances merely point to the relevant vtable.
The problem now is not slicing but type correctness:
struct A { virtual const char* fn() { return "A"; } };
struct B : public A { virtual const char* fn() { return "B"; } };
#include <iostream>
#include <cstring>
int main()
{
A* a = new A();
B* b = new B();
memcpy(a, b, sizeof(A));
std::cout << "sizeof A = " << sizeof(A)
<< " a->fn(): " << a->fn() << '\n';
}
http://ideone.com/G62Cn0
sizeof A = 4 a->fn(): B
What we should have done is use a->operator=(b)
http://ideone.com/Vym3Lp
but again, this is copying an A to an A and so slicing would occur:
struct A { int i; A(int i_) : i(i_) {} virtual const char* fn() { return "A"; } };
struct B : public A {
int j;
B(int i_) : A(i_), j(i_ + 10) {}
virtual const char* fn() { return "B"; }
};
#include <iostream>
#include <cstring>
int main()
{
A* a = new A(1);
B* b = new B(2);
*a = *b; // aka a->operator=(static_cast<A*>(*b));
std::cout << "sizeof A = " << sizeof(A)
<< ", a->i = " << a->i << ", a->fn(): " << a->fn() << '\n';
}
http://ideone.com/DHGwun
(i is copied, but B's j is lost)
The conclusion here is that pointers/references are required because the original instance carries membership information with it that copying may interact with.
But also, that polymorphism is not perfectly solved within C++ and one must be cognizant of their obligation to provide/block actions which could produce slicing.
You need pointers or reference because for the kind of polymorphism you are interested in (*), you need that the dynamic type could be different from the static type, in other words that the true type of the object is different than the declared type. In C++ that happens only with pointers or references.
(*) Genericity, the type of polymorphism provided by templates, doesn't need pointers nor references.
When an object is passed by value, it's typically put on the stack. Putting something on the stack requires knowledge of just how big it is. When using polymorphism, you know that the incoming object implements a particular set of features, but you usually have no idea the size of the object (nor should you, necessarily, that's part of the benefit). Thus, you can't put it on the stack. You do, however, always know the size of a pointer.
Now, not everything goes on the stack, and there are other extenuating circumstances. In the case of virtual methods, the pointer to the object is also a pointer to the object's vtable(s), which indicate where the methods are. This allows the compiler to find and call the functions, regardless of what object it's working with.
Another cause is that very often the object is implemented outside of the calling library, and allocated with a completely different (and possibly incompatible) memory manager. It could also have members that can't be copied, or would cause problems if they were copied with a different manager. There could be side-effects to copying and all sorts of other complications.
The result is that the pointer is the only bit of information on the object that you really properly understand, and provides enough information to figure out where the other bits you need are.
I did find some questions already on StackOverflow with similar title, but when I read the answers, they were focusing on different parts of the question, which were really specific (e.g. STL/containers).
Could someone please show me, why you must use pointers/references for implementing polymorphism? I can understand pointers may help, but surely references only differentiate between pass-by-value and pass-by-reference?
Surely so long as you allocate memory on the heap, so that you can have dynamic binding, then this would have been enough. Obviously not.
"Surely so long as you allocate memory on the heap" - where the memory is allocated has nothing to do with it. It's all about the semantics. Take, for instance:
Derived d;
Base* b = &d;
d is on the stack (automatic memory), but polymorphism will still work on b.
If you don't have a base class pointer or reference to a derived class, polymorphism doesn't work because you no longer have a derived class. Take
Base c = Derived();
The c object isn't a Derived, but a Base, because of slicing. So, technically, polymorphism still works, it's just that you no longer have a Derived object to talk about.
Now take
Base* c = new Derived();
c just points to some place in memory, and you don't really care whether that's actually a Base or a Derived, but the call to a virtual method will be resolved dynamically.
In C++, an object always has a fixed type and size known at compile-time and (if it can and does have its address taken) always exists at a fixed address for the duration of its lifetime. These are features inherited from C which help make both languages suitable for low-level systems programming. (All of this is subject to the as-if, rule, though: a conforming compiler is free to do whatever it pleases with code as long as it can be proven to have no detectable effect on any behavior of a conforming program that is guaranteed by the standard.)
A virtual function in C++ is defined (more or less, no need for extreme language lawyering) as executing based on the run-time type of an object; when called directly on an object this will always be the compile-time type of the object, so there is no polymorphism when a virtual function is called this way.
Note that this didn't necessarily have to be the case: object types with virtual functions are usually implemented in C++ with a per-object pointer to a table of virtual functions which is unique to each type. If so inclined, a compiler for some hypothetical variant of C++ could implement assignment on objects (such as Base b; b = Derived()) as copying both the contents of the object and the virtual table pointer along with it, which would easily work if both Base and Derived were the same size. In the case that the two were not the same size, the compiler could even insert code that pauses the program for an arbitrary amount of time in order to rearrange memory in the program and update all possible references to that memory in a way that could be proven to have no detectable effect on the semantics of the program, terminating the program if no such rearrangement could be found: this would be very inefficient, though, and could not be guaranteed to ever halt, obviously not desirable features for an assignment operator to have.
So in lieu of the above, polymorphism in C++ is accomplished by allowing references and pointers to objects to reference and point to objects of their declared compile-time types and any subtypes thereof. When a virtual function is called through a reference or pointer, and the compiler cannot prove that the object referenced or pointed to is of a run-time type with a specific known implementation of that virtual function, the compiler inserts code which looks up the correct virtual function to call a run-time. It did not have to be this way, either: references and pointers could have been defined as being non-polymorphic (disallowing them to reference or point to subtypes of their declared types) and forcing the programmer to come up with alternative ways of implementing polymorphism. The latter is clearly possible since it's done all the time in C, but at that point there's not much reason to have a new language at all.
In sum, the semantics of C++ are designed in such a way to allow the high-level abstraction and encapsulation of object-oriented polymorphism while still retaining features (like low-level access and explicit management of memory) which allow it to be suitable for low-level development. You could easily design a language that had some other semantics, but it would not be C++ and would have different benefits and drawbacks.
I found it helpful to understand that a copy constructor is invoked when assigning like this:
class Base { };
class Derived : public Base { };
Derived x; /* Derived type object created */
Base y = x; /* Copy is made (using Base's copy constructor), so y really is of type Base. Copy can cause "slicing" btw. */
Since y is an actual object of class Base, rather than the original one, functions called on this are Base's functions.
Consider little endian architectures: values are stored low-order-bytes first. So, for any given unsigned integer, the values 0-255 are stored in the first byte of the value. Accessing the low 8-bits of any value simply requires a pointer to it's address.
So we could implement uint8 as a class. We know that an instance of uint8 is ... one byte. If we derive from it and produce uint16, uint32, etc, the interface remains the same for purposes of abstraction, but the one most important change is size of the concrete instances of the object.
Of course, if we implemented uint8 and char, the sizes may be the same, likewise sint8.
However, operator= of uint8 and uint16 are going to move different quantities of data.
In order to create a Polymorphic function we must either be able to:
a/ receive the argument by value by copying the data into a new location of the correct size and layout,
b/ take a pointer to the object's location,
c/ take a reference to the object instance,
We can use templates to achieve a, so polymorphism can work without pointers and references, but if we are not counting templates, then lets consider what happens if we implement uint128 and pass it to a function expecting uint8? Answer: 8 bits get copied instead of 128.
So what if we made our polymorphic function accept uint128 and we passed it a uint8. If our uint8 we were copying was unfortunately located, our function would attempt to copy 128 bytes of which 127 were outside of our accessible memory -> crash.
Consider the following:
class A { int x; };
A fn(A a)
{
return a;
}
class B : public A {
uint64_t a, b, c;
B(int x_, uint64_t a_, uint64_t b_, uint64_t c_)
: A(x_), a(a_), b(b_), c(c_) {}
};
B b1 { 10, 1, 2, 3 };
B b2 = fn(b1);
// b2.x == 10, but a, b and c?
At the time fn was compiled, there was no knowledge of B. However, B is derived from A so polymorphism should allow that we can call fn with a B. However, the object it returns should be an A comprising a single int.
If we pass an instance of B to this function, what we get back should be just a { int x; } with no a, b, c.
This is "slicing".
Even with pointers and references we don't avoid this for free. Consider:
std::vector<A*> vec;
Elements of this vector could be pointers to A or something derived from A. The language generally solves this through the use of the "vtable", a small addition to the object's instance which identifies the type and provides function pointers for virtual functions. You can think of it as something like:
template<class T>
struct PolymorphicObject {
T::vtable* __vtptr;
T __instance;
};
Rather than every object having its own distinct vtable, classes have them, and object instances merely point to the relevant vtable.
The problem now is not slicing but type correctness:
struct A { virtual const char* fn() { return "A"; } };
struct B : public A { virtual const char* fn() { return "B"; } };
#include <iostream>
#include <cstring>
int main()
{
A* a = new A();
B* b = new B();
memcpy(a, b, sizeof(A));
std::cout << "sizeof A = " << sizeof(A)
<< " a->fn(): " << a->fn() << '\n';
}
http://ideone.com/G62Cn0
sizeof A = 4 a->fn(): B
What we should have done is use a->operator=(b)
http://ideone.com/Vym3Lp
but again, this is copying an A to an A and so slicing would occur:
struct A { int i; A(int i_) : i(i_) {} virtual const char* fn() { return "A"; } };
struct B : public A {
int j;
B(int i_) : A(i_), j(i_ + 10) {}
virtual const char* fn() { return "B"; }
};
#include <iostream>
#include <cstring>
int main()
{
A* a = new A(1);
B* b = new B(2);
*a = *b; // aka a->operator=(static_cast<A*>(*b));
std::cout << "sizeof A = " << sizeof(A)
<< ", a->i = " << a->i << ", a->fn(): " << a->fn() << '\n';
}
http://ideone.com/DHGwun
(i is copied, but B's j is lost)
The conclusion here is that pointers/references are required because the original instance carries membership information with it that copying may interact with.
But also, that polymorphism is not perfectly solved within C++ and one must be cognizant of their obligation to provide/block actions which could produce slicing.
You need pointers or reference because for the kind of polymorphism you are interested in (*), you need that the dynamic type could be different from the static type, in other words that the true type of the object is different than the declared type. In C++ that happens only with pointers or references.
(*) Genericity, the type of polymorphism provided by templates, doesn't need pointers nor references.
When an object is passed by value, it's typically put on the stack. Putting something on the stack requires knowledge of just how big it is. When using polymorphism, you know that the incoming object implements a particular set of features, but you usually have no idea the size of the object (nor should you, necessarily, that's part of the benefit). Thus, you can't put it on the stack. You do, however, always know the size of a pointer.
Now, not everything goes on the stack, and there are other extenuating circumstances. In the case of virtual methods, the pointer to the object is also a pointer to the object's vtable(s), which indicate where the methods are. This allows the compiler to find and call the functions, regardless of what object it's working with.
Another cause is that very often the object is implemented outside of the calling library, and allocated with a completely different (and possibly incompatible) memory manager. It could also have members that can't be copied, or would cause problems if they were copied with a different manager. There could be side-effects to copying and all sorts of other complications.
The result is that the pointer is the only bit of information on the object that you really properly understand, and provides enough information to figure out where the other bits you need are.
I have been told that references, when they are data members of classes, they occupy memory since they will be transformed into constant pointers by the compiler. Why is that? Like why does the compiler(I know that it is implementation-specific in general) make a reference a pointer when they are part of a class, as opposed to when they are a temporary variable?
So in this code:
class A{
public:
A(int &refval):m_ref(refval){};
private:
int &m_ref;
}
m_ref will be treated as a constant pointer(i.e. they do occupy memory).
However, in this code:
void func(int &a){
int &a_ref = a;
}
the compiler just replaces the reference with the actual variable(i.e. they do not occupy memory).
So to simplify a little, my question basically is: What makes it more meaningful to make references into constant pointers when they are data members than when they are temporary variables?
The C++ standard only defines the semantics of a reference, not how they are actually implemented. So all answers to this question are compiler-specific. A (silly, but compliant) compiler might choose to store all references on the hard-disk. It's just that it proved to be the most convenient/efficient to store a reference as a constant pointer for class members, and replace the occurence of the reference with the actual thing where possible.
As an example for a situation where it is impossible for the compiler to decide at compile time to which object a reference is bound, consider this:
#include <iostream>
bool func() {
int i;
std::cin >> i;
return i > 5;
}
int main() {
int a = 3, b = 4;
int& r = func() ? a : b;
std::cout << r;
}
So in general a program has to store some information about references at runtime, and sometimes, for special cases, it can prove at compile time what a reference is bound to.
The reference (or pointer) has to be stored in memory somewhere, so why not store it along with the rest of the class?
Even with your example, the parameter a (int &a) is stored in memory (probably on the stack), then a_ref doesn't use any more memory, it's just an alias, but there is memory used by a.
Imagine that a class is just a user defined data type. You need to have something which can lead you to the actual thing that you are referencing.
Using the actual value in the second case is more about the compiler and his work to optimize your code.
A reference should be an alias to some variable and why should this alias use memory when it could be optimized to be taken directly from the stack.
I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value. And I feel dirty converting back to a reference, it just seems wrong.
Is it?
To clarify...
MyType *pObj = ...
MyType &obj = *pObj;
Isn't this 'dirty', since you can (even if only in theory since you'd check it first) dereference a NULL pointer?
EDIT: Oh, and you don't know if the objects were dynamically created or not.
Ensure that the pointer is not NULL before you try to convert the pointer to a reference, and that the object will remain in scope as long as your reference does (or remain allocated, in reference to the heap), and you'll be okay, and morally clean :)
Initialising a reference with a dereferenced pointer is absolutely fine, nothing wrong with it whatsoever. If p is a pointer, and if dereferencing it is valid (so it's not null, for instance), then *p is the object it points to. You can bind a reference to that object just like you bind a reference to any object. Obviously, you must make sure the reference doesn't outlive the object (like any reference).
So for example, suppose that I am passed a pointer to an array of objects. It could just as well be an iterator pair, or a vector of objects, or a map of objects, but I'll use an array for simplicity. Each object has a function, order, returning an integer. I am to call the bar function once on each object, in order of increasing order value:
void bar(Foo &f) {
// does something
}
bool by_order(Foo *lhs, Foo *rhs) {
return lhs->order() < rhs->order();
}
void call_bar_in_order(Foo *array, int count) {
std::vector<Foo*> vec(count); // vector of pointers
for (int i = 0; i < count; ++i) vec[i] = &(array[i]);
std::sort(vec.begin(), vec.end(), by_order);
for (int i = 0; i < count; ++i) bar(*vec[i]);
}
The reference that my example has initialized is a function parameter rather than a variable directly, but I could just have validly done:
for (int i = 0; i < count; ++i) {
Foo &f = *vec[i];
bar(f);
}
Obviously a vector<Foo> would be incorrect, since then I would be calling bar on a copy of each object in order, not on each object in order. bar takes a non-const reference, so quite aside from performance or anything else, that clearly would be wrong if bar modifies the input.
A vector of smart pointers, or a boost pointer vector, would also be wrong, since I don't own the objects in the array and certainly must not free them. Sorting the original array might also be disallowed, or for that matter impossible if it's a map rather than an array.
No. How else could you implement operator=? You have to dereference this in order to return a reference to yourself.
Note though that I'd still store the items in the STL container by value -- unless your object is huge, overhead of heap allocations is going to mean you're using more storage, and are less efficient, than you would be if you just stored the item by value.
My answer doesn't directly address your initial concern, but it appears you encounter this problem because you have an STL container that stores pointer types.
Boost provides the ptr_container library to address these types of situations. For instance, a ptr_vector internally stores pointers to types, but returns references through its interface. Note that this implies that the container owns the pointer to the instance and will manage its deletion.
Here is a quick example to demonstrate this notion.
#include <string>
#include <boost/ptr_container/ptr_vector.hpp>
void foo()
{
boost::ptr_vector<std::string> strings;
strings.push_back(new std::string("hello world!"));
strings.push_back(new std::string());
const std::string& helloWorld(strings[0]);
std::string& empty(strings[1]);
}
I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value.
Just to be clear: STL containers were designed to support certain semantics ("value semantics"), such as "items in the container can be copied around." Since references aren't rebindable, they don't support value semantics (i.e., try creating a std::vector<int&> or std::list<double&>). You are correct that you cannot put references in STL containers.
Generally, if you're using references instead of plain objects you're either using base classes and want to avoid slicing, or you're trying to avoid copying. And, yes, this means that if you want to store the items in an STL container, then you're going to need to use pointers to avoid slicing and/or copying.
And, yes, the following is legit (although in this case, not very useful):
#include <iostream>
#include <vector>
// note signature, inside this function, i is an int&
// normally I would pass a const reference, but you can't add
// a "const* int" to a "std::vector<int*>"
void add_to_vector(std::vector<int*>& v, int& i)
{
v.push_back(&i);
}
int main()
{
int x = 5;
std::vector<int*> pointers_to_ints;
// x is passed by reference
// NOTE: this line could have simply been "pointers_to_ints.push_back(&x)"
// I simply wanted to demonstrate (in the body of add_to_vector) that
// taking the address of a reference returns the address of the object the
// reference refers to.
add_to_vector(pointers_to_ints, x);
// get the pointer to x out of the container
int* pointer_to_x = pointers_to_ints[0];
// dereference the pointer and initialize a reference with it
int& ref_to_x = *pointer_to_x;
// use the reference to change the original value (in this case, to change x)
ref_to_x = 42;
// show that x changed
std::cout << x << '\n';
}
Oh, and you don't know if the objects were dynamically created or not.
That's not important. In the above sample, x is on the stack and we store a pointer to x in the pointers_to_vectors. Sure, pointers_to_vectors uses a dynamically-allocated array internally (and delete[]s that array when the vector goes out of scope), but that array holds the pointers, not the pointed-to things. When pointers_to_ints falls out of scope, the internal int*[] is delete[]-ed, but the int*s are not deleted.
This, in fact, makes using pointers with STL containers hard, because the STL containers won't manage the lifetime of the pointed-to objects. You may want to look at Boost's pointer containers library. Otherwise, you'll either (1) want to use STL containers of smart pointers (like boost:shared_ptr which is legal for STL containers) or (2) manage the lifetime of the pointed-to objects some other way. You may already be doing (2).
If you want the container to actually contain objects that are dynamically allocated, you shouldn't be using raw pointers. Use unique_ptr or whatever similar type is appropriate.
There's nothing wrong with it, but please be aware that on machine-code level a reference is usually the same as a pointer. So, usually the pointer isn't really dereferenced (no memory access) when assigned to a reference.
So in real life the reference can be 0 and the crash occurs when using the reference - what can happen much later than its assignemt.
Of course what happens exactly heavily depends on compiler version and hardware platform as well as compiler options and the exact usage of the reference.
Officially the behaviour of dereferencing a 0-Pointer is undefined and thus anything can happen. This anything includes that it may crash immediately, but also that it may crash much later or never.
So always make sure that you never assign a 0-Pointer to a reference - bugs likes this are very hard to find.
Edit: Made the "usually" italic and added paragraph about official "undefined" behaviour.
If C++, if I write:
int i = 0;
int& r = i;
then are i and r exactly equivalent?
That means that r is another name for i. They will both refer to the same variable. This means that if you write (after your code):
r = 5;
then i will be 5.
References are slightly different, but for most intents and purposes it is used identically once it has been declared.
There is slightly different behavior from a reference, let me try to explain.
In your example 'i' represents a piece of memory. 'i' owns that piece of memory -- the compiler reserves it when 'i' is declared, and it is no longer valid (and in the case of a class it is destroyed) when 'i' goes out of scope.
However 'r' does not own it's own piece of memory, it represents the same piece of memory as 'i'. No memory is reserved for it when it is declared, and when it goes out of scope it does not cause the memory to be invalid, nor will it call the destructor if 'r' was a class. If 'i' somehow goes out of scope and is destroyed while 'r' is not, 'r' will no longer represent a valid piece of memory.
For example:
class foo
{
public:
int& r;
foo(int& i) : r(i) {};
}
void bar()
{
foo* pFoo;
if(true)
{
int i=0;
pFoo = new foo(i);
}
pFoo->r=1; // r no longer refers to valid memory
}
This may seem contrived, but with an object factory pattern you could easily end up with something similar if you were careless.
I prefer to think of references as being most similar to pointers during creation and destruction, and most similar to a normal variable type during usage.
There are other minor gotchas with references, but IMO this is the big one.
The Reference is an alias of an object. i.e alternate name of an object. Read this article for more information - http://www.parashift.com/c++-faq-lite/references.html
A reference is an alias for an existing object.
Yep - a reference should be thought of as an alias for a variable, which is why you can't reassign them like you can reassign pointers (and also means that, even in the case of a non-optimizing compiler, you won't take up any additional storage space).
When used outside of function arguments, references are mostly useful to serve as shorthands for very->deeply->nested.structures->and.fields :)
C++ references differ from pointers in
several essential ways:
It is not possible to refer directly to a reference object
after it is defined; any occurrence of its name refers directly to the
object it references.
Once a reference is created, it cannot be later made to reference
another object; it cannot be reseated. This is often done with pointers.
References cannot be null, whereas pointers can; every reference
refers to some object, although it may or may not be valid.
References cannot be uninitialized. Because it is impossible to reinitialize a
reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor.
From Here.
The syntax int &r=i; creates another name i.e. r for variable i.hence we say that r is reference to i.if you access value of r,then r=0.Remember Reference is moreover a direct connection as its just another name for same memory location.
You are writing definitions here, with initializations. That means that you're refering to code like this:
void foo() {
int i = 0;
int& r = i;
}
but not
class bar {
int m_i;
int& m_r;
bar() : i(0), r(i) { }
};
The distinction matters. For instance, you can talk of the effects that m_i and m_r have on sizeof(bar) but there's no equivalent sizeof(foo).
Now, when it comes to using i and r, you can distinguish a few different situations:
Reading, i.e. int anotherInt = r;
Writing, i.e. r = 5
Passing to a function taking an int, i.e. void baz(int); baz(r);
Passing to a function taking an int&, i.e. void baz(int&); baz(r);
Template argument deduction, i.e. template<typename T> void baz(T); baz(r);
As the argument of sizeof, i.e. sizeof(r)
In these cases, they're identical. But there is one very important distinction:
std::string s = std::string("hello");
std::string const& cr = std::string("world");
The reference extends the lifetime of the temporary it's bound to, but the first line makes its a copy.