Why reference variable needs to initialise while declaring - c++

It is pretty simple question i had a doubt and i thought to ask everyone,
as we know we can declare reference as
int bar;
int &foo = bar;
My question is what is the reason behind this initialisation? Why this is must?
Also Why i don't need to initialise pointers while declaration?
int bar;
int *p;
p = &bar;

A reference, by definition, must refer to a valid object or POD type. It's not allowed to be uninitialized, referring to nothing in particular. Also, once initialized it can't be changed to refer to something else. Thus the only place it makes sense to initialize it is in the declaration (or if it's a member variable, the initializer list of the class constructor).
Other languages allow null references and reassigning references, but that's not the way they work in C++.

While pointers can be NULL (i.e., point to nothing), a reference must always point to something; it has no NULL state. Thus, it can't be created without being initialized.

Related

Everything in c++ by default is passed by value

In C++, are all types passed by value unless it comes with a & or * symbol?
For example in Java, passing an array as a function argument would be by default passing by reference. Does C++ give you more control over this?
EDIT: Thanks for all your responses, I think I understand the whole pass-by-value thing more clearly. For anyone who is still confused about how Java passes by value (a copy of the object reference), this answer really cleared it up for me.
In C++, are all types passed by value unless it comes with a & or *
symbol?
No if you pass something as * parameter (a pointer thereof) it is still passed by value. A copy of the pointer being passed is made. But both the original and copy point to the same memory. It is similar concept in C# - I believe also in Java, just you don't use * there.
That is why if you make changes to the outer objects using this pointer (e.g. using dereferencing), changes will be visible in original object too.
But if you just say assign a new value to the pointer, nothing will happen to the outer object. e.g.
void foo(int* ptr)
{
// ...
// Below, nothing happens to original object to which ptr was
// pointing, before function call, just ptr - the copy of original pointer -
// now points to a different object
ptr = &someObj;
// ...
}
For example in Java, passing an array as a function argument would be
by default passing by reference. Does C++ give you more control over
this?
In C++ or C if you pass array (e.g. int arr[]), what is being passed is treated as pointer to the first element of the array. Hence, what I said above holds true in this case too.
About & you are correct. You can even apply & to pointers (e.g., int *&), in which case now, the pointer indeed gets passed by reference - there is no copy made.
Probably tangential to your question, but I often take another direction to understand what happens when you call a function in C++.
The difference between
void foo(Bar bar); // [1]
void foo(Bar& bar); // [2]
void foo(Bar* bar); // [3]
is that the body in [1] will receive a copy of the original bar (we call this by value, but I prefer to think of it as my own copy).
The body of [2] will be working with the exact same bar object; no copies. Whether we can modify that bar object depends on whether the argument was Bar& bar (as illustrated) or const Bar& bar. Notice that in a well-formed program,[2] will always receive an object (no null references; let's leave dangling references aside).
The body of [3] will receive a copy of the pointer to the original bar. Whether or not I can modify the pointer and/or the object being pointed depends on whether the argument was const Bar* bar, const Bar* const bar, Bar* const bar, or Bar* bar (yes, really). The pointer may or may not be null.
The reason why I make this mental distinction is because a copy of the object may or may not have reference semantics. For example: a copy of an instance of this class:
struct Foo {
std::shared_ptr<FooImpl> m_pimpl;
};
would, by default, have the same "contents" as the original one (a new shared pointer pointing to the same FooImpl pointer). This, of course, depends on how did the programmer design the class.
For that reason I prefer to think of [1] as "takes a copy of bar", and if I need to know whether such copy will be what I want and what I need I go and study the class directly to understand what does that class in particular means by copy.

Binary trees in C++ using references

I wish to implement a binary tree using references instead of using pointers (which is generally what you tend to find in every book and every website on the internet). I tried the following code:
class tree_node {
private:
tree_node& left;
tree_node& right;
data_type data;
public:
void set_left(tree_node&);
// ... other functions here
};
void tree_node::set_left(tree_node& new_left) {
this.left = new_left;
}
I get the following error:
error C2582: 'operator =' function is unavailable in 'tree_node'.
I know I can easily implement it using pointers but I would like to keep my solution elegant and free of pointers. Can you tell me where I am going wrong?
You can't change the object that a reference refers to1; once you initialize a reference, it always refers to the object with which it was initialized.
You should use pointers. There is nothing wrong with using pointers for this (it's clean using pointers as well because parent nodes own their children, so cleanup and destruction is easy!)
(1) Well, you could explicitly call the object's destructor and then use placement new in the assignment operator implementation, but that's just a mess!
You cannot assign to references. What you're trying to do can't be done... without a huge amount of bending.. (you'd essentially destroy a node and create a new one each time you want to modify it.)
There's a good reason why all those other people use pointers.
References aren't just pointers with shorter syntax. They're another name for the actual object they refer to, even when used as the lhs of an assignment.
int i = 3;
int j = 4;
int &ref = i;
ref = j;
std::cout << i << "\n"; // prints 4: i itself has been modified,
// because semantically ref *is* i
That is, ref = j has the same effect as i = j, or the same effect as *ptr = j, if you had first done int *ptr = &i;. It means, "copy the contents of the object j, into whatever object ref refers to".
For the full lifetime of ref, it will always refer to i. It cannot be made to refer to any other int, that is to say it cannot be "re-seated".
The same is true of reference data members, it's just that their lifetime is different from automatic variables.
So, when you write this.left = new_left, what that means is, "copy the contents of the object new_left into whatever object this.left refers to". Which (a) isn't what you mean, since you were hoping to re-seat this.left, and (b) even if it was what you meant, it's impossible, since this.left has reference members which themselves cannot be reseated.
It's (b) that causes the compiler error you see, although (a) is why you should use pointers for this.
References in C++ don't work the same as references in other languages. Once a reference is set, at construction time, it can't be changed to anything else.
My recommendation is to use boost's shared_ptr class instead of a reference. This will free you of the concern for managing the pointer's deallocation. You may also be interested in Boost's graph library.

What is the difference between references and normal variable handles in C++?

If C++, if I write:
int i = 0;
int& r = i;
then are i and r exactly equivalent?
That means that r is another name for i. They will both refer to the same variable. This means that if you write (after your code):
r = 5;
then i will be 5.
References are slightly different, but for most intents and purposes it is used identically once it has been declared.
There is slightly different behavior from a reference, let me try to explain.
In your example 'i' represents a piece of memory. 'i' owns that piece of memory -- the compiler reserves it when 'i' is declared, and it is no longer valid (and in the case of a class it is destroyed) when 'i' goes out of scope.
However 'r' does not own it's own piece of memory, it represents the same piece of memory as 'i'. No memory is reserved for it when it is declared, and when it goes out of scope it does not cause the memory to be invalid, nor will it call the destructor if 'r' was a class. If 'i' somehow goes out of scope and is destroyed while 'r' is not, 'r' will no longer represent a valid piece of memory.
For example:
class foo
{
public:
int& r;
foo(int& i) : r(i) {};
}
void bar()
{
foo* pFoo;
if(true)
{
int i=0;
pFoo = new foo(i);
}
pFoo->r=1; // r no longer refers to valid memory
}
This may seem contrived, but with an object factory pattern you could easily end up with something similar if you were careless.
I prefer to think of references as being most similar to pointers during creation and destruction, and most similar to a normal variable type during usage.
There are other minor gotchas with references, but IMO this is the big one.
The Reference is an alias of an object. i.e alternate name of an object. Read this article for more information - http://www.parashift.com/c++-faq-lite/references.html
A reference is an alias for an existing object.
Yep - a reference should be thought of as an alias for a variable, which is why you can't reassign them like you can reassign pointers (and also means that, even in the case of a non-optimizing compiler, you won't take up any additional storage space).
When used outside of function arguments, references are mostly useful to serve as shorthands for very->deeply->nested.structures->and.fields :)
C++ references differ from pointers in
several essential ways:
It is not possible to refer directly to a reference object
after it is defined; any occurrence of its name refers directly to the
object it references.
Once a reference is created, it cannot be later made to reference
another object; it cannot be reseated. This is often done with pointers.
References cannot be null, whereas pointers can; every reference
refers to some object, although it may or may not be valid.
References cannot be uninitialized. Because it is impossible to reinitialize a
reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor.
From Here.
The syntax int &r=i; creates another name i.e. r for variable i.hence we say that r is reference to i.if you access value of r,then r=0.Remember Reference is moreover a direct connection as its just another name for same memory location.
You are writing definitions here, with initializations. That means that you're refering to code like this:
void foo() {
int i = 0;
int& r = i;
}
but not
class bar {
int m_i;
int& m_r;
bar() : i(0), r(i) { }
};
The distinction matters. For instance, you can talk of the effects that m_i and m_r have on sizeof(bar) but there's no equivalent sizeof(foo).
Now, when it comes to using i and r, you can distinguish a few different situations:
Reading, i.e. int anotherInt = r;
Writing, i.e. r = 5
Passing to a function taking an int, i.e. void baz(int); baz(r);
Passing to a function taking an int&, i.e. void baz(int&); baz(r);
Template argument deduction, i.e. template<typename T> void baz(T); baz(r);
As the argument of sizeof, i.e. sizeof(r)
In these cases, they're identical. But there is one very important distinction:
std::string s = std::string("hello");
std::string const& cr = std::string("world");
The reference extends the lifetime of the temporary it's bound to, but the first line makes its a copy.

What does `*&` in a function declaration mean?

I wrote a function along the lines of this:
void myFunc(myStruct *&out) {
out = new myStruct;
out->field1 = 1;
out->field2 = 2;
}
Now in a calling function, I might write something like this:
myStruct *data;
myFunc(data);
which will fill all the fields in data. If I omit the '&' in the declaration, this will not work. (Or rather, it will work only locally in the function but won't change anything in the caller)
Could someone explain to me what this '*&' actually does? It looks weird and I just can't make much sense of it.
The & symbol in a C++ variable declaration means it's a reference.
It happens to be a reference to a pointer, which explains the semantics you're seeing; the called function can change the pointer in the calling context, since it has a reference to it.
So, to reiterate, the "operative symbol" here is not *&, that combination in itself doesn't mean a whole lot. The * is part of the type myStruct *, i.e. "pointer to myStruct", and the & makes it a reference, so you'd read it as "out is a reference to a pointer to myStruct".
The original programmer could have helped, in my opinion, by writing it as:
void myFunc(myStruct * &out)
or even (not my personal style, but of course still valid):
void myFunc(myStruct* &out)
Of course, there are many other opinions about style. :)
In C and C++, & means call by reference; you allow the function to change the variable.
In this case your variable is a pointer to myStruct type. In this case the function allocates a new memory block and assigns this to your pointer 'data'.
In the past (say K&R) this had to be done by passing a pointer, in this case a pointer-to-pointer or **. The reference operator allows for more readable code, and stronger type checking.
It may be worthwhile to explain why it's not &*, but the other way around. The reason is, the declarations are built recursively, and so a reference to a pointer builds up like
& out // reference to ...
* (& out) // reference to pointer
The parentheses are dropped since they are redundant, but they may help you see the pattern. (To see why they are redundant, imagine how the thing looks in expressions, and you will notice that first the address is taken, and then dereferenced - that's the order we want and that the parentheses won't change). If you change the order, you would get
* out // pointer to ...
& (* out) // pointer to reference
Pointer to reference isn't legal. That's why the order is *&, which means "reference to pointer".
This looks like you are re-implementing a constructor!
Why not just create the appropriate constructor?
Note in C++ a struct is just like a class (it can have a constructor).
struct myStruct
{
myStruct()
:field1(1)
,field2(2)
{}
};
myStruct* data1 = new myStruct;
// or Preferably use a smart pointer
std::auto_ptr<myStruct> data2(new myStruct);
// or a normal object
myStruct data3;
In C++ it's a reference to a pointer, sort of equivalent to a pointer to pointer in C, so the argument of the function is assignable.
Like others have said, the & means you're taking a reference to the actual variable into the function as opposed to a copy of it. This means any modifications made to the variable in the function affect the original variable. This can get especially confusing when you're passing a pointer, which is already a reference to something else. In the case that your function signature looked like this
void myFunc(myStruct *out);
What would happen is that your function would be passed a copy of the pointer to work with. That means the pointer would point at the same thing, but would be a different variable. Here, any modifications made to *out (ie what out points at) would be permanent, but changes made to out (the pointer itself) would only apply inside of myFunc. With the signature like this
void myFunc(myStruct *&out);
You're declaring that the function will take a reference to the original pointer. Now any changes made to the pointer variable out will affect the original pointer that was passed in.
That being said, the line
out = new myStruct;
is modifying the pointer variable out and not *out. Whatever out used to point at is still alive and well, but now a new instance of myStruct has been created on the heap, and out has been modified to point at it.
As with most data types in C++, you can read it right-to-left and it'll make sense.
myStruct *&out
out is a reference (&) to a pointer (*) to a myStruct object. It must be a reference because you want to change what out points at (in this case, a new myStruct).
MyClass *&MyObject
Here MyObject is reference to a pointer of MyClass. So calling myFunction(MyClass *&MyObject) is call by reference, we can change MyObject which is reference to a pointer. But If we do myFunction( MyClass *MyObject) we can't change MyObject because it is call by value, It will just copy address into a temporary variable so we can change value where MyObject is Pointing but not of MyObject.
so in this case writer is first assigning a new value to out thats why call by reference is necessary.

Must I use pointers for my C++ class fields?

After reading a question on the difference between pointers and references, I decided that I'd like to use references instead of pointers for my class fields. However it seems that this is not possible, because they cannot be declared uninitialized (right?).
In the particular scenario I'm working on right now, I don't want to use normal variables (what's the correct term for them by the way?) because they're automatically initialized when I declare them.
In my snippet, bar1 is automatically instantiated with the default constructor (which isn't what I want), &bar2 causes a compiler error because you can't use uninitialized references (correct?), and *bar3 is happy as larry because pointers can be declared uninitialized (by the way, is it best practice to set this to NULL?).
class Foo
{
public:
Bar bar1;
Bar &bar2;
Bar *bar3;
}
It looks like I have to use pointers in this scenario, is this true? Also, what's the best way of using the variable? The -> syntax is a bit cumbersome... Tough luck? What about smart pointers, etc? Is this relevant?
Update 1:
After attempting to implement a reference variable field in my class and initializing it in the constructor, why might I receive the following error?
../src/textures/VTexture.cpp: In constructor ‘vimrid::textures::VTexture::VTexture()’:
../src/textures/VTexture.cpp:19: error: uninitialized reference member ‘vimrid::textures::VTexture::image’
Here's the real code:
// VTexture.h
class VTexture
{
public:
VTexture(vimrid::imaging::ImageMatrix &rImage);
private:
vimrid::imaging::ImageMatrix ℑ
}
// VTexture.cpp
VTexture::VTexture(ImageMatrix &rImage)
: image(rImage)
{
}
I've also tried doing this in the header, but no luck (I get the same error).
// VTexture.h
class VTexture
{
public:
VTexture(vimrid::imaging::ImageMatrix &rimage) : image(rImage) { }
}
Update 2:
Fred Larson - Yes! There is a default constructor; I neglected it because I thought it wasn't relevant to the problem (how foolish of me). After removing the default constructor I caused a compiler error because the class is used with a std::vector which requires there to be a default constructor. So it looks like I must use a default constructor, and therefore must use a pointer. Shame... or is it? :)
Answer to Question 1:
However it seems that this is not possible, because they [references] cannot be declared uninitialized (right?).
Right.
Answer to Question 2:
In my snippet, bar1 is automatically
instantiated with the default
constructor (which isn't what I want),
&bar2 causes a compiler error because
you can't use uninitialized references
(correct?),
You initialize references of your class in your constructor's initializer list:
class Foo
{
public:
Foo(Bar &rBar) : bar2(rBar), bar3(NULL)
{
}
Bar bar1;
Bar &bar2;
Bar *bar3;
}
Answer to Question 3:
In the particular scenario I'm working
on right now, I don't want to use
normal variables (what's the correct
term for them by the way?)
There is no correct name for them, typically you can just say pointers for most discussions (except this one) and everything you need to discuss will also apply to references. You initialize non pointer, non reference members in the same way via the initailizer list.
class Foo
{
public:
Foo() : x(0), y(4)
{
}
int x, y;
};
Answer to Question 4:
pointers can be declared uninitialized
(by the way, is it best practice to
set this to NULL?).
They can be declared uninitialized yes. It is better to initialize them to NULL because then you can check if they are valid.
int *p = NULL;
//...
//Later in code
if(p)
{
//Do something with p
}
Answer to Question 5:
It looks like I have to use pointers
in this scenario, is this true? Also,
what's the best way of using the
variable?
You can use either pointers or references, but references cannot be re-assigned and references cannot be NULL. A pointer is just like any other variable, like an int, but it holds a memory address. An array is an aliased name for another variable.
A pointer has its own memory address, whereas an array should be seen as sharing the address of the variable it references.
With a reference, after it is initialized and declared, you use it just like you would have used the variable it references. There is no special syntax.
With a pointer, to access the value at the address it holds, you have to dereference the pointer. You do this by putting a * before it.
int x=0;
int *p = &x;//p holds the address of x
int &r(x);//r is a reference to x
//From this point *p == r == x
*p = 3;//change x to 3
r = 4;//change x to 4
//Up until now
int y=0;
p = &y;//p now holds the address of y instead.
Answer to Question 6:
What about smart pointers, etc? Is
this relevant?
Smart pointers (See boost::shared_ptr) are used so that when you allocate on the heap, you do not need to manually free your memory. None of the examples I gave above allocated on the heap. Here is an example where the use of smart pointers would have helped.
void createANewFooAndCallOneOfItsMethods(Bar &bar)
{
Foo *p = new Foo(bar);
p->f();
//The memory for p is never freed here, but if you would have used a smart pointer then it would have been freed here.
}
Answer to Question 7:
Update 1:
After attempting to implement a
reference variable field in my class
and initializing it in the
constructor, why might I receive the
following error?
The problem is that you didn't specify an initializer list. See my answer to question 2 above. Everything after the colon :
class VTexture
{
public:
VTexture(vimrid::imaging::ImageMatrix &rImage)
: image(rImage)
{
}
private:
vimrid::imaging::ImageMatrix ℑ
}
They can be initialized. You just have to use the member initializer list.
Foo::Foo(...) : bar1(...), bar2(...), bar3(...)
{
// Whatever
}
It's a good idea to initialize all of your member variables this way. Otherwise, for other than primitive types, C++ will initialize them with a default constructor anyway. Assigning them within the braces is actually reassigning them, not initializing them.
Also, keep in mind that the member initializer list specifies HOW to initialize the member variables, NOT THE ORDER. Members are initialized in the order in which they are declared, not in the order of the initializers.
Use the null object design pattern
I'm using ints but it would be the same with any type.
//header file
class Foo
{
public:
Foo( void );
Foo( int& i );
private:
int& m_int;
};
//source file
static int s_null_Foo_m_i;
Foo::Foo( void ) :
m_i(s_null_Foo_m_i)
{ }
Foo::Foo( int& i ) :
m_i(i)
{ }
Now you have to make sure that Foo makes sense when default constructed. You can even detect when Foo has been default constructed.
bool Foo::default_constructed( void )
{
return &m_i == &s_null_Foo_m_i;
}
I absolutely agree with the sentiment, Always prefer references over pointers. There are two notable cases where you can't get away with a reference member:
Null has a meaningful value.
This can be avoided with the null object design pattern.
The class has to be assignable.
The compiler will not generate an assignment operator for classes that have a reference member. You can define one yourself, but you will not be able to change where the reference is bound.
There is also a side effect when you define when you define Bar and Bar *
class Foo
{
public:
Bar bar1; // Here, you create a dependency on the definition of Bar, so the header //file for bar always needs to be included.
Bar &bar2;
Bar *bar3; //Here, you create a pointer, and a forward declaration is enough, you don't have to always include the header files for Bar , which is preferred.
}
Using references just because the -> syntax is cumbersome isn't the best reason... References have the one great advatage over pointers in that nulls aren't possible without casting trickery, but also disadvantages in initialization and the risk of accidentally illegaly binding temporaries which then go out of scope (for instance, after an implicit conversion).
Yes, smart pointers such as the boost ones are almost always the right answer for handling composite members, and occasionally for associated members (shared_ptr).
class Foo {
public:
Bar bar1;
Bar &bar2;
Bar *bar3;
// member bar2 must have an initializer in the constructor
Bar::Bar(Bar& _bar2) : bar1(), bar2(_bar2), bar3(new Bar()) {}
Bar::~Bar() {delete bar3;}
}
Note that bar2 isn't just initialized in the ctor; it's initialized with a bar object that's passed in as a reference parameter. That object and the bar2 field will be bound together for the life of the new Foo object. That is usually a very bad idea, because it's hard to ensure that the lifetimes of the two objects will be well coordinated (i.e., that you will never dispose of the passed-in bar object before disposing of the Foo object.)
This is why it's greatly preferred to use either instance variables (as in bar1) or pointers to objects allocated on the heap (as in bar3.)