member variable as a reference - c++

What is the advantage of declaring a member variable as a reference?
I saw people doing that, and can't understand why.

One useful case is when you don't have access to the constructor of that object, yet don't want to work with indirection through a pointer. For example, if a class A does not have a public constructor and your class wants to accept an A instance in its constructor, you would want to store a A&. This also guarantees that the reference is initialized.

A member reference is useful when you need to have access to another object, without copying it.
Unlike a pointer, a reference cannot be changed (accidentally) so it always refers to the same object.

Generally speaking, types with unusual assignment semantics like std::auto_ptr<> and C++ references make it easier to shoot yourself in the foot (or to shoot off the whole leg).
When a reference is used as a member that means that the compiler generated operator= does a very surprising thing by assigning to the object referenced instead of reassigning the reference because references can not be reassigned to refer to another object. In other words, having a reference member most of the time makes the class non-assignable.
One can avoid this surprising behaviour by using plain pointers.

Related

Declare reference member variable that you won't have at instantiation

Coming from Java, C++ is breaking my brain.
I need a class to hold a reference to a variable that's defined in the main scope because I need to modify that variable, but I won't be able to instantiate that class until some inner loop, and I also won't have the reference until then. This causes no end of challenges to my Java brain:
I'm used to declaring a variable to establish its scope, well in advance of knowing the actual value that will go in that variable. For example, creating a variable that will hold an object in my main scope like MyClass test; but C++ can't abide a vacuum and will use the default constructor to actually instantiate it right then and there.
Further, given that I want to pass a reference later on to that object (class), if I want the reference to be held as a member variable, it seems that the member variable must be initialized when it's declared. I can't just declare the member variable in my class definition and then use some MyClass::init(int &myreference){} later on to assign the reference when I'll have it in my program flow.
So this makes what I want to do seemingly impossible - pass a reference to a variable to be held as a member variable in the class at any other time than instantiation of that class. [UPDATE, in stack-overflow-rubber-ducking I realized that in this case I CAN actually know those variables ahead of time so can side-step all this mess. But the question I think is still pertinent as I'm sure I'll run into this pattern often]
Do I have no choice but to use pointers for this? Is there some obvious technique that my hours of Google-fu have been unable to unearth?
TLDR; - how to properly use references in class member variables when you can't define them at instantiation / constructor (ie: list initialization)?
Declare reference member variable that you won't have at instantiation
All references must be initialised. If you don't have anything to initialise it to, then you cannot have a reference.
The type that you seem to be looking for is a pointer. Like references, pointers are a form of indirection but unlike references, pointers can be default initialised, and they have a null state, and can made to point to an object after their initialisation.
Important note: Unlike Java references, C++ references and pointers do not generally extend the lifetime of the object that they refer to. It's very easy to unknowingly keep referring to an object outside of its lifetime, and attempting to access through such invalid reference will result in undefined behaviour. As such, if you do store a reference or a pointer to an object (that was provided as an argument) in a member, then you should make that absolutely clear to the caller who provides the object, so that they can ensure the correct lifetime. For example, you could name the class as something like reference_wrapper (which incidentally is a class that exists in the standard library).
In order to have semantics similar to Java references, you need to have shared ownership such that each owner extends the lifetime of the referred object. In C++, that can be achieved with a shared pointer (std::shared_ptr).
Note however, that it's generally best to not think in Java, and translate your Java thoughts into C++, but it's better to rather learn to think in C++. Shared ownership is convenient, but it has a cost and you have to consider whether you can afford it. A Java programmer must "unlearn" Java before they can write good C++. You can also subsitatute C++ and Java with most other programming languages and same will apply.
it seems that the member variable must be initialized when it's declared.
Member variables aren't directly initialised when they are declared. If you provide an initialiser in a member declaration, that is a default member initialiser which will be used if you don't provide an initialiser for that member in the member initialiser list of a constructor.
You can initialise a member reference to refer to an object provided as an argument in a (member initialiser list of a) constructor, but indeed not after the class instance has been initialised.
Reference member variables are even more problematic beyond the lifetime challenges that both references and pointers have. Since references cannot be made to point to other objects nor default initialised, such member necessarily makes the class non-"regular" i.e. the class won't behave similar ways as fundamental types do. This makes such classes less intuitive to use.
TL;DR:
Java idioms don't work in C++.
Java references are very different from C++ references.
If you think that you need a reference member, then take a step back and consider another idea. First thing to consider: Instead of referring to an object stored elsewhere, could the object be stored inside the class? Is the class needed in the first place?

What's wrong with having a reference member in C++?

While reading "C++ Coding Standards: 101 Rules, Guidelines, and Best Practices" I came across the following:
Note that using a reference or auto_ptr member is almost always wrong.
However, the text doesn't elaborate on why this should be wrong. So what is so wrong about having a class with reference members?
I think the text is telling you to avoid embedding in a class anything whose existence is out of the control of that class. References and auto pointers may refer to already deleted objects.
One problem might be that references are immutable, once set they can't be changed.. Another problem might be dangling references, where you have a reference to now destroyed object.
A reference member can never be changed, rendering the class non-assignable. The same goes for members of const object type. Also, the best practice is to set any invalidated reference to nullptr, but that cannot be applied.
Simply use a pointer instead. There is also std::reference_wrapper for the cases where a bare pointer really won't do.
auto_ptr is long obsolete and deprecated, so that simply shouldn't come up. It had wonky copy semantics.

C++: Why can a statically created variable by passed to a function expecting a reference?

I've been programming in C++ for a while but certainly wouldn't call myself an expert. This question isn't being asked to solve a practical problem that I have, it's more about understanding what C++ is doing.
Imagine I have a function that expects a single paramater:
void doSomething(SomeClass& ref)
{
// do something interesting
}
(Note: the parameter is a reference to SomeClass) Then I call the function like this:
int main(int argc, char *argv[])
{
SomeClass a;
doSomething(a);
}
Why is this legal C++? The function is expecting a reference to SomeClass, but I'm passing it a statically allocated variable of type SomeClass. A reference is like a pointer no? If we were to replace the reference with a pointer the compiler complains. Why is the reference different to a pointer in this way, what's going on behind the scenes?
Sorry if this is a stupid question, it's just been buggin me!
I think you'd understand this better if you stopped thinking of references as being similar to pointers. I would say there are two reasons why this comparison is made:
References allow you to pass objects into functions to allow them to be modified. This was a popular use case of pointers in C.
The implementation of pointers and references is usually pretty much the same once it's compiled.
But they are different things. You could think about references as a way of giving a new name to an object. int& x = y; says that I want to give a new name to the object I currently refer to as y. This new name is x. Both of those identifies, x and y, both refer to the same object now.
This is why you pass the object itself as a reference. You are saying that you want the function to have its own identifier to refer the object that you are passing. If you don't put the ampersand in the parameter list, then the object will be copied into the function. This copying is often unnecessary.
Your code is incorrect - SomeClass a(); is a forward declaration of a function a returning a SomeClass instance - such a declaration is not valid at function scope.
Assuming you meant SomeClass a;:
A reference is quite similar to a pointer in most practical ways - the main difference is that you cannot legally have a reference to NULL, whereas you can have a pointer to NULL. As you've noticed, the syntax for pointers and references is different - you can't pass a pointer where a reference is expected.
If you think of a reference as a "pointer that can't be null and can't be made to point elsewhere" you're pretty much covered. You're passing something which refers to your local a instance - if doSomething modifies its parameter then it's really directly modifying your local a.
SomeClass a();
This is a function signature, not an object.
It should be
SomeClass a; // a is an object
Then your code is valid.
Why is this legal C++?
(assuming you fixed the previous point)
C++ standard say that if your function attribute is a reference, then you should provide an object that have a name (an l-value). So here, it's legal. If it was a const reference, you could even provide a temporary (an r-value, that have no name).
The function is expecting a reference
to SomeClass, but I'm passing it a
statically allocated variable of type
SomeClass.
It's expecting a reference to an non-const instance of SomeClass, that is what you did provide.
That instance is not static, it's just allocated on the stack. The allocation of an object have nothing to do with the way it can be manipulated, only the scope does. The way the object is alloced (on the stack like here, or on the heap by using new/delete) only tells the lifetime of the object. Even a static object could be passed in your function, as far as it's not const.
I think you're mixing some language concepts here...
A reference is like a pointer no?
No.
A reference is a "nickname" of an object. No more, no less.
Okay, in fact it is implemented as a pointer with special rules but it's not true in every use: the compiler is free to implement it in whatever way it want.
In case of a function attribute, it's often implemented as a pointer. But you don't have to even know about it.
For you, it's just the nickname of an object.
If we were to replace the reference
with a pointer the compiler complains.
Why is the reference different to a
pointer in this way, what's going on
behind the scenes?
I guess your first error did make things fuzzy for you?
You're not passing it "a statically allocated variable of type SomeClass", you're passing it a reference to a SomeClass object you created on the stack in main().
Your function
void doSomething(SomeClass& ref)
Causes a reference to a in main to be passed in. That's what & after the type in the parameter list does.
If you left out the &, then SomeClass's (a's) copy constructor would be called, and your function would get a new, local copy of the a object in main(); in that case anything you did to ref in the function wouldn't be seen back in main()
Perhaps your confusion arises from the fact that if you have two functions:
void doThingA(int a) {
a=23;
}
void doThingB(int &a) {
a=23;
}
The calls to them look the same, but are in fact very different:
int a=10;
doThingA(a);
doThingB(a);
The first case, doThingA(int), creates a completely new variable with the value 10, assignes 23 to it and returns. The original variable in the caller remains unchanged. In the second case, doThingB(int&), where the variable is passed by reference, a new variable is create with the same address as the variable passed in. This is what people mean when they say passing by reference is like passing by pointer. Because both variables share the same address (occupy the memory location) when doThingB(int&) changes the value passed in, the variable in the caller is also changed.
I like to think of it as passing pointer without all that annoying pointer syntax. Having said that though, I find functions that modify variables passed by reference to be confusing, and I almost never do it. I would either pass by const reference
void doThingB(const int &a);
or, if I want to modify the value, explicitly pass a pointer.
void doThingB(int *a);
Reference is not a pointer. You simply pass parameters as "by value" and use-it . Under the hood only a pointer will be used, but this is just under the hood.
A reference is nothing at all like a pointer, it is an alias - a new name - for some other object. That is one reason for having both!
Consider this:
Someclass a;
Someclass& b = a;
Someclass& c = a;
Here we first create an object a, and then we say that b and c are other names for the same object. Nothing new is created, just two additional names.
When b or c is a parameter to a function, the alias for a is made available inside the function, and you can use it to refer to the actual object.
It is that simple! You don't have to jump through any loops with &, *, or -> like when using pointers.
Though the question has been already answered adequately, I can't resist sharing few more words on the related language feature of "References in C++".
As C programmers, we have two options available when passing variables to functions:
Pass the value of the variable (creating a new copy)
Pass the pointer to the variable.
When it comes to C++, we are usually dealing with objects. Copying such objects on each function call that needs to work on that object is not recommended due to space (and also speed) considerations. There are benefits involved with passing the variable address (via the pointer approach), and though we can make the pointer 'const' to avoid any changes through the pointer, the syntax with pointers is rather clumsy (miss the dereference operator at a place or two and end up spending hours debugging!).
C++, in providing 'references', packages the best of both options:
The reference can be understood to be as good as passing the address
The syntax to use the reference is the same as working on the variable itself.
The reference would always point to 'something'. Hence no 'null-pointer' exceptions.
Additionally, if we make the reference 'const', we disallow any changes to the original variable.

Operator & and * at function prototype in class

I'm having a problem with a class like this:
class Sprite {
...
bool checkCollision(Sprite &spr);
...
};
So, if I have that class, I can do this:
ball.checkCollision(bar1);
But if I change the class to this:
class Sprite {
...
bool checkCollision(Sprite* spr);
...
};
I have to do this:
ball.checkCollision(&bar1);
So, what's the difference?? It's better a way instead other?
Thank you.
In both cases you are actually passing the address of bar1 (and you're not copying the value), since both pointers (Sprite *) and references (Sprite &) have reference semantics, in the first case explicit (you have to explicitly dereference the pointer to manipulate the pointed object, and you have to explicitly pass the address of the object to a pointer parameter), in the second case implicit (when you manipulate a reference it's as if you're manipulating the object itself, so they have value syntax, and the caller's code doesn't explicitly pass a pointer using the & operator).
So, the big difference between pointers and references is on what you can do on the pointer/reference variable: pointer variables themselves can be modified, so they may be changed to point to something else, can be NULLed, incremented, decremented, etc, so there's a strong separation between activities on the pointer (that you access directly with the variable name) and on the object that it points to (that you access with the * operator - or, if you want to access to the members, with the -> shortcut).
References, instead, aim to be just an alias to the object they point to, and do not allow changes to the reference itself: you initialize them with the object they refer to, and then they act as if they were such object for their whole life.
In general, in C++ references are preferred over pointers, for the motivations I said and for some other that you can find in the appropriate section of C++ FAQ.
In terms of performance, they should be the same, because a reference is actually a pointer in disguise; still, there may be some corner case in which the compiler may optimize more when the code uses a reference instead of a pointer, because references are guaranteed not to change the address they hide (i.e., from the beginning to the end of their life they always point to the same object), so in some strange case you may gain something in performance using references, but, again, the point of using references is about good programming style and readability, not performance.
A reference cannot be null. A pointer can.
If you don't want to allow passing null pointers into your function then use a reference.
With the pointer you need to specifically let the compiler know you want to pass the address of the object, with a reference, the compiler already knows you want the ptr. Both are ok, it's a matter of taste, I personally don't like references because I like to see whats going on but thats just me.
They both do the (essentially) same thing - they pass an object to a function by reference so that only the address of the object is copied. This is efficient and means the function can change the object.
In the simple case you give they are equivalent.
Main differences are that the reference cannot be null, so you don't have to test for null in the function - but you also cannot pass a null object if the case of no object is valid.
Some people also dislike the pass by reference version because it is not obvious in the calling code that the object you pass in might be modified. Some coding standards recommend you only pass const references to functions.

Should I prefer pointers or references in member data?

This is a simplified example to illustrate the question:
class A {};
class B
{
B(A& a) : a(a) {}
A& a;
};
class C
{
C() : b(a) {}
A a;
B b;
};
So B is responsible for updating a part of C. I ran the code through lint and it whinged about the reference member: lint#1725.
This talks about taking care over default copy and assignments which is fair enough, but default copy and assignment is also bad with pointers, so there's little advantage there.
I always try to use references where I can since naked pointers introduce uncertaintly about who is responsible for deleting that pointer. I prefer to embed objects by value but if I need a pointer, I use auto_ptr in the member data of the class that owns the pointer, and pass the object around as a reference.
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My own rule of thumb :
Use a reference member when you want the life of your object to be dependent on the life of other objects : it's an explicit way to say that you don't allow the object to be alive without a valid instance of another class - because of no assignment and the obligation to get the references initialization via the constructor. It's a good way to design your class without assuming anything about it's instance being member or not of another class. You only assume that their lives are directly linked to other instances. It allows you to change later how you use your class instance (with new, as a local instance, as a class member, generated by a memory pool in a manager, etc.)
Use pointer in other cases : When you want the member to be changed later, use a pointer or a const pointer to be sure to only read the pointed instance. If that type is supposed to be copyable, you cannot use references anyway. Sometimes you also need to initialize the member after a special function call ( init() for example) and then you simply have no choice but to use a pointer. BUT : use asserts in all your member function to quickly detect wrong pointer state!
In cases where you want the object lifetime to be dependent on an external object's lifetime, and you also need that type to be copyable, then use pointer members but reference argument in constructor That way you are indicating on construction that the lifetime of this object depends on the argument's lifetime BUT the implementation use pointers to still be copyable. As long as these members are only changed by copy, and your type don't have a default constructor, the type should fullfil both goals.
Avoid reference members, because they restrict what the implementation of a class can do (including, as you mention, preventing the implementation of an assignment operator) and provide no benefits to what the class can provide.
Example problems:
you are forced to initialise the reference in each constructor's initialiser list: there's no way to factor out this initialisation into another function (until C++0x, anyway edit: C++ now has delegating constructors)
the reference cannot be rebound or be null. This can be an advantage, but if the code ever needs changing to allow rebinding or for the member to be null, all uses of the member need to change
unlike pointer members, references can't easily be replaced by smart pointers or iterators as refactoring might require
Whenever a reference is used it looks like value type (. operator etc), but behaves like a pointer (can dangle) - so e.g. Google Style Guide discourages it
Objects rarely should allow assign and other stuff like comparison. If you consider some business model with objects like 'Department', 'Employee', 'Director', it is hard to imagine a case when one employee will be assigned to other.
So for business objects it is very good to describe one-to-one and one-to-many relationships as references and not pointers.
And probably it is OK to describe one-or-zero relationship as a pointer.
So no 'we can't assign' then factor.
A lot of programmers just get used with pointers and that's why they will find any argument to avoid use of reference.
Having a pointer as a member will force you or member of your team to check the pointer again and again before use, with "just in case" comment. If a pointer can be zero then pointer probably is used as kind of flag, which is bad, as every object have to play its own role.
Use references when you can, and pointers when you have to.
In a few important cases, assignability is simply not needed. These are often lightweight algorithm wrappers that facilitate calculation without leaving the scope. Such objects are prime candidates for reference members since you can be sure that they always hold a valid reference and never need to be copied.
In such cases, make sure to make the assignment operator (and often also the copy constructor) non-usable (by inheriting from boost::noncopyable or declaring them private).
However, as user pts already commented, the same is not true for most other objects. Here, using reference members can be a huge problem and should generally be avoided.
As everyone seems to be handing out general rules, I'll offer two:
Never, ever use use references as class members. I have never done so in my own code (except to prove to myself that I was right in this rule) and cannot imagine a case where I would do so. The semantics are too confusing, and it's really not what references were designed for.
Always, always, use references when passing parameters to functions, except for the basic types, or when the algorithm requires a copy.
These rules are simple, and have stood me in good stead. I leave making rules on using smart pointers (but please, not auto_ptr) as class members to others.
Yes to: Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My rules of thumb for data members:
never use a reference, because it prevents assignment
if your class is responsible for deleting, use boost's scoped_ptr (which is safer than an auto_ptr)
otherwise, use a pointer or const pointer
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Yes. Readability of your code. A pointer makes it more obvious that the member is a reference (ironically :)), and not a contained object, because when you use it you have to de-reference it. I know some people think that is old fashioned, but I still think that it simply prevent confusion and mistakes.
I advise against reference data members becasue you never know who is going to derive from your class and what they might want to do. They might not want to make use of the referenced object, but being a reference you have forced them to provide a valid object.
I've done this to myself enough to stop using reference data members.