Coming from Java, C++ is breaking my brain.
I need a class to hold a reference to a variable that's defined in the main scope because I need to modify that variable, but I won't be able to instantiate that class until some inner loop, and I also won't have the reference until then. This causes no end of challenges to my Java brain:
I'm used to declaring a variable to establish its scope, well in advance of knowing the actual value that will go in that variable. For example, creating a variable that will hold an object in my main scope like MyClass test; but C++ can't abide a vacuum and will use the default constructor to actually instantiate it right then and there.
Further, given that I want to pass a reference later on to that object (class), if I want the reference to be held as a member variable, it seems that the member variable must be initialized when it's declared. I can't just declare the member variable in my class definition and then use some MyClass::init(int &myreference){} later on to assign the reference when I'll have it in my program flow.
So this makes what I want to do seemingly impossible - pass a reference to a variable to be held as a member variable in the class at any other time than instantiation of that class. [UPDATE, in stack-overflow-rubber-ducking I realized that in this case I CAN actually know those variables ahead of time so can side-step all this mess. But the question I think is still pertinent as I'm sure I'll run into this pattern often]
Do I have no choice but to use pointers for this? Is there some obvious technique that my hours of Google-fu have been unable to unearth?
TLDR; - how to properly use references in class member variables when you can't define them at instantiation / constructor (ie: list initialization)?
Declare reference member variable that you won't have at instantiation
All references must be initialised. If you don't have anything to initialise it to, then you cannot have a reference.
The type that you seem to be looking for is a pointer. Like references, pointers are a form of indirection but unlike references, pointers can be default initialised, and they have a null state, and can made to point to an object after their initialisation.
Important note: Unlike Java references, C++ references and pointers do not generally extend the lifetime of the object that they refer to. It's very easy to unknowingly keep referring to an object outside of its lifetime, and attempting to access through such invalid reference will result in undefined behaviour. As such, if you do store a reference or a pointer to an object (that was provided as an argument) in a member, then you should make that absolutely clear to the caller who provides the object, so that they can ensure the correct lifetime. For example, you could name the class as something like reference_wrapper (which incidentally is a class that exists in the standard library).
In order to have semantics similar to Java references, you need to have shared ownership such that each owner extends the lifetime of the referred object. In C++, that can be achieved with a shared pointer (std::shared_ptr).
Note however, that it's generally best to not think in Java, and translate your Java thoughts into C++, but it's better to rather learn to think in C++. Shared ownership is convenient, but it has a cost and you have to consider whether you can afford it. A Java programmer must "unlearn" Java before they can write good C++. You can also subsitatute C++ and Java with most other programming languages and same will apply.
it seems that the member variable must be initialized when it's declared.
Member variables aren't directly initialised when they are declared. If you provide an initialiser in a member declaration, that is a default member initialiser which will be used if you don't provide an initialiser for that member in the member initialiser list of a constructor.
You can initialise a member reference to refer to an object provided as an argument in a (member initialiser list of a) constructor, but indeed not after the class instance has been initialised.
Reference member variables are even more problematic beyond the lifetime challenges that both references and pointers have. Since references cannot be made to point to other objects nor default initialised, such member necessarily makes the class non-"regular" i.e. the class won't behave similar ways as fundamental types do. This makes such classes less intuitive to use.
TL;DR:
Java idioms don't work in C++.
Java references are very different from C++ references.
If you think that you need a reference member, then take a step back and consider another idea. First thing to consider: Instead of referring to an object stored elsewhere, could the object be stored inside the class? Is the class needed in the first place?
Related
I've read some answers in SO and tutorials in other places that generally give the following idea:
Reference variables can be used as a second name or an alias for another variable
Is that indeed true or are there situations where it is either not possible or not convenient to do so? When is it viable and how to make it so?
The motivation for my question:
I followed this notion inside variables of a class and ended up with a problem whenever objects of those classes were copied (The references inside the copied objects would refer to the original variables). The solution I've seen so far involves specifying a custom copy constructor to change the initialization of those aliases from the default, which can be a lot of work since you can't extend the default constructor to change those specific variables and you are then required to write one for your entire class or for a nested one that wraps your aliases ( and thus also limits the names you can use).
Bottom-line, as far as I know, using reference member variables as aliases is either unsafe (it won't work as expected if your variable is copied) or not easy( you may have to write and maintain a lot of code).
Having said that, my question can be split as follows:
Can member reference variables be used as aliases without all the trouble mentioned earlier?
Are there any other situations where they can be unsafe? ( besides in copy operations)
When you don't have a member reference variable, can you indeed safely use them as a second name or is there a situation where extra care should be taken?
References are safe when:
What they are referencing cannot go away before the reference goes out of scope.
That's it.
Examples of this are:
A. Parameters to a function which will not store the references anywhere.
B. Aliasing a deeply nested or computed element in a container to make code cleaner.
C. Inside a functor that has temporary lifetime (such as a custom printer object)
At almost Any other time you should either be using a copy or a shared pointer.
I went through the already existing thread on this topic and wasn't convinced with the explanation.
What I could pick up from there was:
When a non-static member function is declared const, the restriction is imposed on this this pointer. As static member functions donot involve the this pointer, they cannot be declared const.
Is that it? Doesn't sound too convincing to me. I mean, I'm not questioning why it's so. I just want to to the reason why.
A const non-static member function is allowed to modify local, static, and global variables; it just isn't allowed to modify members of its class through the this pointer (implicitly or explicitly). A const static member function, therefore, would be allowed to modify local, static, and global variables, just like a non-member function. This would make the const meaningless.
If you want to write a function that isn't allowed to modify any non-local variables at all, you can declare it constexpr, although that also imposes additional restrictions.
The reason the const/non-const distinction for functions is important is that there are contexts in which it is not legal to call a non-const function. So the distinction can be used to enforce invariants.
For example, if you pass a non-const reference to a function, if your class is properly designed, you are guaranteed that the function can't change the value of the thing the reference refers to. This allows you to avoid copies.
Also, a non-const reference can't bind to a temporary. This permits functions to signal whether they return values through references or just take a value. You will get an error at compile time if you inadvertently ignore a returned value because a temporary was created unexpectedly.
None of this would apply to static functions because there is no context in which you would be prohibited from calling them. So the entire rationale for the distinction does not exist with static functions.
Say, I develop a complex application: Within object member functions, should I modify only those objects, that are passed to the member functions as parameters, or can I access and modify any other objects I have access to(say public or static objects)?
Technically, I know that it is possible to modify anything I have access to. I am asking about good practices.
Sometimes, it is bothering to pass as an argument everythying i will access and modify, especially if I know that the object member function will not be used by anybody else, but me. Thanks.
Global state is never a good idea (though it is sometimes simpler, for example logging), because it introduces dependencies that are not documented in the interface and increase coupling between components. Therefore, modifying a global state (static variables for example) should be avoided at all costs. Note: global constants are perfectly okay
In C++, you have the const keyword, to document (and have the compiler enforce) what can be modified and what cannot.
A const method is a guarantee that the visible state of an object will be untouched, an argument passed by const reference, or value, will not be touched either.
As long as it is documented, it is fine... and you should strive for having as few non-const methods in your class interface and as few non-const parameters in your methods.
If you have a class with member variables, then it is entirely acceptable to modify those member variables in a member method regardless of whether those member variables are private, protected, or public. This is want is meant by encapsulation.
In fact, modifying the variables passed into the member method is probably a bad idea; returning a new value is what you'd want, or getting a new value back from a separate member method.
What is the advantage of declaring a member variable as a reference?
I saw people doing that, and can't understand why.
One useful case is when you don't have access to the constructor of that object, yet don't want to work with indirection through a pointer. For example, if a class A does not have a public constructor and your class wants to accept an A instance in its constructor, you would want to store a A&. This also guarantees that the reference is initialized.
A member reference is useful when you need to have access to another object, without copying it.
Unlike a pointer, a reference cannot be changed (accidentally) so it always refers to the same object.
Generally speaking, types with unusual assignment semantics like std::auto_ptr<> and C++ references make it easier to shoot yourself in the foot (or to shoot off the whole leg).
When a reference is used as a member that means that the compiler generated operator= does a very surprising thing by assigning to the object referenced instead of reassigning the reference because references can not be reassigned to refer to another object. In other words, having a reference member most of the time makes the class non-assignable.
One can avoid this surprising behaviour by using plain pointers.
This is a simplified example to illustrate the question:
class A {};
class B
{
B(A& a) : a(a) {}
A& a;
};
class C
{
C() : b(a) {}
A a;
B b;
};
So B is responsible for updating a part of C. I ran the code through lint and it whinged about the reference member: lint#1725.
This talks about taking care over default copy and assignments which is fair enough, but default copy and assignment is also bad with pointers, so there's little advantage there.
I always try to use references where I can since naked pointers introduce uncertaintly about who is responsible for deleting that pointer. I prefer to embed objects by value but if I need a pointer, I use auto_ptr in the member data of the class that owns the pointer, and pass the object around as a reference.
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My own rule of thumb :
Use a reference member when you want the life of your object to be dependent on the life of other objects : it's an explicit way to say that you don't allow the object to be alive without a valid instance of another class - because of no assignment and the obligation to get the references initialization via the constructor. It's a good way to design your class without assuming anything about it's instance being member or not of another class. You only assume that their lives are directly linked to other instances. It allows you to change later how you use your class instance (with new, as a local instance, as a class member, generated by a memory pool in a manager, etc.)
Use pointer in other cases : When you want the member to be changed later, use a pointer or a const pointer to be sure to only read the pointed instance. If that type is supposed to be copyable, you cannot use references anyway. Sometimes you also need to initialize the member after a special function call ( init() for example) and then you simply have no choice but to use a pointer. BUT : use asserts in all your member function to quickly detect wrong pointer state!
In cases where you want the object lifetime to be dependent on an external object's lifetime, and you also need that type to be copyable, then use pointer members but reference argument in constructor That way you are indicating on construction that the lifetime of this object depends on the argument's lifetime BUT the implementation use pointers to still be copyable. As long as these members are only changed by copy, and your type don't have a default constructor, the type should fullfil both goals.
Avoid reference members, because they restrict what the implementation of a class can do (including, as you mention, preventing the implementation of an assignment operator) and provide no benefits to what the class can provide.
Example problems:
you are forced to initialise the reference in each constructor's initialiser list: there's no way to factor out this initialisation into another function (until C++0x, anyway edit: C++ now has delegating constructors)
the reference cannot be rebound or be null. This can be an advantage, but if the code ever needs changing to allow rebinding or for the member to be null, all uses of the member need to change
unlike pointer members, references can't easily be replaced by smart pointers or iterators as refactoring might require
Whenever a reference is used it looks like value type (. operator etc), but behaves like a pointer (can dangle) - so e.g. Google Style Guide discourages it
Objects rarely should allow assign and other stuff like comparison. If you consider some business model with objects like 'Department', 'Employee', 'Director', it is hard to imagine a case when one employee will be assigned to other.
So for business objects it is very good to describe one-to-one and one-to-many relationships as references and not pointers.
And probably it is OK to describe one-or-zero relationship as a pointer.
So no 'we can't assign' then factor.
A lot of programmers just get used with pointers and that's why they will find any argument to avoid use of reference.
Having a pointer as a member will force you or member of your team to check the pointer again and again before use, with "just in case" comment. If a pointer can be zero then pointer probably is used as kind of flag, which is bad, as every object have to play its own role.
Use references when you can, and pointers when you have to.
In a few important cases, assignability is simply not needed. These are often lightweight algorithm wrappers that facilitate calculation without leaving the scope. Such objects are prime candidates for reference members since you can be sure that they always hold a valid reference and never need to be copied.
In such cases, make sure to make the assignment operator (and often also the copy constructor) non-usable (by inheriting from boost::noncopyable or declaring them private).
However, as user pts already commented, the same is not true for most other objects. Here, using reference members can be a huge problem and should generally be avoided.
As everyone seems to be handing out general rules, I'll offer two:
Never, ever use use references as class members. I have never done so in my own code (except to prove to myself that I was right in this rule) and cannot imagine a case where I would do so. The semantics are too confusing, and it's really not what references were designed for.
Always, always, use references when passing parameters to functions, except for the basic types, or when the algorithm requires a copy.
These rules are simple, and have stood me in good stead. I leave making rules on using smart pointers (but please, not auto_ptr) as class members to others.
Yes to: Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My rules of thumb for data members:
never use a reference, because it prevents assignment
if your class is responsible for deleting, use boost's scoped_ptr (which is safer than an auto_ptr)
otherwise, use a pointer or const pointer
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Yes. Readability of your code. A pointer makes it more obvious that the member is a reference (ironically :)), and not a contained object, because when you use it you have to de-reference it. I know some people think that is old fashioned, but I still think that it simply prevent confusion and mistakes.
I advise against reference data members becasue you never know who is going to derive from your class and what they might want to do. They might not want to make use of the referenced object, but being a reference you have forced them to provide a valid object.
I've done this to myself enough to stop using reference data members.