I've read some answers in SO and tutorials in other places that generally give the following idea:
Reference variables can be used as a second name or an alias for another variable
Is that indeed true or are there situations where it is either not possible or not convenient to do so? When is it viable and how to make it so?
The motivation for my question:
I followed this notion inside variables of a class and ended up with a problem whenever objects of those classes were copied (The references inside the copied objects would refer to the original variables). The solution I've seen so far involves specifying a custom copy constructor to change the initialization of those aliases from the default, which can be a lot of work since you can't extend the default constructor to change those specific variables and you are then required to write one for your entire class or for a nested one that wraps your aliases ( and thus also limits the names you can use).
Bottom-line, as far as I know, using reference member variables as aliases is either unsafe (it won't work as expected if your variable is copied) or not easy( you may have to write and maintain a lot of code).
Having said that, my question can be split as follows:
Can member reference variables be used as aliases without all the trouble mentioned earlier?
Are there any other situations where they can be unsafe? ( besides in copy operations)
When you don't have a member reference variable, can you indeed safely use them as a second name or is there a situation where extra care should be taken?
References are safe when:
What they are referencing cannot go away before the reference goes out of scope.
That's it.
Examples of this are:
A. Parameters to a function which will not store the references anywhere.
B. Aliasing a deeply nested or computed element in a container to make code cleaner.
C. Inside a functor that has temporary lifetime (such as a custom printer object)
At almost Any other time you should either be using a copy or a shared pointer.
Related
Coming from Java, C++ is breaking my brain.
I need a class to hold a reference to a variable that's defined in the main scope because I need to modify that variable, but I won't be able to instantiate that class until some inner loop, and I also won't have the reference until then. This causes no end of challenges to my Java brain:
I'm used to declaring a variable to establish its scope, well in advance of knowing the actual value that will go in that variable. For example, creating a variable that will hold an object in my main scope like MyClass test; but C++ can't abide a vacuum and will use the default constructor to actually instantiate it right then and there.
Further, given that I want to pass a reference later on to that object (class), if I want the reference to be held as a member variable, it seems that the member variable must be initialized when it's declared. I can't just declare the member variable in my class definition and then use some MyClass::init(int &myreference){} later on to assign the reference when I'll have it in my program flow.
So this makes what I want to do seemingly impossible - pass a reference to a variable to be held as a member variable in the class at any other time than instantiation of that class. [UPDATE, in stack-overflow-rubber-ducking I realized that in this case I CAN actually know those variables ahead of time so can side-step all this mess. But the question I think is still pertinent as I'm sure I'll run into this pattern often]
Do I have no choice but to use pointers for this? Is there some obvious technique that my hours of Google-fu have been unable to unearth?
TLDR; - how to properly use references in class member variables when you can't define them at instantiation / constructor (ie: list initialization)?
Declare reference member variable that you won't have at instantiation
All references must be initialised. If you don't have anything to initialise it to, then you cannot have a reference.
The type that you seem to be looking for is a pointer. Like references, pointers are a form of indirection but unlike references, pointers can be default initialised, and they have a null state, and can made to point to an object after their initialisation.
Important note: Unlike Java references, C++ references and pointers do not generally extend the lifetime of the object that they refer to. It's very easy to unknowingly keep referring to an object outside of its lifetime, and attempting to access through such invalid reference will result in undefined behaviour. As such, if you do store a reference or a pointer to an object (that was provided as an argument) in a member, then you should make that absolutely clear to the caller who provides the object, so that they can ensure the correct lifetime. For example, you could name the class as something like reference_wrapper (which incidentally is a class that exists in the standard library).
In order to have semantics similar to Java references, you need to have shared ownership such that each owner extends the lifetime of the referred object. In C++, that can be achieved with a shared pointer (std::shared_ptr).
Note however, that it's generally best to not think in Java, and translate your Java thoughts into C++, but it's better to rather learn to think in C++. Shared ownership is convenient, but it has a cost and you have to consider whether you can afford it. A Java programmer must "unlearn" Java before they can write good C++. You can also subsitatute C++ and Java with most other programming languages and same will apply.
it seems that the member variable must be initialized when it's declared.
Member variables aren't directly initialised when they are declared. If you provide an initialiser in a member declaration, that is a default member initialiser which will be used if you don't provide an initialiser for that member in the member initialiser list of a constructor.
You can initialise a member reference to refer to an object provided as an argument in a (member initialiser list of a) constructor, but indeed not after the class instance has been initialised.
Reference member variables are even more problematic beyond the lifetime challenges that both references and pointers have. Since references cannot be made to point to other objects nor default initialised, such member necessarily makes the class non-"regular" i.e. the class won't behave similar ways as fundamental types do. This makes such classes less intuitive to use.
TL;DR:
Java idioms don't work in C++.
Java references are very different from C++ references.
If you think that you need a reference member, then take a step back and consider another idea. First thing to consider: Instead of referring to an object stored elsewhere, could the object be stored inside the class? Is the class needed in the first place?
I'd like to work out conventions on passing parameters to functions/methods. I know it's a common issue and it has been answered many times, but I searched a lot and found nothing that fully satisfies me.
Passing by value is obvious and I won't mention this. What I came up with is:
Passing by non-const reference means, that object is MODIFIED
Passing by const reference means, that object is USED
Passing by pointer means, that a reference to object is going to be STORED. Whether ownership is passed or not will depend on the context.
It seems to be consistent, but when I want to pick heap-allocated object and pass it to 2. case parameter, it'd look like this:
void use(const Object &object) { ... }
//...
Object *obj = getOrCreateObject();
use(*obj);
or
Object &obj = *getOrCreateObject();
use(obj);
Both look weird to me. What would you advise?
PS I know that one should avoid raw pointers and use smart instead (easier memory managment and expressiveness in ownership) and it can be the next step in refactoring the project I work on.
You can use these conventions if you like. But keep in mind that you cannot assume conventions when dealing with code written by other people. You also cannot assume that people reading your code are aware of your conventions. You should document an interface with comments when it might be ambiguous.
Passing by pointer means, that object is going to be STORED. Who's its owner will depend on the context.
I can think of only one context where the ownership of a pointer argument should transfer to the callee: Constructor of a smart pointer.
Besides possible intention of storing, a pointer argument can alternatively have the same meaning as a reference argument, with the addition that the argument is optional. You typically cannot represent an optional argument with a reference since they cannot be null - although with custom types you could use a reference to a sentinel value.
Both look weird to me. What would you advise?
Neither look weird to me, so my advise is to get accustomed.
The main problem with your conventions is that you make no allowance for the possibility of interfacing to code (e.g. written by someone else) that doesn't follow your conventions.
Generally speaking, I use a different set of conventions, and rarely find a need to work around them. (The main exception will be if there is a need to use a pointer to a pointer, but I rarely need to do that directly).
Passing by non-const reference is appropriate if ANY of the following MAY be true;
The object may be changed;
The object may be passed to another function by a non-const reference [relevant when using third party code by developers who choose to omit the const - which is actually something a lot of beginners or lazy developers do];
The object may be passed to another function by a non-const pointer [relevant when using third party code be developers who choose to omit the const, or when using legacy APIs];
Non-const member functions of the object are called (regardless of whether they change the object or not) [also often a consideration when using third-party code by developers who prefer to avoid using const].
Conversely, const references may be passed if ALL of the following are true;
No non-mutable members of the object are changed;
The object is only passed to other functions by const reference, by const pointer, or by value;
Only const member functions of the object are called (even if those members are able to change mutable members.
I'll pass by value instead of by const reference in cases where the function would copy the object anyway. (e.g. I won't pass by const reference, and then construct a copy of the passed object within the function).
Passing non-const pointers is relevant if it is appropriate to pass a non-const reference but there is also a possibility of passing no object (e.g. a nullptr).
Passing const pointers is relevant if it is appropriate to pass a const reference but there is also a possibility of passing no object (e.g. a nullptr).
I would not change the convention for either of the following
Storing a reference or pointer to the object within the function for later use - it is possible to convert a pointer to a reference or vice versa. And either one can be stored (a pointer can be assigned, a reference can be used to construct an object);
Distinguishing between dynamically allocated and other objects - since I mostly either avoid using dynamic memory allocation at all (e.g. use standard containers, and pass them around by reference or simply pass iterators from them around) or - if I must use a new expression directly - store the pointer in another object that becomes responsible for deallocation (e.g. a std::smart_pointer) and then pass the containing object around.
In my opionion, they are the same. In the first part of your post, you are talking about the signature, but your example is about function call.
Say, I develop a complex application: Within object member functions, should I modify only those objects, that are passed to the member functions as parameters, or can I access and modify any other objects I have access to(say public or static objects)?
Technically, I know that it is possible to modify anything I have access to. I am asking about good practices.
Sometimes, it is bothering to pass as an argument everythying i will access and modify, especially if I know that the object member function will not be used by anybody else, but me. Thanks.
Global state is never a good idea (though it is sometimes simpler, for example logging), because it introduces dependencies that are not documented in the interface and increase coupling between components. Therefore, modifying a global state (static variables for example) should be avoided at all costs. Note: global constants are perfectly okay
In C++, you have the const keyword, to document (and have the compiler enforce) what can be modified and what cannot.
A const method is a guarantee that the visible state of an object will be untouched, an argument passed by const reference, or value, will not be touched either.
As long as it is documented, it is fine... and you should strive for having as few non-const methods in your class interface and as few non-const parameters in your methods.
If you have a class with member variables, then it is entirely acceptable to modify those member variables in a member method regardless of whether those member variables are private, protected, or public. This is want is meant by encapsulation.
In fact, modifying the variables passed into the member method is probably a bad idea; returning a new value is what you'd want, or getting a new value back from a separate member method.
Let's say I have an object Employee_Storage that contains a database connection data member. Should this data member be stored as a pointer or as a reference?
If I store it as a reference, I
don't have to do any NULL
checking. (Just how important is NULL checking anyway?)
If I store it as a pointer, it's
easier to setup Employee_Storage
(or MockEmployee_Storage) for the
purposes of testing.
Generally, I've been in the habit of always storing my data members as references. However, this makes my mock objects difficult to set up, because instead of being able to pass in NULLs (presumably inside a default constructor) I now must pass in true/mock objects.
Is there a good rule of thumb to follow, specifically with an eye towards testability?
It's only preferable to store references as data members if they're being assigned at construction, and there is truly no reason to ever change them. Since references cannot be reassigned, they are very limited.
In general, I typically store as pointers (or some form of templated smart pointer). This is much more flexible - both for testing (as you mentioned) but also just in terms of normal usage.
It is almost never prefereable to store references as data members, and a lot of the time it is impossible. If the objects must be assignable (as they must to be stored in a standard library container), references cannot be used. Also, references cannot be reseated, so once a reference is initialised with an object, it cannot be made to refer to a different object.
See this question Should I prefer pointers or references in member data? for a more detailed discussion of the issue.
I was trying to figure this out myself, so might as well post it. I conclude it doesn't seem to be a good idea to use reference data member because you could inadvertently create an alias when you go to initialize it.
#include <iostream>
using namespace std;
class stuff
{
public:
explicit stuff(int &a):x(a) //you have to initialize it here
{
//body intialization won't work
};
int& x; //reference data member
};
int main()
{
int A=100;
stuff B(A);//intialize B.x
cout<<B.x<<endl;//outputs 100
A=50;//change A;
cout<<B.x<<endl; //outputs 50, so B.x is an alias of A.
system("pause");
return 0;
}
Given a choice, I like to use the most constrained type possible.
So if I don't need to support null objects I'd much prefer to declare a
Foo& m_foo;
member rather than a
Foo*const m_foo;
member, because the former declaration documents the fact that m_foo can't be null.
In the short term, the advantage isn't that great. But in the long run, when you come back to old code, the instant assurance that you don't have to worry about the case of m_foo being null is quite valuable.
There are other ways of achieving a similar effect. One project I worked on where they didn't understand references would insist any potentially null pointers be suffixed '00' e.g m_foo00. Interestingly, boost::optional seems to support references although I haven't tried it. Or you can litter your code with assertions.
Adding to this question..
Class with reference data member:
you must pass a value to the object at construction (not unexpectedly)
breaks the rule of encapsulation, as referenced variable can be changed from outside class, without class object having any control of it. (I suppose the only use case could be something like this though, for some very specialized reasons.)
prevents creating assignment operator. What are you going to copy?
you need to ensure the referred variable is not destroyed while your object is alive
This is a simplified example to illustrate the question:
class A {};
class B
{
B(A& a) : a(a) {}
A& a;
};
class C
{
C() : b(a) {}
A a;
B b;
};
So B is responsible for updating a part of C. I ran the code through lint and it whinged about the reference member: lint#1725.
This talks about taking care over default copy and assignments which is fair enough, but default copy and assignment is also bad with pointers, so there's little advantage there.
I always try to use references where I can since naked pointers introduce uncertaintly about who is responsible for deleting that pointer. I prefer to embed objects by value but if I need a pointer, I use auto_ptr in the member data of the class that owns the pointer, and pass the object around as a reference.
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My own rule of thumb :
Use a reference member when you want the life of your object to be dependent on the life of other objects : it's an explicit way to say that you don't allow the object to be alive without a valid instance of another class - because of no assignment and the obligation to get the references initialization via the constructor. It's a good way to design your class without assuming anything about it's instance being member or not of another class. You only assume that their lives are directly linked to other instances. It allows you to change later how you use your class instance (with new, as a local instance, as a class member, generated by a memory pool in a manager, etc.)
Use pointer in other cases : When you want the member to be changed later, use a pointer or a const pointer to be sure to only read the pointed instance. If that type is supposed to be copyable, you cannot use references anyway. Sometimes you also need to initialize the member after a special function call ( init() for example) and then you simply have no choice but to use a pointer. BUT : use asserts in all your member function to quickly detect wrong pointer state!
In cases where you want the object lifetime to be dependent on an external object's lifetime, and you also need that type to be copyable, then use pointer members but reference argument in constructor That way you are indicating on construction that the lifetime of this object depends on the argument's lifetime BUT the implementation use pointers to still be copyable. As long as these members are only changed by copy, and your type don't have a default constructor, the type should fullfil both goals.
Avoid reference members, because they restrict what the implementation of a class can do (including, as you mention, preventing the implementation of an assignment operator) and provide no benefits to what the class can provide.
Example problems:
you are forced to initialise the reference in each constructor's initialiser list: there's no way to factor out this initialisation into another function (until C++0x, anyway edit: C++ now has delegating constructors)
the reference cannot be rebound or be null. This can be an advantage, but if the code ever needs changing to allow rebinding or for the member to be null, all uses of the member need to change
unlike pointer members, references can't easily be replaced by smart pointers or iterators as refactoring might require
Whenever a reference is used it looks like value type (. operator etc), but behaves like a pointer (can dangle) - so e.g. Google Style Guide discourages it
Objects rarely should allow assign and other stuff like comparison. If you consider some business model with objects like 'Department', 'Employee', 'Director', it is hard to imagine a case when one employee will be assigned to other.
So for business objects it is very good to describe one-to-one and one-to-many relationships as references and not pointers.
And probably it is OK to describe one-or-zero relationship as a pointer.
So no 'we can't assign' then factor.
A lot of programmers just get used with pointers and that's why they will find any argument to avoid use of reference.
Having a pointer as a member will force you or member of your team to check the pointer again and again before use, with "just in case" comment. If a pointer can be zero then pointer probably is used as kind of flag, which is bad, as every object have to play its own role.
Use references when you can, and pointers when you have to.
In a few important cases, assignability is simply not needed. These are often lightweight algorithm wrappers that facilitate calculation without leaving the scope. Such objects are prime candidates for reference members since you can be sure that they always hold a valid reference and never need to be copied.
In such cases, make sure to make the assignment operator (and often also the copy constructor) non-usable (by inheriting from boost::noncopyable or declaring them private).
However, as user pts already commented, the same is not true for most other objects. Here, using reference members can be a huge problem and should generally be avoided.
As everyone seems to be handing out general rules, I'll offer two:
Never, ever use use references as class members. I have never done so in my own code (except to prove to myself that I was right in this rule) and cannot imagine a case where I would do so. The semantics are too confusing, and it's really not what references were designed for.
Always, always, use references when passing parameters to functions, except for the basic types, or when the algorithm requires a copy.
These rules are simple, and have stood me in good stead. I leave making rules on using smart pointers (but please, not auto_ptr) as class members to others.
Yes to: Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My rules of thumb for data members:
never use a reference, because it prevents assignment
if your class is responsible for deleting, use boost's scoped_ptr (which is safer than an auto_ptr)
otherwise, use a pointer or const pointer
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Yes. Readability of your code. A pointer makes it more obvious that the member is a reference (ironically :)), and not a contained object, because when you use it you have to de-reference it. I know some people think that is old fashioned, but I still think that it simply prevent confusion and mistakes.
I advise against reference data members becasue you never know who is going to derive from your class and what they might want to do. They might not want to make use of the referenced object, but being a reference you have forced them to provide a valid object.
I've done this to myself enough to stop using reference data members.