Should I prefer pointers or references in member data? - c++

This is a simplified example to illustrate the question:
class A {};
class B
{
B(A& a) : a(a) {}
A& a;
};
class C
{
C() : b(a) {}
A a;
B b;
};
So B is responsible for updating a part of C. I ran the code through lint and it whinged about the reference member: lint#1725.
This talks about taking care over default copy and assignments which is fair enough, but default copy and assignment is also bad with pointers, so there's little advantage there.
I always try to use references where I can since naked pointers introduce uncertaintly about who is responsible for deleting that pointer. I prefer to embed objects by value but if I need a pointer, I use auto_ptr in the member data of the class that owns the pointer, and pass the object around as a reference.
I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?

My own rule of thumb :
Use a reference member when you want the life of your object to be dependent on the life of other objects : it's an explicit way to say that you don't allow the object to be alive without a valid instance of another class - because of no assignment and the obligation to get the references initialization via the constructor. It's a good way to design your class without assuming anything about it's instance being member or not of another class. You only assume that their lives are directly linked to other instances. It allows you to change later how you use your class instance (with new, as a local instance, as a class member, generated by a memory pool in a manager, etc.)
Use pointer in other cases : When you want the member to be changed later, use a pointer or a const pointer to be sure to only read the pointed instance. If that type is supposed to be copyable, you cannot use references anyway. Sometimes you also need to initialize the member after a special function call ( init() for example) and then you simply have no choice but to use a pointer. BUT : use asserts in all your member function to quickly detect wrong pointer state!
In cases where you want the object lifetime to be dependent on an external object's lifetime, and you also need that type to be copyable, then use pointer members but reference argument in constructor That way you are indicating on construction that the lifetime of this object depends on the argument's lifetime BUT the implementation use pointers to still be copyable. As long as these members are only changed by copy, and your type don't have a default constructor, the type should fullfil both goals.

Avoid reference members, because they restrict what the implementation of a class can do (including, as you mention, preventing the implementation of an assignment operator) and provide no benefits to what the class can provide.
Example problems:
you are forced to initialise the reference in each constructor's initialiser list: there's no way to factor out this initialisation into another function (until C++0x, anyway edit: C++ now has delegating constructors)
the reference cannot be rebound or be null. This can be an advantage, but if the code ever needs changing to allow rebinding or for the member to be null, all uses of the member need to change
unlike pointer members, references can't easily be replaced by smart pointers or iterators as refactoring might require
Whenever a reference is used it looks like value type (. operator etc), but behaves like a pointer (can dangle) - so e.g. Google Style Guide discourages it

Objects rarely should allow assign and other stuff like comparison. If you consider some business model with objects like 'Department', 'Employee', 'Director', it is hard to imagine a case when one employee will be assigned to other.
So for business objects it is very good to describe one-to-one and one-to-many relationships as references and not pointers.
And probably it is OK to describe one-or-zero relationship as a pointer.
So no 'we can't assign' then factor.
A lot of programmers just get used with pointers and that's why they will find any argument to avoid use of reference.
Having a pointer as a member will force you or member of your team to check the pointer again and again before use, with "just in case" comment. If a pointer can be zero then pointer probably is used as kind of flag, which is bad, as every object have to play its own role.

Use references when you can, and pointers when you have to.

In a few important cases, assignability is simply not needed. These are often lightweight algorithm wrappers that facilitate calculation without leaving the scope. Such objects are prime candidates for reference members since you can be sure that they always hold a valid reference and never need to be copied.
In such cases, make sure to make the assignment operator (and often also the copy constructor) non-usable (by inheriting from boost::noncopyable or declaring them private).
However, as user pts already commented, the same is not true for most other objects. Here, using reference members can be a huge problem and should generally be avoided.

As everyone seems to be handing out general rules, I'll offer two:
Never, ever use use references as class members. I have never done so in my own code (except to prove to myself that I was right in this rule) and cannot imagine a case where I would do so. The semantics are too confusing, and it's really not what references were designed for.
Always, always, use references when passing parameters to functions, except for the basic types, or when the algorithm requires a copy.
These rules are simple, and have stood me in good stead. I leave making rules on using smart pointers (but please, not auto_ptr) as class members to others.

Yes to: Is it true to say that an object containing a reference should not be assignable, since a reference should not be changed once initialised?
My rules of thumb for data members:
never use a reference, because it prevents assignment
if your class is responsible for deleting, use boost's scoped_ptr (which is safer than an auto_ptr)
otherwise, use a pointer or const pointer

I would generally only use a pointer in member data when the pointer could be null or could change. Are there any other reasons to prefer pointers over references for data members?
Yes. Readability of your code. A pointer makes it more obvious that the member is a reference (ironically :)), and not a contained object, because when you use it you have to de-reference it. I know some people think that is old fashioned, but I still think that it simply prevent confusion and mistakes.

I advise against reference data members becasue you never know who is going to derive from your class and what they might want to do. They might not want to make use of the referenced object, but being a reference you have forced them to provide a valid object.
I've done this to myself enough to stop using reference data members.

Related

Are there any downsides to using `std::reference_wrapper<T>` as an always-valid member variable instead of a pointer?

I am having a quite common problem. I have a class that must store a non-owning pointer to a different class object.
I know that:
The lifetime of the reference object is guaranteed to outlive the instance.
The referenced object is passed in a constructor and does not change with the exception of moves or assignment.
It is never invalid.
It is used in many methods.
It can be shared by many instances.
Think of e.g. a logger class which is not global.
These points lead me to this solution using a reference variable which guarantees validity:
struct Foo{};
struct Bar{
Bar(Foo& foo):m_foo(foo){}
Foo& m_foo;
};
The big downside is Bar is unnecessarily almost immutable - no assignment, no move.
The usual thing I did was to store Foo as a pointer instead. This solves most of the issues except that it is no longer very clear that the pointer is always valid. Furthermore it adds a small new issue that it can be invalidated in any method, which should not happen. (Making it const has the same downside as &). That makes me add assert(m_foo) to every method for peace of mind.
So, I was thinking about just storing std::reference_wrapper<Foo>. It is always valid and it keeps Bar mutable. Are there any downsides compared to a simple pointer?
I know that any method can still
point it to e.g. a local variable but let's say that does not happen because it is perhaps hard to obtain a new valid instance of Foo. At least it is harder than simple =nullptr;
I know this approach is used for containers like std::vector so I assume it is okay, but I would like to know if there is any catch I should look for.
Since Foo is a struct you need to invoke get() every time you access any member or field of Foo. With reference or pointer you could use '.' or '->' respectively for member access. So reference_wrapper is not "transparent" in that regard. (There is currently also no way of making it "transparent" in C++, which would be nice of course).
There will be no runtime overhead but code will be congested with get() calls.
If that is not of a concern for you then there are no downsides of using reference_wrapper instead of a pointer. (In fact reference_wrapper is implemented by using a pointer member)
EDIT: also if you only need to call one or two member functions of Foo one could inherit from reference_wrapper and add a call stub. But that perhaps may be overkill ...

C++:using pointer to unordered_map or just defining it as a member variable from this type in a class?

I have a problem which I cannot understand:
Let's Say I have a class System with several member fields, and one of them is of type unordered_map, so when I declare the class in the header file, I write at the beginning of the header #include <unordered_map>.
Now, I have two ways of declaring this field:
1.std::unordered_map<std::string,int> umap;
2.std::unordered_map<std::string,int>* p_umap;
Now in the constructor of the class, if I choose the first option, there is no need to initialize that field in the initializer list since the constructor of class System will call the default constructor for the field umap as part of constructing an instance of type class System.
If I choose the second option, I should initialize the field p_umap in the constructor (in the initialize list) with the operator new and in the destructor, to delete this dynamic allocation.
What is the difference between these two options? If you have a class that one of it's fields is of type unordered_map, how do you declare this field? As a pointer or as a variable of type unordered_map?
In a situation like the one you are describing, it seems like the first option is preferable. Most likely, in fact, the unordered map is intended to be owned by the class it is a data member of. In other words, its lifetime should not be extended beyond the lifetime of the encapsulating class, and the encapsulating class has the responsibility of creating and destroying the unordered map.
While with option 1 all this work is done automatically, with option 2 you would have to take care of it manually (and take care of correct copy-construction, copy-assignment, exception-safety, lack of memory leaks, and so on). Surely you could use smart pointers (e.g. std::unique_ptr<>) to encapsulate this responsibility into a wrapper that would take care of deleting the wrapped object when the smart pointer itself goes out of scope (this idiom is called RAII, which is an acronym for Resource Acquisition Is Initialization).
However, it seems to me like you do not really need a pointer at all here. You have an object whose lifetime is completely bounded by the lifetime of the class that contains it. In these situations, you should just not use pointers and prefer declaring the variable as:
std::unordered_map<std::string, int> umap;
Make it not a pointer until you need to make it a pointer.
Pointers are rife with user error.
For example, you forgot to mention that your class System would also need to implement
System( const Sysytem& )
and
System& operator= ( const System& )
or Bad Behavior will arise when you try to copy your object.
The difference is in how you want to be able to access umap. Pointers can allow for a bit more flexibility, but they obviously add complexity in terms of allocation (stack vs heap, destructors and such). If you use a pointer to umap, you can do some pretty convoluted stuff such as making two System's with the same umap. In the end though, go with KISS unless there's a compelling reason not to.
There is no need to define it as pointer. If you do it, you must also make sure to implement copy constructor and assignment operator, or disable them completely.
If there is no specific reason to make it a pointer (and you don't show any) just make it a normal member variable.

member variable as a reference

What is the advantage of declaring a member variable as a reference?
I saw people doing that, and can't understand why.
One useful case is when you don't have access to the constructor of that object, yet don't want to work with indirection through a pointer. For example, if a class A does not have a public constructor and your class wants to accept an A instance in its constructor, you would want to store a A&. This also guarantees that the reference is initialized.
A member reference is useful when you need to have access to another object, without copying it.
Unlike a pointer, a reference cannot be changed (accidentally) so it always refers to the same object.
Generally speaking, types with unusual assignment semantics like std::auto_ptr<> and C++ references make it easier to shoot yourself in the foot (or to shoot off the whole leg).
When a reference is used as a member that means that the compiler generated operator= does a very surprising thing by assigning to the object referenced instead of reassigning the reference because references can not be reassigned to refer to another object. In other words, having a reference member most of the time makes the class non-assignable.
One can avoid this surprising behaviour by using plain pointers.

When is it preferable to store data members as references instead of pointers?

Let's say I have an object Employee_Storage that contains a database connection data member. Should this data member be stored as a pointer or as a reference?
If I store it as a reference, I
don't have to do any NULL
checking. (Just how important is NULL checking anyway?)
If I store it as a pointer, it's
easier to setup Employee_Storage
(or MockEmployee_Storage) for the
purposes of testing.
Generally, I've been in the habit of always storing my data members as references. However, this makes my mock objects difficult to set up, because instead of being able to pass in NULLs (presumably inside a default constructor) I now must pass in true/mock objects.
Is there a good rule of thumb to follow, specifically with an eye towards testability?
It's only preferable to store references as data members if they're being assigned at construction, and there is truly no reason to ever change them. Since references cannot be reassigned, they are very limited.
In general, I typically store as pointers (or some form of templated smart pointer). This is much more flexible - both for testing (as you mentioned) but also just in terms of normal usage.
It is almost never prefereable to store references as data members, and a lot of the time it is impossible. If the objects must be assignable (as they must to be stored in a standard library container), references cannot be used. Also, references cannot be reseated, so once a reference is initialised with an object, it cannot be made to refer to a different object.
See this question Should I prefer pointers or references in member data? for a more detailed discussion of the issue.
I was trying to figure this out myself, so might as well post it. I conclude it doesn't seem to be a good idea to use reference data member because you could inadvertently create an alias when you go to initialize it.
#include <iostream>
using namespace std;
class stuff
{
public:
explicit stuff(int &a):x(a) //you have to initialize it here
{
//body intialization won't work
};
int& x; //reference data member
};
int main()
{
int A=100;
stuff B(A);//intialize B.x
cout<<B.x<<endl;//outputs 100
A=50;//change A;
cout<<B.x<<endl; //outputs 50, so B.x is an alias of A.
system("pause");
return 0;
}
Given a choice, I like to use the most constrained type possible.
So if I don't need to support null objects I'd much prefer to declare a
Foo& m_foo;
member rather than a
Foo*const m_foo;
member, because the former declaration documents the fact that m_foo can't be null.
In the short term, the advantage isn't that great. But in the long run, when you come back to old code, the instant assurance that you don't have to worry about the case of m_foo being null is quite valuable.
There are other ways of achieving a similar effect. One project I worked on where they didn't understand references would insist any potentially null pointers be suffixed '00' e.g m_foo00. Interestingly, boost::optional seems to support references although I haven't tried it. Or you can litter your code with assertions.
Adding to this question..
Class with reference data member:
you must pass a value to the object at construction (not unexpectedly)
breaks the rule of encapsulation, as referenced variable can be changed from outside class, without class object having any control of it. (I suppose the only use case could be something like this though, for some very specialized reasons.)
prevents creating assignment operator. What are you going to copy?
you need to ensure the referred variable is not destroyed while your object is alive

C++ - when should I use a pointer member in a class

One of the thing that has been confusing for me while learning C++ (and Direct3D, but that some time ago) is when you should use a pointer member in a class. For example, I can use a non-pointer declaration:
private:
SomeClass instance_;
Or I could use a pointer declaration
private:
Someclass * instance_
And then use new() on it in the constructor.
I understand that if SomeClass could be derived from another class, a COM object or is an ABC then it should be a pointer. Are there any other guidelines that I should be aware of?
A pointer has following advantages:
a) You can do a lazy initialization, that means to init / create the object only short before the first real usage.
b) The design: if you use pointers for members of an external class type, you can place a forward declaration above your class and thus don't need to include the headers of that types in your header - instead of that you include the third party headers in your .cpp - that has the advantage to reduce the compile time and prevents side effects by including too many other headers.
class ExtCamera; // forward declaration to external class type in "ExtCamera.h"
class MyCamera {
public:
MyCamera() : m_pCamera(0) { }
void init(const ExtCamera &cam);
private:
ExtCamera *m_pCamera; // do not use it in inline code inside header!
};
c) A pointer can be deleted anytime - so you have more control about the livetime and can re-create an object - for example in case of a failure.
The advantages of using a pointer are outlined by 3DH: lazy initialization, reduction in header dependencies, and control over the lifetime of the object.
The are also disadvantages. When you have a pointer data member, you probably have to write your own copy constructor and assignment operator, to make sure that a copy of the object is created properly. Of course, you also must remember to delete the object in the destructor. Also, if you add a pointer data member to an existing class, you must remember to update the copy constructor and operator=. In short, having a pointer data member is more work for you.
Another disadvantage is really the flip side of the control over the lifetime of the object pointed to by the pointer. Non-pointer data members are destroyed automagically when the object is destroyed, meaning that you can always be sure that they exist as long as the object exists. With the pointer, you have to check for it being nullptr, meaning also that you have to make sure to set it to nullptr whenever it doesn't point to anything. Having to deal with all this may easily lead to bugs.
Finally, accessing non-pointer members is likely to be faster, because they are contiguous in memory. On the other hand, accessing pointer data member pointing to an object allocated on the heap is likely to cause a cache miss, making it slower.
There is no single answer to your question. You have to look at your design, and decide whether the advantages of pointer data members outweigh the additional headache. If reducing compile time and header dependencies is important, use the pimpl idiom. If your data member may not be necessary for your object in certain cases, use a pointer, and allocate it when needed. If these do not sound like compelling reasons, and you do not want to do extra work, then do not use a pointer.
If lazy initialization and the reduction of header dependencies are important, then you should first consider using a smart pointer, like std::unique_ptr or std::shared_ptr, instead of a raw pointer. Smart pointers save you from many of the headaches of using raw pointers described above.
Of course, there are still caveats. std::unique_ptr cleans up after itself, so you do not need to add or modify the destructor of your class. However, it is non-copiable, so having a unique pointer as a data member makes your class non-copiable as well.
With std::shared_ptr, you do not have to worry about the destructor or copying or assignment. However, the shared pointer incurs a performance penalty for reference counting.
Allocate it on the stack if you can, from the free-store if you have to. There is a similar question here, where you will find all the "why's".
The reason you see lots of pointer usage when it comes to games and stuff is because DirectX is a COM interface, and in honesty, most games programmers from back in the day aren't really C++ programmers, they are C-with-classes programmers, and in C pointer usage is very common.
Another reason to use pointers would be dynamic binding. If you have a base class with a virtual method and some derived classes, you can only get dynamic binding using pointers.