What is the preferred way of allocating C++ class member data?

What is the preferred way of allocating C++ class member data? - c++

Let's say I have a class that allocates some arbitrary member data. There are two common ways that I have seen used (I know that there are others):
class A
{
public:
A();
~A();
//Accessors...
private:
B *mB;
}
A::A()
{
mB = new B();
}
A::~A()
{
delete B;
}
Versus...
class A
{
public:
//Accessors...
private:
B mB;
}
Assume that A itself will be allocated on the heap by consumer code.
In the general case, which method is preferred? I realize that specific situations really encourage one way or the other, but in absence of those demands, is one way preferred? What are the tradeoffs?

The second is the preferred route. Do not use new / delete unless you specifically need a variable to be on the heap or have a lifetime longer than it's container. C++ value types are easier to manage and have less error cases to worry about IMHO

It depends.
In general, If a B is large and unwieldy then it's easier to pass around a pointer to the B than the B itself. So if the B will often be disassociated from the A (f'rinstance if your A's swap B's) then the first way is better.
Using a pointer can also reduce dependencies. If you do it right, A.hh can get by without specifiying what a B is or does (i.e. A.h need not #include "B.hh") so that things that depend on A.hh won't necessarily depend on B.hh.
The price of using pointers is an extra layer of machinery and the dangers of things like lost objects, double-deletion and the dereferencing of uninitialized pointers, so it shouldn't be used unless it actually gives a benefit in your situation. Some people fall in love with pointer techniques and use them everywhere; if they want to improve as programmers they have to grow out of it.

In general, prefer direct composition (the second choice). In that case there is no chance of leaking memory and the object is fully located in a contiguous memory block, allowing better cache locality.
You might use the first option if you're implementing a PIMPL, or you have a need to use one of several possible class types (via inheritance). In that case definitely use a smart pointer (boost::shared_ptr for example) to manage the memory for you.

It depends, mainly, on what you are looking for.
For simplicity's sake: don't use a pointer. Therefore the second choice.
It's easier to manage (no need to worry about memory management, deep copying, deep constness, etc...).
However you might need dynamically allocated attributes sometimes:
if you need polymorphism (otherwise you have a truncation)
if you want to cut down your dependencies (in the header file) --> see PIMPL here
Even in this case though, hand over the responsibility to a smart manager (smart pointer, dedicated pimpl class, etc...)

Related

How efficient is accessing variables through a chain->of->pointers?

I had my doubts since I first saw where it leads, but now that I look at some code I have (medium-ish beginner), it strikes me as not only ugly, but potentially slow?
If I have a struct S inside a class A, called with class B (composition), and I need to do something like this:
struct S { int x[3] {1, 2, 3}; };
S *s;
A(): s {new S} {}
B(A *a) { a->s->x[1] = 4; }
How efficient is this chain: a->s->x[1]? Is this ugly and unnecessary? A potential drag? If there are even more levels in the chain, is it that much uglier? Should this be avoided? Or, if by any chance none of the previous, is it a better approach than:
S s;
B(A *a): { a->s.x[1] = 4; }
It seems slower like this, since (if I got it right) I have to make a copy of the struct, rather than working with a pointer to it. I have no idea what to think about this.

is it a better approach
In the case you just showed no, not at all.
First of all, in modern C++ you should avoid raw pointers with ownership which means that you shouldn't use new, never. Use one of the smart pointers that fit your needs:
std::unique_ptr for sole ownership.
std::shared_ptr for multiple objects -> same resource.
I can't exactly tell you about the performance but direct access through the member s won't ever be slower than direct access through the member s that is dereferenced. You should always go for the non-pointer way here.
But take another step back. You don't even need pointers here in the first place. s should just be an object like in your 2nd example and replace the pointer in B's constructor for a reference.
I have to make a copy of the struct, rather than working with a
pointer to it.
No, no copy will be made.

The real cost of using pointers to objects in many iterations, is not necessarily the dereferencing of the pointer itself, but the potential cost of loading another cache frame into the CPU cache. As long as the pointers points to something within the currently loaded cache frame, the cost is minimal.

Always avoid dynamic allocation with new wherever possible, as it is potentially a very expensive operation, and requires an indirection operation to access the thing you allocated. If you do use it, you should also be using smart pointers, but in your case there is absolutely no reason to do so - just have an instance of S (a value, not a pointer) inside your class.

If you consider a->s->x[1] = 4 as ugly, then it is rather because of the chain than because of the arrows, and a->s.x[1] = 4 is ugly to the same extent. In my opinion, the code exposes S more than necessary, though there may sometimes exist good reasons for doing so.
Performance is one thing that matters, others are maintainability and adaptability. A chain of member accesses usually supports the principle of information hiding to a lesser extent than designs where such chains are avoided; Involved objects (and therefore the involved code) is tighter coupled than otherwise, and this usually goes on the cost of maintainability (confer, for example, Law of Demeter as a design principle towards better information hiding:
In particular, an object should avoid invoking methods of a member
object returned by another method. For many modern object oriented
languages that use a dot as field identifier, the law can be stated
simply as "use only one dot". That is, the code a.b.Method() breaks
the law where a.Method() does not. As an analogy, when one wants a dog
to walk, one does not command the dog's legs to walk directly; instead
one commands the dog which then commands its own legs.
Suppose, for example, that you change the size of array x from 3 to 2, then you have to review not only the code of class A, but potentially that of any other class in your program.
However, if we avoid exposing to much of component S, class A could be extended by a member/operator int setSAt(int x, int value), which can then also check, for example, array boundaries; changing S influences only those classes that have S as component:
B(A *a) { a->setSAt(1,4); }

Is using pointers in C++ always bad?

I was told to avoid using pointers in C++. It seems that I can't avoid them however in the code i'm trying to write, or perhaps i'm missing out on other great C++ features.
I wish to create a class (class1) which contains another class (class2) as a data member. I then want class2 to know about class1 and be able to communicate with it.
I could have a reference to class1 as a member in class2 but that then means I need to provide a reference to class1 as a parameter in the constructor of class2 and use initialiser lists which I don't want. I'm trying to do this without needing the constructor to do it.
I would like for class2 to have a member function called Initialise which could take in the reference to class1, but this seems impossible without using pointers. What would people recommend here? Thanks in advance.
The code is completely simplified just to get the main idea across :
class class1
{
public:
InitialiseClass2()
{
c2.Initialise(this);
}
private:
class2 c2;
};
class class2
{
public:
Initialise(class1* c1)
{
this->c1 = c1;
}
private:
class1* c1;
};

this seems impossible without using pointers
That is incorrect. Indeed, to handle a reference to some other object, take a reference into a constructor:
class class2
{
public:
class2(class1& c1)
: c1(c1)
{}
private:
class1& c1;
};
The key here is to initialise, not assign, the reference. Whether this is possible depends on whether you can get rid of your Initialise function and settle into RAII (please do!). After that, whether this is actually a good idea depends on your use case; nowadays, you can almost certainly make ownership and lifetime semantics much clearer by using one of the smart-pointer types instead — even if it's just a std::weak_ptr.
Anyway, speaking more generally.
Are pointers "always" bad? No, of course not. I'd almost be tempted to say that managing dynamic memory yourself is "always" bad, but I won't make a generalisation.
Should you avoid them? Yes.
The difference is that the latter is a guideline to steer you away from manual memory management, and the former is an attempted prohibition.

No, using pointers in C++ is not bad at all, and I see this anti-advice over and over again. What is bad is managing pointers by yourself, unless you are creating a pointer-managing low-level entity.
Again, I shall make a very clear distinction. Using pointers is good. Very few real C++ programs can do without USING pointers. Managing pointers is bad, unless you are working on pointer manager.

A pointer can be nullptr whereas a reference must always be bound to something (and cannot be subsequently re-bound to something else).
That's the chief distinction and the primary consideration for your design choice.
Memory management of pointers can be delegated to std::shared_ptr and std::unique_ptr as appropriate.

well, I never had the need to 2 classes to have reciprocal reference and for good reasons, how do you know how to test those classes? If later you need to change something in the way the 2 classes communicates you will probably have to change code in both classes). You can workaround in many ways:
You may need in reality just 1 class ( you have broken into much classes)
You can register a Observer for a class (using a 3rd class, in that case you will end up with a pointer, but at least the 2 classes are less coupled and it is easier test them).
You can think (maybe) to a new interface that require only 1 class to call methods on the other class
You could pass a lambda (or a functor if you do not have C++11) into one of the methods of the class removing the need to a back reference
You could pass a reference of the class inside a method.
Maybe you have to few classes and in reality you need a third class than communicates with both classes.
It is possible you need a Visitor (maybe you really need multiple dispatch)
Some of the workarounds above need pointers, some not. To you the choice ;)
NOTE: However what you are doing is perfectly fine to me (I see you do some trickery only in constructors, but probably you have more omitted code, in wich case that can cause troubles to you). In my case I "register" one class into another, then after the constructor called I have only one class calling the other and not viceversa.

First of all whenever you have a circular dependency in your design think about it twice and make sure it's the way to go. Try to use the Dependency inversion principle in order to analyze and fix your dependencies.
I was told to avoid using pointers in C++. It seems that I can't avoid them however in the code i'm trying to write, or perhaps i'm missing out on other great C++ features.
Pointers are a powerful programming tool. Like any other feature in the C++ (or in any programming language in general) they have to be used when they are the right tool. In C++ additionally you have access to references which are similar to pointers in usage but with a better syntax. Additionally they can't be null. Thus they always reference a valid object.
So use pointers when you ever need to but try to avoid using raw pointers and prefer a smart pointer as alternative whenever possible. This will protect you against some trivial memory leak problems but you still have to pay attention to your object life-cycle and for each dynamically allocated object you should know clearly who create it and when/whom will release the memory allocated for the object.
Pointers (and references) are very useful in general because they could be used to pass parameters to a method by reference so you avoid passing heavy objects by value in the stack. Imagine the case for example that you have a very big array of heavy objects (which copy/= operator is time consuming) and you would like to sort these objects. One simple method is to use pointers to these objects so instead of moving the whole object during the sorting operation you just move the pointers which are very lightweight data type (size of machine address basically).

data inheritance in C++

I have two class, one for storing base data, and the other for storing additional data as following:
struct AnimationState(){
virtual ~ AnimationState(){};
Vector3f m_spacialData;
float m_fTimeStamp;
}
And the derived class:
struct HermiteAnimationState() : public AnimationState{
virtual ~HermiteAnimationState(){};
Vector3f m_tangentIn;
Vector3f m_tangentOut;
}
My question: is how can I, at first, create an instance of HermiteAnimationState, and then upcast it to AnimationState for storing in a vector like this:
std::vector<AnimationState> m_vStates;
...
Lately, I can get the object AnimationState and downcast it to HermiteAnimationState for accessing the additional data (member m_tangentIn and m_tangentOut).
HermiteAnimationState* p = dynamic_cast<HermiteAnimationState*>(&m_vStates[i])

The way polymorphism works in C++ is that if B is a base class and D is derived from B, then:
a pointer to D can be used where a pointer to B is expected
a reference to D can be used where a reference to B is expected
What you can't do in C++ is actually use a value of type D in a context where a value of type B is expected. For example, you can't store derived objects in an array of base object. This makes sense when you consider that a derived object may have a different size from a base object.
Similarly, you can't store derived objects in a vector of base objects.
What you can do is store pointers to HermiteAnimationState in a vector of pointers to AnimationState. It's up to you how to manage the memory. For example, the following would be valid:
std::vector<AnimationState*> m_vStates;
HermiteAnimationState h_a_s;
m_vStates.push_back(&h_a_s);
...
HermiteAnimationState* p = dynamic_cast<HermiteAnimationState*>(m_vStates[i])
Since h_a_s is a local variable, it'll be destroyed automatically at the end of its scope.
But this is probably an unworkable approach, because you probably want the objects referred to by the vector elements to persist beyond the current scope. We can use std::unique_ptr for this purpose. A std::unique_ptr owns the object it points to, and as long as it stays alive, so does that object; and it deletes the object when it is itself destroyed. So a vector of std::unique_ptr objects behaves like a vector of objects themselves in terms of memory management. Now you can do
std::vector<std::unique_ptr<AnimationState*>> m_vStates;
m_vStates.emplace_back(new HermiteAnimationState);
...
HermiteAnimationState* p =
dynamic_cast<HermiteAnimationState*>(m_vStates[i].get());
(Note, however, that you can't copy this vector; you can only move it.)

Basically, you need to use some kind of reference to the pointed object because you need dynamic polymorphism.
The simplest but error-prone would be using "naked" pointers. The first thing that is problematic with this is that you have to do the destroying manually: containers will destroy the pointer, not what is pointed.
The safer way to do this is to use smart pointers, which are designed to do the destruction depending on a pre-fixed rule that the smart pointer embedd in it's type. The simplest one and certainly the best choice if you are doubting is std::unique_ptr, which can't be copied but can be moved. The other choice, which should be thought carefully about before being used, is the std::shared_ptr which is useful IFF you don't know when you should destroy these objects but you know it's when some systems will refer no more to it. Some other systems might just be observing that object, in which case std::weak_ptr.
Now, from reading your question, I think you are certainly processing a lot of these animation data. There is an obvious design issue there, I think, I might be wrong.
However, it looks like, if you have a lot of these AnimationState to manage, in a loop, you will get performance issues. This is common issues in games, mainly caused by "cache conherency".
What I would recommand in this case, would be to NOT use
inheritance: it's an invitation to the cpu to jump all over the place and trigger cache misses;
dynamic_cast: it's one of the few operations that are not guaranteed to end in a predictable time (with new and delete for example), which basically mean that if you are in a critical loop, you can lose a lot of time through it. In some cases, you can't avoid using dynamic cast (like when doing dynamic plugins), but in most cases, using it just because you have chosen to use inheritance is just wrong. If you use inheritance, then you should use virtual calls.
However, what I suggest is even more drastic: don't use inheritance at all.
Obviously, this is only an advice. If you are not doing something with a critical loop, it doesn't matter. I'm just worried because it looks like you are doing some inheritance for composition, which always have bad consequences both on readability of the code and performance.

How to store class member objects in C++

I am trying to write a simple game using C++ and SDL. My question is, what is the best practice to store class member variables.
MyObject obj;
MyObject* obj;
I read a lot about eliminating pointers as much as possible in similar questions, but I remember that few years back in some books I read they used it a lot (for all non trivial objects) . Another thing is that SDL returns pointers in many of its functions and therefor I would have to use "*" a lot when working with SDL objects.
Also am I right when I think the only way to initialize the first one using other than default constructor is through initializer list?

Generally, using value members is preferred over pointer members. However, there are some exceptions, e.g. (this list is probably incomplete and only contains reason I could come up with immediately):
When the members are huge (use sizeof(MyObject) to find out), the difference often doesn't matter for the access and stack size may be a concern.
When the objects come from another source, e.g., when there are factory function creating pointers, there is often no alternative to store the objects.
If the dynamic type of the object isn't known, using a pointer is generally the only alternative. However, this shouldn't be as common as it often is.
When there are more complicated relations than direct owner, e.g., if an object is shared between different objects, using a pointer is the most reasonable approach.
In all of these case you wouldn't use a pointer directly but rather a suitable smart pointer. For example, for 1. you might want to use a std::unique_ptr<MyObject> and for 4. a std::shared_ptr<MyObject> is the best alternative. For 2. you might need to use one of these smart pointer templates combined with a suitable deleter function to deal with the appropriate clean-up (e.g. for a FILE* obtained from fopen() you'd use fclose() as a deleter function; of course, this is a made up example as in C++ you would use I/O streams anyway).
In general, I normally initialize my objects entirely in the member initializer list, independent on how the members are represented exactly. However, yes, if you member objects require constructor arguments, these need to be passed from a member initializer list.

First I would like to say that I completely agree with Dietmar Kühl and Mats Petersson answer. However, you have also to take on account that SDL is a pure C library where the majority of the API functions expect C pointers of structs that can own big chunks of data. So you should not allocate them on stack (you shoud use new operator to allocate them on the heap). Furthermore, because C language does not contain smart pointers, you need to use std::unique_ptr::get() to recover the C pointer that std::unique_ptr owns before sending it to SDL API functions. This can be quite dangerous because you have to make sure that the std::unique_ptr does not get out of scope while SDL is using the C pointer (similar problem with std::share_ptr). Otherwise you will get seg fault because std::unique_ptr will delete the C pointer while SDL is using it.
Whenever you need to call pure C libraries inside a C++ program, I recommend the use of RAII. The main idea is that you create a small wrapper class that owns the C pointer and also calls the SDL API functions for you. Then you use the class destructor to delete all your C pointers.
Example:
class SDLAudioWrap {
public:
SDLAudioWrap() { // constructor
// allocate SDL_AudioSpec
}
~SDLAudioWrap() { // destructor
// free SDL_AudioSpec
}
// here you wrap all SDL API functions that involve
// SDL_AudioSpec and that you will use in your program
// It is quite simple
void SDL_do_some_stuff() {
SDL_do_some_stuff(ptr); // original C function
// SDL_do_some_stuff(SDL_AudioSpec* ptr)
}
private:
SDL_AudioSpec* ptr;
}
Now your program is exception safe and you don't have the possible issue of having smart pointers deleting your C pointer while SDL is using it.
UPDATE 1: I forget to mention that because SDL is a C library, you will need a custom deleter class in order to proper manage their C structs using smart pointers.
Concrete example: GSL GNU scientific library. Integration routine requires the allocation of a struct called "gsl_integration_workspace". In this case, you can use the following code to ensure that your code is exception safe
auto deleter= [](gsl_integration_workspace* ptr) {
gsl_integration_workspace_free(ptr);
};
std::unique_ptr<gsl_integration_workspace, decltype(deleter)> ptr4 (
gsl_integration_workspace_alloc (2000), deleter);
Another reason why I prefer wrapper classes

In case of initialization, it depends on what the options are, but yes, a common way is to use an initializer list.
The "don't use pointers unless you have to" is good advice in general. Of course, there are times when you have to - for example when an object is being returned by an API!
Also, using new will waste quite a bit of memory and CPU-time if MyObject is small. Each object created with new has an overhead of around 16-48 bytes in a typical modern OS, so if your object is only a couple of simple types, then you may well have more overhead than actual storage. In a largeer application, this can easily add up to a huge amount. And of course, a call to new or delete will most likely take some hundreds or thousands of cycles (above and beyond the time used in the constructor). So, you end up with code that runs slower and takes more memory - and of course, there's always some risk that you mess up and have memory leaks, causing your program to potentially crash due to out of memory, when it's not REALLY out of memory.
And as that famous "Murphy's law states", these things just have to happen at the worst possible and most annoying times - when you have just done some really good work, or when you've just succeeded at a level in a game, or something. So avoiding those risks whenever possible is definitely a good idea.

Well, creating the object is a lot better than using pointers because it's less error prone. Your code doesn't describe it well.
MyObj* foo;
foo = new MyObj;
foo->CanDoStuff(stuff);
//Later when foo is not needed
delete foo;
The other way is
MyObj foo;
foo.CanDoStuff(stuff);
less memory management but really it's up to you.

As the previous answers claimed the "don't use pointers unless you have to" is a good advise for general programming but then there are many issues that could finally make you select the pointers choice. Furthermore, in you initial question you are not considering the option of using references. So you can face three types of variable members in a class:
MyObject obj;
MyObject* obj;
MyObject& obj;
I use to always consider the reference option rather than the pointer one because you don't need to take care about if the pointer is NULL or not.
Also, as Dietmar Kühl pointed, a good reason for selecting pointers is:
If the dynamic type of the object isn't known, using a pointer is
generally the only alternative. However, this shouldn't be as common
as it often is.
I think this point is of particular importance when you are working on a big project. If you have many own classes, arranged in many source files and you use them in many parts of your code you will come up with long compilation times. If you use normal class instances (instead of pointers or references) a simple change in one of the header file of your classes will infer in the recompilation of all the classes that include this modified class. One possible solution for this issue is to use the concept of Forward declaration, which make use of pointers or references (you can find more info here).

Encapsulation vs structs - is this considered bad style?

I have a bunch of classes in a CUDA project that are mostly glorified structs and are dependent on each other by composition:
class A {
public:
typedef boost::shared_ptr<A> Ptr;
A(uint n_elements) { ... // allocate element_indices };
DeviceVector<int>::iterator get_element_indices();
private:
DeviceVector<int> element_indices;
}
class B {
public:
B(uint n_elements) {
... // initialize members
};
A::Ptr get_a();
DevicePointer<int>::iterator get_other_stuff();
private:
A::Ptr a;
DeviceVector<int> other_stuff;
}
DeviceVector is just a wrapper around thrust::device_vectors and the ::iterator can be cast to a raw device pointer. This is needed, as custom kernels are called and require handles to device memory.
Now, I do care about encapsulation, but
raw pointers to the data have to be exposed, so the classes using A and B can run custom kernels on the GPU
a default constructor is not desired, device memory should be allocated automatically --> shared_ptr<T>
only very few methods on A and B are required
So, one could make life much simpler by simply using structs
struct A {
void initialize(uint n_elements);
DeviceVector<int> element_indices;
}
struct B {
void initialize(uint n_elements);
A a;
DeviceVector<int> other_stuff;
}
I'm wondering whether I'm correct that in the sense of encapsulation this is practically equivalent. If so, is there anything that is wrong with the whole concept and might bite at some point?

Make it simple. Don't introduce abstraction and encapsulation before you need it.

It is a good habit to always make your data members private. It may seem at first that your struct is tiny, has no or a couple of member functions, and needs to expose the data members. However, as your program evolves, these "structs" tend to grow and proliferate. Before you know it, all of your code depends on the internals of one of these structs, and a slight change to it will reverberate throughout your code base.
Even if you need to expose raw pointers to the data, it is still a good idea to do that through getters. You may want to change how the data is handled internally, e. g. replace a raw array with an std::vector. If your data member is private and you are using a getter, you can do that without affecting any code using your class. Furthermore, getters let you enforce const-ness, and make a particular piece of data read-only by returning a const pointer.
It is a bit more work up front, but most of the time it pays off in the long run.

It's a trade off.
Using value structs can be a beautifully simple way to group a bunch of data together. They can be very kludgy if you start tacking on a lot of helper routines and rely on them beyond their intended use. Be strict with yourself about when and how to use them and they are fine. Having zero methods on these objects is a good way to make this obvious to yourself.
You may have some set of classes that you use to solve a problem, I'll call it a module. Having value structs within the module are easy to reason about. Outside of the module you have to hope for good behavior. You don't have strict interfaces on them, so you have to hope the compiler will warn you about misuse.
Given that statement, I think they are more appropriate in anonymous or detail namespaces. If they end up in public interfaces, people tend to adding sugar to them. Delete the sugar or refactor it into a first class object with an interface.
I think they are more appropriate as const objects. The problem you fall into is that you are (trying to) maintain the invariance of this "object" everywhere that its used for its entire lifetime. If a different level of abstraction wants them with slight mutations, make a copy. The named parameter idiom is good for this.
Domain Driven Design gives thoughtful, thorough treatment on the subject. It characterizes it a more practical sense of how to understand and facilitate design.
Clean Code also discusses the topic, though from a different perspective. It is more of a morality book.
Both are awesome books and generally recommend outside of this topic.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js