Smart pointers. When, where and how?

Smart pointers. When, where and how? - c++

First off, since there are different kinds of smart pointers, I'd like to focus this question on two of them: reference counted intrusive and non-intrusive smart pointers. The question is asked individualy for each pointer type.
I am not really sure how to formulate my question, so here's what I'm not asking:
I am not asking why, or when, are smart pointers needed. Neither which type of smart pointer should I use and for what.
Here is what I'm asking, and I hope it's clear enough: When dealing with "smartly-managed" objects, in which contextes should I use which pointer semantics? That is, smart pointer semantics, raw pointer semantics, something else (Such as a reference to a smart pointer)?
It's preety obvious that when I "store" a pointer to an object (object being a reference counted memory entity), such as a global pointer, or as a class member, it should be a smart pointer, so it would claim ownership, but what about other situations?
When I'm passing a pointer as a function argument, should it be a smart-pointer, a raw pointer, a reference to a smart pointer, or maybe something else? What about returned pointers? Local pointers? so on...
Ofcourse, I could use smart pointers everywhere, which is the safest option, but I am feeling that this is really unnecessary and adds overhead.

IMHO, sometimes it's better to do things faster than to have a minor performance improvement. I'm supposing you will do things faster if you always use smart pointers.
My suggestion: Use smart pointers everywhere. Then use a profiler to see if it generates considerable overhead. Where it does, change it :)

I suggest you limit the use of pointers, smart or otherwise, as much as possible. I don't know your background, but many people coming from languages like Java or C# use pointers when they should be really using values and call by reference. Use of pointers in a C++ program should be relatively rare.

If you are tyring to contrast "smart pointer semantics" and "raw pointer semantics", you are incorrectly assuming that you can group all "smart pointer semantics" together. I disagree. The difference between boost::scoped_ptr and boost::shared_ptr is of the same order of magnitude as the difference bwteeen boost::shared_ptr and T*
When you "store" a pointer to an object as a class member, you haven't really said a lot about the semantics. If the referenced object is logically a member ("owned"), then yes, you'd want a smart pointer. But one with very specific semantics: single ownership. This means no shared ownership, nor a weak pointer, to name two other common smart pointers. On the other hand, if you're storing a pointer to a error logging object, you probably want a weak pointer. This will prevent you from crashing during shutdown - the weak pointer will be NULL if the log is gone.
Similarly, when you're returning a pointer from a function, the actual semantics of the case will dictate the type of pointer you need. Not the simple fact that it's returned from a function.

My list of pointers:
normal usage: normal members and (const) references to them
sharing and keeping the object alive (owners, containers): shared_ptr
sharing, but not keeping alive (users): weak_ptr
scoped usage: scoped_ptr
other usages (output-parameter,...): raw pointer

In many situations, the use of smart pointers relate to memory management and/or exception handling. The STL auto_ptr neatly manage destruction even in complex try/catch environment. Smart pointers are useful to delegate the lifecycle of the pointed object to the smart pointer. It is generally needed whenever it is difficult to follow the paradigm "destroy your object where you have created it". A reference type smart pointer can be useful when you cannot easily manage a shared object. I prefer to solve this sort of problem with a good architecture, but there is cases where smart pointers are the best way.

In my experience, when I'm passing the object to a function that doesn't need to store a reference to the object (or otherwise, that calling the function does not affect the lifetime of the object in any way), then it's safe to pass the object by reference. This makes the code less "tied" to the smart pointer type.
On the other hand, always using the smart pointer type is "always safe", so when in doubt....

"When I'm passing a pointer as a function argument, should it be a smart-pointer, a raw
pointer, a reference to a smart pointer, or maybe something else? What about returned
pointers? Local pointe rs? so on..."
my 2p, unless you have an obvious reason to use shared_ptrs don't:
Don't pass pointers to functions, pass references to the objects
Don't return pointers from functions, pass objects as non-const references and change those objects
When creating local objects on the heap use scoped_ptr

Related

Are there some smarter smart pointers in C++?

Quite often I work with an old code where raw pointers are mixed with smart ones and I don't have the time to change all raw ones to smart.
And there can be some situations like static raw pointer pointing to an object, which can be already destructed and at first it seems like a situation to use weak_ptr to hold the reference, but there the problem arises, because the place with raw pointer does not have any information about shared_ptrs already pointing to the same object.
So:
1) Is there any smarter smart pointer that tracks all pointers (both raw and smart) to an object?
2) Is there any smarter smart pointer that at least tracks all shared_ptrs to an object?
I don't want a discussion about an implementation, if possible I want to use it as a black-box.
EDIT: I asked 2), because for example calling make_shared on an object twice, makes 2 separate shared_ptr reference counters.

1) Is there any smarter smart pointer that tracks all pointers (both raw and smart) to an object?
No. There is no such standard pointer and such pointer is not implementable in standard C++.
2) Is there any smarter smart pointer that at least tracks all shared_ptrs to an object?
It is somewhat unclear what you mean. Ownership is implicitly shared with all copies of the shared pointer. If you mean that you want taking ownership with separate shared pointers to work, there is no such smart pointer in C++. Maybe implementable with a global data structure.
On the other hand, std::enable_shared_from_this might be what you're looking for instead. It still doesn't work if you try to take shared ownership separately, but it provides a convenient way to join an existing ownership without needing access to any of the shared pointers.

You may want to program in a language not based in any way shape or form on C.
C has pointers. Smart pointers are not real pointers.
Unless you invent a different language, not compatible at any level with C, you will always have pointers and these won't be under the control of any so called "smart pointer", which is neither smart nor a pointer.

Smart pointers usage

I have a project and I want make smart pointers usage better.
The main idea is to use them when returning new object from function. The question is what smart pointer to use? auto_ptr or shared_ptr from boost? As I know, auto_ptr is slower but it can fall back to the 'pure' pointer.
And if I'll use smart pointer in place where I don't need it, would it make the perfomance slower?

What makes you think auto_ptr is slower than shared_ptr? Typically I would expect the reverse to be true, since shared_ptr needs to update the reference count.
As for which you should use, different smart pointers imply different ownership semantics. Ownership implies the responsibility to delete the object when it is no longer needed.
A raw pointer implies no ownership; a program that uses smart pointers correctly may still make use of raw pointers in a lot of places where ownership is not intended (for example, if you need to pass an optional reference to an object into a function, you would often use a raw pointer).
scoped_ptr implies single (ie, non-shared), non-transferable ownership.
auto_ptr implies single (ie, non-shared) transferable ownership. This is the smart pointer I would use to return a newly constructed object from a function (the function is transferring the object to its caller). auto_ptr suffers from the disadvantage that due to limitations of the language when auto_ptr was defined, it is difficult to use correctly (this has given it a very bad reputation, though the intended purpose of a smart pointer with single, transferable ownership semantics was and is both valid and useful).
unique_ptr has the same semantics as auto_ptr, but uses new C++0x features (rvalue references) to make it a lot safer (less prone to incorrect use) than auto_ptr. If you are developing on a platform where unique_ptr is available, then you should use it instead of auto_ptr.
shared_ptr implies shared ownership. In my opinion this is over-used. It does have many valid uses, but it should not simply be used as a default option.
I would also add that shared_ptr is often used with STL containers because the other smart pointer classes do not achieve their intended function in that context (due to copying of values internally within the container). This often leads to use of shared_ptr where shared ownership is not really the intended meaning. In these cases, I suggest (where possible) using the boost pointer-container classes (ptr_vector, ptr_map and so on), which provide the (commonly desired) semantics of a container with transferable, but single (non-shared) ownership.
You should always think about the ownership of your objects: in most cases, with a clean system design, each object has one obvious owner, and ownership does not need to be shared. This has the advantage that it is easy to see exactly where and when objects will be freed, and no reference counting overhead is needed.
[edited to note the new unique_ptr]

You probably should use shared_ptr<>. It's hard to be more specific without knowing what exactly you want to do. Best read its documentation and see if it does what you need.
The performance difference will most likely be negligible. Only in extreme cases there might me an noticeable impact, like when copying these pointers many million times each second.

I prefer shared_ptr, auto_ptr can cause a lot of trouble and its use is not too intuitive. If you expect this object to be inserted on a STL container, so you certainly want to use shared_ptr.
Abot the performance you have a cost, but this is minimal and you can ignore it most of the time.

Depends.
Shared pointers have a better use than auto_ptr which have the unusual characteristic of changing ownership on assignments.
Also auto_ptr can not be used in containers.
Also you can not use auto_ptr as return values if you do not want to transfer ownership.
Shared pointers have all the benefits of smart pointers, have overloaded the relevant operators to act like a pointer and can be used in containers.
Having said that, they are not cheap to use.
You must analyze your needs to decide if you actually gain something by avoiding the shared_pointer implementation overheads

Use only shared_ptr. With auto_ptr you can have ONLY ONE reference to your object. Also auto_ptr isn't slower it must work faster than shared_ptr.
To not ask such questions you need to know, how this smart pointers work.
auto_ptr just storing pointer to your object and destroying it in it's destructor.
The problem of auto_ptr is that when you are trying to copy it it's stopping to point to your object.
For example
auto_ptr a_ptr(new someClass);
auto_ptr another_ptr=aptr;// after this another_ptr is pointing to your class, but a_ptr isn't pointing to it anymore!
This is why I don't recomend you to use auto_ptr.
Shared pointer counting how much smart pointers are pointing to your object and destroying your object when there is no more pointers to it. That's why you can have more than 1 pointer pointing to your object.
But Shared pointer also isn't perfect. And if in your program you have cyclic graph (when you have classes A and B and A have a member shared_ptr wich points to B and B have or B's member objects have shared_ptr pointing to A) than A and B will never deleted and you will have memory lick.
To write correct code with shared_ptr you need to be careful and also use weak_ptr.
For more information look at here http://www.boost.org/doc/libs/1_45_0/libs/smart_ptr/smart_ptr.htm

Should the "this" pointer and smart pointers be mixed?

How should I avoid using the "this" pointer in conjunction with smart pointers? Are there any design patterns/general suggestions on working around this?
I'm assuming combining the two is a no-no since either:
you're passing around a native pointer to a smart pointer-managed object which defeats the point of using the smart pointers in the first place,
if you wrap the "this" pointer in a smart pointer at use, e.g. "return CSmartPtr(this);", you've effectively set up multiple smart pointers managing the same object so the first one to have a reference count of zero will destroy the object from under the other, or
if you have a member variable holding the value of CSmartPtr(this) to return in these cases, it'll ultimately be a circular reference that results in the reference count always being one.
To give a bit of context, I recently learned about the negative implications of combining STL containers with objects (repeated shallow copying, slicing when using containers of a base class, etc), so I'm replacing some usage of these in my code with smart pointers to the objects. A few objects pass around references to themselves using the "this" pointer, which is where I'm stuck.
I've found smart pointers + “this” considered harmful? asked on a somewhat similar problem, but the answer isn't useful as I'm not using Boost.
Edit: A (very contrived) example of what I'd been doing would be
...::AddToProcessingList(vector<CSmartPtr> &vecPtrs)
{
vecPtrs.push_back(CSmartPtr(this));
}

Combining the two can be done, but you always need to keep clear in your mind about the ownership issues. Generally, the rule I follow is to never convert a raw pointer to a smart pointer (with ownership), unless you are sure that you are taking ownership of the object at that point. Times when this is safe to do should be obvious, but include things like:
you just created the object (via new)
you were passed the object from some external method call where the semantic is obviously one of ownership (such as an add to a container class)
the object is being passed to another thread
As long as you follow the rule, and you don't have any ambiguous ownership situations, then there should be no arising issues.
In your examples above, I might look on them as follows:
you're passing around a native pointer to a smart pointer-managed object which defeats the point of using the smart pointers in the first place
In this case, since you are passing around the native pointer, you can assume by my rule that you are not transferring ownership so you can not convert it to a smart pointer
if you wrap the "this" pointer in a smart pointer at use, e.g. "return CSmartPtr(this);", you've effectively set up multiple smart pointers managing the same object so the first one to have a reference count of zero will destroy the object from under the other
This is obviously illegal, since you have said that the object is already owned by some other smart pointer.
if you have a member variable holding the value of CSmartPtr(this) to return in these cases, it'll ultimately be a circular reference that results in the reference count always being one.
This can actually be managed, if some external code implicitly owns the member variable - this code can call some kind of close() method at some point prior to the object being released. Obviously on reflection, that external code owns the object so it should really have a smart pointer itself.
The boost library (which I accept you have said you are not using) makes these kind of issues easier to manage, because it divides up the smart pointer library along the different types of ownership (scoped, shared, weak and so on).

Most smart pointer frameworks provide a means around this. For instance, Boost.SmartPtr provides an enable_shared_from_this<T> CRTP class which you use as a base class, and then you can make sure your shared pointer doesn't result in two pointers to the same object.

One fairly robust solution to this problem is the use of intrusive smart pointers. To instantiate RefCountedPtr<T>, T should derive from RefCount. This allows construction of RefCountedPtr<T> from this, because this->RefCount::m_count holds the single counter that determines the lifetime.
Downside: you have an unused RefCount when you put the object on the stack.

best practice when returning smart pointers

What is the best practice when returning a smart pointer, for example a boost::shared_ptr? Should I by standard return the smart pointer, or the underlying raw pointer? I come from C# so I tend to always return smart pointers, because it feels right. Like this (skipping const-correctness for shorter code):
class X
{
public:
boost::shared_ptr<Y> getInternal() {return m_internal;}
private:
boost::shared_ptr<Y> m_internal;
}
However I've seen some experienced coders returning the raw pointer, and putting the raw pointers in vectors. What is the right way to do it?

There is no "right" way. It really depends on the context.
You can internally handle memory with a smart pointer and externally give references or raw pointers. After all, the user of your interface doesn't need to know how you manage memory internally. In a synchronous context this is safe and efficient. In an asynchronous context, there are many pitfalls.
If you're unsure about what to do you can safely return smart pointers to your caller. The object will be deallocated when the references count reaches zero. Just make sure that you don't have a class that keeps smart pointers of objects for ever thus preventing the deallocation when needed.
As a last note, in C++ don't overuse dynamically allocated objects. There are many cases where you don't need a pointer and can work on references and const references. That's safer and reduces the pressure on the memory allocator.

It depends on what the meaning of the pointer is.
When returning a shared_pointer, you are syntactically saying "You will share ownership of this object", such that, if the the original container object dies before you release your pointer, that object will still exist.
Returning a raw pointer says: "You know about this object, but don't own it". It's a way of passing control, but not keeping the lifetime tied to the original owner.
(in some older c-programs, it means "It's now your problem to delete me", but I'd heavily recommend avoiding this one)
Typically, defaulting to shared saves me a lot of hassle, but it depends on your design.

I follow the following guidelines for passing pointers arguments to functions and returning pointers:
boost::shared_ptr
API and client are sharing ownership of this object. However you have to be careful to avoid circular references with shared_ptr, if the objects represent some kind of graph. I try to limit my use of shared_ptr for this reason.
boost::weak_ptr / raw pointer
API owns this object, you are allowed share it while it is valid. If there is a chance the client will live longer than the api I use a weak_ptr.
std::auto_ptr
API is creating an object but the client owns the object. This ensures that the returning code is exception safe, and clearly states that ownership is being transferred.
boost::scoped_ptr
For pointers to objects stored on the stack or as class member variables. I try to use scoped_ptr first.
Like all guidelines there will be times when the rules conflict or have to be bent, then I try to use intelligence.

I typically return "owning"/"unique" smart pointers from factories or similar to make it clear who is responsible for cleaning up.
This example https://ideone.com/qJnzva shows how to return a std::unique_ptr that will be deleted when the scope of the variable that the caller assigns the value to goes out of scope.
While it's true that the smart pointer deletes its own pointer, the lifetime of the variable holding the smart pointer is 100% controlled by the caller, so the caller decides when the pointer is deleted. However, since it's a "unique" and "owning" smart pointer, no other client can control the lifetime.

I would never return a raw pointer, instead I would return a weak_ptr to tell the user of the pointer that he doesn't have the control over the resource.
If you return a weak_ptr its very unlikely that there will be dangling pointers in the application.
If there is a performance problem I would return a reference to the object and a hasValidXObject method.

In my opinion, in C++, you should always have to justify the use of an unguarded pointer.
There could be many valid reasons: a need for very high performance, for very low memory usage, for dealing with legacy libraries, because of some issue with the underlying data structure the pointer is storing. But [dynamically allocated] pointers are somewhat 'evil', in that you have to deallocate the memory at every possible execution path and you will almost certainly forget one.

I wouldn't put raw pointers in vectors.
In case they use auto_ptr or boost::scoped_ptr, they can't use (or return) anything but raw pointers. That could explain their way of coding, i guess.

depends on your goals.
blindly returning smart ptr to internal data might not be a good idea (which is very sensitive to the task you're trying to solve) - you might be better off just offering some doX() and doY() that use the pointer internally instead.
on the other hand, if returning the smart ptr, you should also consider that you'll create no mutual circular references when objects end up unable to destroy each other (weak_ptr might be a better option in that case).
otherwise, like already mentioned above, performance/legacy code/lifetime considerations should all be taken into account.

const boost::shared_ptr &getInternal() {return m_internal;}
This avoids a copy.
Sometimes you'll like to return a reference, for example:
Y &operator*() { return *m_internal; }
const Y &operator*() const { return *m_internal; }
This is good too only if the reference will be used and discarded inmediately.
The same goes for a raw pointer.
Returning a weak_ptr is also an option.
The 4 are good depending on the goals. This question requires a more extensive discussion.

smart pointers + "this" considered harmful?

In a C++ project that uses smart pointers, such as boost::shared_ptr, what is a good design philosophy regarding use of "this"?
Consider that:
It's dangerous to store the raw pointer contained in any smart pointer for later use. You've given up control of object deletion and trust the smart pointer to do it at the right time.
Non-static class members intrinsically use a this pointer. It's a raw pointer and that can't be changed.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to make a shared pointer to my class.
Given that, when is it ever appropriate for me to explicitly use a this pointer? Are there design paradigms that can prevent bugs related to this?

Wrong question
In a C++ project that uses smart pointers
The issue has nothing to do with smart pointers actually. It is only about ownership.
Smart pointers are just tools
They change nothing WRT the concept of ownership, esp. the need to have well-defined ownership in your program, the fact that ownership can be voluntarily transferred, but cannot be taken by a client.
You must understand that smart pointers (also locks and other RAII objects) represent a value and a relationship WRT this value at the same time. A shared_ptr is a reference to an object and establishes a relationship: the object must not be destroyed before this shared_ptr, and when this shared_ptr is destroyed, if it is the last one aliasing this object, the object must be destroyed immediately. (unique_ptr can be viewed as a special case of shared_ptr where there is zero aliasing by definition, so the unique_ptr is always the last one aliasing an object.)
Why you should use smart pointers
It is recommended to use smart pointers because they express a lot with only variables and functions declarations.
Smart pointers can only express a well-defined design, they don't take away the need to define ownership. In contrast, garbage collection takes away the need to define who is responsible for memory deallocation. (But do not take away the need to define who is responsible for other resources clean-up.)
Even in non-purely functional garbage collected languages, you need to make ownership clear: you don't want to overwrite the value of an object if other components still need the old value. This is notably true in Java, where the concept of ownership of mutable data structure is extremely important in threaded programs.
What about raw pointers?
The use of a raw pointer does not mean there is no ownership. It's just not described by a variable declaration. It can be described in comments, in your design documents, etc.
That's why many C++ programmers consider that using raw pointers instead of the adequate smart pointer is inferior: because it's less expressive (I have avoided the terms "good" and "bad" on purpose). I believe the Linux kernel would be more readable with a few C++ objects to express relationships.
You can implement a specific design with or without smart pointers. The implementation that uses smart pointer appropriately will be considered superior by many C++ programmers.
Your real question
In a C++ project, what is a good design philosophy regarding use of "this"?
That's awfully vague.
It's dangerous to store the raw pointer for later use.
Why do you need to a pointer for later use?
You've given up control of object deletion and trust the responsible component to do it at the right time.
Indeed, some component is responsible for the lifetime of the variable. You cannot take the responsibility: it has to be transferred.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to use my class.
Obviously, since the caller is not informed that the function will hide a pointer and use it later without the control of the caller, you are creating bugs.
The solution is obviously to either:
transfer responsibility to handle the lifetime of the object to the function
ensure that the pointer is only saved and used under the control of the caller
Only in the first case, you might end up with a smart pointer in the class implementation.
The source of your problem
I think that your problem is that you are trying hard to complicate matters using smart pointers. Smart pointers are tools to make things easier, not harder. If smart pointers complicate your specification, then rethink your spec in term of simpler things.
Don't try to introduce smart pointers as a solution before you have a problem.
Only introduce smart pointers to solve a specific well-defined problem. Because you don't describe a specific well-defined problem, it is not possible to discuss a specific solution (involving smart pointers or not).

While i don't have a general answer or some idiom, there is boost::enable_shared_from_this . It allows you to get a shared_ptr managing an object that is already managed by shared_ptr. Since in a member function you have no reference to those managing shared_ptr's, enable_shared_ptr does allow you to get a shared_ptr instance and pass that when you need to pass the this pointer.
But this won't solve the issue of passing this from within the constructor, since at that time, no shared_ptr is managing your object yet.

One example of correct use is return *this; in functions like operator++() and operator<<().

When you are using a smart pointer class, you are right that is dangerous to directly expose "this". There are some pointer classes related to boost::shared_ptr<T> that may be of use:
boost::enable_shared_from_this<T>
Provides the ability to have an object return a shared pointer to itself that uses the same reference counting data as an existing shared pointer to the object
boost::weak_ptr<T>
Works hand-in-hand with shared pointers, but do not hold a reference to the object. If all the shared pointers go away and the object is released, a weak pointer will be able to tell that the object no longer exists and will return you NULL instead of a pointer to invalid memory. You can use weak pointers to get shared pointers to a valid reference-counted object.
Neither of these is foolproof, of course, but they'll at least make your code more stable and secure while providing appropriate access and reference counting for your objects.

If you need to use this, just use it explicitly. Smart pointers wrap only pointers of the objects they own - either exclusivelly (unique_ptr) or in a shared manner (shared_ptr).

I personally like to use the this pointer when accessing member variables of the class. For example:
void foo::bar ()
{
this->some_var += 7;
}
It's just a harmless question of style. Some people like it, somepeople don't.
But using the this pointer for any other thing is likely to cause problems. If you really need to do fancy things with it, you should really reconsider your design. I once saw some code that, in the constructor of a class, it assigned the this pointer to another pointer stored somewhere else! That's just crazy, and I can't ever think of a reason to do that. The whole code was a huge mess, by the way.
Can you tell us what exactly do you want to do with the pointer?

Another option is using intrusive smart pointers, and taking care of reference counting within the object itself, not the pointers. This requires a bit more work, but is actually more efficient and easy to control.

Another reason to pass around this is if you want to keep a central registry of all of the objects. In the constructor, an object calls a static method of the registry with this. Its useful for various publish/subscribe mechanisms, or when you don't want the registry to need knowledge of what objects/classes are in the system.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js