Smart pointers - cases where they cannot replace raw pointers

Smart pointers - cases where they cannot replace raw pointers - c++

HI,
I have this query about smart pointers.
I heard from one of my friends that smart pointers can almost always replace raw pointers.
but when i asked him what are the other cases where smart pointers cannot replace the raw pointers,i did not get the answer from him.
could anybody please tell me when and where they cannot replace raw pointers?

Passing pointers to legacy APIs.
Back-references in a reference-counted tree structure (or any cyclic situation, for that matter). This one is debatable, since you could use weak-refs.
Iterating over an array.
There are also many cases where you could use smart pointers but may not want to, e.g.:
Some small programs are designed to leak everything, because it just isn't worth the added complexity of figuring out how to clean up after yourself.
Fine-grained batch algorithms such as parsers might allocate from a pre-allocated memory pool, and then just blow away the whole pool on completion. Having smart pointers into such a pool is usually pointless.

An API that is going to be called from C, would be an obvious example.

Depends on the smart pointer you use. std::auto_ptr is not compatible with STL containers.

It's a matter of semantics:
smart pointer: you own (at least partly) the memory being pointed to, and as such are responsible for releasing it
regular pointer: you are being given a handle to an object... or not (NULL)
For example:
class FooContainer
{
public:
typedef std::vector<Foo> foos_t;
foos_t::const_iterator fooById(int id) const; // natural right ?
};
But you expose some implementation detail here, you could perfectly create your own iterator class... but iterator usually means incrementable etc... or use a pointer
class FooContainer
{
public:
const Foo* fooById(int id) const;
};
Possibly it will return NULL, which indicates a failure, or it will return a pointer to an object, for which you don't have to handle the memory.
Of course, you could also use a weak_ptr here (you get the expired method), however that would require using shared_ptr in the first place and you might not use them in your implementation.

interaction with legacy code. if the api needs a raw pointer you need to provide a raw pointer even if once its in your code you wrap it in a smart pointer.

If you have a situation where a raw pointer is cast to an intptr_t and back for some reason, it cannot be replaced by a smart pointer because the casting operation would lose any reference counting information contained in the smart pointer.

It would be quite hard to implement smart pointers if at some point you don't use plain pointers.
I suppose it would also be harder to implement certain data structures with smart pointers. E.g freeing the memory of a regular linked list is quite trivial, but it would take some thought to figure out the combination of owning and non-owning smart pointers to get the same result.

Related

Smart pointer concepts ownership and lifetime

There are two concepts (ownership, lifetime) that are important when using C++ smart pointers (unique, shared, weak). I try to understand those concepts and how they influence smart pointer (or raw pointer) usage.
I read two rules:
Always use smart pointers to manage ownership/lifetime of dynamic
objects.
Don't use smart pointers when not managing ownership/lifetime.
An example:
class Object
{
public:
Object* child(int i) { return mChildren[i]; }
// More search and access functions returning pointers here
private:
vector<Object*> mChildren;
};
I want to rewrite this using smart pointers. Lets ignore child() first. Easy game. A parent owns its children. So make mChildren a vector of unique_ptr.
According to the above rules, some people argue child(i) should continue returning a raw pointer.
But isn't this risky? Someone could do stupid things like deleting the returned object getting a hard to debug crash... which could be avoided using a weak_ptr or a shared_ptr as a return value.
Can't one say that copying a pointer always means to temporarily share the ownership and/or to assert the lifetime of the object?
Is it worth using smart pointers for children only if I do not get a safer API as well?

You could return a const std::unique_ptr<Object>& which would allow you to have same semantics of a raw pointer to call methods on it while preventing deletion.
Using std::unique_ptr with raw pointer makes sense when you know that the ownership will survive any raw pointer and you are sure that people won't try to delete the pointer directly. So that's different from using a std::weak_ptr and std::shared_ptr because they won't allow you to use dangling pointers at all.
There's always room to make something wrong, so the answer really depends on the specific situation, where this code is going to be used and such.

Writing more general pointer code

Assume that I want to write function that takes in a pointer. However I want to allow caller to use naked pointers or smart pointers - whatever they prefer. This should be good because my code should rely on pointer semantics, not how pointers are actually implemented. This is one way to do this:
template<typename MyPtr>
void doSomething(MyPtr p)
{
//store pointer for later use
this->var1 = p;
//do something here
}
Above will use duck typing and one can pass naked pointers or smart pointers. The problem occurs when passed value is base pointer and we need to see if we can cast to derived type.
template<typename BasePtr, typename DerivedPtr>
void doSomething(BasePtr b)
{
auto d = dynamic_cast<DerivedPtr>(b);
if (d) {
this->var1 = d;
//do some more things here
}
}
Above code will work for raw pointers but won't work for the smart pointers because I need to use dynamic_pointer_cast instead of dynamic_cast.
One solution to above problem is that I add new utility method, something like, universal_dynamic_cast that works both on raw pointers and smart pointers by selecting overloaded version using std::enable_if.
The questions I have are,
Is there a value in adding all these complexities so code supports raw as well as smart pointers? Or should we just use shared_ptr in our library public APIs? I know this depends on purpose of library, but what is the general feeling about using shared_ptr all over API signatures? Assume that we only have to support C++11.
Why doesn't STL has built-in pointer casts that are agnostic of whether you pass raw pointers or smart pointers? Is this intentional from STL designers or just oversight?
One other problem in above approach is loss of intellisense and bit of readability. This is the problem obviously in all duck typed code. In C++, however, we have a choice. I could have easily strongly typed my argument above like shared_ptr<MyBase> which would sacrifice flexibility for callers to pass whatever wrapped in whatever pointer but reader of my code would be more confident and can build better model on on what should be coming in. In C++ public library APIs, are there general preferences/advantages one way or another?
There is one more approach I have seen in other SO answer where the author proposed that you should just use template<typename T> and let caller decide if T is some pointer type or reference or class. This super generic approach obviously don't work if I have to call something in T because C++ requires dereferencing pointer types which means I have to probably create utility method like universal_deref using std::enable_if that applies * operator to pointer types but does nothing for plain objects. I wonder if there are any design patterns that allows this super generic approach more easily. Again, above all, is it worth going all these troubles or just keep thing simple and use shared_ptr everywhere?

To store a shared_ptr within a class has a semantic meaning. It means that the class is now claiming ownership of that object: the responsibility for its destruction. In the case of shared_ptr, you are potentially sharing that responsibility with other code.
To store a naked T*... well, that has no clear meaning. The Core C++ Guidelines tell us that naked pointers should not be used to represent object ownership, but other people do different things.
Under the core guidelines, what you are talking about is a function that may or may not claim ownership of an object, based on how the user calls it. I would say that you have a very confused interface. Ownership semantics are usually part of the fundamental structure of code. A function either takes ownership or it does not; it's not something that gets determined based on where it gets called.
However, there are times (typically for optimization reasons) where you might need this. Where you might have an object that in one instance is given ownership of memory and in another instance is not. This typically crops up with strings, where some users will allocate a string that you should clean up, and other users will get the string from static data (like a literal), so you don't clean it up.
In those cases, I would say that you should develop a smart pointer type which has this specific semantics. It can be constructed from a shared_ptr<T> or a T*. Internally, it would probably use a variant<shared_ptr<T>, T*> or a similar type if you don't have access to variant.
Then you could give it its own dynamic/static/reinterpret/const_pointer_cast functions, which would forward the operation as needed, based on the status of the internal variant.
Alternatively, shared_ptr instances can be given a deleter object that does nothing. So if your interface just uses shared_ptr, the user can choose to pass an object that it technically does not truly own.

The usual solution is
template<typename T>
void doSomething(T& p)
{
//store reference for later use
this->var1 = &p;
}
This decouples the type I use internally from the representation used by the caller. Yes, there's a lifetime issue, but that's unavoidable. I cannot enforce a lifetime policy on my caller and at the same time accept any pointer. If I want to ensure the object stays alive, I must change the interface to std::shared_ptr<T>.

I think the solution you want is to force callers of your function to pass a regular pointer rather than using a template function. Using shared_ptrs is a good practice, but provides no benefit in passing along the stack, since the object is already held in a shared pointer by the caller of your function, guaranteeing it does not get destroyed, and your function isn't really "holding on" to the object. Use shared_ptrs when storing as a member (or when instantiating the object that will become stored in a member), but not when passing as an argument. It should be a simple matter for the caller to get a raw pointer from the shared_ptr anyway.

The purpose of smart pointers
The purpose of smart pointers is to manage memory resources. When you have a smart pointer, then you usually claim unique or shared ownership. On the other hand, raw pointers just point to some memory that is managed by someone else. Having a raw pointer as a function parameter basically tells the caller of the function that the function is not caring about the memory management. It can be stack memory or heap memory. It does not matter. It only needs to outlive the lifetime of the function call.
Semantics of pointer parameters
When passing a unique_ptr to a function (by value), then your passing the responsibility to clean up memory to that function. When passing a shared_ptr or weak_ptr to a function, then that's saying "I'll possibly share memory ownership with that function or object it belongs to". That's quite different from passing a raw pointer, which implicitly mean "Here's a pointer. You can access it until you return (unless specified otherwise)".
Conclusion
If you have a function, then you usually know which kind of ownership semantics you have and 98% of the time you don't care about ownership and should just stick to raw pointers or even just references, if you know that the pointer you're passing is not a nullptr anyways. Callers that have smart pointers can use the p.get() member function or &*p, if they want to be more terse. Therefore, I would not recommend to template code to tackle your problem, since raw pointers give the caller all the flexibility you can get. Avoiding templates also allows you to put your implementation into an implementation file (and not into a header file).
To answer your concrete questions:
I don't see much value in adding this complexity. To the contrary: It complicates your code unnecessarily.
There is hardly any need for this. Even if you use std::dynamic_pointer_cast in the such, it is to maintain ownership in some way. However, adequate uses of this are rare, because most of the time just using dynamic_cast<U*>(ptr.get()) is all you need. That way you avoid the overhead of shared ownership management.
My preference would be: Use raw pointers. You get all the flexibility, intellisense and so forth and you will live happily ever after.
I would rather call this an antipattern - a pattern that should not be used. If you want to be generic, then use raw pointers (if they are nullable) or references, if the pointer parameter would never be a nullptr. This gives the caller all the flexibility while keeping the interface clean and simple.
Further reading: Herb Sutter talked about smart pointers as function parameters in his Guru of the Week #91. He explains the topic in depth there. Especially point 3 might be interesting to you.

After reviewing some more material, I've finally decided to use plain old raw pointers in my public interface. Here is the reasoning:
We shouldn't be designing interface to accommodate bad design decisions of others. The mantra of "avoid raw pointers like a plague and replace them with smart pointers everywhere" is just bad advice (also se Shutter's GoTW). Trying to support those bad decisions spreads them in to your own code.
Raw pointers explicitly sets up contract with callers that they are the one who need to worry about lifetime of inputs.
Raw pointers gives the maximum flexibility to callers who have shared_ptr, unique_ptr or just raw pointers.
Code now looks much more readable, intuitive and reasonable unlike those duck typed templates taking over everywhere.
I get my strong typing back along with intellisense and better compile time checks.
Casting up and down hierarchy is a breeze and don't have to worry about perf implications where new instance of smart pointer may get created at each cast.
While passing pointers around internally, I don't have to carefully care if the pointer would be shared_ptr or raw pointer.
Although I don't care about it, there is better pathway to support older compilers.
In short, trying to accommodate potential clients who have taken up on guidelines of never using raw pointers and replace them with smart pointers everywhere causes polluting my code with unnecessary complexity. So keep simple things simple and just use raw pointers unless you explicitly want ownership.

Implementing Containers using Smart Pointers

Ok, so everyone knows that raw pointers should be avoided like the plague and to prefer smart pointers, but does this advice apply when implementing a container? This is what I am trying to accomplish:
template<typename T> class AVLTreeNode {
public:
T data;
unique_ptr<AVLTreeNode<T>> left, right;
int height;
}
Unique_ptr can make container functions more cumbersome to write because I can't have multiple raw pointers temporarily pointing to the same object in a way that is elegant. For example:
unique_ptr<AVLTreeNode<T>> rotate_right(unique_ptr<AVLTreeNode<T>> n1)
{
unique_ptr<AVLTreeNode<T>> n2 = n1->left;
n1->left = n2->right;
n2->right = n1;
// n1 must now be referenced through the longer name n2->right from now on
n2->right->recalculate_height();
n2->recalculate_height();
return n2;
}
(It's not a big deal in this example but I can imagine how it could become a problem). Should I take problems like these as a strong hint that containers should be implemented with good old new, delete, and raw pointers? It seems like awfully a lot of trouble just to avoid writing a destructor.

I do not usually use smart pointers when implementing containers as you show. Raw pointers (imho) are not to be avoided like the plague. Use a smart pointer when you want to enforce memory ownership. But typically in a container, the container owns the memory pointed to by the pointers making up the data structure.
If in your design, an AVLTreeNode uniquely owns its left and right children and you want to express that with unique_ptr, that's fine. But if you would prefer that AVLTree owns all AVLTreeNodes, and does so with raw pointers, that is just as valid (and is the way I usually code it).
Trust me, I'm not anti-smart-pointer. I am the one who invented unique_ptr. But unique_ptr is just another tool in the tool box. Having good smart pointers in the tool box is not a cure-all, and using them blindly for everything is not a substitute for careful design.
Update to respond to comment (comment box was too small):
I use raw pointers a lot (which are rarely owning). A good sampling of my coding style exists in the open source project libc++. One can browse the source under the "Browse SVN" link.
I prefer that every allocation of a resource be deallocate-able in a destructor somewhere, because of exception safety concerns, even if the usual deallocation happens outside of a destructor. When the allocation is owned by a single pointer, a smart pointer is typically the most convenient tool in the tool box. When the allocation is owned by something larger than a pointer (e.g. a container, or a class Employee), raw pointers are often a convenient part of the data structure composing the larger object.
The most important thing is that I never allocate any resource without knowing what object owns that resource, be it smart pointer, container, or whatever.

The code you presented compiles with no problems
#include <memory>
template<typename T> class AVLTreeNode {
public:
T data;
std::unique_ptr<AVLTreeNode<T>> left, right;
int height;
};
int main()
{
AVLTreeNode<int> node;
}
test compilation: https://ideone.com/aUAHs
Personally, I've been using smart pointers for trees even when the only thing we had was std::auto_ptr
As for rotate_right, it could be implemented with a couple calls to unique_ptr::swap

Small correction: raw pointers should not be avoided like the plague (oops, not everybody knew the fact), but manual memory management should be avoided when possible (by using containers instead of dynamic array or smartpointers), so in your function, just do a get() on your unique_ptr for temporary storage.

std::shared_ptr does not have these restrictions. Especially, multiple shared_ptr-instances can reference the same object.

Herb Shutter has very clear guideline about not using shared_ptr as parameters in his GoTW series:
Guideline: Don’t pass a smart pointer as a function parameter unless
you want to use or manipulate the smart pointer itself, such as to
share or transfer ownership.
and this...
Guideline: Prefer passing objects by value, *, or &, not by smart
pointer.

Accelerated C++: Can I substitute raw pointers for smart pointers?

I love this book, sadly it does not cover smart pointers as they were not part of the standard back then. So when reading the book can I fairly substitute every mentioned pointer by a smart pointer, respectively reference?

"Smart Pointer" is a bit of a misnomer. The "smart" part is that they will do some things for you, whether or not you need, want, or even understand what those things are. And that's really important. Because sometimes you'll want to go to the store, and smart pointers will drive you to church. Smart pointers solve some very specific problems. Many would argue that if you think you need smart pointers, then you're probably solving the wrong problem. I personally try not to take sides. Instead, I use a toolbox metaphor - you need to really understand the problem you're solving, and the tools that you have at your disposal. Only then can you remotely expect to select the right tool for the job. Best of luck, and keep questioning!

Well, there are different kinds of smart pointers. For example:
You could create a scoped_ptr class, which would be useful when you're allocating for a task within a block of code, and you want the resource to be freed automatically when it runs of of scope.
Something like:
template <typename T>
class scoped_ptr
{
public:
scoped_ptr(T* p = 0) : mPtr(p) {}
~scoped_ptr() { delete mPtr; }
//...
};
Additionally you could create a shared_ptr who acts the same but keeps a ref count. Once the ref count reach 0 you deallocate.
shared_ptr would be useful for pointers stored in STL containers and the like.
So yes, you could use smart pointers for most of the purposes of your program.
But think judiciously about what kind of smart pointer you need and why.
Do not simply "find and replace" all the pointers you come across.

No.
Pointers which represent object ownership should be replaced by smart pointers.
Other pointers should be replaced by iterators (which in the simplest case is just a typedef for a raw pointer, but no one would think they need to delete).
And of course, the implementation code for smart pointers and iterators will continue to need raw pointers.

Once you've adopted boost's smart pointers, is there any case where you use raw pointers?

I'm curious as I begin to adopt more of the boost idioms and what appears to be best practices I wonder at what point does my c++ even remotely look like the c++ of yesteryear, often found in typical examples and in the minds of those who've not been introduced to "Modern C++"?

I don't use shared_ptr almost at all, because I avoid shared ownership in general. Therefore, I use something like boost::scoped_ptr to "own" an object, but all other references to it will be raw pointers. Example:
boost::scoped_ptr<SomeType> my_object(new SomeType);
some_function(my_object.get());
But some_function will deal with a raw pointer:
void some_function(SomeType* some_obj)
{
assert (some_obj);
some_obj->whatever();
}

Just a few off the top of my head:
Navigating around in memory-mapped files.
Windows API calls where you have to over-allocate (like a LPBITMAPINFOHEADER).
Any code where you're munging around in arbitrary memory (VirtualQuery() and the like).
Just about any time you're using reinterpret_cast<> on a pointer.
Any time you use placement-new.
The common thread here is "any situation in which you need to treat a piece of memory as something other than a resource over which you have allocation control".

These days I've pretty much abandoned all use of raw pointers. I've even started looking through our code base for places where raw pointers were used and switched them to a smart pointer variant. It's amazing how much code I've been able to delete by doing this simple act. There is so much code wasted on lifetime management of raw C++ pointers.
The only places where I don't use pointers is for a couple of interop scenarios with other code bases I don't have control over.

I find the primary difference between 'modern' C++ and the old* stuff is careful use of class invariants and encapsulation. Well organised code tends naturally to have fewer pointers flying around. I'm almost as nervous swimming in shared_ptrs as I would be in news and deletes.
I'm looking forward to unique_ptr in C++0x. I think that will tidy away the few (smart) pointers that do still roam the wild.
*still unfortunately very common

Certainly any time you're dealing with a legacy library or API you'll need to pass a raw pointer, although you'll probably just extract it from your smart pointer temporarily.
In fact it is always safe to pass a raw pointer to a function, as long as the function does not try to keep a copy of the pointer in a global or member variable, or try to delete it. With these restrictions in place, the function cannot affect the lifetime of the object, and the only reason for a smart pointer is to manage the object lifetime.

I still use regular pointers in resource-sensitive code or other code that needs tiny footprint, such as certain exceptions, where I cannot assume that any data is valid and must also assume that I am running out of memory too.
Managed memory is almost always superior to raw otherwise, because it means that you don't have to deal with deleting it at the right place, but still have great control over the construction and destruction points of your pointers.
Oh, and there's one other place to use raw pointers:
boost::shared_ptr<int> ptr(new int);

I still use raw pointers on devices that have memory mapped IO, such as embedded systems, where having a smart pointer doesn't really make sense because you will never need or be able to delete it.

If you have circular data structures, e.g., A points to B and B points back to A, you can't use naively use smart pointers for both A and B, since then the objects will only be freed extra work. To free the memory, you have to manually clear the smart pointers, which is about as bad as the delete the smart pointers get rid of.
You might thing this doesn't happen very often, but suppose you have Parent object that has smart pointers to a bunch of Child objects. Somewhere along the way someone needs to look up a the Parent for a Child, so they add a smart pointer member to Child that points back to the parent. Silently, memory is no longer freed.
Some care is required. Smart pointers are not equivalent to garbage collection.

I'm writing C++ that has to co-exist with Objective C (using Objective C++ to bridge).
Because C++ objects declared as part of Objective C++ classes don't have constructors or destructors called you can't really hold them there in smart pointers.
So I tend to use raw pointers, although often with boost::intrustive_ptr and an internal ref count.

Not that I would do it, but you need raw pointers to implement, say, a linked list or a graph. But it would be much smarter to use std::list<> or boost::graph<>.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js