Smart pointer concepts ownership and lifetime

Smart pointer concepts ownership and lifetime - c++

There are two concepts (ownership, lifetime) that are important when using C++ smart pointers (unique, shared, weak). I try to understand those concepts and how they influence smart pointer (or raw pointer) usage.
I read two rules:
Always use smart pointers to manage ownership/lifetime of dynamic
objects.
Don't use smart pointers when not managing ownership/lifetime.
An example:
class Object
{
public:
Object* child(int i) { return mChildren[i]; }
// More search and access functions returning pointers here
private:
vector<Object*> mChildren;
};
I want to rewrite this using smart pointers. Lets ignore child() first. Easy game. A parent owns its children. So make mChildren a vector of unique_ptr.
According to the above rules, some people argue child(i) should continue returning a raw pointer.
But isn't this risky? Someone could do stupid things like deleting the returned object getting a hard to debug crash... which could be avoided using a weak_ptr or a shared_ptr as a return value.
Can't one say that copying a pointer always means to temporarily share the ownership and/or to assert the lifetime of the object?
Is it worth using smart pointers for children only if I do not get a safer API as well?

You could return a const std::unique_ptr<Object>& which would allow you to have same semantics of a raw pointer to call methods on it while preventing deletion.
Using std::unique_ptr with raw pointer makes sense when you know that the ownership will survive any raw pointer and you are sure that people won't try to delete the pointer directly. So that's different from using a std::weak_ptr and std::shared_ptr because they won't allow you to use dangling pointers at all.
There's always room to make something wrong, so the answer really depends on the specific situation, where this code is going to be used and such.

Related

How to show a C++ class doesn't own a pointer member?

I have a class with a pointer member, something like:
class MyClass
{
public:
void setPointer(Pointer* ptr){_pointer = ptr;}
private:
Pointer* _pointer{nullptr};
};
MyClass doesn't own the memory to _pointer. It just has a pointer to invoke methods it requires.
I started to write ~MyClass() and I fortunately I realised it shouldn't delete _pointer because it doesn't own it.
What is the best way to show MyClass doesn't have ownership of that pointer?
EDIT:
Should I use a unique_ptr in the owner class and a shared_ptr in MyClass, or should they both use shared_ptr?

Future
At the moment, std::observer_ptr is in discussion to express exactly those semantics: non-owning, passive pointers. See also this thread for its purpose.
Present
In idiomatic C++ (latest since 2011), it's quite uncommon to still have raw pointers that own the object they're pointing to. The reason is that new/delete come with a lot of pitfalls, while there are better alternatives such as std::unique_ptr and std::shared_ptr. Those do not only bring the semantics needed, but also express the intent in code (unique or shared ownership, respectively).
Eventually it depends on code style of your environment, but I would say until a standard best-practice is established, T* is a good convention for non-owning, passive pointers.

The Coreguidelines come with a support library that was a owner to mark raw pointers as owning. In that context they write:
The "raw-pointer" notation (e.g. int*) is assumed to have its most
common meaning; that is, a pointer points to an object, but does not
own it. Owners should be converted to resource handles (e.g.,
unique_ptr or vector) or marked owner.
owner<T*> // a T* that owns the object pointed/referred to; may be nullptr.
owner is used to mark owning pointers in code that cannot be upgraded
to use proper resource handles.
I read that as: A raw pointer does not need to be marked as not-owning, because nowadays raw owning pointers should be the exception. Hence, it is owning raw pointers that need to be highlighted.
Of course this only applies when you consistently avoid owning raw pointers in your code.

With a comment.
// A class wrapping (but not owning) a pointer.
class MyClass
{
public:
void setPointer(Pointer* ptr){_pointer = ptr;}
private:
// Not owned by the class
Pointer* _pointer{nullptr};
};
Seriously, don't be afraid to document your code with comments. That's what they're there for.

What is the best way to show MyClass doesn't have ownership of that pointer?
The lack of ownership of a resource (such as pointer to dynamic object) which you store is shown by not deallocating (deleting) the resource. Conventionally, you should never store bare identifiers (such as bare pointers) to owned resources except in a RAII wrapper (such as a smart pointer) whose sole purpose is to own and deallocate the resource.
Not taking ownership of a pointer passed to you is shown by accepting a bare resource identifier - as opposed to a RAII wrapper. This convention cannot be followed when providing a C compatible interface nor in the constructor / assignment of such RAII wrapper, which does take ownership of a bare pointer. In such non-conventional cases, the exception must be documented carefully.
The client is shown that ownership is not transferred to them by returning a bare resource identifier. This has similar exceptions with C interface as well as in the interface of the RAII wrapper, if it supports releasing ownership.
All of these can be clarified with extra documentation.
Should I use a unique_ptr in the owner class and a shared_ptr in MyClass,
No. Unique pointer means that it is the only owner of the resource. The ownership cannot be shared with other pointers. If you use a unique pointer, then you can use a bare non-owning pointer in MyClass.
or should they both use shared_ptr?
That is an option. It safer than the unique + bare pointer, because there is no need to guarantee that the lifetime of the unique pointer is longer than the lifetime of the bare pointer. The downside is some runtime overhead.

How to interpret this statement "A pointer in a function should not represent ownership"

Based on this
Stroustrup suggests that "A pointer in a function should not represent ownership"
Question> Can someone give me a little detail explanation? Best if an example is illustrated.
Thank you

A pointer is "owned" by some code if that code is responsible for deleting it, or for transferring ownership to someone else. Various smart pointers implement explicit ownership models. shared_ptr represents multiple pieces of code owning a pointer. unique_ptr represents only one piece of code that owns the pointer.
What he's saying is that if a function has a naked pointer (a pointer not in a smart pointer), it should not be considered to own it. If it is to claim some ownership of this pointer, it should either have been given a smart pointer as a parameter, or it should have stored the pointer it created with new in a smart pointer.
He's saying that only smart pointers own pointers. If a function takes a naked pointer as a parameter, it does not claim ownership of that pointer. If a function returns a naked pointer, you cannot claim ownership of that pointer.

std::unique_ptr<int> pOwner(new int(5)); // this is the owner
int* pAccess = pOwner.get(); // this is a weak accessor, it does not own anything
He's talking about the role of raw pointers in a C++11 world. Owning pointers are meant to be shared_ptr and unique_ptr (they are the owners because they are responsible for deleting the object). Raw pointers should be used to access objects which are owned by a smart pointer. In C++11 you should basically never have a reason to explicitly call delete on a raw pointer anymore.

When you create a dynamic object using new, some other object will be responsible for deleting it once it is no longer needed; that object is the dynamic object's "owner".
If you were to always refer to the object using plain pointers, then it's difficult to tell (without documentation) which of those pointers pointers represents ownership. If you pass a pointer to a function, then does the function take ownership? Or is the caller still responsible for deleting it? If you get that wrong, then you'll either not delete it (causing a resource leak, which might degrade your program's performance and eventually stop it from working), or you might delete it too soon (which can cause all manner of bugs, often very hard to track down).
A widely-used solution for this is to have a policy of always using smart pointers (objects that look like pointers, but contain logic to manage their target's lifetime) to denote ownership, and to never delete anything to which you just have a plain pointer. Then there is never any confusion about whether or not to delete something. The standard library provides smart pointers (unique_ptr and shared_ptr) that give the common semantics of ownership by a single object, and ownership shared between multiple objects.
This is one aspect of the wider topic of resource management via RAII, which is extremely important in C++. As well as providing a clear ownership model, it is also the only sensible way to reliably prevent memory leaks when exceptions are thrown.

When to use shared_ptr and when to use raw pointers?

class B;
class A
{
public:
A ()
: m_b(new B())
{
}
shared_ptr<B> GimmeB ()
{
return m_b;
}
private:
shared_ptr<B> m_b;
};
Let's say B is a class that semantically should not exist outside of the lifetime of A, i.e., it makes absolutely no sense for B to exist by itself. Should GimmeB return a shared_ptr<B> or a B*?
In general, is it good practice to completely avoid using raw pointers in C++ code, in lieu of smart pointers?
I am of the opinion that shared_ptr should only be used when there is explicit transfer or sharing of ownership, which I think is quite rare outside of cases where a function allocates some memory, populates it with some data, and returns it, and there is understanding between the caller and the callee that the former is now "responsible" for that data.

Your analysis is quite correct, I think. In this situation, I also would return a bare B*, or even a [const] B& if the object is guaranteed to never be null.
Having had some time to peruse smart pointers, I arrived at some guidelines which tell me what to do in many cases:
If you return an object whose lifetime is to be managed by the caller, return std::unique_ptr. The caller can assign it to a std::shared_ptr if it wants.
Returning std::shared_ptr is actually quite rare, and when it makes sense, it is generally obvious: you indicate to the caller that it will prolong the lifetime of the pointed-to object beyond the lifetime of the object which was originally maintaining the resource. Returning shared pointers from factories is no exception: you must do this eg. when you use std::enable_shared_from_this.
You very rarely need std::weak_ptr, except when you want to make sense of the lock method. This has some uses, but they are rare. In your example, if the lifetime of the A object was not deterministic from the caller's point of view, this would have been something to consider.
If you return a reference to an existing object whose lifetime the caller cannot control, then return a bare pointer or a reference. By doing so, you tell the caller that an object exists and that she doesn't have to take care of its lifetime. You should return a reference if you don't make use of the nullptr value.

The question "when should I use shared_ptr and when should I use raw pointers?" has a very simple answer:
Use raw pointers when you do not want to have any ownership attached to the pointer. This job can also often be done with references. Raw pointers can also be used in some low level code (such as for implementing smart pointers, or implementing containers).
Use unique_ptr or scope_ptr when you want unique ownership of the object. This is the most useful option, and should be used in most cases. Unique ownership can also be expressed by simply creating an object directly, rather than using a pointer (this is even better than using a unique_ptr, if it can be done).
Use shared_ptr or intrusive_ptr when you want shared ownership of the pointer. This can be confusing and inefficient, and is often not a good option. Shared ownership can be useful in some complex designs, but should be avoided in general, because it leads to code which is hard to understand.
shared_ptrs perform a totally different task from raw pointers, and neither shared_ptrs nor raw pointers are the best option for the majority of code.

The following is a good rule of thumb:
When there is no transfer of shared ownership references or plain pointers are good enough. (Plain pointers are more flexible than references.)
When there is transfer of ownership but no shared ownership then std::unique_ptr<> is a good choice. Often the case with factory functions.
When there is shared ownership, then it is a good use case for std::shared_ptr<> or boost::intrusive_ptr<>.
It is best to avoid shared ownership, partly because they are most expensive in terms of copying and std::shared_ptr<> takes double of the storage of a plain pointer, but, most importantly, because they are conducive for poor designs where there are no clear owners, which, in turn, leads to a hairball of objects that cannot destroy because they hold shared pointers to each other.
The best design is where clear ownership is established and is hierarchical, so that, ideally, no smart pointers are required at all. For example, if there is a factory that creates unique objects or returns existing ones, it makes sense for the factory to own the objects it creates and just keep them by value in an associative container (such as std::unordered_map), so that it can return plain pointers or references to its users. This factory must have lifetime that starts before its first user and ends after its last user (the hierarchical property), so that users cannot possible have a pointer to an already destroyed object.

If you don't want the callee of GimmeB() to be able to extend the lifetime of the pointer by keeping a copy of the ptr after the instance of A dies, then you definitely should not return a shared_ptr.
If the callee is not supposed to keep the returned pointer for long periods of time, i.e. there's no risk of the instance of A's lifetime expiring before the pointer's, then raw pointer would be better. But even a better choice is simply to use a reference, unless there's a good reason to use an actual raw pointer.
And finally in the case that the returned pointer can exist after the lifetime of the A instance has expired, but you don't want the pointer itself extend the lifetime of the B, then you can return a weak_ptr, which you can use to test whether it still exists.
The bottom line is that there's usually a nicer solution than using a raw pointer.

I agree with your opinion that shared_ptr is best used when explicit sharing of resources occurs, however there are other types of smart pointers available.
In your precise case: why not return a reference ?
A pointer suggests that the data might be null, however here there will always be a B in your A, thus it will never be null. The reference asserts this behavior.
That being said, I have seen people advocating the use of shared_ptr even in non-shared environments, and giving weak_ptr handles, with the idea of "securing" the application and avoiding stale pointers. Unfortunately, since you can recover a shared_ptr from the weak_ptr (and it is the only way to actually manipulate the data), this is still shared ownership even if it was not meant to be.
Note: there is a subtle bug with shared_ptr, a copy of A will share the same B as the original by default, unless you explicitly write a copy constructor and a copy assignment operator. And of course you would not use a raw pointer in A to hold a B, would you :) ?
Of course, another question is whether you actually need to do so. One of the tenets of good design is encapsulation. To achieve encapsulation:
You shall not return handles to your internals (see Law of Demeter).
so perhaps the real answer to your question is that instead of giving away a reference or pointer to B, it should only be modified through A's interface.

Generally, I would avoid using raw pointers as far as possible since they have very ambiguous meaning - you might have to deallocate the pointee, but maybe not, and only human-read and -written documentation tells you what the case is. And documentation is always bad, outdated or misunderstood.
If ownership is an issue, use a smart pointer. If not, I'd use a reference if practicable.

You allocate B at constuction of A.
You say B shouldn't persist outside As lifetime.
Both these point to B being a member of A and a just returning a reference accessor. Are you overengineering this?

I found that the C++ Core Guidelines give some very useful hints for this question:
To use raw pointer(T*) or smarter pointer depends on who owns the object (whose responsibility to release memory of the obj).
own :
smart pointer, owner<T*>
not own:
T*, T&, span<>
owner<>, span<> is defined in Microsoft GSL library
here is the rules of thumb:
1) never use raw pointer(or not own types) to pass ownership
2) smart pointer should only be used when ownership semantics are intended
3) T* or owner designate a individual object(only)
4) use vector/array/span for array
5) To my undetstanding, shared_ptr is usually used when you don't know who will release the obj, for example, one obj is used by multi-thread

It is good practice to avoid using raw pointers, but you can not just replace everything with shared_ptr. In the example, users of your class will assume that it's ok to extend B's lifetime beyond that of A's, and may decide to hold the returned B object for some time for their own reasons. You should return a weak_ptr, or, if B absolutely cannot exist when A is destroyed, a reference to B or simply a raw pointer.

When you say: "Let's say B is a class that semantically should not exist outside of the lifetime of A"
This tells me B should logically not exist without A, but what about physically existing?
If you can be sure no one will try using a *B after A dtors than perhaps a raw pointer will be fine. Otherwise a smarter pointer may be appropriate.
When clients have a direct pointer to A you have to trust they'll handle it appropriately; not try dtoring it etc.

Smart pointers - cases where they cannot replace raw pointers

HI,
I have this query about smart pointers.
I heard from one of my friends that smart pointers can almost always replace raw pointers.
but when i asked him what are the other cases where smart pointers cannot replace the raw pointers,i did not get the answer from him.
could anybody please tell me when and where they cannot replace raw pointers?

Passing pointers to legacy APIs.
Back-references in a reference-counted tree structure (or any cyclic situation, for that matter). This one is debatable, since you could use weak-refs.
Iterating over an array.
There are also many cases where you could use smart pointers but may not want to, e.g.:
Some small programs are designed to leak everything, because it just isn't worth the added complexity of figuring out how to clean up after yourself.
Fine-grained batch algorithms such as parsers might allocate from a pre-allocated memory pool, and then just blow away the whole pool on completion. Having smart pointers into such a pool is usually pointless.

An API that is going to be called from C, would be an obvious example.

Depends on the smart pointer you use. std::auto_ptr is not compatible with STL containers.

It's a matter of semantics:
smart pointer: you own (at least partly) the memory being pointed to, and as such are responsible for releasing it
regular pointer: you are being given a handle to an object... or not (NULL)
For example:
class FooContainer
{
public:
typedef std::vector<Foo> foos_t;
foos_t::const_iterator fooById(int id) const; // natural right ?
};
But you expose some implementation detail here, you could perfectly create your own iterator class... but iterator usually means incrementable etc... or use a pointer
class FooContainer
{
public:
const Foo* fooById(int id) const;
};
Possibly it will return NULL, which indicates a failure, or it will return a pointer to an object, for which you don't have to handle the memory.
Of course, you could also use a weak_ptr here (you get the expired method), however that would require using shared_ptr in the first place and you might not use them in your implementation.

interaction with legacy code. if the api needs a raw pointer you need to provide a raw pointer even if once its in your code you wrap it in a smart pointer.

If you have a situation where a raw pointer is cast to an intptr_t and back for some reason, it cannot be replaced by a smart pointer because the casting operation would lose any reference counting information contained in the smart pointer.

It would be quite hard to implement smart pointers if at some point you don't use plain pointers.
I suppose it would also be harder to implement certain data structures with smart pointers. E.g freeing the memory of a regular linked list is quite trivial, but it would take some thought to figure out the combination of owning and non-owning smart pointers to get the same result.

best practice when returning smart pointers

What is the best practice when returning a smart pointer, for example a boost::shared_ptr? Should I by standard return the smart pointer, or the underlying raw pointer? I come from C# so I tend to always return smart pointers, because it feels right. Like this (skipping const-correctness for shorter code):
class X
{
public:
boost::shared_ptr<Y> getInternal() {return m_internal;}
private:
boost::shared_ptr<Y> m_internal;
}
However I've seen some experienced coders returning the raw pointer, and putting the raw pointers in vectors. What is the right way to do it?

There is no "right" way. It really depends on the context.
You can internally handle memory with a smart pointer and externally give references or raw pointers. After all, the user of your interface doesn't need to know how you manage memory internally. In a synchronous context this is safe and efficient. In an asynchronous context, there are many pitfalls.
If you're unsure about what to do you can safely return smart pointers to your caller. The object will be deallocated when the references count reaches zero. Just make sure that you don't have a class that keeps smart pointers of objects for ever thus preventing the deallocation when needed.
As a last note, in C++ don't overuse dynamically allocated objects. There are many cases where you don't need a pointer and can work on references and const references. That's safer and reduces the pressure on the memory allocator.

It depends on what the meaning of the pointer is.
When returning a shared_pointer, you are syntactically saying "You will share ownership of this object", such that, if the the original container object dies before you release your pointer, that object will still exist.
Returning a raw pointer says: "You know about this object, but don't own it". It's a way of passing control, but not keeping the lifetime tied to the original owner.
(in some older c-programs, it means "It's now your problem to delete me", but I'd heavily recommend avoiding this one)
Typically, defaulting to shared saves me a lot of hassle, but it depends on your design.

I follow the following guidelines for passing pointers arguments to functions and returning pointers:
boost::shared_ptr
API and client are sharing ownership of this object. However you have to be careful to avoid circular references with shared_ptr, if the objects represent some kind of graph. I try to limit my use of shared_ptr for this reason.
boost::weak_ptr / raw pointer
API owns this object, you are allowed share it while it is valid. If there is a chance the client will live longer than the api I use a weak_ptr.
std::auto_ptr
API is creating an object but the client owns the object. This ensures that the returning code is exception safe, and clearly states that ownership is being transferred.
boost::scoped_ptr
For pointers to objects stored on the stack or as class member variables. I try to use scoped_ptr first.
Like all guidelines there will be times when the rules conflict or have to be bent, then I try to use intelligence.

I typically return "owning"/"unique" smart pointers from factories or similar to make it clear who is responsible for cleaning up.
This example https://ideone.com/qJnzva shows how to return a std::unique_ptr that will be deleted when the scope of the variable that the caller assigns the value to goes out of scope.
While it's true that the smart pointer deletes its own pointer, the lifetime of the variable holding the smart pointer is 100% controlled by the caller, so the caller decides when the pointer is deleted. However, since it's a "unique" and "owning" smart pointer, no other client can control the lifetime.

I would never return a raw pointer, instead I would return a weak_ptr to tell the user of the pointer that he doesn't have the control over the resource.
If you return a weak_ptr its very unlikely that there will be dangling pointers in the application.
If there is a performance problem I would return a reference to the object and a hasValidXObject method.

In my opinion, in C++, you should always have to justify the use of an unguarded pointer.
There could be many valid reasons: a need for very high performance, for very low memory usage, for dealing with legacy libraries, because of some issue with the underlying data structure the pointer is storing. But [dynamically allocated] pointers are somewhat 'evil', in that you have to deallocate the memory at every possible execution path and you will almost certainly forget one.

I wouldn't put raw pointers in vectors.
In case they use auto_ptr or boost::scoped_ptr, they can't use (or return) anything but raw pointers. That could explain their way of coding, i guess.

depends on your goals.
blindly returning smart ptr to internal data might not be a good idea (which is very sensitive to the task you're trying to solve) - you might be better off just offering some doX() and doY() that use the pointer internally instead.
on the other hand, if returning the smart ptr, you should also consider that you'll create no mutual circular references when objects end up unable to destroy each other (weak_ptr might be a better option in that case).
otherwise, like already mentioned above, performance/legacy code/lifetime considerations should all be taken into account.

const boost::shared_ptr &getInternal() {return m_internal;}
This avoids a copy.
Sometimes you'll like to return a reference, for example:
Y &operator*() { return *m_internal; }
const Y &operator*() const { return *m_internal; }
This is good too only if the reference will be used and discarded inmediately.
The same goes for a raw pointer.
Returning a weak_ptr is also an option.
The 4 are good depending on the goals. This question requires a more extensive discussion.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js