What is the canonical way to deal with shared pointers in C++ when there is a clear case to argue that "one, unique object owns the pointer"?
For example, if a shared_ptr is a member of a particular class, which is responsible for initializing the pointer, then it could be argued that this class should also have the final say on when the pointer is deleted.
In other words, it may be the case that when the owning class goes out of scope (or is itself delete'd that any remaining references to the pointer no longer make sense. This may be due to related variables which were members of the destroyed class.
Here is a sketch of an example
class Owner
{
Owner()
{
p.reset(malloc_object(arguments), free_object);
}
std::shared_ptr<type> get() { return p; }
// seems strange because now something somewhere
// else in the code can hold up the deletion of p
// unless a manual destructor is written
~Owner()
{
p.reset(nullptr); // arduous
}
std::shared_ptr<type> p;
int a, b, c; // some member variables which are logically attached to p
// such that neither a, b, c or p make sense without each other
}
One cannot use a unique_ptr as this would not permit the pointer to be returned by the get function, unless a raw pointer is returned. (Is this is an acceptable solution?)
A unique_ptr in combination with returning weak_ptr from the get function might make sense. But this is not valid C++. weak_ptr is used in conjunction with shared_ptr.
A shared_ptr with the get function returning weak_ptr is better than a raw pointer becuase in order to use the weak pointer, it has to be converted to a shared pointer. This will fail if the reference count is already zero and the object has been deleted.
However using a shared_ptr defeats the point, since ideally a unique_ptr would be chosen because there can then only be one thing which "owns" the pointed to data.
I hope the question is clear enough, it was quite difficult to explain since I can't copy the code I am working with.
It is ok to return the shared_ptr there, what will happen is that the pointer will still be held somewhere outside the Owner class. Since your doing p.reset(nullptr); at the destructor, whoever was holding that shared_ptr will now be holding a pointer to null.
Using weak_ptr with shared_ptr is also a good solution, the problem is the same which is the fact that the best class to represent p is unique_ptr as you described.
The path I would choose is to hold a unique_ptr which seems more adequate and to implement the get() function like this:
type* get() { return p.get(); }
The behaviour is the same and the code is clearer since having p as unique_ptr will give clarity on how it should be used.
Related
Preamble
In C++11 there is std::shared_ptr + std::weak_ptr combo. Despite being very useful, it has a nasty issue: you cannot easily construct shared_ptr from a raw pointer. As a result of this flaw, such smart pointers usually become "viral": people start to completely avoid raw pointers and references, and use exclusively shared_ptr and weak_ptr smart pointers all over the code. Because there is no way to pass a raw reference into a function expecting a smart pointer.
On the other hand, there is boost::intrusive_ptr. It is equivalent to std::shared_ptr and can easily be constructed from raw pointer, because reference counter is contained within the object. Unfortunately, there is no weak_ptr companion to it, so there is no way to have non-owning references which you could check for being invalid. In fact, some believe that weak companion for intrusive_ptr is impossible.
Now, there is std::enable_shared_from_this, which embeds a weak_ptr directly into your class, so that you could construct shared_ptr from pointer to object. But there is small limitation (at least one shared_ptr must exist), and it still does not allow the obvious syntax: std::shared_ptr(pObject).
Also, there is a std::make_shared, which allocates reference counters and the user's object in a single memory chunk. This is very close to the concept of intrusive_ptr, but the user's object can be destroyed independently of the reference counting block. Also, this concept has an inevitable drawback: the whole memory block (which can be large) is deallocated only when all weak_ptr-s are gone.
Question
The main question is: how to create a pair of shared_ptr/weak_ptr, which would have the benefits of both std::shared_ptr/std::weak_ptr and boost::intrusive_ptr?
In particular:
shared_ptr models shared ownership over the object, i.e. the object is destroyed exactly when the last shared_ptr pointing to it is destroyed.
weak_ptr does not model ownership over the object, and it can be used to solve the circular dependency problem.
weak_ptr can be checked for being valid: it is valid when there exists a shared_ptr pointing to the object.
shared_ptr can be constructed from a valid weak_ptr.
weak_ptr can be constructed from a valid raw pointer to the object. Raw pointer is valid if there exists at least one weak_ptr still pointing to that object. Constructing weak_ptr from invalid pointer results in undefined behavior.
The whole smart pointer system should be cast-friendly, like the abovementioned existing systems.
It is OK for being intrusive, i.e. asking the user to inherit once from given base class. Holding the object's memory when the object is already destroyed is also OK. Thread safety is very good to have (unless being too inefficient), but solutions without it are also interesting. It is OK to allocate several chunks of memory per object, though having one memory chunk per object is preferred.
Points 1-4 and 6 are already modelled by shared_ptr/weak_ptr.
Point 5 makes no sense. If lifetime is shared, then there is no valid object if a weak_ptr exists but a shared_ptr does not. Any raw pointer would be an invalid pointer. The lifetime of the object has ended. The object is no more.
A weak_ptr does not keep the object alive, it keeps the control block alive. A shared_ptr keeps both the control block and the controlled object alive.
If you don't want to "waste" memory by combining the control block with the controlled object, don't call make_shared.
If you don't want shared_ptr<X> to be passed virally into functions, don't pass it. Pass a reference or const reference to the X. You only need to mention shared_ptr in the argument list if you intend on managing the lifetime in the function. If you simply want to perform operations on what the shared_ptr is pointing at, pass *p or *p.get() and accept a [const] reference.
Override new on the object to allocate a control block before the instance of the object.
This is pseudo-intrusive. Conversion to from raw pointer is possible, because of the known offset. The object can be destroyed without a problem.
The reference counting block holds a strong and weak count, and a function object to destroy the object.
Downside: it doesn't work polymorphically very well.
Imagine we have:
struct A {int x;};
struct B {int y;};
struct C:B,A {int z;};
then we allocate a C this way.
C* c = new C{};
and store it in an A*:
A* a = c;
We then pass this to a smart-pointer-to-A. It expects the control block to be immediately before the address a points to, but because B exists before A in the inheritance graph of C, there is an instance of B there instead.
That seems less than ideal.
So we cheat. We again replace new. But it instead registers the pointer value and size with a registry somewhere. There we store the weak/strong pointer counts (etc).
We rely on a linear address space and class layout. When we have a pointer p, we simply look for whose range of address it is in. Then we know the strong/weak counts.
This one has horrible performance in general, especially multi-threaded, and relies upon undefined behavior (pointer comparisons for pointers not pointing to the same object, or less order in such cases).
In theory, it is possible to implement intrusive version of shared_ptr and weak_ptr, but it might be unsafe due to C++ language limitations.
Two reference counters (strong and weak) are stored in the base class RefCounters of the managed object. Any smart pointer (either shared or weak) contains a single pointer to the managed object. Shared pointers own the object itself, and shared + weak pointers together own the memory block of the object. So when the last shared pointer is gone, object is destroyed, but its memory block remains alive as long as there is at least one weak pointer to it. Casting pointers works as expected, given that all the involved types are still inherited from the RefCounted class.
Unfortunately, in C++ it is usually forbidden to work with members of object after the object is destroyed, although most implementations should allow doing that without problems. More details about legibility of the approach can be found in this question.
Here is the base class required for the smart pointers to work:
struct RefCounters {
size_t strong_cnt;
size_t weak_cnt;
};
struct RefCounted : public RefCounters {
virtual ~RefCounted() {}
};
Here is a part of shared pointer definition (shows how object is destroyed and memory chunk is deallocated):
template<class T> class SharedPtr {
static_assert(std::is_base_of<RefCounted, T>::value);
T *ptr;
RefCounters *Counter() const {
RefCounters *base = ptr;
return base;
}
void DestroyObject() {
ptr->~T();
}
void DeallocateMemory() {
RefCounted *base = ptr;
operator delete(base);
}
public:
~SharedPtr() {
if (ptr) {
if (--Counter()->strong_cnt == 0) {
DestroyObject();
if (Counter()->weak_cnt == 0)
DeallocateMemory();
}
}
}
...
};
Full code with sample is available here.
I never used any kind of smart pointer, but I keep reading about them almost everywhere when the topic is pointers. I do understand that there are situations where smart pointers are much nicer to work with than raw pointers, because to some extend they manage ownership of the pointer. However, I still do not know, where is the line between "I do not needing smart pointers for that" and "this is a case for smart pointers".
Lets say, I have the following situation:
class A {
public:
double get1(){return 1;}
double get2(){return 2;}
};
class SomeUtilityClass {
public:
SomeUtilityClass(A* a) : a(a) {}
double getResult(){return a->get1() + a->get2();}
void setA(A* a){a = a;}
private:
A* a;
};
int main(int argc, char** argv) {
A a;
SomeUtilityClass u(&a);
std::cout << u.getResult() << std::endl;
A a2;
u.setA(&a2);
std::cout << u.getResult() << std::endl;
return 0;
}
This is of course an oversimplified example. What I mean is that SomeUtilityClass is not supposed to "own" an instance of A (because it is just a utility class), thus it just holds a pointer.
Concerning the pointer, the only thing that I am aware of that could go wrong is:
SomeUtilityClass can be instantiated with a null pointer
The object pointed to may be deleted/go out of scope, without the SomeUtilityClass noticing it
How could a smart pointer help to avoid this problem? What other benefits I would get by using a smart pointer in this case?
PS: I know that there are several question on smart pointers (e.g. this one). However, I would appreciate, if you could tell me about the impact on this particular example.
This depends on how the parameter is created and stored. If you don't own the memory and it could be either statically or dynamically allocated, a raw pointer is a perfectly reasonable solution -- especially if you need to support swapping of the data as in your example. Another option would be to use std::reference_wrapper, which would get rid of your nullptr issue whilst keeping the same semantics.
If you are holding a pointer to some shared resource (i.e. stored in a std::shared_ptr somewhere) and want to be able to check if it has been deleted or not, you could hold a std::weak_ptr.
For the purposes of this answer I'm redefining setA as:
void setA(A* new_a){a = new_a;}
Consider:
// Using your SomeUtilityClass
int main() {
A a;
SomeUtilityClass u(&a);
// We define a new scope, just because:
{
A b;
u.setA(&b);
}
std::cout << u.getResult() << '\n';
return 0;
}
After the scope is finished, SomeUtilityClass has a dangling pointer and getResult() invokes Undefined Behaviour. Note that this can't be solved with a reference: You would still get a dangling one.
Now consider the version using a smart pointer:
class SomeUtilityClass {
public:
SomeUtilityClass(std::shared_ptr<A>& a) : a{a} {}
double getResult(){return a->get1() + a->get2();}
void setA(std::shared_ptr<A>& new_a){a = new_a;}
private:
std::shared_ptr<A> a;
};
int main() {
std::shared_ptr<A> a{new A};
SomeUtilityClass u{a};
// We define a new scope, just because:
{
std::shared_ptr<A> b{new A};
u.setA(b);
}
std::cout << u.getResult() << '\n';
return 0;
}
Because you have shared ownership, there's no way to get a dangling pointer. The memory pointed to by b will be deleted as usual, but only after u is destroyed(or its pointer is changed).
IMHO, in most cases you should be using smart pointers (Even when at first it doesn't seem to make much sense). It makes maintenance much easier. Use raw pointers only in specific code that actually needs them, and encapsulate/isolate this code as much as possible.
If SomeUtilityClass does not own the member variable a, then a smart pointer does not make sense.
You might consider a reference member, which would remove the problems of a null pointer.
The default way of expressing not-owning pointer in C++ is weak_ptr. To use weak_ptr you need to use shared_ptr for ownership, so in your example you would use
shared_ptr<A> owner(...)
instead of
A a
Then as the private pointer member of your SomeUtilityClass you use weak pointer:
weak_ptr<A> w;
and initialise it with shared_ptr:
SomeUtilityClass(shared_ptr<A> o) : w(o) {}
however, you cannot use weak_ptr directly, since the shared_ptr could go out of scope and your weak pointer can no longer point to anything. Before use you need to lock it:
shared_ptr<A> locked = w.lock();
The locked pointer will be empty if the owning pointer no longer manages an object, since e.g. it went out of scope. If it is not empty, you may use it and then it will go out of scope automatically releasing the lock the object.
Both shared_ptr and weak_ptr are available in standard library in C++11, and in Boost for older compilers.
There are different types of smart pointers. In your case, it is clear that a smart pointer is not really needed, but it may still provide some benefits.
SomeUtilityClass can be instantiated with a null pointer
This one is probably best solved with a check in the constructor, throwing an exception or indicating an error in some other way in the case when you get a NULL pointer as the argument. I can hardly imagine how a smart pointer would help, unless you use a specific smart pointer class that doesn't accept NULLs, so it does the check for you already.
The object pointed to may be deleted/go out of scope, without the
SomeUtilityClass noticing it
This one can actually be resolved with a special type of smart pointers, but then it is needed that the object being pointed to somehow supports notification of destruction. One such example is the QPointer class in the Qt library, which can only point to QObject instances, which notify it when deleted, so the smart pointer automatically becomes NULL when the object is deleted. There are some problems with this approach, though:
You need to check for NULLs every time you access the object through the smart pointer.
If a smart pointer points to an instance of a class, say MyClass, extending the class performing the deletion notification (QObject in the Qt case), you get strange results: it's the destructor of QObject that notifies the smart pointer, so it is possible that you access it when the MyClass destructor already began its dirty work, so the object is partially destructed, but the pointer is not NULL yet because the destruction is still in progress.
Suppose I have:
class SomeObject {
};
SomeObject& f() {
SomeObject *s = new SomeObject();
return *s;
}
// Variant 1
int main() {
SomeObject& s = f();
// Do something with s
}
// Variant 2
int main() {
SomeObject s = f();
// Do something with s
}
Is there any difference between the first variant and the second? any cases I would use one over the other?
Edit: One more question, what does s contain in both cases?
First, you never want to return a reference to an object which
was dynamically allocated in the function. This is a memory
leak waiting to happen.
Beyond that, it depends on the semantics of the object, and what
you are doing. Using the reference (variant 1) allows
modification of the object it refers to, so that some other
function will see the modified value. Declaring a value
(variant 2) means that you have your own local copy, and any
modifications, etc. will be to it, and not to the object
referred to in the function return.
Typically, if a function returns a reference to a non-const,
it's because it expects the value to be modified; a typical
example would be something like std::vector<>::operator[],
where an expression like:
v[i] = 42;
is expected to modify the element in the vector. If this is
not the case, then the function should return a value, not
a reference (and you should almost never use such a function to
initialize a local reference). And of course, this only makes
sense if you return a reference to something that is accessible
elsewhere; either a global variable or (far more likely) data
owned by the class of which the function is a member.
In the first variant you attach a reference directly to a dynamically allocated object. This is a rather unorthodox way to own dynamic memory (a pointer would be better suited for that purpose), but still it gives you the opportunity to properly deallocate that object. I.e. at the end of your first main you can do
delete &s;
In the second variant you lose the reference, i.e. you lose the only link to that dynamically allocated object. The object becomes a memory leak.
Again, owning a dynamically allocated object through a reference does not strike me as a good practice. It is usually better to use a pointer or a smart pointer for that purpose. For that reason, both of your variants are flawed, even though the first one is formally redeemable.
Variant 1 will copy the address of the object and will be fast
Variant 2 will copy the whole object and will be slow (as already pointed out in Variant2 you cant delete the object which you created by calling new)
for the edit: Both f contain the same Object
None of the two options you asked about is very good. In this particular case you should use shared_ptr or unique_ptr, or auto_ptr if you use older C++ compilers, and change the function so it returns pointer, not reference. Another good option is returning the object by value, especially if the object is small and cheap to construct.
Modification to return the object by value:
SomeObject f() { return SomeObject(); }
SomeObject s(f());
Simple, clean, safe - no memory leaking here.
Using unique_ptr:
SomeObject* f() { return new SomeObject(); }
unique_ptr<SomeObject> s(f());
One of the advantages of using a unique_ptr or shared_ptr here is that you can change your function f at some point to return objects of a class derived from SomeObject and none of your client code will need to be changed - just make sure the base class (SomeObject) has a virtual constructor.
Why the options you were considering are not very good:
Variant 1:
SomeObject& s = f();
How are you going to destroy the object? You will need address of the object to call it's destructor anyway, so at some point you would need to dereference the object that s refers to (&s)
Variant 2. You have a leak here and not a chance to call destructor of the object returned from your function.
struct Temp
{
CString one;
CString two;
};
class Foo
{
public:
Temp obj;
void somewhere();
}
void Foo::somewhere()
{
void* pData = static_cast<void*>(&obj);
OwnMethod(pData); // void OwnMethod(void*);
}
The question is:
Should I create obj on heap or this situation isn't dangerous (passing local class objects pointer)?
If OwnMethod(pData) stores the pointer somwhere for later use, that later use is not possible anymore, once the object on which Foo::somewhere() is called is destroyed.
If OwnMethod(pData) only access the pointed to data, you are safe.
The member variable will last as long as the Foo object, so the pointer will be valid during the call to OwnMethod.
If that function stores a copy of the pointer somewhere, and something else uses that pointer later, then there is a danger that it might be accessed after the Foo (and therefore the pointer's target) have been destroyed. There are various ways to prevent that; as you say, one is to dynamically allocate the object, and then transfer or share ownership when it's passed to OwnMethod. Smart pointers, such as std::unique_ptr and std::shared_ptr, are a very good way to track ownership of dynamic objects.
Wow, a lot of issues.
A complex object should't be passed to anything taking a void*.
Who wrote OwnMethod?
Why doesn't it take a pointer of type Foo*?
In fact why doesn't it take a reference of type Foo&?
If OwnMethod() may be required to accept objects of several different types then it should take a base class pointer or reference and use polymorphism.
However as far as the lifetime arguments go - obj will exist as long as the wrapping class does, so if the pointer is not used beyond the scope of OwnMethod this is ok. If OwnMethod causes the pointer to be stored elsewhere beyond Foo's lifetime then you have an issue, and maybe obj should be allocated on the heap. (And it might not even be appropriate for it to be a member of Foo at all.)
void ClassName::LocalMethod( )
{
boost::shared_ptr<ClassName> classNamePtr( this );
//some operation with classNamePtr
return;
}
Here the object is getting released when it returns from LocalMethod() since classNamePtr is out of scope. Isn't the shared_ptr smart enough to know that the ClassName object is still in scope and not to delete it?
What does it mean to create a shared_ptr to an object? It means that the holder of the shared_ptr now assumes ownership over the object. Ownership meaning that the object will be deleted when he so desires. When the holder of the shared_ptr destroys its shared_ptr, that will cause the object to potentially be destroyed, assuming that there are no other shared_ptrs to that object.
When a shared_ptr is a member of a class, that means that the lifetime of the object pointed to by the shared_ptr is at least as long as the object that the shared_ptr is a member of. When a shared_ptr is on the stack, this means that the lifetime of the object that the shared_ptr is pointing to will be at least as long as the scope it was created in. Once the object falls off the stack, it may be deleted.
The only time you should ever take a pointer and wrap it into a shared_ptr is when you are allocating the object initially. Why? Because an object does not know whether it is in a shared_ptr or not. It can't know. This means that the person who creates the original shared_ptr now has the responsibility to pass it around to other people who need to share ownership of that memory. The only way shared ownership works is through the copy constructor of shared_ptr. For example:
shared_ptr<int> p1 = new int(12);
shared_ptr<int> p2 = p1.get();
shared_ptr<int> p3 = p1;
The copy constructor of shared_ptr creates shared ownership between p1 and p3. Note that p2 does not share ownership with p1. They both think they have ownership over the same memory, but that's not the same as sharing it. Because they both think that they have unique ownership of it.
Therefore, when the three pointers are destroyed, the following will happen. First, p3 will be destroyed. But since p3 and p1 share ownership of the integer, the integer will not be destroyed yet. Next, p2 will be destroyed. Since it thinks that it is the only holder of the integer, it will then destroy it.
At this point, p1 is pointing to deleted memory. When p1 is destroyed, it thinks that it is the only holder of the integer, so it will then destroy it. This is bad, since it was already destroyed.
Your problem is this. You are inside an instance of a class. And you need to call some functions of yours that take a shared_ptr. But all you have is this, which is a regular pointer. What do you do?
You're going to get some examples that suggest enable_shared_from_this. But consider a more relevant question: "why do those functions take a shared_ptr as an argument?"
The type of pointer a function takes is indicative of what that function does with its argument. If a function takes a shared_ptr, that means that it needs to own the pointer. It needs to take shared ownership of the memory. So, look at your code and ask whether those functions truly need to take ownership of the memory. Are they storing the shared_ptr somewhere long-term (ie: in an object), or are they just using them for the duration of the function call?
If it's the latter, then the functions should take a naked pointer, not a shared_ptr. That way, they cannot claim ownership. Your interface is then self-documenting: the pointer type explains ownership.
However, it is possible that you could be calling functions that truly do need to take shared ownership. Then you need to use enable_shared_from_this. First, your class needs to be derived from enable_shared_from_this. Then, in the function:
void ClassName::LocalMethod()
{
boost::shared_ptr<ClassName> classNamePtr(shared_from_this());
//some operation with classNamePtr
return;
}
Note that there is a cost here. enable_shared_from_this puts a boost::weak_ptr in the class. But there is no virtual overhead or somesuch; it doesn't make the class virtual. enable_shared_from_this is a template, so you have to declare it like this:
class ClassName : public boost::enable_shared_from_this<ClassName>
Isn't the shared_ptr smart enough to know that the ClassName object is
still in scope and not to delete it?
That's not how shared_ptr works. When you pass a pointer while constructing a shared_ptr, the shared_ptr will assume ownership of the pointee (in this case, *this). In other words, the shared_ptr assumes total control over the lifetime of the pointee by virtue of the fact that the shared_ptr now owns it. Because of this, the last shared_ptr owning the pointee will delete it.
If there will be no copies of classNamePtr outside of ClassName::LocalMethod(), you can pass a deleter that does nothing while constructing classNamePtr. Here's an example of a custom deleter being used to prevent a shared_ptr from deleting its pointee. Adapting the example to your situation:
struct null_deleter // Does nothing
{
void operator()(void const*) const {}
};
void ClassName::LocalMethod()
{
// Construct a shared_ptr to this, but make it so that it doesn't
// delete the pointee.
boost::shared_ptr<ClassName> classNamePtr(this, null_deleter());
// Some operation with classNamePtr
// The only shared_ptr here will go away as the stack unwinds,
// but because of the null deleter it won't delete this.
return;
}
You can also use enable_shared_from_this to obtain a shared_ptr from this. Note that the member function shared_from_this() only works if you have an existing shared_ptr already pointing to this.
class ClassName : public enable_shared_from_this<ClassName>
{
public:
void LocalMethod()
{
boost::shared_ptr<ClassName> classNamePtr = shared_from_this();
}
}
// ...
// This must have been declared somewhere...
shared_ptr<ClassName> p(new ClassName);
// before you call this:
p->LocalMethod();
This is the more appropriate, "official" method and it's much less hackish than the null deleter method.
It could also be that you don't actually need to create a shared_ptr in the first place. What goes into the section commented //some operation with classNamePtr? There might be an even better way than the first two ways.