shared_ptr and weak_ptr differences - c++

I am reading Scott Meyers "Effective C++" book. It was mentioned that there are tr1::shared_ptr and tr1::weak_ptr act like built-in pointers, but they keep track of how many tr1::shared_ptrs point to an object.
This is known as reference counting. This works well in preventing resource leaks in acyclic data structures, but if two or more objects contain tr1::shared_ptrs such that a cycle is formed, the cycle may keep each other's reference count above zero, even when all external pointers to the cycle have been destroyed.
That's where tr1::weak_ptrs come in.
My question is how cyclic data structures make the reference count above zero. I kindly request an example C++ program. How is the problem solved by weak_ptrs? (again, with example please).

Let me repeat your question: "My question, how cyclic data structures makes reference count above zero, kindly request to show with example in C++ program. How the problem is solved by weak_ptrs again with example please."
The problem occurs with C++ code like this (conceptually):
class A { shared_ptr<B> b; ... };
class B { shared_ptr<A> a; ... };
shared_ptr<A> x(new A); // +1
x->b = new B; // +1
x->b->a = x; // +1
// Ref count of 'x' is 2.
// Ref count of 'x->b' is 1.
// When 'x' leaves the scope, there will be a memory leak:
// 2 is decremented to 1, and so both ref counts will be 1.
// (Memory is deallocated only when ref count drops to 0)
To answer the second part of your question: It is mathematically impossible for reference counting to deal with cycles. Therefore, a weak_ptr (which is basically just a stripped down version of shared_ptr) cannot be used to solve the cycle problem - the programmer is solving the cycle problem.
To solve it, the programmer needs to be aware of the ownership relationship among the objects, or needs to invent an ownership relationship if no such ownership exists naturally.
The above C++ code can be changed so that A owns B:
class A { shared_ptr<B> b; ... };
class B { weak_ptr<A> a; ... };
shared_ptr<A> x(new A); // +1
x->b = new B; // +1
x->b->a = x; // No +1 here
// Ref count of 'x' is 1.
// Ref count of 'x->b' is 1.
// When 'x' leaves the scope, its ref count will drop to 0.
// While destroying it, ref count of 'x->b' will drop to 0.
// So both A and B will be deallocated.
A crucial question is: Can weak_ptr be used in case the programmer cannot tell the ownership relationship and cannot establish any static ownership because of lack of privilege or lack of information?
The answer is: If ownership among objects is unclear, weak_ptr cannot help. If there is a cycle, the programmer has to find it and break it. An alternative remedy is to use a programming language with full garbage collection (such as: Java, C#, Go, Haskell), or to use a conservative (=imperfect) garbage collector which works with C/C++ (such as: Boehm GC).

A shared_ptr wraps a reference counting mechanism around a raw pointer. So for each instance of the shared_ptr the reference count is increased by one. If two share_ptr objects refer the each other they will never get deleted because they will never end up with a reference count of zero.
weak_ptr points to a shared_ptr but does not increase its reference count.This means that the underying object can still be deleted even though there is a weak_ptr reference to it.
The way that this works is that the weak_ptr can be use to create a shared_ptr for whenever one wants to use the underlying object. If however the object has already been deleted then an empty instance of a shared_ptr is returned. Since the reference count on the underlying object is not increased with a weak_ptr reference, a circular reference will not result in the underlying object not being deleted.

For future readers.
Just want to point out that explanation given by Atom is excellent, here is working code
#include <memory> // and others
using namespace std;
class B; // forward declaration
// for clarity, add explicit destructor to see that they are not called
class A { public: shared_ptr<B> b; ~A() {cout << "~A()" << endl; } };
class B { public: shared_ptr<A> a; ~B() {cout << "~B()" << endl; } };
shared_ptr<A> x(new A); //x->b share_ptr is default initialized
x->b = make_shared<B>(); // you can't do "= new B" on shared_ptr
x->b->a = x;
cout << x.use_count() << endl;

Weak pointers just "observe" the managed object; they don't "keep it alive" or affect its lifetime. Unlike shared_ptr, when the last weak_ptr goes out of scope or disappears, the pointed-to object can still exist because the weak_ptr does not affect the lifetime of the object - it has no ownership rights. The weak_ptr can be used to determine whether the object exists, and to provide a shared_ptr that can be used to refer to it.
The definition of weak_ptr is designed to make it relatively foolproof, so as a result there is very little you can do directly with a weak_ptr. For example, you can't dereference it; neither operator* nor operator-> is defined
for a weak_ptr. You can't access the pointer to the object with it - there is no get() function. There is a comparison function defined so that you can store weak_ptrs in an ordered container, but that's all.

All the above answer are WRONG. weak_ptr is NOT used to break cyclic references, they have another purpose.
Basically, if all shared_ptr(s) were created by make_shared() or allocate_shared() calls, you will NEVER need weak_ptr if you have no resource other than memory to manage. These functions create the shared_ptr reference counter object with the object itself, and the memory will be freed at the same time.
The only difference between weak_ptr and shared_ptr is that the weak_ptr allows the reference counter object to be kept after the actual object was freed. As a result, if you keep a lot of shared_ptr in a std::set the actual objects will occupy a lot of memory if they are big enough. This problem can be solved by using weak_ptr instead. In this case, you have to ensure the weak_ptr stored in the container is not expired before using it.

Related

A shared pointer which is conceptually owned by one, unique, object

What is the canonical way to deal with shared pointers in C++ when there is a clear case to argue that "one, unique object owns the pointer"?
For example, if a shared_ptr is a member of a particular class, which is responsible for initializing the pointer, then it could be argued that this class should also have the final say on when the pointer is deleted.
In other words, it may be the case that when the owning class goes out of scope (or is itself delete'd that any remaining references to the pointer no longer make sense. This may be due to related variables which were members of the destroyed class.
Here is a sketch of an example
class Owner
{
Owner()
{
p.reset(malloc_object(arguments), free_object);
}
std::shared_ptr<type> get() { return p; }
// seems strange because now something somewhere
// else in the code can hold up the deletion of p
// unless a manual destructor is written
~Owner()
{
p.reset(nullptr); // arduous
}
std::shared_ptr<type> p;
int a, b, c; // some member variables which are logically attached to p
// such that neither a, b, c or p make sense without each other
}
One cannot use a unique_ptr as this would not permit the pointer to be returned by the get function, unless a raw pointer is returned. (Is this is an acceptable solution?)
A unique_ptr in combination with returning weak_ptr from the get function might make sense. But this is not valid C++. weak_ptr is used in conjunction with shared_ptr.
A shared_ptr with the get function returning weak_ptr is better than a raw pointer becuase in order to use the weak pointer, it has to be converted to a shared pointer. This will fail if the reference count is already zero and the object has been deleted.
However using a shared_ptr defeats the point, since ideally a unique_ptr would be chosen because there can then only be one thing which "owns" the pointed to data.
I hope the question is clear enough, it was quite difficult to explain since I can't copy the code I am working with.
It is ok to return the shared_ptr there, what will happen is that the pointer will still be held somewhere outside the Owner class. Since your doing p.reset(nullptr); at the destructor, whoever was holding that shared_ptr will now be holding a pointer to null.
Using weak_ptr with shared_ptr is also a good solution, the problem is the same which is the fact that the best class to represent p is unique_ptr as you described.
The path I would choose is to hold a unique_ptr which seems more adequate and to implement the get() function like this:
type* get() { return p.get(); }
The behaviour is the same and the code is clearer since having p as unique_ptr will give clarity on how it should be used.

Force deleting std::shared_ptr in C++

According to the first answer in this article: Explicitly deleting a shared_ptr
Is it possible to force delete a std::shared_ptr and the object it manages like below code?
do {
ptr.reset();
} while (!ptr.unique());
ptr.reset(); // To eliminate the last reference
Technically, this should try calling std::shared_ptr::reset if the pointer has more than 1 reference count, unless it reaches to one. Any thoughts on this?
This code doesn't make any sense.
Once you reset ptr, it doesn't manage an object anymore. If ptr was the only shared_ptr sharing ownership, then you're done. If it wasn't... well, you don't have access to all those other ones. Calling reset() on a disengaged shared_ptr is effectively a noop - there's nothing more to reset.
Imagine a simple scenario:
std::shared_ptr<int> a = std::make_shared<int>(42);
std::shared_ptr<int> b = a; // a and b are sharing ownership of an int
do {
a.reset();
} while (!a.unique());
The only way to reset b is to reset b - this code will reset a only, it cannot possibly reach b.
Also note that unique() was deprecated in C++17 and is removed entirely in C++20. But if even if you use use_count() instead, once you do a.reset(), a.use_count() will be equal to 0 because a no longer points to an object.
No this is not possible (or desirable). The point of a shared pointer is that if you have one you can guarantee the object it points to (if any) will not disappear from under you until (at least) you have finished with it.
Calling ptr.reset() will only reduce the reference count by 1 - being your shared pointer's reference. It will never affect other references from other shared pointers that are sharing your object.

Why would I want to use a smart pointer in this situation?

I never used any kind of smart pointer, but I keep reading about them almost everywhere when the topic is pointers. I do understand that there are situations where smart pointers are much nicer to work with than raw pointers, because to some extend they manage ownership of the pointer. However, I still do not know, where is the line between "I do not needing smart pointers for that" and "this is a case for smart pointers".
Lets say, I have the following situation:
class A {
public:
double get1(){return 1;}
double get2(){return 2;}
};
class SomeUtilityClass {
public:
SomeUtilityClass(A* a) : a(a) {}
double getResult(){return a->get1() + a->get2();}
void setA(A* a){a = a;}
private:
A* a;
};
int main(int argc, char** argv) {
A a;
SomeUtilityClass u(&a);
std::cout << u.getResult() << std::endl;
A a2;
u.setA(&a2);
std::cout << u.getResult() << std::endl;
return 0;
}
This is of course an oversimplified example. What I mean is that SomeUtilityClass is not supposed to "own" an instance of A (because it is just a utility class), thus it just holds a pointer.
Concerning the pointer, the only thing that I am aware of that could go wrong is:
SomeUtilityClass can be instantiated with a null pointer
The object pointed to may be deleted/go out of scope, without the SomeUtilityClass noticing it
How could a smart pointer help to avoid this problem? What other benefits I would get by using a smart pointer in this case?
PS: I know that there are several question on smart pointers (e.g. this one). However, I would appreciate, if you could tell me about the impact on this particular example.
This depends on how the parameter is created and stored. If you don't own the memory and it could be either statically or dynamically allocated, a raw pointer is a perfectly reasonable solution -- especially if you need to support swapping of the data as in your example. Another option would be to use std::reference_wrapper, which would get rid of your nullptr issue whilst keeping the same semantics.
If you are holding a pointer to some shared resource (i.e. stored in a std::shared_ptr somewhere) and want to be able to check if it has been deleted or not, you could hold a std::weak_ptr.
For the purposes of this answer I'm redefining setA as:
void setA(A* new_a){a = new_a;}
Consider:
// Using your SomeUtilityClass
int main() {
A a;
SomeUtilityClass u(&a);
// We define a new scope, just because:
{
A b;
u.setA(&b);
}
std::cout << u.getResult() << '\n';
return 0;
}
After the scope is finished, SomeUtilityClass has a dangling pointer and getResult() invokes Undefined Behaviour. Note that this can't be solved with a reference: You would still get a dangling one.
Now consider the version using a smart pointer:
class SomeUtilityClass {
public:
SomeUtilityClass(std::shared_ptr<A>& a) : a{a} {}
double getResult(){return a->get1() + a->get2();}
void setA(std::shared_ptr<A>& new_a){a = new_a;}
private:
std::shared_ptr<A> a;
};
int main() {
std::shared_ptr<A> a{new A};
SomeUtilityClass u{a};
// We define a new scope, just because:
{
std::shared_ptr<A> b{new A};
u.setA(b);
}
std::cout << u.getResult() << '\n';
return 0;
}
Because you have shared ownership, there's no way to get a dangling pointer. The memory pointed to by b will be deleted as usual, but only after u is destroyed(or its pointer is changed).
IMHO, in most cases you should be using smart pointers (Even when at first it doesn't seem to make much sense). It makes maintenance much easier. Use raw pointers only in specific code that actually needs them, and encapsulate/isolate this code as much as possible.
If SomeUtilityClass does not own the member variable a, then a smart pointer does not make sense.
You might consider a reference member, which would remove the problems of a null pointer.
The default way of expressing not-owning pointer in C++ is weak_ptr. To use weak_ptr you need to use shared_ptr for ownership, so in your example you would use
shared_ptr<A> owner(...)
instead of
A a
Then as the private pointer member of your SomeUtilityClass you use weak pointer:
weak_ptr<A> w;
and initialise it with shared_ptr:
SomeUtilityClass(shared_ptr<A> o) : w(o) {}
however, you cannot use weak_ptr directly, since the shared_ptr could go out of scope and your weak pointer can no longer point to anything. Before use you need to lock it:
shared_ptr<A> locked = w.lock();
The locked pointer will be empty if the owning pointer no longer manages an object, since e.g. it went out of scope. If it is not empty, you may use it and then it will go out of scope automatically releasing the lock the object.
Both shared_ptr and weak_ptr are available in standard library in C++11, and in Boost for older compilers.
There are different types of smart pointers. In your case, it is clear that a smart pointer is not really needed, but it may still provide some benefits.
SomeUtilityClass can be instantiated with a null pointer
This one is probably best solved with a check in the constructor, throwing an exception or indicating an error in some other way in the case when you get a NULL pointer as the argument. I can hardly imagine how a smart pointer would help, unless you use a specific smart pointer class that doesn't accept NULLs, so it does the check for you already.
The object pointed to may be deleted/go out of scope, without the
SomeUtilityClass noticing it
This one can actually be resolved with a special type of smart pointers, but then it is needed that the object being pointed to somehow supports notification of destruction. One such example is the QPointer class in the Qt library, which can only point to QObject instances, which notify it when deleted, so the smart pointer automatically becomes NULL when the object is deleted. There are some problems with this approach, though:
You need to check for NULLs every time you access the object through the smart pointer.
If a smart pointer points to an instance of a class, say MyClass, extending the class performing the deletion notification (QObject in the Qt case), you get strange results: it's the destructor of QObject that notifies the smart pointer, so it is possible that you access it when the MyClass destructor already began its dirty work, so the object is partially destructed, but the pointer is not NULL yet because the destruction is still in progress.

will weak_pointer.lock() increment the ref count of original shared_ptr which was used to create weak_ptr

As per my understanding, weak pointer is used to cyclic dependency problem occurs if we use all shared_ptr objects and if there is a cyclic dependency. Weak pointers are used to break the cycles. weak pointers achieves this by using lock() which will create shared pointer.
class A { shared_ptr<B> b; ... };
class B { weak_ptr<A> a; ... };
shared_ptr<A> x(new A); // +1
x->b = new B; // +1
x->b->a = x; // No +1 here
But now assume that I created lock calling x->b->a.lock(), so ref count of x will become 2. and if x leaves the scope, still there will be memory leak right? Because I created a shared pointer using lock() and ref count became 2. Please let me know whether my understanding is correct or not.
There are two distinct reference counts involved for a shared_ptr shared object:
The number of references to the object, i.e. shared_ptr instances.
The number of references to the control block, i.e. shared_ptr and weak_ptr instances.
A weak_ptr contributes only to the latter count. When all shared_ptr instances have been destroyed, the object deleter is called, which usually is the default one that destroys the object. The control block still exists if there are weak pointers. When also all weak pointers have been destroyed, the control block is destroyed.
So (ignoring possible optimization of caching object pointer directly in each shared_ptr instance), in your case you have x pointing (hidden for you) to the control block, which has a pointer to the A instance. And you have the b member of that instance pointing to a second control block, which has a pointer to a B instance. Finally that instance has a pointer to the control block that x points to, which is circular yes, but not a circularity of ownership.

Lifetime management of encapsulated objects

What is the best approach to encapsulate objects and manage their lifetime? Example: I have a class A, that contains an object of type B and is solely responsible for it.
Solution 1, clone b object to ensure that only A is able to clean it up.
class A
{
B *b;
public:
A(B &b)
{
this->b = b.clone();
}
~A()
{
delete b; // safe
}
};
Solution 2, directly use the passed object, we risk a potential double free here.
class A
{
B *b;
public:
A(B *b)
{
this->b = b;
}
~A()
{
delete b; // unsafe
}
};
In my actual case, solution #2 would fit best. However I wonder if this is considered bad code because someone might not know about the behavior of A, even if it's documented. I can think of these scenarios:
B *myB = new B();
A *myA = new A(myB);
delete myB; // myA contains a wild pointer now
Or,
B *myB = new B();
A *firstA = new A(myB);
A *secondA = new A(myB); // bug! double assignment
delete firstA; // deletes myB, secondA contains a wild pointer now
delete secondA; // deletes myB again, double free
Can I just ignore these issues if I properly document the behavior of A? Is it enough to declare the responsibility and leave it up to the others to read the docs? How is this managed in your codebase?
I never delete anything myself unless I really have to. That leads to errors.
Smart pointers are your friend. std::auto_ptr<> is your friend when one object owns another and is responsible for deleting it when going out of scope. boost::shared_ptr<> (or, now, std::tr1::shared_ptr<>) is your friend when there's potentially more than one object attached to another object, and you want the object deleted when there's no more references to it.
So, either use your solution 1 with auto_ptr, or your solution 2 with shared_ptr.
You should define your object so that the ownership semantics are, as much as possible, defined by the interface. As David Thornley pointed out, std::auto_ptr is the smart pointer of choice to indicate transfer of ownership. Define your class like so:
class A
{
std::auto_ptr<B> b;
public:
A(std::auto_ptr<B> b)
{
this->b = b;
}
// Don't need to define this for this scenario
//~A()
//{
// delete b; // safe
//}
};
Since the contract of std::auto_ptr is that assignment = transfer of ownership, your constructor now states implicitly that an A object has ownership of the pointer to B it's passed. In fact, if a client tries to do something with a std::auto_ptr<B> that they used to construct an A after the construction, the operation will fail, as the pointer they hold will be invalid.
If you are writing code that someone else will be using later, these issues must be addressed. In this case I would go for simple reference counting (maybe with smart pointers). Consider the following example:
When an instance of the encapsulating class is assigned an object B, it calls a method to increase object's B reference counter. When the encapsulating class is destroyed, it doesn't delete B, but instead calls a method do decrease reference count. When the counter reaches zero, object B is destroyed (or destroys itself for that matter). This way multiple instances of encapsulating class can work with a single instance of object B.
More on the subject: Reference Counting.
If your object is solely responsible for the passed object then deleting it should be safe. If it is not safe than the assertion that you are solely responsible is false. So which is it? If you're interface is documented that you WILL delete the inbound object, then it is the caller responsibility to make sure you receive an object that must be deleted by you.
If you're cloning A, and both A1 and A2 retain references to B, then B's lifetime is not being controlled entirely by A. It's being shared among the various A. Cloning B ensures a one-to-one relationship between As and Bs, which will be easy to ensure lifetime consistency.
If cloning B is not an option, then you need to discard the concept that A is responsible for B's lifetime. Either another object will need to manage the various B, or you'll need to implement a method like reference counting.
For reference, when I think of the term 'Clone', it implies a deep copy, which would clone B as well. I'd expect the two As to be completely detached from each other after a clone.
I don't clone things unnecessarily or "just to be safe".
Instead I know whose responsibility it is to delete something: either via documentation, or by smart pointers ... for example, if I have a create function which instantiates something and returns a pointer to it and doesn't delete it, so that it's unclear where and by whome that thing is ever supposed to be deleted, then instead of create's returning a naked pointer I might define create's return type as returning the pointer contained within some kind of smart pointer.