Unexpected behavior of smart pointers

Unexpected behavior of smart pointers - c++

I am trying to get the managed object of an aliased shared pointer. My idea was using a weak pointer. Since weak pointers doesn't yield any objects I thought creating a weak pointer from a shared pointer would make the weak pointer forget the stored object of the aliased shared pointer and hence locking the weak pointer would yield a shared pointer with the equal stored and managed pointers. But the results I got confuse me. Does the weak pointer remember from what share pointer is constructed? And is there a way how to get the managed object of an aliased shared pointer?
template<class T> struct Deleter {
void operator()(T* p) const {};
};
Deleter<T> d {};
T t1 {};
T* p1 = &t1;
T t2 {};
T* p2 = &t2;
auto sp1 = std::shared_ptr<T>(p1,d);
auto sp2 = std::shared_ptr<T>(sp1,p2);
auto wp = std::weak_ptr<T>(sp2);
std::cout << sp1.get() << " " << sp2.get() << " " << wp.lock().get() << std::endl;
produce 0x7fff5798c958 0x7fff5798c948 0x7fff5798c948

Does the weak pointer remember from what share pointer is constructed?
It remembers the raw pointer which was stored by the shared_ptr from which it was constructed. I think it would be very confusing if things worked the way you are expecting. For example, if a function recieved a shared_ptr<T> as an argument, it has no idea whether it is aliasing or not. And if that function then wanted to take a weak_ptr from that shared_ptr, it would, I think, be very surprising behavior to find that actualizing that weak_ptr to a shared_ptr gave a completely different object than the original shared_ptr from which it was obtained.
Also note that the aliasing shared_ptr can store a different type than the shared_ptr which it is aliasing. The types don't even have to be convertible to/from each other. And a weak_ptr which is constructed from an aliasing shared_ptr gets its stored type from the shared_ptr it was constructed from (or an implicitly convertible type).
The most common use case for an aliasing shared_ptr, I would suspect (even though I've never had need for one), would be to pass around a member of a shared object, and to be sure it stays alive even if the rest of the owned object is no longer needed. So the scenario which you've constructed, where the type of the aliasing shared_ptr is the same as the original, seems like it would be uncommon.
And is there a way how to get the managed object of an aliased shared pointer?
As far as I'm aware, no. Perhaps there should be. If you think this feature would be useful, propose it. https://groups.google.com/a/isocpp.org/forum/?fromgroups#!forum/std-proposals

Related

A shared pointer which is conceptually owned by one, unique, object

What is the canonical way to deal with shared pointers in C++ when there is a clear case to argue that "one, unique object owns the pointer"?
For example, if a shared_ptr is a member of a particular class, which is responsible for initializing the pointer, then it could be argued that this class should also have the final say on when the pointer is deleted.
In other words, it may be the case that when the owning class goes out of scope (or is itself delete'd that any remaining references to the pointer no longer make sense. This may be due to related variables which were members of the destroyed class.
Here is a sketch of an example
class Owner
{
Owner()
{
p.reset(malloc_object(arguments), free_object);
}
std::shared_ptr<type> get() { return p; }
// seems strange because now something somewhere
// else in the code can hold up the deletion of p
// unless a manual destructor is written
~Owner()
{
p.reset(nullptr); // arduous
}
std::shared_ptr<type> p;
int a, b, c; // some member variables which are logically attached to p
// such that neither a, b, c or p make sense without each other
}
One cannot use a unique_ptr as this would not permit the pointer to be returned by the get function, unless a raw pointer is returned. (Is this is an acceptable solution?)
A unique_ptr in combination with returning weak_ptr from the get function might make sense. But this is not valid C++. weak_ptr is used in conjunction with shared_ptr.
A shared_ptr with the get function returning weak_ptr is better than a raw pointer becuase in order to use the weak pointer, it has to be converted to a shared pointer. This will fail if the reference count is already zero and the object has been deleted.
However using a shared_ptr defeats the point, since ideally a unique_ptr would be chosen because there can then only be one thing which "owns" the pointed to data.
I hope the question is clear enough, it was quite difficult to explain since I can't copy the code I am working with.

It is ok to return the shared_ptr there, what will happen is that the pointer will still be held somewhere outside the Owner class. Since your doing p.reset(nullptr); at the destructor, whoever was holding that shared_ptr will now be holding a pointer to null.
Using weak_ptr with shared_ptr is also a good solution, the problem is the same which is the fact that the best class to represent p is unique_ptr as you described.
The path I would choose is to hold a unique_ptr which seems more adequate and to implement the get() function like this:
type* get() { return p.get(); }
The behaviour is the same and the code is clearer since having p as unique_ptr will give clarity on how it should be used.

Design of (shared_ptr + weak_ptr) compatible with raw pointers

Preamble
In C++11 there is std::shared_ptr + std::weak_ptr combo. Despite being very useful, it has a nasty issue: you cannot easily construct shared_ptr from a raw pointer. As a result of this flaw, such smart pointers usually become "viral": people start to completely avoid raw pointers and references, and use exclusively shared_ptr and weak_ptr smart pointers all over the code. Because there is no way to pass a raw reference into a function expecting a smart pointer.
On the other hand, there is boost::intrusive_ptr. It is equivalent to std::shared_ptr and can easily be constructed from raw pointer, because reference counter is contained within the object. Unfortunately, there is no weak_ptr companion to it, so there is no way to have non-owning references which you could check for being invalid. In fact, some believe that weak companion for intrusive_ptr is impossible.
Now, there is std::enable_shared_from_this, which embeds a weak_ptr directly into your class, so that you could construct shared_ptr from pointer to object. But there is small limitation (at least one shared_ptr must exist), and it still does not allow the obvious syntax: std::shared_ptr(pObject).
Also, there is a std::make_shared, which allocates reference counters and the user's object in a single memory chunk. This is very close to the concept of intrusive_ptr, but the user's object can be destroyed independently of the reference counting block. Also, this concept has an inevitable drawback: the whole memory block (which can be large) is deallocated only when all weak_ptr-s are gone.
Question
The main question is: how to create a pair of shared_ptr/weak_ptr, which would have the benefits of both std::shared_ptr/std::weak_ptr and boost::intrusive_ptr?
In particular:
shared_ptr models shared ownership over the object, i.e. the object is destroyed exactly when the last shared_ptr pointing to it is destroyed.
weak_ptr does not model ownership over the object, and it can be used to solve the circular dependency problem.
weak_ptr can be checked for being valid: it is valid when there exists a shared_ptr pointing to the object.
shared_ptr can be constructed from a valid weak_ptr.
weak_ptr can be constructed from a valid raw pointer to the object. Raw pointer is valid if there exists at least one weak_ptr still pointing to that object. Constructing weak_ptr from invalid pointer results in undefined behavior.
The whole smart pointer system should be cast-friendly, like the abovementioned existing systems.
It is OK for being intrusive, i.e. asking the user to inherit once from given base class. Holding the object's memory when the object is already destroyed is also OK. Thread safety is very good to have (unless being too inefficient), but solutions without it are also interesting. It is OK to allocate several chunks of memory per object, though having one memory chunk per object is preferred.

Points 1-4 and 6 are already modelled by shared_ptr/weak_ptr.
Point 5 makes no sense. If lifetime is shared, then there is no valid object if a weak_ptr exists but a shared_ptr does not. Any raw pointer would be an invalid pointer. The lifetime of the object has ended. The object is no more.
A weak_ptr does not keep the object alive, it keeps the control block alive. A shared_ptr keeps both the control block and the controlled object alive.
If you don't want to "waste" memory by combining the control block with the controlled object, don't call make_shared.
If you don't want shared_ptr<X> to be passed virally into functions, don't pass it. Pass a reference or const reference to the X. You only need to mention shared_ptr in the argument list if you intend on managing the lifetime in the function. If you simply want to perform operations on what the shared_ptr is pointing at, pass *p or *p.get() and accept a [const] reference.

Override new on the object to allocate a control block before the instance of the object.
This is pseudo-intrusive. Conversion to from raw pointer is possible, because of the known offset. The object can be destroyed without a problem.
The reference counting block holds a strong and weak count, and a function object to destroy the object.
Downside: it doesn't work polymorphically very well.
Imagine we have:
struct A {int x;};
struct B {int y;};
struct C:B,A {int z;};
then we allocate a C this way.
C* c = new C{};
and store it in an A*:
A* a = c;
We then pass this to a smart-pointer-to-A. It expects the control block to be immediately before the address a points to, but because B exists before A in the inheritance graph of C, there is an instance of B there instead.
That seems less than ideal.
So we cheat. We again replace new. But it instead registers the pointer value and size with a registry somewhere. There we store the weak/strong pointer counts (etc).
We rely on a linear address space and class layout. When we have a pointer p, we simply look for whose range of address it is in. Then we know the strong/weak counts.
This one has horrible performance in general, especially multi-threaded, and relies upon undefined behavior (pointer comparisons for pointers not pointing to the same object, or less order in such cases).

In theory, it is possible to implement intrusive version of shared_ptr and weak_ptr, but it might be unsafe due to C++ language limitations.
Two reference counters (strong and weak) are stored in the base class RefCounters of the managed object. Any smart pointer (either shared or weak) contains a single pointer to the managed object. Shared pointers own the object itself, and shared + weak pointers together own the memory block of the object. So when the last shared pointer is gone, object is destroyed, but its memory block remains alive as long as there is at least one weak pointer to it. Casting pointers works as expected, given that all the involved types are still inherited from the RefCounted class.
Unfortunately, in C++ it is usually forbidden to work with members of object after the object is destroyed, although most implementations should allow doing that without problems. More details about legibility of the approach can be found in this question.
Here is the base class required for the smart pointers to work:
struct RefCounters {
size_t strong_cnt;
size_t weak_cnt;
};
struct RefCounted : public RefCounters {
virtual ~RefCounted() {}
};
Here is a part of shared pointer definition (shows how object is destroyed and memory chunk is deallocated):
template<class T> class SharedPtr {
static_assert(std::is_base_of<RefCounted, T>::value);
T *ptr;
RefCounters *Counter() const {
RefCounters *base = ptr;
return base;
}
void DestroyObject() {
ptr->~T();
}
void DeallocateMemory() {
RefCounted *base = ptr;
operator delete(base);
}
public:
~SharedPtr() {
if (ptr) {
if (--Counter()->strong_cnt == 0) {
DestroyObject();
if (Counter()->weak_cnt == 0)
DeallocateMemory();
}
}
}
...
};
Full code with sample is available here.

Smart pointers ownership semantics and equality

I have a couple of questions for smart pointers that earlier I didn't give them any credit.
What does mean to own an object , to point to a object and to manage a object in the world of smart pointers? Earlier I thought that the one who owns the object, also points to it and manages the object. Now, I know that a smart pointer can own an object, but point to another object (alias constructors). Here I found a really good explanation for what owning an object mean -> http://www.umich.edu/~eecs381/handouts/C++11_smart_ptrs.pdf , but still I can't make difference between this 3 terms.
If the pointer owns an object, but point to another object, which object does he manage? The one he owns it, or the one he points to, or both? What's the point of owning an object, but not pointing to it?
When are two smart pointers equal? Can two pointers own a same object and be different in a same time? I'm not interested in their value equality, but regarding the ownership.
Why is ownership order important (beside for using the pointers as keys in containers)? I guess this is relevant only for shared_ptr.
Everything began with trying to understand owner_before, now I'm more confused than before I began exploring this topic.. :(

I think all of your confusion comes from the "aliasing constructor":
template <typename U>
shared_ptr(const shared_ptr<U>& x, element_type* p)
What's the use of this thing? Well, it's rarely used, but what it does is to "share" ownership of some object x but when you dereference it you will get p instead. That's all. It will never delete p or do anything else to it.
It might be useful if you have something like this:
struct Foo {
Bar bar;
};
struct Baz {
Baz(shared_ptr<Bar> bar) : m_bar(bar) {}
shared_ptr<Bar> m_bar;
};
int main()
{
auto foo = make_shared<Foo>();
Baz baz(shared_ptr<Bar>(foo, &foo.bar));
}
Now baz gets to manage the lifetime of foo without knowing that's what it's doing--it only cares that it manages the lifetime of a Bar, but since our bar is part of foo, we can't destroy foo without destroying bar, so we use the aliasing constructor.
But actually, we don't do that, because this use case is very rare.

Old "raw" pointer:
someType* pR = new someType(param1, param2); //pR is a pointer
MyOwner.TakeOwnershipOf(pR); // Now MyOwner is the owner, **the one who ONLY should call 'delete pR;'**
Smart pointer:
std::shared_ptr<someType> sp = std::make_shared<someType>(param1, param2);
"sp" is now the owner (there's no "pR" in code, but internally it is). And you don't need to call "delete pR". It's an object that internally stores a pointer to pR and delete it when becomes no more needed.
sp is an object. "sp->any" is exactly as "pR->any". This can confusse you about sp being also a pointer. No, it happens that "->" is overloaded.
More overloaded:
sp.get() is the same as pR. *sp is the same as *pR
sp1 == sp2 is the same as pR1 == pR2 and the same as sp1.get()==sp2.get()
The good thing about shared_ptr is, for example, when you need to store the same pointer in several containers. If some container is deleted, the stored pointer (sp, who owns pR) should also be deleted. This will invalidate the pointer stored in other container. Instead of inspecting other containers about this pointer existence, shared_ptr takes in charge of it. This is, you can safely delete sp because pR will be only deleted by the last container that stores sp.

Ownership
Your confusion is because ownership is not something which the compiler can verify; you, as a programmer, need to deduce it.
We can say any object p owns q if the existence of p guarantees the existence of q (preferably without memory leaks).
The simple case is direct ownership, where deallocating p would also deallocate q, e.g. if q is a member of p, or q is explicitly deallocated with delete in p's destructor.
Smart pointers make this obvious to people. If q is stored in a std::unique_ptr member of p, we know that p owns q. You don't need to search around for the (possibly missing or duplicated) delete statement.
Ownership is also transitive, if p owns q and q owns r, then p must own r.
Aliasing
If p directly owns q, and we want to create a shared_ptr that owns q, then it must also own p. Otherwise, if p is destroyed, then q will be too, despite the existence of our shared pointer.
This is what the aliasing constructor for std::shared_ptr does (deomonstrated in John's answer).
It extends q's lifetime by extending p's lifetime, so we have a pointer to q which actually owns p. We are asserting to the compiler that p in fact owns q, so the shared ptr transitively owns q.
If p doesn't own q, then your program will compile, but it is broken, just like if you manually call delete twice.
Comparisons
For stl smart pointers, comparisons are passed to the raw pointer. So smart pointers are equal if they dereference to the same object, and comparisons are about memory location. There aren't any defined behaviors specifying memory locations, so there isn't much you can use it for other than storing in a map or set.

Why would I want to use a smart pointer in this situation?

I never used any kind of smart pointer, but I keep reading about them almost everywhere when the topic is pointers. I do understand that there are situations where smart pointers are much nicer to work with than raw pointers, because to some extend they manage ownership of the pointer. However, I still do not know, where is the line between "I do not needing smart pointers for that" and "this is a case for smart pointers".
Lets say, I have the following situation:
class A {
public:
double get1(){return 1;}
double get2(){return 2;}
};
class SomeUtilityClass {
public:
SomeUtilityClass(A* a) : a(a) {}
double getResult(){return a->get1() + a->get2();}
void setA(A* a){a = a;}
private:
A* a;
};
int main(int argc, char** argv) {
A a;
SomeUtilityClass u(&a);
std::cout << u.getResult() << std::endl;
A a2;
u.setA(&a2);
std::cout << u.getResult() << std::endl;
return 0;
}
This is of course an oversimplified example. What I mean is that SomeUtilityClass is not supposed to "own" an instance of A (because it is just a utility class), thus it just holds a pointer.
Concerning the pointer, the only thing that I am aware of that could go wrong is:
SomeUtilityClass can be instantiated with a null pointer
The object pointed to may be deleted/go out of scope, without the SomeUtilityClass noticing it
How could a smart pointer help to avoid this problem? What other benefits I would get by using a smart pointer in this case?
PS: I know that there are several question on smart pointers (e.g. this one). However, I would appreciate, if you could tell me about the impact on this particular example.

This depends on how the parameter is created and stored. If you don't own the memory and it could be either statically or dynamically allocated, a raw pointer is a perfectly reasonable solution -- especially if you need to support swapping of the data as in your example. Another option would be to use std::reference_wrapper, which would get rid of your nullptr issue whilst keeping the same semantics.
If you are holding a pointer to some shared resource (i.e. stored in a std::shared_ptr somewhere) and want to be able to check if it has been deleted or not, you could hold a std::weak_ptr.

For the purposes of this answer I'm redefining setA as:
void setA(A* new_a){a = new_a;}
Consider:
// Using your SomeUtilityClass
int main() {
A a;
SomeUtilityClass u(&a);
// We define a new scope, just because:
{
A b;
u.setA(&b);
}
std::cout << u.getResult() << '\n';
return 0;
}
After the scope is finished, SomeUtilityClass has a dangling pointer and getResult() invokes Undefined Behaviour. Note that this can't be solved with a reference: You would still get a dangling one.
Now consider the version using a smart pointer:
class SomeUtilityClass {
public:
SomeUtilityClass(std::shared_ptr<A>& a) : a{a} {}
double getResult(){return a->get1() + a->get2();}
void setA(std::shared_ptr<A>& new_a){a = new_a;}
private:
std::shared_ptr<A> a;
};
int main() {
std::shared_ptr<A> a{new A};
SomeUtilityClass u{a};
// We define a new scope, just because:
{
std::shared_ptr<A> b{new A};
u.setA(b);
}
std::cout << u.getResult() << '\n';
return 0;
}
Because you have shared ownership, there's no way to get a dangling pointer. The memory pointed to by b will be deleted as usual, but only after u is destroyed(or its pointer is changed).
IMHO, in most cases you should be using smart pointers (Even when at first it doesn't seem to make much sense). It makes maintenance much easier. Use raw pointers only in specific code that actually needs them, and encapsulate/isolate this code as much as possible.

If SomeUtilityClass does not own the member variable a, then a smart pointer does not make sense.
You might consider a reference member, which would remove the problems of a null pointer.

The default way of expressing not-owning pointer in C++ is weak_ptr. To use weak_ptr you need to use shared_ptr for ownership, so in your example you would use
shared_ptr<A> owner(...)
instead of
A a
Then as the private pointer member of your SomeUtilityClass you use weak pointer:
weak_ptr<A> w;
and initialise it with shared_ptr:
SomeUtilityClass(shared_ptr<A> o) : w(o) {}
however, you cannot use weak_ptr directly, since the shared_ptr could go out of scope and your weak pointer can no longer point to anything. Before use you need to lock it:
shared_ptr<A> locked = w.lock();
The locked pointer will be empty if the owning pointer no longer manages an object, since e.g. it went out of scope. If it is not empty, you may use it and then it will go out of scope automatically releasing the lock the object.
Both shared_ptr and weak_ptr are available in standard library in C++11, and in Boost for older compilers.

There are different types of smart pointers. In your case, it is clear that a smart pointer is not really needed, but it may still provide some benefits.
SomeUtilityClass can be instantiated with a null pointer
This one is probably best solved with a check in the constructor, throwing an exception or indicating an error in some other way in the case when you get a NULL pointer as the argument. I can hardly imagine how a smart pointer would help, unless you use a specific smart pointer class that doesn't accept NULLs, so it does the check for you already.
The object pointed to may be deleted/go out of scope, without the
SomeUtilityClass noticing it
This one can actually be resolved with a special type of smart pointers, but then it is needed that the object being pointed to somehow supports notification of destruction. One such example is the QPointer class in the Qt library, which can only point to QObject instances, which notify it when deleted, so the smart pointer automatically becomes NULL when the object is deleted. There are some problems with this approach, though:
You need to check for NULLs every time you access the object through the smart pointer.
If a smart pointer points to an instance of a class, say MyClass, extending the class performing the deletion notification (QObject in the Qt case), you get strange results: it's the destructor of QObject that notifies the smart pointer, so it is possible that you access it when the MyClass destructor already began its dirty work, so the object is partially destructed, but the pointer is not NULL yet because the destruction is still in progress.

Should a pointer be the same before and after adding to a unique_ptr?

I have a std::vector of unique_ptrs and I'm happy to have them manage the life cycle of those objects.
However I require storing other pointers to those objects for convenience. I know that once unique_ptr removes something, those other pointers will dangle. But I'm more concerned about the validity of those pointers before and after unique_ptr gets them.
I do not always create via new within the unique_ptr itself, for example I might pass new Something as a function parameter in which case the unique_ptr is using move on that pointer into itself inside the function.
But I might also new Something before I pass it into a function that then assigned it a unique_ptr.
Once an object is assigned to a unique_ptr I can get a pointer to it via get(). But can I always assume that this get() pointer points to the same place as the pointer initially obtained via new if the original pointer was created before the assignment to a unique_ptr ?
My assumption is Yes, and that even if the vector resizes and reallocates, the unique_ptr as well as any other pointers to the objects in memory remain the same.

Yes, a std::unique_ptr<T> holds a pointer to T, and it will not alter the value between initialization and later retrieval with get()
A common use of a unique_ptr is to assign one "parent" object ownership of a dynamically-allocated "subobject", in a similiar same way as:
struct A
{
B b;
}
int main()
{
A a = ...;
B* p = &a.b;
}
In the above b is a true subobject of A.
Compare this to:
struct A
{
unique_ptr<B> b = new B(...);
}
int main()
{
A a = ...;
B* p = a.b.get();
}
In the above A and (*b) have a similar relationship to the first example, except here the B object is allocated on the heap. In both cases the destructor of A will destroy the "subobject". This "on heap" subobject structure may be preferable in some cases, for example because B is a polymorphic base type, or to make B an optional/nullable subobject of A.
The advantage of using unique_ptr over a raw pointer to manage this ownership relationship is that it will automatically destroy it in As destructor, and it will automatically move construct and move assign it as part of A.
As usual, in both cases, you must be careful that the lifetime of any raw pointers to the subobject are enclosed by the lifetime of the owning object.

Yes, you are correct, because unique_ptr does not copy the object; therefore, it has to point to the same address. However, once you give a pointer to a unique_ptr to own, you should not use that raw pointer any more, because the unique_ptr could be destroyed and deallocate the memory, and turn your raw pointer into a dangling pointer. Perhaps shared_ptr would be better for your situation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js