How is a unique_ptr unique? - c++

I know unique_ptrs cannot be copied only moved and they have no reference counting. But we can have two smart pointers that share a resource:
Foo* f = new Foo;
auto p1 = std::unique_ptr<Foo>(f);
auto p2 = std::unique_ptr<Foo>(f);
Now both of these classes share a pointer to *f. Also, I know this will eventually cause UB because we will be doing double delete but still: What do we really mean by a unique_ptr being "unique" if this is possible?

Beside the fact that I do not believe that this is wanted or portable behaviour, I think that a unique_ptr is also a statement to other people working on the same project.
From the reference:
std::unique_ptr is a smart pointer that retains sole ownership of an
object through a pointer and destroys that object when the unique_ptr
goes out of scope. No two unique_ptr instances can manage the same
object.
As I understand this, the behaviour of the sample you showed is actually not wanted and should not be used at all.
For people without knowledge of the subject (aka programming for dummies): What the OP does is like having two girlfriends, not knowing of each other. You're fine until they find out. When they do, and they definitely will, you'll probably wish you wouldn't have played with the fire.

To understand the terminology, you have to contrast unique_ptr and shared_ptr:
the former should be the sole responsible for managing the resource it points to
the latter should be sharing this responsibility with a set of peers
Often times, you will hear the term ownership to describe the responsibility of cleaning up.
Now, like many things in C++, you can attempt to subvert the system: only the intention is described, it's up to you to uphold your end of the bargain.

Your question is akin to,
How is a crescent wrench a wrench when I can use it to drive nails in
to the wall?
In other words, just because you can incorrectly use a tool to do something that shouldn't be done with it, doesn't mean it can't do what it was designed to do.
A unique_ptr is unique in the sense that you won't make copies of the pointer if you use it correctly. It ensures that there's only one controlling object, and that the controlled object is destroyed properly when the container is destroyed.

This is about ownership semantics:
Sole or unique ownership (e.g. std::unique_ptr and the old friend std::auto_ptr): only one pointer at a time owns an object.
Shared ownership (e.g. std::shared_ptr, boost::intrusive_ptr, linked_ptr): many pointers share the same object.

It's unique because, when used correctly, it represents a unique ownership model - only one pointer gives access to, and controls the lifetime of, an object. Compare this to shared_ptr, which represents a shared ownership model - more than one pointer can be used to access and manage the same object.
As you point out, you can break that model by messing around with dumb pointers (either keeping hold of the one used to initialise the smart pointer, or by using get() or similar to bypass the ownership model). As always, it's up to the programmer to be careful not to do the wrong thing with dumb pointers. There is nothing a smart pointer can do to control the use of dumb pointers.

Not wishing to put words in the OP's mouth, but I think the issue they may be raising might be to do with naming. Perhaps they are saying something like:
'Aaaaah! It makes no sense! The language is a work of lunacy! Lunacy, I tell you! Run! Run! Save yourselves!'.
If that's what the OP is hinting at then...
Rather than look for 'meaning' in C++ words and symbols, just try to remember their actual effects. eg 'unique' doesn't mean 'One only', even though it appears to mean exactly that, it merely has the effect of indicating that the ptr should be used in certain ways and not others. Similarly, 'Private' does not mean private, but has an effect on how something is shared. 'Static' things can move, and 'move' keeps things where they are to avoid copying them to a new location.
All you have to do is read the documentation forever, and accept the pain.
See also 'Alice Through The Looking-Glass'.

Related

OOP Design Question (MFC C++ implementation)

I have a GUI to interact with the user, but I have an OOP design problem with this.
Through a dialog the user specifies CDiscreteDistributions and they are stored in a std::vector<CDiscreteDistribution*> in the MyAppDoc class for serialization. Through another dialog the user chooses a type of CDistribution for a particular CParameter. CDiscreteDistribution, CConstantDistribution, and CContinuousDistribution inherit CDistribution, and CParameter has a polymorphic pointer to a CDistribution member variable. MyAppDoc has a container class of CParameter. Thus the CDiscreteDistributions are pointed two twice, but only exist once.
In summary, MyAppDoc has
std::vector<CDiscreteDistribution*>
CContainer which has many CParameter which have
CDistribution* which can point to one of
CDiscreteDistribution which is one of the CDiscreteDistribution*s stored above
CConstantDistribution created/destroyed by CParameter
CContinuousDistribution created/destroyed by CParameter
This design pattern is causing me various nightmares in porting the app to use shared_ptr due to double deletes and serialization (boost). Should one of the pointers to CDiscreteDistribution be a weak_ptr? If so where should own the pointer?
Thanks for any help!
EDIT:
I re-thought the reasoning for having std::vector<CDiscreteDistribution*> and it was just to avoid copying the vector into and out of the GUI. But the objects are quite small, and so I've broken the link between them and suffer the minor performance implications. Now MyAppDoc has:
std::vector<CDiscreteDistribution>
CContainer which has many CParameter which have
CDistribution* which can point to one of
CDiscreteDistribution created/destroyed by CParameter, copied from one of the CDiscreteDistributions stored above
CConstantDistribution created/destroyed by CParameter
CContinuousDistribution created/destroyed by CParameter
I think part of the problem was boost::serialization made two shared_ptrs for each CDiscreteDistribution that weren't aware of each other's existence. Now the only issue is backwards compatibility to files created with the previous versions.
I figure this 'solution' is actually just avoiding a proper design!
The question is described not enough to understand the full situation, complications and exact problem, but in general -
I assume you want to use shared_ptr to not have to manually delete() objects
If so, see if you can solve it by not using shared_ptr, but rather using boost::ptr_vector instead of a vector of raw pointers; the ptr_vector will then handle memory management for you.
I'm not even sure what the shared_ptr would bring you - it's quite obvious, I'd say from my limited understanding of the situation, that the Doc owns the CDiscreteDistribution objects. Whoever owns the other two types of Distributions is responsible for deleting them; this can be done though a shared_ptr or otherwise. (you say 'locally instanced' but that doesn't mean much - are they instantiated on the heap or the stack? What is their lifetime? Why is their lifetime different from the DiscreteDistribution objects? What is 'local' - local to what?)
I agree with Roel that the question is not fully specified. But having done several extensive conversions from raw pointers to shared_ptr, I can give you a little advice.
weak_ptr should not be necessary unless you have circular dependencies. In other words, if an object A has a shared_ptr to object B and object B has a shared_ptr back to object A. In this case, it's impossible for the reference count of either pointer to go to 0, so you would either need to manually intervene to break the cycle, or use weak_ptr to designate one side as dependent.
I'm confused why you have issues with double deletion when using shared_ptr. One of the advantages of smart pointers in general is that you don't have to actually delete them. If you've converted all your raw pointers to shared_ptr, you shouldn't have this problem. If your CConstantDistribution and CContinuousDistribution objects are actually locals rather than allocated with new (I can't tell 100% if this is the case from your description), you can make them shared_ptr objects that are initialized in your constructor, if you can change the app's code. This would allow you to make your std::vector<CDiscreteDistribution*> a container of shared_ptr to CDiscreteDistribution instead. At that point, you shouldn't have to worry about deleting those objects at all, unless you have circular references as described above. Even if you do, you'd have converted a double-delete crash to a memory leak, which is generally less bad.
Serialization can be tough. Since you've tagged the question with MFC, I'll assume you're using MFC serialization. I generally don't have problems when wrapping everything in shared_ptr when I serialize out to a file -- I just use .get() on the smart pointer object and serialize the resulting raw pointer. When I serialize in, I read the raw pointer from the serialized file and wrap it in the shared_ptr candy coating at that point. It's a little extra code in the serialization function, but it works.
If I've guessed inaccurately on some of your situation, feel free to add a comment. I'd be happy to help further if I can.

Should I convert shared_ptr to weak_ptr when passed to a method?

there are already a couple of questions regarding this topic, but I am still not sure what to do: Our codebase uses shared_ptr at many places. I have to admit that we did not define ownership clearly when writing it.
We have some methods like
void doSomething(shared_ptr<MyClass> ptr)
{
//doSomething() is a member function of a class, but usually won't store the ptr
ptr->foo();
...
}
After having discovered the first (indirect) circular dependencies I would like to correct the mistakes in our design. But I'm not exactly sure how. Is there any benefit in changing the method from above to
void doSomething(weak_ptr<MyClass> ptr)
{
shared_ptr<MyClass> ptrShared = ptr.lock();
ptrShared->foo();
...
}
?
I am also confused because some people say (including the Google Style guide) that in the first place it's important to get ownership correct (which would probably mean introduction of many weak_ptrs, e.g. in the example with the methods above, but also for many member variables that we have). Others say (see links below) that you should use weak_ptr to break cyclic dependencies. However, detecting them is not always easy, so I wonder if I really should use shared_ptr until I run into problems (and realize them), and then fix them??
Thanks for your thoughts!
See also
shared_ptr and weak_ptr differences
boost::shared_ptr cycle break with weak_ptr
boost, shared ptr Vs weak ptr? Which to use when?
We did not define ownership clearly.
You need to clearly define who owns what. There's no other way to solve this. Arbitrarily swapping out some uses of shared_ptr with weak_ptr won't make things better.
There is no benefit in changing your design above from shared_ptr to weak_ptr. Getting ownership right is not about using weak_ptrs, it's about managing who stores the shared_ptr for any significant length of time. If I pass a shared_ptr to a method, assuming I don't store that shared_ptr into a field in my object as part of that method, I haven't changed who owns that data.
In my experience the only reason for using weak_ptr is when you absolutely must have a cycle of pointers and you need to break that cycle. But first you should consider if you can modify your design to eliminate the cycle.
I usually discourage mixing shared_ptr's and raw pointers. It inevitably happens (though it probably shouldn't) that a raw pointer needs to be passed to a function that takes a shared_ptr of that type. A weak_ptr can be safely converted to a shared_ptr, with a raw pointer you're out of luck. Even worse, a developer inexperienced with shared_ptr's may create a new shared_ptr from that raw pointer and pass it to the function, causing that pointer to be deleted when the function returns. (I actually had to fix this bug in production code, so yes it does happen :) )
It sounds like you have a design problem. shared_ptr provides
a simple to use implementation for specific design solutions,
but it (nor anything else) can replace the design. Until you
have determined what the actual lifetime of each type of object
should be, you shouldn't be using shared_ptr. Once you've done
that, most of the shared_ptr/weak_ptr issues should disappear.
If, having done that, and determined that the lifetime of some
objects does depend on that of other objects, and there are
cycles in this dependency, you have to determine (at the design
level, again) how to manage those cycles---it's quite possible,
for example, that in those cases, shared_ptr isn't the correct
solution, or that many of the pointers involved are just for
navigation, and should be raw pointers.
At any rate, the answer to your question resides at the design
level. If you have to ask it when coding, then it's time to go
back to design.
Some people are right: you should at first have very clear picture about objects' ownership in your project.
shared_ptrs are shared, i.e. "owned by community". That might or might not be desirable. So I would advise to define ownership model and then to not abuse shared_ptr semantics and use plain pointers whenever ownership should not be "shared" more.
Using weak_ptrs would mask the problem further rather than fix it.

Does it exist: smart pointer, owned by one object allowing access

I'm wondering if anyone's run across anything that exists which would fill this need.
Object A contains an object B. It wants to provide access to that B to clients through a pointer (maybe there's the option it could be 0, or maybe the clients need to be copiable and yet hold references...whatever). Clients, lets call them object C, would normally, if we're perfect developers, be written carefully so as to not violate the lifetime semantics of any pointer to B they might have...but we're not perfect, in fact we're pretty dumb half the time.
So what we want is for object C to have a pointer to object B that is not "shared" ownership but that is smart enough to recognize a situation in which the pointer is no longer valid, such as when object A is destroyed or it destroys object B. Accessing this pointer when it's no longer valid would cause an assertion/exception/whatever.
In other words, I wish to share access to data in a safe, clear way but retain the original ownership semantics. Currently, because I've not been able to find any shared pointer in which one of the objects owns it, I've been using shared_ptr in place of having such a thing. But I want clear owneship and shared/weak pointer doesn't really provide that.
Would be nice further if this smart pointer could be attached to member variables and not just hold pointers to dynamically allocated memory regions.
If it doesn't exist I'm going to make it, so I first want to know if someone's already released something out there that does it.
And, BTW, I do realize that things like references and pointers do provide this sort of thing...I'm looking for something smarter.
boost::weak_ptr is what you are looking for. Maybe with some minor tweaks though, like prohibiting creation of shared_ptr from it. Also, this can hold anything, including pointer to memory that is not dynamically allocated.
The semantics you want is similar to Qt's QPointer. This is a pointer that can hold QObjects and nulls itself when the corresponding QObject is deleteed (ordinarily, eg. by operator delete).
However, similar approach has inherent problems - such that the client cannot be sure he isn't using a dangling pointer. eg.
QPointer<T> smart_ptr = original_obj;
T* tmp = smart_ptr; // this might be a function argument etc.
... // later
delete original_obj;
... // even later
tmp->do_something(); // CRASH
This can be avoided using some "hard" references that don't allow object deletion, which is exactly what shared_ptr/weak_ptr do.
BTW, AFAIK, shared_ptr can point to member variables, except it can't manage them. That is, you must provide a custom deleter that doesn't do anything.

How to make data ownership explicit in C++

When working with pointers and references in C++, it is sometimes difficult to see whether the pointer has ownership over the referenced data, or if it is just a temporal reference. For example:
Instance* i = new Instance();
Instance* j = i;
How can it be made clear which of the 2 pointers has ownership over the instance? In other words, how to make clear on which pointer delete has to be called?
Note: In the above example this is not hard to see, as it is a very short piece of code. However, when the pointer is duplicated and passed around a lot, this can become unclear.
You cannot determine the owner, since there is no built in mechanism to know which pointer is owning the memory the pointer points to.
If you are really concerned about this, you could always introduce your own naming convention, e.g. through some pre/post-fix to your variable names. In other words, it's your code design that can give you this information. Since you (and your coworkers) are writing the code you can always make sure that this design is enforced during implementation. This of course means that everyone has to follow these "rules".
This is one reason why a common coding convention is so important. So you can read your own and other peoples code and understand it.
Firstly, it seems unnecessarily confounding to use a reference to refer to data that must be deleted. Use a pointer instead.
Secondly, if you want to indicate ownership of an object, use a wrapper class that manages ownership. There is auto_ptr specifically for this purpose, although it has shortcomings. (These should be addressed by unique_ptr in the next version of the language, though that doesn't help you now).
Thirdly, in the simplest cases (as often as possible), don't use the heap directly. Just declare a local object, e.g.
std::vector<int> v;
This doesn't stop you transfering ownership when you need to (use swap).
You can use something like shared_ptr<> to explicitly share ownership. If you want to maintain a clear single owner with other non-owner pointers referring to the same object, you could use something like boost::scoped_ptr<> for the owning pointer and have a typedef for non-owning pointers:
typedef Instance* UnownedInstance_ptr; // or some better name
This would at least document intent. I don't know of a way off the top of my head to have a smart pointer type that prevents the ability to delete the contained pointer and prevent copying the pointer into another smart pointer that takes ownership (since the source doesn't have any ownership to give away), but that might be an interesting class to represent that policy.
For me I would go with the Hungarian Notation!
Joel tells you the rest ::
Making Wrong Code Look Wrong
an example in your case ::
Instance* Owener_i = new Instance();
Instance* Observer_j = i;
.
.
.
.
.
delete Observer_j; // Wrong! not an Owner.
As the others indicated - use a convention. I use raw pointers for non-owning variables, and the owner is usually wrapped into some kind of smart pointer (such as boost::scoped_ptr) or even not a pointer at all but an object created on the stack.

smart pointers + "this" considered harmful?

In a C++ project that uses smart pointers, such as boost::shared_ptr, what is a good design philosophy regarding use of "this"?
Consider that:
It's dangerous to store the raw pointer contained in any smart pointer for later use. You've given up control of object deletion and trust the smart pointer to do it at the right time.
Non-static class members intrinsically use a this pointer. It's a raw pointer and that can't be changed.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to make a shared pointer to my class.
Given that, when is it ever appropriate for me to explicitly use a this pointer? Are there design paradigms that can prevent bugs related to this?
Wrong question
In a C++ project that uses smart pointers
The issue has nothing to do with smart pointers actually. It is only about ownership.
Smart pointers are just tools
They change nothing WRT the concept of ownership, esp. the need to have well-defined ownership in your program, the fact that ownership can be voluntarily transferred, but cannot be taken by a client.
You must understand that smart pointers (also locks and other RAII objects) represent a value and a relationship WRT this value at the same time. A shared_ptr is a reference to an object and establishes a relationship: the object must not be destroyed before this shared_ptr, and when this shared_ptr is destroyed, if it is the last one aliasing this object, the object must be destroyed immediately. (unique_ptr can be viewed as a special case of shared_ptr where there is zero aliasing by definition, so the unique_ptr is always the last one aliasing an object.)
Why you should use smart pointers
It is recommended to use smart pointers because they express a lot with only variables and functions declarations.
Smart pointers can only express a well-defined design, they don't take away the need to define ownership. In contrast, garbage collection takes away the need to define who is responsible for memory deallocation. (But do not take away the need to define who is responsible for other resources clean-up.)
Even in non-purely functional garbage collected languages, you need to make ownership clear: you don't want to overwrite the value of an object if other components still need the old value. This is notably true in Java, where the concept of ownership of mutable data structure is extremely important in threaded programs.
What about raw pointers?
The use of a raw pointer does not mean there is no ownership. It's just not described by a variable declaration. It can be described in comments, in your design documents, etc.
That's why many C++ programmers consider that using raw pointers instead of the adequate smart pointer is inferior: because it's less expressive (I have avoided the terms "good" and "bad" on purpose). I believe the Linux kernel would be more readable with a few C++ objects to express relationships.
You can implement a specific design with or without smart pointers. The implementation that uses smart pointer appropriately will be considered superior by many C++ programmers.
Your real question
In a C++ project, what is a good design philosophy regarding use of "this"?
That's awfully vague.
It's dangerous to store the raw pointer for later use.
Why do you need to a pointer for later use?
You've given up control of object deletion and trust the responsible component to do it at the right time.
Indeed, some component is responsible for the lifetime of the variable. You cannot take the responsibility: it has to be transferred.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to use my class.
Obviously, since the caller is not informed that the function will hide a pointer and use it later without the control of the caller, you are creating bugs.
The solution is obviously to either:
transfer responsibility to handle the lifetime of the object to the function
ensure that the pointer is only saved and used under the control of the caller
Only in the first case, you might end up with a smart pointer in the class implementation.
The source of your problem
I think that your problem is that you are trying hard to complicate matters using smart pointers. Smart pointers are tools to make things easier, not harder. If smart pointers complicate your specification, then rethink your spec in term of simpler things.
Don't try to introduce smart pointers as a solution before you have a problem.
Only introduce smart pointers to solve a specific well-defined problem. Because you don't describe a specific well-defined problem, it is not possible to discuss a specific solution (involving smart pointers or not).
While i don't have a general answer or some idiom, there is boost::enable_shared_from_this . It allows you to get a shared_ptr managing an object that is already managed by shared_ptr. Since in a member function you have no reference to those managing shared_ptr's, enable_shared_ptr does allow you to get a shared_ptr instance and pass that when you need to pass the this pointer.
But this won't solve the issue of passing this from within the constructor, since at that time, no shared_ptr is managing your object yet.
One example of correct use is return *this; in functions like operator++() and operator<<().
When you are using a smart pointer class, you are right that is dangerous to directly expose "this". There are some pointer classes related to boost::shared_ptr<T> that may be of use:
boost::enable_shared_from_this<T>
Provides the ability to have an object return a shared pointer to itself that uses the same reference counting data as an existing shared pointer to the object
boost::weak_ptr<T>
Works hand-in-hand with shared pointers, but do not hold a reference to the object. If all the shared pointers go away and the object is released, a weak pointer will be able to tell that the object no longer exists and will return you NULL instead of a pointer to invalid memory. You can use weak pointers to get shared pointers to a valid reference-counted object.
Neither of these is foolproof, of course, but they'll at least make your code more stable and secure while providing appropriate access and reference counting for your objects.
If you need to use this, just use it explicitly. Smart pointers wrap only pointers of the objects they own - either exclusivelly (unique_ptr) or in a shared manner (shared_ptr).
I personally like to use the this pointer when accessing member variables of the class. For example:
void foo::bar ()
{
this->some_var += 7;
}
It's just a harmless question of style. Some people like it, somepeople don't.
But using the this pointer for any other thing is likely to cause problems. If you really need to do fancy things with it, you should really reconsider your design. I once saw some code that, in the constructor of a class, it assigned the this pointer to another pointer stored somewhere else! That's just crazy, and I can't ever think of a reason to do that. The whole code was a huge mess, by the way.
Can you tell us what exactly do you want to do with the pointer?
Another option is using intrusive smart pointers, and taking care of reference counting within the object itself, not the pointers. This requires a bit more work, but is actually more efficient and easy to control.
Another reason to pass around this is if you want to keep a central registry of all of the objects. In the constructor, an object calls a static method of the registry with this. Its useful for various publish/subscribe mechanisms, or when you don't want the registry to need knowledge of what objects/classes are in the system.