How to make data ownership explicit in C++

How to make data ownership explicit in C++ - c++

When working with pointers and references in C++, it is sometimes difficult to see whether the pointer has ownership over the referenced data, or if it is just a temporal reference. For example:
Instance* i = new Instance();
Instance* j = i;
How can it be made clear which of the 2 pointers has ownership over the instance? In other words, how to make clear on which pointer delete has to be called?
Note: In the above example this is not hard to see, as it is a very short piece of code. However, when the pointer is duplicated and passed around a lot, this can become unclear.

You cannot determine the owner, since there is no built in mechanism to know which pointer is owning the memory the pointer points to.
If you are really concerned about this, you could always introduce your own naming convention, e.g. through some pre/post-fix to your variable names. In other words, it's your code design that can give you this information. Since you (and your coworkers) are writing the code you can always make sure that this design is enforced during implementation. This of course means that everyone has to follow these "rules".
This is one reason why a common coding convention is so important. So you can read your own and other peoples code and understand it.

Firstly, it seems unnecessarily confounding to use a reference to refer to data that must be deleted. Use a pointer instead.
Secondly, if you want to indicate ownership of an object, use a wrapper class that manages ownership. There is auto_ptr specifically for this purpose, although it has shortcomings. (These should be addressed by unique_ptr in the next version of the language, though that doesn't help you now).
Thirdly, in the simplest cases (as often as possible), don't use the heap directly. Just declare a local object, e.g.
std::vector<int> v;
This doesn't stop you transfering ownership when you need to (use swap).

You can use something like shared_ptr<> to explicitly share ownership. If you want to maintain a clear single owner with other non-owner pointers referring to the same object, you could use something like boost::scoped_ptr<> for the owning pointer and have a typedef for non-owning pointers:
typedef Instance* UnownedInstance_ptr; // or some better name
This would at least document intent. I don't know of a way off the top of my head to have a smart pointer type that prevents the ability to delete the contained pointer and prevent copying the pointer into another smart pointer that takes ownership (since the source doesn't have any ownership to give away), but that might be an interesting class to represent that policy.

For me I would go with the Hungarian Notation!
Joel tells you the rest ::
Making Wrong Code Look Wrong
an example in your case ::
Instance* Owener_i = new Instance();
Instance* Observer_j = i;
.
.
.
.
.
delete Observer_j; // Wrong! not an Owner.

As the others indicated - use a convention. I use raw pointers for non-owning variables, and the owner is usually wrapped into some kind of smart pointer (such as boost::scoped_ptr) or even not a pointer at all but an object created on the stack.

Related

C++ pointers and memory leak

I'd like to illustrate one thing I've seen:
class Something {};
void do_something(const Something *p_sth) {}
Then:
do_something(new Something());
should cause a memory leak because when you call new you should also always call delete, right? Would this be a good solution?
void do_something(const Something *p_sth)
{
delete p_sth;
}
Or is it better to use references &? I also find out that smart pointers can solve this, so delete isn't required (it seems as a good thing but I have never used it before). I just want to know what's the best solution for this to avoid the memory leak. Thanks
*Thank you all for your answers. It helped me to clear up few things. I'm also sorry for the code I posted as it was maybe too general.

Your suggestion of assuming the ownership of the pointer and deleting the object has problems.
That behaviour is not idiomatic in c++. Programmers expect that they must delete every object that they allocate with new. If a user of your function expects that they're responsible for deleting the object whose address they pass to the function, then your solution breaks apart. Also, it prevents using your function with objects that must keep existing after the call has ended:
Something* s = new s();
do_something(s);
s.foo() // oops, object was deleted, I can't use it anymore
delete s; // oops, double delete because I'm for some reason not responsible for deleting the object that I allocated
Your solution also prevents using automatically and statically allocated objects with the function.
Something s;
do_something(&s); //oops, do_something assumes that object is dynamically allocated
All of these caveats would have to be documented to the user of the function.
A raw pointer without deleting inside the function has none of these problems. Managing the callers memory should really not be the responsibility of your function. Doing that would break the single responsibility principle. There's nothing wrong with raw pointer parameters when you don't want to transfer or share ownership. If you do want to imply changes in ownership, then use smart pointers which were designed exactly for that.
Smart pointers don't have some of the above problems, but they still prevent using the function with automatic and static objects.
A reference parameter is in many cases ideal. It communicates the caller that they're still responsible for the object. As a bonus, The lack of need for addressof operator allows slightly nicer syntax. Sure, the caller may still forget to manage their memory, but as I pointed out, that shouldn't be your responsibility.
References have one potential drawback. They can't be null. If you don't need null, then it's actually an advantage.
Which solution is ideal, depends on what you need. Following is not true for all corner cases, but should hold for most common cases:
If you want to modify the object, then pass a reference.
Unless you need null, in which case use a pointer
If you just want to read the object, then
If the object is copyable, small (size less than or equal to word), doesn't cointain pointers to dynamic objects and not polymorphic, then pass by value
Otherwise or if you don't know those things because you're writing a template, pass a const reference
Unless you need null, in which case use a pointer
If you want to If you want to transfer ownership, then use unique_ptr
If you want that ownership to be shared, then use shared_ptr

Best is to use a smart pointer
class Something {};
void do_something(std::shared_ptr<Something> p_sth)
{
...
}
That way the ownership is clear when you look at the prototype as well as you get an automatic delete when you leave scope.

I just want to know what's the best solution for this to avoid the memory leak. Thanks
The best solution to ensure there are no memory leaks would be to use std::shared_ptrwhich is a smart pointer which, as soon as it does not have any references, deletes itself. This pointer is best if you have more than on reference to the same pointer. Use std::unique_ptr if there is only one reference at a given time.
This will prevent memory leaks and I also prefer using smart pointers rather than standard pointers as they are very powerful. Also, retaining to the question:
Or is it better to use references &?
Use references wherever you need to reference the object, if you delete the reference, being a smart pointer, it will delete the pointer as well (Being there are no other references to that pointer)
I hope this answers your questions :)

Prefer to avoid the use of raw pointers.
If you do not need a heap allocation and can use a stack allocated variable, then prefer to pass by reference (or even pass by value if an appropriate move constructor is in place, but that's a topic for another day).
If you need a heap allocation (for polymorphism, dependency injection, information hiding, transfer of ownership etc.) then determine what the ownership semantics will be for that object and use the appropriate smart pointer to manage those semantics.
If all else fails and you must use a raw pointer (perhaps you are dealing with a legacy interface or a C interface, or something similar) then again, clearly define your ownership semantics and document them to make it clear who is responsible for deleting the heap allocated object, etc.
If you must have a raw pointer to a stack allocated object then document that you will not be deleting the pointer and be careful about documenting the lifetime of the pointer to avoid accessing the object after it has gone out of scope.

Cleanest solution would be to avoid the pointer completely.
And yes, a reference is a good idea.
void do_something(const Something &p_sth) {}
Then you can declare your 'something' on the Stack.
void function_that_does_something()
{
Something mySomething;
do_something( mySomething );
}
It is always best to let the compiler do the cleanup for you.

How is a unique_ptr unique?

I know unique_ptrs cannot be copied only moved and they have no reference counting. But we can have two smart pointers that share a resource:
Foo* f = new Foo;
auto p1 = std::unique_ptr<Foo>(f);
auto p2 = std::unique_ptr<Foo>(f);
Now both of these classes share a pointer to *f. Also, I know this will eventually cause UB because we will be doing double delete but still: What do we really mean by a unique_ptr being "unique" if this is possible?

Beside the fact that I do not believe that this is wanted or portable behaviour, I think that a unique_ptr is also a statement to other people working on the same project.
From the reference:
std::unique_ptr is a smart pointer that retains sole ownership of an
object through a pointer and destroys that object when the unique_ptr
goes out of scope. No two unique_ptr instances can manage the same
object.
As I understand this, the behaviour of the sample you showed is actually not wanted and should not be used at all.
For people without knowledge of the subject (aka programming for dummies): What the OP does is like having two girlfriends, not knowing of each other. You're fine until they find out. When they do, and they definitely will, you'll probably wish you wouldn't have played with the fire.

To understand the terminology, you have to contrast unique_ptr and shared_ptr:
the former should be the sole responsible for managing the resource it points to
the latter should be sharing this responsibility with a set of peers
Often times, you will hear the term ownership to describe the responsibility of cleaning up.
Now, like many things in C++, you can attempt to subvert the system: only the intention is described, it's up to you to uphold your end of the bargain.

Your question is akin to,
How is a crescent wrench a wrench when I can use it to drive nails in
to the wall?
In other words, just because you can incorrectly use a tool to do something that shouldn't be done with it, doesn't mean it can't do what it was designed to do.
A unique_ptr is unique in the sense that you won't make copies of the pointer if you use it correctly. It ensures that there's only one controlling object, and that the controlled object is destroyed properly when the container is destroyed.

This is about ownership semantics:
Sole or unique ownership (e.g. std::unique_ptr and the old friend std::auto_ptr): only one pointer at a time owns an object.
Shared ownership (e.g. std::shared_ptr, boost::intrusive_ptr, linked_ptr): many pointers share the same object.

It's unique because, when used correctly, it represents a unique ownership model - only one pointer gives access to, and controls the lifetime of, an object. Compare this to shared_ptr, which represents a shared ownership model - more than one pointer can be used to access and manage the same object.
As you point out, you can break that model by messing around with dumb pointers (either keeping hold of the one used to initialise the smart pointer, or by using get() or similar to bypass the ownership model). As always, it's up to the programmer to be careful not to do the wrong thing with dumb pointers. There is nothing a smart pointer can do to control the use of dumb pointers.

Not wishing to put words in the OP's mouth, but I think the issue they may be raising might be to do with naming. Perhaps they are saying something like:
'Aaaaah! It makes no sense! The language is a work of lunacy! Lunacy, I tell you! Run! Run! Save yourselves!'.
If that's what the OP is hinting at then...
Rather than look for 'meaning' in C++ words and symbols, just try to remember their actual effects. eg 'unique' doesn't mean 'One only', even though it appears to mean exactly that, it merely has the effect of indicating that the ptr should be used in certain ways and not others. Similarly, 'Private' does not mean private, but has an effect on how something is shared. 'Static' things can move, and 'move' keeps things where they are to avoid copying them to a new location.
All you have to do is read the documentation forever, and accept the pain.
See also 'Alice Through The Looking-Glass'.

Dealing with pointers that may not point to anything

I have a set of objects in a vector of pointers to their baseclass held inside a manager:
std::vector<object*> objectVec;
Classes may wish to spawn one of these objects using the Add() method in the manager. The problem is that they then subsequently need to set or update these objects themselves. I've decided to have Add() return a pointer to the object itself, which is stored in whatever class has decided to spawn one. The problem is dealing with the case where the object behind that pointer may have been deleted.
Add looks like this:
object* ObjectManager::Add(object* obj)
{
objectVec.push_back(obj);
return objectVec.back();
}
and used like this:
objectptr = ObjectManager::OMan()->Add(new object());
Where objectptr is a member of whatever class has called the function. So should that particular object be deleted, the pointer returned by Add would point to rubbish.
Is it my responsibility to ensure that whateverclass::objectptr is always set to NULL if this object is deleted? Or can this be dealt with using some sort of smart pointer? The problem being that I don't need to use a smart pointer to deal with the possibility of a memory leak, but to deal with the case where the stored pointer has become invalid.
Please let me know if i've been unclear, or if the question is badly formed.
Any help would be greatly appreciated.

Yes, you can store smart ptr's instead of raw ptr's in your vector. In this case if somebody releases an object, it's not deleted until the last reference is not released (the one held in vector in your case). You can use boost::shared_ptr or std::shared_ptr (C++11).
If this is not what you want, you can use boost::weak_ptr to store references in your vector. weak_ptr doesn't increment reference counter so if somebody releases an object, it's get deleted, but reference (weak_ptr) stored in your vector allows you to check this.

You likely want weak_ptr and shared_ptr. shared_ptr is a general smart pointer class. weak_ptr is an observer of shared_ptr. When all the references of the shared_ptr go away, instances of weak_ptr "become null" and are easier to deal with than a pointer to a deleted object.
These classes come with Boost.
http://www.boost.org/doc/libs/1_47_0/libs/smart_ptr/shared_ptr.htm
http://www.boost.org/doc/libs/1_47_0/libs/smart_ptr/weak_ptr.htm
And if I'm not mistaken, there are equivalents built into std namespace on compilers that implement newer C++0x standards. Visual C++ keeps has this built in.
http://blogs.msdn.com/b/vcblog/archive/2011/02/16/10128357.aspx
Oh shoot, looks like everyone else beat me to the answer...

Best is to forget this "manager" idea, but if you do or if you don't, the solution to shared ownership is the same as always, use boost::shared_ptr.
Or, with relatively new compiler, use std::shared_ptr.
Considering that with shared_ptr the ownership issue is taken care of already, then ask yourself, what is it that the "manager" manages?
Cheers & hth.,

Is it my responsibility to ensure that whateverclass::objectptr is always set to NULL if this object is deleted?
You're writing the class, so it's up to you to decide. This is a design decision and either choice is admissible, provided that you document it:
design the application
write the documentation/specification
write the code to matches the specification
Or can this be dealt with using some sort of smart pointer?
Using a smart pointer (strong or weak version) will help achieve whatever behavior you chose for the class. However, it will also strongly affect the client code. In the following code:
class Scene
{
// can't use this object in a call to `ObjectManager::Add()`,
// assuming it uses a smart pointer to deal with object lifetimes.
Object myLight;
};
The use cases for the ObjectManager class should be taken into consideration, on top of simplicity of implementation. Think "write once, use a lot".

Dangling pointers and memory leaks are two different issues, but a proper shared pointer can protect from both. For this particular case, I'd suggest boost::shared_ptr.
Use std::vector<boost::shared_ptr<BaseType>> for the vector type and also have the objects that hold the bare pointers now hold instead a boost::shared_ptr<BaseType>. This will ensure that the pointers will stay valid in the vector and in the objects as long as one of those objects still exist.
If you have differing requirements, you can use a boost::weak_ptr in one of the places holding the pointer (either the vector or the object).
Also, the object can hold a derived type instead of a base type (boost::shared_ptr<DerivedType>) and you can convert between them using boost::shared_static_cast.
Here is the documentation for all of these concepts.

Does it exist: smart pointer, owned by one object allowing access

I'm wondering if anyone's run across anything that exists which would fill this need.
Object A contains an object B. It wants to provide access to that B to clients through a pointer (maybe there's the option it could be 0, or maybe the clients need to be copiable and yet hold references...whatever). Clients, lets call them object C, would normally, if we're perfect developers, be written carefully so as to not violate the lifetime semantics of any pointer to B they might have...but we're not perfect, in fact we're pretty dumb half the time.
So what we want is for object C to have a pointer to object B that is not "shared" ownership but that is smart enough to recognize a situation in which the pointer is no longer valid, such as when object A is destroyed or it destroys object B. Accessing this pointer when it's no longer valid would cause an assertion/exception/whatever.
In other words, I wish to share access to data in a safe, clear way but retain the original ownership semantics. Currently, because I've not been able to find any shared pointer in which one of the objects owns it, I've been using shared_ptr in place of having such a thing. But I want clear owneship and shared/weak pointer doesn't really provide that.
Would be nice further if this smart pointer could be attached to member variables and not just hold pointers to dynamically allocated memory regions.
If it doesn't exist I'm going to make it, so I first want to know if someone's already released something out there that does it.
And, BTW, I do realize that things like references and pointers do provide this sort of thing...I'm looking for something smarter.

boost::weak_ptr is what you are looking for. Maybe with some minor tweaks though, like prohibiting creation of shared_ptr from it. Also, this can hold anything, including pointer to memory that is not dynamically allocated.

The semantics you want is similar to Qt's QPointer. This is a pointer that can hold QObjects and nulls itself when the corresponding QObject is deleteed (ordinarily, eg. by operator delete).
However, similar approach has inherent problems - such that the client cannot be sure he isn't using a dangling pointer. eg.
QPointer<T> smart_ptr = original_obj;
T* tmp = smart_ptr; // this might be a function argument etc.
... // later
delete original_obj;
... // even later
tmp->do_something(); // CRASH
This can be avoided using some "hard" references that don't allow object deletion, which is exactly what shared_ptr/weak_ptr do.
BTW, AFAIK, shared_ptr can point to member variables, except it can't manage them. That is, you must provide a custom deleter that doesn't do anything.

smart pointers + "this" considered harmful?

In a C++ project that uses smart pointers, such as boost::shared_ptr, what is a good design philosophy regarding use of "this"?
Consider that:
It's dangerous to store the raw pointer contained in any smart pointer for later use. You've given up control of object deletion and trust the smart pointer to do it at the right time.
Non-static class members intrinsically use a this pointer. It's a raw pointer and that can't be changed.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to make a shared pointer to my class.
Given that, when is it ever appropriate for me to explicitly use a this pointer? Are there design paradigms that can prevent bugs related to this?

Wrong question
In a C++ project that uses smart pointers
The issue has nothing to do with smart pointers actually. It is only about ownership.
Smart pointers are just tools
They change nothing WRT the concept of ownership, esp. the need to have well-defined ownership in your program, the fact that ownership can be voluntarily transferred, but cannot be taken by a client.
You must understand that smart pointers (also locks and other RAII objects) represent a value and a relationship WRT this value at the same time. A shared_ptr is a reference to an object and establishes a relationship: the object must not be destroyed before this shared_ptr, and when this shared_ptr is destroyed, if it is the last one aliasing this object, the object must be destroyed immediately. (unique_ptr can be viewed as a special case of shared_ptr where there is zero aliasing by definition, so the unique_ptr is always the last one aliasing an object.)
Why you should use smart pointers
It is recommended to use smart pointers because they express a lot with only variables and functions declarations.
Smart pointers can only express a well-defined design, they don't take away the need to define ownership. In contrast, garbage collection takes away the need to define who is responsible for memory deallocation. (But do not take away the need to define who is responsible for other resources clean-up.)
Even in non-purely functional garbage collected languages, you need to make ownership clear: you don't want to overwrite the value of an object if other components still need the old value. This is notably true in Java, where the concept of ownership of mutable data structure is extremely important in threaded programs.
What about raw pointers?
The use of a raw pointer does not mean there is no ownership. It's just not described by a variable declaration. It can be described in comments, in your design documents, etc.
That's why many C++ programmers consider that using raw pointers instead of the adequate smart pointer is inferior: because it's less expressive (I have avoided the terms "good" and "bad" on purpose). I believe the Linux kernel would be more readable with a few C++ objects to express relationships.
You can implement a specific design with or without smart pointers. The implementation that uses smart pointer appropriately will be considered superior by many C++ programmers.
Your real question
In a C++ project, what is a good design philosophy regarding use of "this"?
That's awfully vague.
It's dangerous to store the raw pointer for later use.
Why do you need to a pointer for later use?
You've given up control of object deletion and trust the responsible component to do it at the right time.
Indeed, some component is responsible for the lifetime of the variable. You cannot take the responsibility: it has to be transferred.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to use my class.
Obviously, since the caller is not informed that the function will hide a pointer and use it later without the control of the caller, you are creating bugs.
The solution is obviously to either:
transfer responsibility to handle the lifetime of the object to the function
ensure that the pointer is only saved and used under the control of the caller
Only in the first case, you might end up with a smart pointer in the class implementation.
The source of your problem
I think that your problem is that you are trying hard to complicate matters using smart pointers. Smart pointers are tools to make things easier, not harder. If smart pointers complicate your specification, then rethink your spec in term of simpler things.
Don't try to introduce smart pointers as a solution before you have a problem.
Only introduce smart pointers to solve a specific well-defined problem. Because you don't describe a specific well-defined problem, it is not possible to discuss a specific solution (involving smart pointers or not).

While i don't have a general answer or some idiom, there is boost::enable_shared_from_this . It allows you to get a shared_ptr managing an object that is already managed by shared_ptr. Since in a member function you have no reference to those managing shared_ptr's, enable_shared_ptr does allow you to get a shared_ptr instance and pass that when you need to pass the this pointer.
But this won't solve the issue of passing this from within the constructor, since at that time, no shared_ptr is managing your object yet.

One example of correct use is return *this; in functions like operator++() and operator<<().

When you are using a smart pointer class, you are right that is dangerous to directly expose "this". There are some pointer classes related to boost::shared_ptr<T> that may be of use:
boost::enable_shared_from_this<T>
Provides the ability to have an object return a shared pointer to itself that uses the same reference counting data as an existing shared pointer to the object
boost::weak_ptr<T>
Works hand-in-hand with shared pointers, but do not hold a reference to the object. If all the shared pointers go away and the object is released, a weak pointer will be able to tell that the object no longer exists and will return you NULL instead of a pointer to invalid memory. You can use weak pointers to get shared pointers to a valid reference-counted object.
Neither of these is foolproof, of course, but they'll at least make your code more stable and secure while providing appropriate access and reference counting for your objects.

If you need to use this, just use it explicitly. Smart pointers wrap only pointers of the objects they own - either exclusivelly (unique_ptr) or in a shared manner (shared_ptr).

I personally like to use the this pointer when accessing member variables of the class. For example:
void foo::bar ()
{
this->some_var += 7;
}
It's just a harmless question of style. Some people like it, somepeople don't.
But using the this pointer for any other thing is likely to cause problems. If you really need to do fancy things with it, you should really reconsider your design. I once saw some code that, in the constructor of a class, it assigned the this pointer to another pointer stored somewhere else! That's just crazy, and I can't ever think of a reason to do that. The whole code was a huge mess, by the way.
Can you tell us what exactly do you want to do with the pointer?

Another option is using intrusive smart pointers, and taking care of reference counting within the object itself, not the pointers. This requires a bit more work, but is actually more efficient and easy to control.

Another reason to pass around this is if you want to keep a central registry of all of the objects. In the constructor, an object calls a static method of the registry with this. Its useful for various publish/subscribe mechanisms, or when you don't want the registry to need knowledge of what objects/classes are in the system.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js