Related
How do pointers work with the concepts of Object oriented programming?
As I understand it (and please recognize, I'm classified as an ID-10T), the main tenet of OOP is containment and keeping management responsibility (memory/implementation/etc.) contained within the class; but when an object's method returns a pointers it seems like we are 'popping' the object. Now, somebody might need to worry about:
Are they supposed to delete the pointer's associated object?
But what if the class still needs the object?
Can they change the object? If so, how? (I recognize const might solve this issue)
and so forth...
It seems the user of the object now needs to know much more about how the class works and what the class expects of the user. It feels like a "cat's out of the bag" scenario which seems to slap in the face of OOP.
NOTE: I notice this is a language independent question; however, I was prompted to ask the question while working in a C++ environment.
What you describe are ownership issues. These are orthogonal (i.e. independent, you can have either without the other or even both) to object orientation. You have the same issues if you do not use OOP and juggle pointers to POD structs. You don't have the issue if you use OOP but solve it somehow. You can (try to) solve it using more OOP or in another way.
They are also orthogonal to the use of pointers (unless you nit pick and extend the definition of pointer). For example, the same issues arise if two separate places hold indices into an array and mutate, resize and ultimately delete the array.
In C++, the usual solution is to select the right smart pointer type (e.g. return a shared pointer when you wish to share the object, or a unique pointer to signify exclusive ownership), along with extensive documentation. Actually, the latter is a key ingredient in any language.
One OOP-related thing you can do to help this is encapsulation (of course, you can have encaptulation just fine without OOP). For instance, don't expose the object at all, only expose methods which query the object under the hood. Or don't expose raw pointers, only expose smart pointers.
For starters... You can't have polymorphism without pointers or
references. In C++, traditionally, objects are copied, and have (for
the most part) automatic storage duration. But copy doesn't work with
polymorphic objects—they tend to get sliced. And OO also often
means identity, which in turn means you don't want copy. So the
solution is for the object to be dynamically allocated, and to pass
around pointers. What you do with them is part of the design:
If the object is logically part of another object, then that object is
responsible for its lifetime, and objects which receive the pointer
should take steps to ensure that they don't use it after the owning
object disappears. (Note that this is true even in languages with
garbage collection. The object won't disappear as long as you've got a
pointer to it, but once the owning object is invalid, the owned object
may become invalid as well. The fact that the garbage collector won't
recycle the memory won't guarantee that the object you point to is
usable.)
If the object is a first class entity itself, rather than being
logically part of another object, then it should probably take care of
itself. Again, other objects which may hold a pointer to it must be
informed if it ceases to exist (or becomes invalid). The use of the
Observer pattern is the usual solution. Back when I started C++, there
was a fashion for "relationship management", with some sort of
management classes where you registered relationships, and which
supposedly ensured that everything worked out OK. In practice, they
either didn't work, or didn't do any more than the simple observer
pattern, and you don't hear any more of them today.
For the most part, your precise questions are part of the contract that
each class has to establish for each of its functions. For true OO
classes (entity objects), you should probably never delete them: that's
there business, not yours. But there are exceptions: if you're dealing
with transactions, for example, a deleted object cannot be rolled back,
so when an object decides to delete itself, it will usually register
this fact with the transaction manager, who will delete it as part of
the commit, once it's established that roll back won't be necessary. As
for changing the object, that's a question of the contract: in a lot of
applications, there are mapping objects, which are used to map an
external identifier of some sort to the object. With the goal, often,
of being able to modify the object.
From my understanding and experience, it generally revolves around what it is that you are trying to do as well as the language using pointers (e.g. C++ vs Objective-C).
Usually, though, in C++ terms, I've found that it's best to return either a reference to a smart pointer (such as std::shared_ptr) by reference (perhaps even const reference, depending on the situation), or simply hide the pointer in the class, and if it NEEDS to be accessed or used by something outside of it, use a getter method which either copies the pointer and returns that, or returns a reference to a pointer (granted, AFAIK ref-to-ptr is only possible in C++). If someone doesn't know that you shouldn't delete a ref-to-ptr in most situations (of course, if its deallocation is handled by the class internally), you should really think twice about whether or not they're ready to be doing C++ stuff on your team.
It's fairly common to just use public references for class members if they can be stack allocated (i.e., if they won't take up too much memory), while managing heap allocated objects internally. If you need to set the class member outside of the class, it's possible to just use a set method which takes the required value, rather than access it directly.
I'm wondering if anyone's run across anything that exists which would fill this need.
Object A contains an object B. It wants to provide access to that B to clients through a pointer (maybe there's the option it could be 0, or maybe the clients need to be copiable and yet hold references...whatever). Clients, lets call them object C, would normally, if we're perfect developers, be written carefully so as to not violate the lifetime semantics of any pointer to B they might have...but we're not perfect, in fact we're pretty dumb half the time.
So what we want is for object C to have a pointer to object B that is not "shared" ownership but that is smart enough to recognize a situation in which the pointer is no longer valid, such as when object A is destroyed or it destroys object B. Accessing this pointer when it's no longer valid would cause an assertion/exception/whatever.
In other words, I wish to share access to data in a safe, clear way but retain the original ownership semantics. Currently, because I've not been able to find any shared pointer in which one of the objects owns it, I've been using shared_ptr in place of having such a thing. But I want clear owneship and shared/weak pointer doesn't really provide that.
Would be nice further if this smart pointer could be attached to member variables and not just hold pointers to dynamically allocated memory regions.
If it doesn't exist I'm going to make it, so I first want to know if someone's already released something out there that does it.
And, BTW, I do realize that things like references and pointers do provide this sort of thing...I'm looking for something smarter.
boost::weak_ptr is what you are looking for. Maybe with some minor tweaks though, like prohibiting creation of shared_ptr from it. Also, this can hold anything, including pointer to memory that is not dynamically allocated.
The semantics you want is similar to Qt's QPointer. This is a pointer that can hold QObjects and nulls itself when the corresponding QObject is deleteed (ordinarily, eg. by operator delete).
However, similar approach has inherent problems - such that the client cannot be sure he isn't using a dangling pointer. eg.
QPointer<T> smart_ptr = original_obj;
T* tmp = smart_ptr; // this might be a function argument etc.
... // later
delete original_obj;
... // even later
tmp->do_something(); // CRASH
This can be avoided using some "hard" references that don't allow object deletion, which is exactly what shared_ptr/weak_ptr do.
BTW, AFAIK, shared_ptr can point to member variables, except it can't manage them. That is, you must provide a custom deleter that doesn't do anything.
I have some intermittent segmentation faults in a Qt application. I think the problem is related to our (bad) use of QSharedPointer. The Qt Documentation states :
QSharedPointer::QSharedPointer ( T * ptr ) :
Creates a QSharedPointer that points to ptr. The pointer ptr becomes managed by this QSharedPointer and must not be passed to another QSharedPointer object or deleted outside this object.
I think we are doing both must not... :/
Is there a OOP way to enforce that the pointer managed by QSharedPointer cannot be deleted or passed to another QSharedPointer?
The best solution will be to have a compiler error.
The normal pattern is to put the new statement inside the smart pointer's constructor, like this:
QSharedPointer<Obj> p (new Obj(2));
That way you never have a reference to the naked pointer itself.
If you refactor your code so that all new operator are in lines like these, all your problems will be solved.
Well, an OOP-esque way would be to create the raw pointer as a private member in a wrapper class, and only perform actions on the pointer through methods that act on the shared pointer. kind of silly though, isn't it?
Or you could make your class with the raw pointer a base class to your other classes and make the raw pointer a private member in the class. In this regard, you're more or less creating an abstract class that does nothing. Your derivative classes must instead do all the work, and since they can't access the raw pointer, compilation will fail... this doesn't stop someone from just copying the raw pointer value out of the shared pointer, though.
In the end, I think your best policy is to manuall change all of the functions in question to use either a shared pointer or else a raw pointer. You can copy one shared pointer to another safely, so why no just go that way?
Edit:
I might add that regardless of whether or not you're using shared pointers, it sounds like you're having ownership issues. If a pointer was created in one scope, it should be deleted in that scope, unless the function that it is passed to contractually takes ownership of the pointer. Using a shared pointer in this scenario will only caused different bugs, eventually. It sounds like you have design issues deeper than just the sharing of pointers.
I'm not familiar with the particular Qt implementation of a shared pointer, but as a general guideline : attempting to mix raw pointers with managed pointers usually ends in blood. Once you 'trust' a shared pointer implementation in taking ownership of your dynamically allocated data, you should under no circumstances try to manage the object lifetime yourself (for example by deleting the provided pointer).
Is there a OOP way to enforce that the pointer managed by QSharedPointer cannot be deleted ?
I guess you could imagine some weird technique where the pointed type would have a private destructor and declare QSharedPointer as friend (which would effectively prevent any 'outside deletion' from compiling), but I wouldn't bet that anything good can come out of this (and note that it will make your type absolutely unusable unless new'ed and transfered to a QSharedPointer).
Is there a OOP way to enforce that the pointer managed by QSharedPointer cannot be passed to another QSharedPointer?
I can't think of any, and that is another reason why you should avoid manipulating the raw pointer once it's ownership has been transferred to a QSharedPointer.
Check your code for all .data() usage and make sure what they return is neither stored nor deleted. I don't think a hard compiler error would be good, because sometimes it's okay to pass the raw pointer, e.g. to a function that doesn't store nor delete passed pointers. (Especially when using 3rd-party code, you cannot always change everything to use shared pointers, and often you want it to work with both, raw and shared ptrs).
One could mark QSharedPointer::data() as deprecated (by patching Qt), to get a compile time warning.
After reading some tutorials I came to the conclusion that one should always use pointers for objects. But I have also seen a few exceptions while reading some QT tutorials (http://zetcode.com/gui/qt4/painting/) where QPaint object is created on the stack. So now I am confused. When should I use pointers?
If you don't know when you should use pointers just don't use them.
It will become apparent when you need to use them, every situation is different. It is not easy to sum up concisely when they should be used. Do not get into the habit of 'always using pointers for objects', that is certainly bad advice.
Main reasons for using pointers:
control object lifetime;
can't use references (e.g. you want to store something non-copyable in vector);
you should pass pointer to some third party function;
maybe some optimization reasons, but I'm not sure.
It's not clear to me if your question is ptr-to-obj vs stack-based-obj or ptr-to-obj vs reference-to-obj. There are also uses that don't fall into either category.
Regarding vs stack, that seems to already be covered above. Several reasons, most obvious is lifetime of object.
Regarding vs references, always strive to use references, but there are things you can do only with ptrs, for example (there are many uses):
walking through elements in an array (e.g., marching over a standard array[])
when a called function allocates something & returns it via a ptr
Most importantly, pointers (and references, as opposed to automatic/stack-based & static objects) support polymorphism. A pointer to a base class may actually point to a derived class. This is fundamental to the OO behavior supported in C++.
First off, the question is wrong: the dilemma is not between pointers and stack, but between heap and stack. You can have an object on the stack and pass the pointer to that object. I assume what you are really asking is whether you should declare a pointer to class or an instance of class.
The answer is that it depends on what you want to do with the object. If the object has to exist after the control leaves the function, then you have to use a pointer and create the object on heap. You will do this, for example, when your function has to return the pointer to the created object or add the object to a list that was created before calling your function.
On the other hand, if the objects is local to the function, then it is better to use it on stack. This enables the compiler to call the destructor when the control leaves the function.
Which tutorials would those be? Actually, the rule is that you should use pointers only when you absolutely have to, which is quite rarely. You need to read a good book on C++, like Accelerated C++ by Koenig & Moo.
Edit: To clarify a bit - two instances where you would not use a pointer (string is being used here as an exemplar - same would go for any other type):
class Person {
public:
string name; // NOT string * name;
...
};
void f() {
string value; // NOT string * value
// use vvalue
}
You usually have to use pointers in the following scenarios:
You need a collection of objects that belong to different classes (in most cases they will have a common base).
You need a stack-allocated collection of objects so large that it'll likely cause stack overflow.
You need a data structure that can rearrange objects quickly - like a linked list, tree ar similar.
You need some complex logic of lifetime management for your object.
You need a data structure that allows for direct navigation from object to object - like a linked list, tree or any other graph.
In addition to points others make (esp. w.r.t. controlling the object lifetime), if you need to handle NULL objects, you should use pointers, not references. It's possible to create a NULL reference through typecasting, but it's generally a bad idea.
Generally use pointers / references to objects when:
passing them to other methods
creating a large array (I'm not sure what the normal stack size is)
Use the stack when:
You are creating an object that lives and dies within the method
The object is the size of a CPU register or smaller
I actually use pointers in this situation:
class Foo
{
Bar* bar;
Foo(Bar& bar) : bar(&bar) { }
Bar& Bar() const { return *bar; }
};
Before that, I used reference members, initialized from the constructor, but the compiler has a problem creating copy constructors, assignment operators, and the lot.
Dave
using pointers is connected with two orthogonal things:
Dynamic allocation. In general, you should allocate dynamically, when the object is intended to live longer that the scope in which it's created. Such an object is a resource which owner have to be clearly specified (most commonly some sort of smart pointer).
Accessing by address (regardless of how the object was created). In this context pointer doesn't mean ownership. Such accessing could be needed when:
some already existing interface requires that.
association which could be null should be modeled.
copying of large objects should be avoided or copying is impossible at all, but the reference can't be used (e.g., stl collections).
The #1 and #2 can occur in different configurations, for example you can imagine dynamically allocated object accessed by pointer, but such the object could also by passed by reference to some function. You also can get pointer to some object which is created on the stack, etc.
Pass by value with well behaved copyable objects is the way to go for a large amount of your code.
If speed really matters, use pass by reference where you can, and finally use pointers.
If possible never use pointers. Rely on pass by reference or if you are going to return a structure or class, assume that your compiler has return value optimization. (You have to avoid conditional construction of the returned class however).
There is a reason why Java doesn't have pointers. C++ doesn't need them either. If you avoid their use you will get the added benefit of automatic object destruction when the object leaves scope. Otherwise your code will be generating memory errors of various types. Memory leaks can be very tricky to find and often occur in C++ due to unhandled exceptions.
If you must use pointers, consider some of the smart pointer classes like auto_ptr. Auto destruction of objects is more than just releasing the underlying memory. There is a concept called RAII. Some objects require additionally handing on destruction. e.g. mutexes and closing files etc.
Use pointers when you don't want your object to be destroyed when the stack frame is emptied.
Use references for passing parameters where possible.
Speaking about C++, objects created on the stack cannot be used when the program has left the scope it was created in. So generally, when you know you don't need a variable past a function or past a close brace, you can create it on the stack.
Speaking about Qt specifically, Qt helps the programmer by handling a lot of the memory management of heap objects. For objects that are derived from QObject (almost all classes prefixed by "Q" are), constructors take an optional parameter parent. The parent then owns the object, and when the parent is deleted, all owned objects are deleted as well. In essence, the responsibility of the children's destruction is passed to the parent object. When using this mechanism, child QObjects must be created on the heap.
In short, in Qt you can easily create objects on the heap, and as long as you set a proper parent, you'll only have to worry about destroying the parent. In general C++, however, you'll need to remember to destroy heap objects, or use smart pointers.
In a C++ project that uses smart pointers, such as boost::shared_ptr, what is a good design philosophy regarding use of "this"?
Consider that:
It's dangerous to store the raw pointer contained in any smart pointer for later use. You've given up control of object deletion and trust the smart pointer to do it at the right time.
Non-static class members intrinsically use a this pointer. It's a raw pointer and that can't be changed.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to make a shared pointer to my class.
Given that, when is it ever appropriate for me to explicitly use a this pointer? Are there design paradigms that can prevent bugs related to this?
Wrong question
In a C++ project that uses smart pointers
The issue has nothing to do with smart pointers actually. It is only about ownership.
Smart pointers are just tools
They change nothing WRT the concept of ownership, esp. the need to have well-defined ownership in your program, the fact that ownership can be voluntarily transferred, but cannot be taken by a client.
You must understand that smart pointers (also locks and other RAII objects) represent a value and a relationship WRT this value at the same time. A shared_ptr is a reference to an object and establishes a relationship: the object must not be destroyed before this shared_ptr, and when this shared_ptr is destroyed, if it is the last one aliasing this object, the object must be destroyed immediately. (unique_ptr can be viewed as a special case of shared_ptr where there is zero aliasing by definition, so the unique_ptr is always the last one aliasing an object.)
Why you should use smart pointers
It is recommended to use smart pointers because they express a lot with only variables and functions declarations.
Smart pointers can only express a well-defined design, they don't take away the need to define ownership. In contrast, garbage collection takes away the need to define who is responsible for memory deallocation. (But do not take away the need to define who is responsible for other resources clean-up.)
Even in non-purely functional garbage collected languages, you need to make ownership clear: you don't want to overwrite the value of an object if other components still need the old value. This is notably true in Java, where the concept of ownership of mutable data structure is extremely important in threaded programs.
What about raw pointers?
The use of a raw pointer does not mean there is no ownership. It's just not described by a variable declaration. It can be described in comments, in your design documents, etc.
That's why many C++ programmers consider that using raw pointers instead of the adequate smart pointer is inferior: because it's less expressive (I have avoided the terms "good" and "bad" on purpose). I believe the Linux kernel would be more readable with a few C++ objects to express relationships.
You can implement a specific design with or without smart pointers. The implementation that uses smart pointer appropriately will be considered superior by many C++ programmers.
Your real question
In a C++ project, what is a good design philosophy regarding use of "this"?
That's awfully vague.
It's dangerous to store the raw pointer for later use.
Why do you need to a pointer for later use?
You've given up control of object deletion and trust the responsible component to do it at the right time.
Indeed, some component is responsible for the lifetime of the variable. You cannot take the responsibility: it has to be transferred.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to use my class.
Obviously, since the caller is not informed that the function will hide a pointer and use it later without the control of the caller, you are creating bugs.
The solution is obviously to either:
transfer responsibility to handle the lifetime of the object to the function
ensure that the pointer is only saved and used under the control of the caller
Only in the first case, you might end up with a smart pointer in the class implementation.
The source of your problem
I think that your problem is that you are trying hard to complicate matters using smart pointers. Smart pointers are tools to make things easier, not harder. If smart pointers complicate your specification, then rethink your spec in term of simpler things.
Don't try to introduce smart pointers as a solution before you have a problem.
Only introduce smart pointers to solve a specific well-defined problem. Because you don't describe a specific well-defined problem, it is not possible to discuss a specific solution (involving smart pointers or not).
While i don't have a general answer or some idiom, there is boost::enable_shared_from_this . It allows you to get a shared_ptr managing an object that is already managed by shared_ptr. Since in a member function you have no reference to those managing shared_ptr's, enable_shared_ptr does allow you to get a shared_ptr instance and pass that when you need to pass the this pointer.
But this won't solve the issue of passing this from within the constructor, since at that time, no shared_ptr is managing your object yet.
One example of correct use is return *this; in functions like operator++() and operator<<().
When you are using a smart pointer class, you are right that is dangerous to directly expose "this". There are some pointer classes related to boost::shared_ptr<T> that may be of use:
boost::enable_shared_from_this<T>
Provides the ability to have an object return a shared pointer to itself that uses the same reference counting data as an existing shared pointer to the object
boost::weak_ptr<T>
Works hand-in-hand with shared pointers, but do not hold a reference to the object. If all the shared pointers go away and the object is released, a weak pointer will be able to tell that the object no longer exists and will return you NULL instead of a pointer to invalid memory. You can use weak pointers to get shared pointers to a valid reference-counted object.
Neither of these is foolproof, of course, but they'll at least make your code more stable and secure while providing appropriate access and reference counting for your objects.
If you need to use this, just use it explicitly. Smart pointers wrap only pointers of the objects they own - either exclusivelly (unique_ptr) or in a shared manner (shared_ptr).
I personally like to use the this pointer when accessing member variables of the class. For example:
void foo::bar ()
{
this->some_var += 7;
}
It's just a harmless question of style. Some people like it, somepeople don't.
But using the this pointer for any other thing is likely to cause problems. If you really need to do fancy things with it, you should really reconsider your design. I once saw some code that, in the constructor of a class, it assigned the this pointer to another pointer stored somewhere else! That's just crazy, and I can't ever think of a reason to do that. The whole code was a huge mess, by the way.
Can you tell us what exactly do you want to do with the pointer?
Another option is using intrusive smart pointers, and taking care of reference counting within the object itself, not the pointers. This requires a bit more work, but is actually more efficient and easy to control.
Another reason to pass around this is if you want to keep a central registry of all of the objects. In the constructor, an object calls a static method of the registry with this. Its useful for various publish/subscribe mechanisms, or when you don't want the registry to need knowledge of what objects/classes are in the system.