I think I have a considerable experience with normal (functional) designed patters, as described e.g. in the gang of four book, which I mainly used in java and C#. In these "managed" languages this is pretty much everything you need to know to get your work done.
However, in C++ world the developer also has the control of how all the objects get allocated, passed around and deleted. I understand the principles (I read Stroutrup among other texts), but it still takes me a lot of effort to decide which mechanism is best for a given scenario - and this is where a portfolio of memory-related design patterns would be useful.
For example, yesterday I had to create a class Results, that was a container for a few objects and a collection (std::vector in this case) of yet another type of objects. So there are a few design questions I couldn't really answer:
Should I return this class by value, or by smart pointer?
Inside the class, should the vector and the objects be normal members, or should they be stored as smart pointers again?
In the vector, should I store the objects directly, or smart pointers to them again?
What should the getters defined on my Results class return (i.e. values, references or smart pointers)?
Of course, smart pointers are cool and what not, but they create syntactic clutter and I am not convinced if using malloc for every single object is optimal approach.
I would be grateful for answers for the specific points above, but even more for some longer and more general texts on memory-related design patterns - so that I can solve the problems I will have on Mondays as well!
The answer to all of your questions ends up being one and the same: it depends on whether you need reference semantics or value semantics (with some caveats to be taken into account).
If you need reference semantics, which is what you have by default in languages like Java and C# for UDTs (User-defined Data Types) declared with the class keyword, then you will have to go for smart pointers. In this scenario you want several clients to hold safe aliases to a specific object, where the word safe encapsulates these two requirements:
Avoid dangling references, so that you won't try to access an object that doesn't exist anymore;
Avoid objects which outlive all of the references to them, so that you won't leak memory.
This is what smart pointers do. If you need reference semantics (and if your algorithms are not such to make the overhead of reference counting significant where shared ownership is needed), then you should use smart pointers.
You do need reference semantics, for instance, when you want the same object to be part of several collections. When you update the object in one collection, you want the representations of the same object in all the other collections to be consistently updated. In this case, you store smart pointers to your objects in those collections. Smart pointers encapsulate the identity of an object rather than its value.
But if you do not need to create aliases, then value semantics is probably what you should rely on. This is what you get by default in C++ when you declare an object with automatic storage (i.e. on the stack).
One thing to consider is that STL collections store values, so if you have a vector<T>, then copies of T will be stored in your vector. Always supposing that you do not need reference semantics, this might become anyway an overhead if your objects are big and expensive to copy around.
To limit the likelyhood of this scenario, C++11 comes with move operations, which make it possible to efficiently transfer objects by value when the old copy of the object is no more needed.
I will now try to use the above concepts to answer your questions more directly.
1) Should I return this class by value, or by smart pointer?
It depends on whether you need reference semantics or not. What does the function do with that object? Is the object returned by that function supposed to be shared by many clients? If so, then by smart pointer. If not, is it possible to define an efficient move operation (this is almost always the case)? If so, then by value. If not, by smart pointer.
2) Inside the class, should the vector and the objects be normal members, or should they be stored as smart pointers again?
Most likely as normal members, since vectors are usually meant to be conceptually a part of your object, and their lifetime is therefore bound to the lifetime of the object that embeds them. You rarely want reference semantics in such a scenario, but if you do, then use smart pointers.
3) In the vector, should I store the objects directly, or smart pointers to them again?
Same answer as for point 1): do you need to share those objects? Are you supposed to store aliases to those objects? Do you want changes to those objects to be seen in different parts of your code which refer those objects? If so, then use shared pointers. If not, is it possible to efficiently copy and/or move those objects? If so (most of the time), store values. If not, store smart pointers.
4) What should the getters defined on my Results class return (i.e. values, references or smart pointers)?
Same answer as for point 2): it depends on what you plan to do with the returned objects: do you want them to be shared by many parts of your code? If so, return a smart pointer. If they shall be exclusively owned by just one part, return by value, unless moving/copying those objects is too expensive or not allowed at all (quite unlikely). In that case, return a smart pointer.
As a side note, please be aware that smart pointers in C++ are a bit trickier than Java/C# references: first of all, you have two main flavors of smart pointers depending on whether shared ownership (shared_ptr) or unique ownership (unique_ptr) is desired. Secondly, you need to avoid circular references of shared_ptr, which would create islands of objects that keep each other alive even though they are no more reachable by your running code. This is the reason why weak pointers (weak_ptr) exist.
These concept naturally lead to the concept of responsibility for managing the lifetime of an object or (more generally) the management of a used resource. You might want to read about the RAII idiom for instance (Resource Acquisition Is Initialization), and about exception handling in general (writing exception-safe code is one of the main reasons why these techniques exist).
Related
Comparisons, Pros, Cons, and When to Use?
This is a spin-off from a garbage collection thread where what I thought was a simple answer generated a lot of comments about some specific smart pointer implementations so it seemed worth starting a new post.
Ultimately the question is what are the various implementations of smart pointers in C++ out there and how do they compare? Just simple pros and cons or exceptions and gotchas to something you might otherwise think should work.
I've posted some implementations that I've used or at least glossed over and considered using as an answer below and my understanding of their differences and similarities which may not be 100% accurate so feel free to fact check or correct me as needed.
The goal is to learn about some new objects and libraries or correct my usage and understanding of existing implementations already widely in use and end up with a decent reference for others.
C++03
std::auto_ptr - Perhaps one of the originals it suffered from first draft syndrome only providing limited garbage collection facilities. The first downside being that it calls delete upon destruction making them unacceptable for holding array allocated objects (new[]). It takes ownership of the pointer so two auto pointers shouldn't contain the same object. Assignment will transfer ownership and reset the rvalue auto pointer to a null pointer. Which leads to perhaps the worst drawback; they can't be used within STL containers due to the aforementioned inability to be copied. The final blow to any use case is they are slated to be deprecated in the next standard of C++.
std::auto_ptr_ref - This is not a smart pointer it's actually a design detail used in conjunction with std::auto_ptr to allow copying and assignment in certain situations. Specifically it can be used to convert a non-const std::auto_ptr to an lvalue using the Colvin-Gibbons trick also known as a move constructor to transfer ownership.
On the contrary perhaps std::auto_ptr wasn't really intended to be used as a general purpose smart pointer for automatic garbage collection. Most of my limited understanding and assumptions are based on Herb Sutter's Effective Use of auto_ptr and I do use it regularly although not always in the most optimized way.
C++11
std::unique_ptr - This is our friend who will be replacing std::auto_ptr it will be quite similar except with the key improvements to correct the weaknesses of std::auto_ptr like working with arrays, lvalue protection via private copy constructor, being usable with STL containers and algorithms, etc. Since it's performance overhead and memory footprint are limited this is an ideal candidate for replacing, or perhaps more aptly described as owning, raw pointers. As the "unique" implies there is only one owner of the pointer just like the previous std::auto_ptr.
std::shared_ptr - I believe this is based off TR1 and boost::shared_ptr but improved to include aliasing and pointer arithmetic as well. In short it wraps a reference counted smart pointer around a dynamically allocated object. As the "shared" implies the pointer can be owned by more than one shared pointer when the last reference of the last shared pointer goes out of scope then the object will be deleted appropriately. These are also thread safe and can handle incomplete types in most cases. std::make_shared can be used to efficiently construct a std::shared_ptr with one heap allocation using the default allocator.
std::weak_ptr - Likewise based off TR1 and boost::weak_ptr. This is a reference to an object owned by a std::shared_ptr and will therefore not prevent the deletion of the object if the std::shared_ptr reference count drops to zero. In order to get access to the raw pointer you'll first need to access the std::shared_ptr by calling lock which will return an empty std::shared_ptr if the owned pointer has expired and been destroyed already. This is primarily useful to avoid indefinite hanging reference counts when using multiple smart pointers.
Boost
boost::shared_ptr - Probably the easiest to use in the most varying scenarios (STL, PIMPL, RAII, etc) this is a shared referenced counted smart pointer. I've heard a few complaints about performance and overhead in some situations but I must have ignored them because I can't remember what the argument was. Apparently it was popular enough to become a pending standard C++ object and no drawbacks over the norm regarding smart pointers come to mind.
boost::weak_ptr - Much like previous description of std::weak_ptr, based on this implementation, this allows a non-owning reference to a boost::shared_ptr. You not surprisingly call lock() to access the "strong" shared pointer and must check to make sure it's valid as it could have already been destroyed. Just make sure not to store the shared pointer returned and let it go out of scope as soon as you're done with it otherwise you're right back to the cyclic reference problem where your reference counts will hang and objects will not be destroyed.
boost::scoped_ptr - This is a simple smart pointer class with little overhead probably designed for a better performing alternative to boost::shared_ptr when usable. It's comparable to std::auto_ptr especially in the fact that it can't be safely used as an element of a STL container or with multiple pointers to the same object.
boost::intrusive_ptr - I've never used this but from my understanding it's designed to be used when creating your own smart pointer compatible classes. You need to implement the reference counting yourself, you'll also need to implement a few methods if you want your class to be generic, furthermore you'd have to implement your own thread safety. On the plus side this probably gives you the most custom way of picking and choosing exactly how much or how little "smartness" you want. intrusive_ptr is typically more efficient than shared_ptr since it allows you to have a single heap allocation per object. (thanks Arvid)
boost::shared_array - This is a boost::shared_ptr for arrays. Basically new [], operator[], and of course delete [] are baked in. This can be used in STL containers and as far as I know does everything boost:shared_ptr does although you can't use boost::weak_ptr with these. You could however alternatively use a boost::shared_ptr<std::vector<>> for similar functionality and to regain the ability to use boost::weak_ptr for references.
boost::scoped_array - This is a boost::scoped_ptr for arrays. As with boost::shared_array all the necessary array goodness is baked in. This one is non-copyable and so can't be used in STL containers. I've found almost anywhere you find yourself wanting to use this you probably could just use std::vector. I've never determined which is actually faster or has less overhead but this scoped array seems far less involved than a STL vector. When you want to keep allocation on the stack consider boost::array instead.
Qt
QPointer - Introduced in Qt 4.0 this is a "weak" smart pointer which only works with QObject and derived classes, which in the Qt framework is almost everything so that's not really a limitation. However there are limitations namely that it doesn't supply a "strong" pointer and although you can check if the underlying object is valid with isNull() you could find your object being destroyed right after you pass that check especially in multi-threaded environments. Qt people consider this deprecated I believe.
QSharedDataPointer - This is a "strong" smart pointer potentially comparable to boost::intrusive_ptr although it has some built in thread safety but it does require you to include reference counting methods (ref and deref) which you can do by subclassing QSharedData. As with much of Qt the objects are best used through ample inheritance and subclassing everything seems to be the intended design.
QExplicitlySharedDataPointer - Very similar to QSharedDataPointer except it doesn't implicitly call detach(). I'd call this version 2.0 of QSharedDataPointer as that slight increase in control as to exactly when to detach after the reference count drops to zero isn't particularly worth a whole new object.
QSharedPointer - Atomic reference counting, thread safe, sharable pointer, custom deletes (array support), sounds like everything a smart pointer should be. This is what I primarily use as a smart pointer in Qt and I find it comparable with boost:shared_ptr although probably significantly more overhead like many objects in Qt.
QWeakPointer - Do you sense a reoccurring pattern? Just as std::weak_ptr and boost::weak_ptr this is used in conjunction with QSharedPointer when you need references between two smart pointers that would otherwise cause your objects to never be deleted.
QScopedPointer - This name should also look familiar and actually was in fact based on boost::scoped_ptr unlike the Qt versions of shared and weak pointers. It functions to provide a single owner smart pointer without the overhead of QSharedPointer which makes it more suitable for compatibility, exception safe code, and all the things you might use std::auto_ptr or boost::scoped_ptr for.
There is also Loki which implements policy-based smart pointers.
Other references on policy-based smart pointers, addressing the problem of the poor support of the empty base optimization along with multiple inheritance by many compilers:
Smart Pointers Reloaded
A Proposal to Add a Policy-Based Smart Pointer Framework to the Standard Library
In addition to the ones given, there are some safety oriented ones too:
SaferCPlusPlus
mse::TRefCountingPointer is a reference counting smart pointer like std::shared_ptr. The difference being that mse::TRefCountingPointer is safer, smaller and faster, but does not have any thread safety mechanism. And it comes in "not null" and "fixed" (non-retargetable) versions that can be safely assumed to always be pointing to a validly allocated object. So basically, if your target object is shared among asynchronous threads then use std::shared_ptr, otherwise mse::TRefCountingPointer is more optimal.
mse::TScopeOwnerPointer is similar to boost::scoped_ptr, but works in conjunction with mse::TScopeFixedPointer in a "strong-weak" pointer relationship kind of like std::shared_ptr and std::weak_ptr.
mse::TScopeFixedPointer points to objects that are allocated on the stack, or whose "owning" pointer is allocated on the stack. It is (intentionally) limited in it's functionality to enhance compile-time safety with no runtime cost. The point of "scope" pointers is essentially to identify a set of circumstances that are simple and deterministic enough that no (runtime) safety mechanisms are necessary.
mse::TRegisteredPointer behaves like a raw pointer, except that its value is automatically set to null_ptr when the target object is destroyed. It can be used as a general replacement for raw pointers in most situations. Like a raw pointer, it does not have any intrinsic thread safety. But in exchange it has no problem targeting objects allocated on the stack (and obtaining the corresponding performance benefit). When run-time checks are enabled, this pointer is safe from accessing invalid memory. Because mse::TRegisteredPointer has the same behavior as a raw pointer when pointing to valid objects, it can be "disabled" (automatically replaced with the corresponding raw pointer) with a compile-time directive, allowing it to be used to help catch bugs in debug/test/beta modes while incurring no overhead cost in release mode.
Here is an article describing why and how to use them. (Note, shameless plug.)
Are there any reasons to still use raw pointers (for managed resources) in C++11/14?
Should resource member variables in a class be held in their own smart pointers for automatic RAII without need for cleanup in destructor?
Is the implementation of smart pointers inlined that there is no overhead in doing so?
Are there any reasons to still use raw pointers (for managed
resources) in C++11/14?
I assume that by "managed resources" you mean "owned resources".
Yes there are reasons:
As you are inferring in your question, sometime you want to refer to an object or none, and it can change through time. You have to use a raw pointer in this case, because there is no alternative right now in this specific case. There might be later as there is a proposal about adding a "dumb" non-owning pointer which just clarify the role of the pointer (observe/refer, not own). Meanwhile the recommendation is to avoid new/delete if you can and use raw pointer only as "refers to without owning" kind of re-assignable reference.
You still need raw pointers for implementation of non-raw pointers and any low level RAII construct. Not everybody needs to work on fundamental libraries, but of course if you do, then you need basic constructs to work with. For example in my domain I often have to build up custom "object pool" systems in different ways. At some point in the implementation you have to manipulate raw memory which are not objects yet, so you need to use raw pointers to work with that.
When communicating with C interfaces, you have no other choices than to pass raw pointers to functions taking them. A lot of C++ developers have to do this, so it's just in some very specific regions of your code that you will have to use them.
Most companies using C++ don't have a "modern C++" experience of C++ and work with code that use a lot of pointers where it is not actually necessary. So most of the time when you add code to their codebase, you might be forced by the code-environment, politics, peer-pressure and conventions of the company to use pointers where in a more "modern c++" usage kind of company it would not pass peer-review. So consider the political/historic/social/knowledge-base-of-coworkers context too when choosing your techniques. Or make sure companies/projects you work in match your way of doing things (this might be harder).
Should resource member variables in a class be held in their own smart
pointers for automatic RAII without need for cleanup in destructor?
Resource member variables, in the best case, should just be members without being obvious pointers, not even smart pointers. Smart pointers are a "bridge" between code manipulating raw pointers and pure RAII style. If you have total control over some code and it's new code, you can totally avoid making any use of smart pointer in your interfaces. Maybe you will need them in your implementations though. Keep in mind that there is no actual rules, only recommendations of what you could result in if you
Is the implementation of smart pointers inlined that there is no
overhead in doing so?
Implementation of standard smart pointers is as efficient as they can be so yes most of their code is inlined. However, they are not always free, it depends on what they actually do. For example, in almost all cases, unique_ptr is exactly one raw pointer, with just additional checks around it's places of use. So it's "free". shared_ptr on the other hand have to maintain a counter of how many other shared_ptr refer to the same object. That counter can be changed on several threads doing copies of the shared_ptr, so it have to be atomic. Changing the value of atomic counters is not always free, and you should always assume that there is a higher cost than copying a raw pointer.
So "it depends".
Just:
use RAII as much as you can, without exposing any kind of pointer in your interfaces (smart or not);
use standard smart pointers in implementations if you must use owning pointers;
use raw pointers only if you need to refer to object, null, or other objects changing through time, without owning them;
avoid raw pointers in interfaces except in case of allowing passing optional objects (when nullptr is a correct argument);
You will end-up with code that, from the user's perspective, don't seem to manipulate pointers. If you have several layers of code following these rules, the code will be easier to follow and highly maintainable.
On a related note from: When to use references vs. pointers
Avoid pointers until you can't.
Also note that Sean Parents in his recent talks also consider smart pointers to be raw pointers. They can indeed be encapsulated as implementation details of value-semantic types corresponding to the actual concept being manipulated. Additionally using type-erasure techniques in implementations but never exposing them to the user helps extensibility of some library constructs.
It depends. If the object is fully owned, constructed and destructed by another object, there is a good case for using an std::unique_ptr in that other object. If you have an object which owns several such objects, all of which are dynamically allocated in the constructor, then you have to do something; if the semantics of the usual smart pointers isn't appropriate (which is often the case), then you'll have to invent something: in the case of a graph, for example, you might put the root pointer in a base class (initializing it to null), and have the destructor of the base class clean-up the graph, starting at the root.
Of course, unless your class has some sort of dynamic structure like a graph, you might ask yourself why it is using dynamic allocation to begin with. There are special cases (where, for example, the owned object is polymorphic, and its actual type depends on arguments to the constructor), but in my experience, they aren't that common. In practice, there just aren't that many cases where a smart pointer can be used in an object, much less should be used.
RAII is more than just wrapping up new and delete - a smart pointer is a form of RAII but RAII is a lot more than that. A good candidate for RAII is when you have any sort of mirroring functions: new/delete, initialise/teardown, stop/start. So your resource should still have its own RAII class - one that performs its cleanup function in its own destructor.
How do pointers work with the concepts of Object oriented programming?
As I understand it (and please recognize, I'm classified as an ID-10T), the main tenet of OOP is containment and keeping management responsibility (memory/implementation/etc.) contained within the class; but when an object's method returns a pointers it seems like we are 'popping' the object. Now, somebody might need to worry about:
Are they supposed to delete the pointer's associated object?
But what if the class still needs the object?
Can they change the object? If so, how? (I recognize const might solve this issue)
and so forth...
It seems the user of the object now needs to know much more about how the class works and what the class expects of the user. It feels like a "cat's out of the bag" scenario which seems to slap in the face of OOP.
NOTE: I notice this is a language independent question; however, I was prompted to ask the question while working in a C++ environment.
What you describe are ownership issues. These are orthogonal (i.e. independent, you can have either without the other or even both) to object orientation. You have the same issues if you do not use OOP and juggle pointers to POD structs. You don't have the issue if you use OOP but solve it somehow. You can (try to) solve it using more OOP or in another way.
They are also orthogonal to the use of pointers (unless you nit pick and extend the definition of pointer). For example, the same issues arise if two separate places hold indices into an array and mutate, resize and ultimately delete the array.
In C++, the usual solution is to select the right smart pointer type (e.g. return a shared pointer when you wish to share the object, or a unique pointer to signify exclusive ownership), along with extensive documentation. Actually, the latter is a key ingredient in any language.
One OOP-related thing you can do to help this is encapsulation (of course, you can have encaptulation just fine without OOP). For instance, don't expose the object at all, only expose methods which query the object under the hood. Or don't expose raw pointers, only expose smart pointers.
For starters... You can't have polymorphism without pointers or
references. In C++, traditionally, objects are copied, and have (for
the most part) automatic storage duration. But copy doesn't work with
polymorphic objects—they tend to get sliced. And OO also often
means identity, which in turn means you don't want copy. So the
solution is for the object to be dynamically allocated, and to pass
around pointers. What you do with them is part of the design:
If the object is logically part of another object, then that object is
responsible for its lifetime, and objects which receive the pointer
should take steps to ensure that they don't use it after the owning
object disappears. (Note that this is true even in languages with
garbage collection. The object won't disappear as long as you've got a
pointer to it, but once the owning object is invalid, the owned object
may become invalid as well. The fact that the garbage collector won't
recycle the memory won't guarantee that the object you point to is
usable.)
If the object is a first class entity itself, rather than being
logically part of another object, then it should probably take care of
itself. Again, other objects which may hold a pointer to it must be
informed if it ceases to exist (or becomes invalid). The use of the
Observer pattern is the usual solution. Back when I started C++, there
was a fashion for "relationship management", with some sort of
management classes where you registered relationships, and which
supposedly ensured that everything worked out OK. In practice, they
either didn't work, or didn't do any more than the simple observer
pattern, and you don't hear any more of them today.
For the most part, your precise questions are part of the contract that
each class has to establish for each of its functions. For true OO
classes (entity objects), you should probably never delete them: that's
there business, not yours. But there are exceptions: if you're dealing
with transactions, for example, a deleted object cannot be rolled back,
so when an object decides to delete itself, it will usually register
this fact with the transaction manager, who will delete it as part of
the commit, once it's established that roll back won't be necessary. As
for changing the object, that's a question of the contract: in a lot of
applications, there are mapping objects, which are used to map an
external identifier of some sort to the object. With the goal, often,
of being able to modify the object.
From my understanding and experience, it generally revolves around what it is that you are trying to do as well as the language using pointers (e.g. C++ vs Objective-C).
Usually, though, in C++ terms, I've found that it's best to return either a reference to a smart pointer (such as std::shared_ptr) by reference (perhaps even const reference, depending on the situation), or simply hide the pointer in the class, and if it NEEDS to be accessed or used by something outside of it, use a getter method which either copies the pointer and returns that, or returns a reference to a pointer (granted, AFAIK ref-to-ptr is only possible in C++). If someone doesn't know that you shouldn't delete a ref-to-ptr in most situations (of course, if its deallocation is handled by the class internally), you should really think twice about whether or not they're ready to be doing C++ stuff on your team.
It's fairly common to just use public references for class members if they can be stack allocated (i.e., if they won't take up too much memory), while managing heap allocated objects internally. If you need to set the class member outside of the class, it's possible to just use a set method which takes the required value, rather than access it directly.
I have a C++ application which makes extensively use of pointers to maintain quite complex data structures. The application performs mathematical simulations on huge data sets (which could take several GB of memory), and is compiled using Microsoft's Visual Studio 2010.
I am now reworking an important part of the application. To reduce errors (dangling pointers, memory leaks, ...) I would want to start using smart pointers. Sacrificing memory or performance is acceptible as long as it is limited.
In practice most of the classes are maintained in big pools (one pool per class) and although the classes can refer to each other, you could consider the pool as owner of all the instances of that class. However, if the pool decides to delete an instance, I don't want any of the other classes that still refers to the deleted instance to have a dangling pointer.
In another part I keep a collection of pointers to instances that are delivered by other modules in the application. In practice the other modules maintain ownership of the passed instance, but in some cases, modules don't want to take care of the ownership and just want to pass the instance to the collection, telling it "it's yours now, manage it".
What is the best way to start introducing smart pointers? Just replacing pointers [at random] with smart pointers doesn't seem a correct way, and probably doesn't deliver all the (or any of the) advantages of smart pointers. But what is a better method?
Which types of smart pointers should I further investigate? I sometimes use std::auto_ptr for the deallocation of locally allocated memory, but this seems to be deprected in C++0x. Is std::unique_ptr a better alternative? Or should I go straight to shared pointers or other types of smart pointers?
The question Replacing existing raw pointers with smart pointers seems similar but instead of asking how easy it is, I am asking what the best approach would be, and which kind of smart pointers are suited best.
Thanks in advance for your ideas and suggestions.
I recommend using unique_ptr when possible (this may require some program analysis) and shared_ptr when this is impossible. When in doubt, use a shared_ptr to maximize safety: when handing off control to a container, the reference count will simply go to two and then back to one and the container will eventually delete the associated object automatically. When performance becomes an issue, consider using boost::intrusive_ptr.
Here are the 3 varieties found in the new C++11 standard (unique_ptr replaces auto_ptr)
http://www.stroustrup.com/C++11FAQ.html#std-unique_ptr
http://www.stroustrup.com/C++11FAQ.html#std-shared_ptr
http://www.stroustrup.com/C++11FAQ.html#std-weak_ptr
You can read the text for each pointer and there is an explanation of when to use which in there. For local memory management unique_ptr is the choice. It is non-copyable but movable so as you move it around the receiver takes ownership of it.
Shared_ptr is used if you want to share an object instance around with no one really owning the object and to make sure it doesn't get deleted while someone still has a reference to it. Once the last user of an object destroys the shared_ptr container, the contained object will be deleted.
weak_ptr is used in conjunction with shared_ptr. It enables one to "lock" to see if the reference shared_ptr object still exists before trying to access the internal object.
In a C++ project that uses smart pointers, such as boost::shared_ptr, what is a good design philosophy regarding use of "this"?
Consider that:
It's dangerous to store the raw pointer contained in any smart pointer for later use. You've given up control of object deletion and trust the smart pointer to do it at the right time.
Non-static class members intrinsically use a this pointer. It's a raw pointer and that can't be changed.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to make a shared pointer to my class.
Given that, when is it ever appropriate for me to explicitly use a this pointer? Are there design paradigms that can prevent bugs related to this?
Wrong question
In a C++ project that uses smart pointers
The issue has nothing to do with smart pointers actually. It is only about ownership.
Smart pointers are just tools
They change nothing WRT the concept of ownership, esp. the need to have well-defined ownership in your program, the fact that ownership can be voluntarily transferred, but cannot be taken by a client.
You must understand that smart pointers (also locks and other RAII objects) represent a value and a relationship WRT this value at the same time. A shared_ptr is a reference to an object and establishes a relationship: the object must not be destroyed before this shared_ptr, and when this shared_ptr is destroyed, if it is the last one aliasing this object, the object must be destroyed immediately. (unique_ptr can be viewed as a special case of shared_ptr where there is zero aliasing by definition, so the unique_ptr is always the last one aliasing an object.)
Why you should use smart pointers
It is recommended to use smart pointers because they express a lot with only variables and functions declarations.
Smart pointers can only express a well-defined design, they don't take away the need to define ownership. In contrast, garbage collection takes away the need to define who is responsible for memory deallocation. (But do not take away the need to define who is responsible for other resources clean-up.)
Even in non-purely functional garbage collected languages, you need to make ownership clear: you don't want to overwrite the value of an object if other components still need the old value. This is notably true in Java, where the concept of ownership of mutable data structure is extremely important in threaded programs.
What about raw pointers?
The use of a raw pointer does not mean there is no ownership. It's just not described by a variable declaration. It can be described in comments, in your design documents, etc.
That's why many C++ programmers consider that using raw pointers instead of the adequate smart pointer is inferior: because it's less expressive (I have avoided the terms "good" and "bad" on purpose). I believe the Linux kernel would be more readable with a few C++ objects to express relationships.
You can implement a specific design with or without smart pointers. The implementation that uses smart pointer appropriately will be considered superior by many C++ programmers.
Your real question
In a C++ project, what is a good design philosophy regarding use of "this"?
That's awfully vague.
It's dangerous to store the raw pointer for later use.
Why do you need to a pointer for later use?
You've given up control of object deletion and trust the responsible component to do it at the right time.
Indeed, some component is responsible for the lifetime of the variable. You cannot take the responsibility: it has to be transferred.
If I ever store this in another variable or pass it to another function which could potentially store it for later or bind it in a callback, I'm creating bugs that are introduced when anyone decides to use my class.
Obviously, since the caller is not informed that the function will hide a pointer and use it later without the control of the caller, you are creating bugs.
The solution is obviously to either:
transfer responsibility to handle the lifetime of the object to the function
ensure that the pointer is only saved and used under the control of the caller
Only in the first case, you might end up with a smart pointer in the class implementation.
The source of your problem
I think that your problem is that you are trying hard to complicate matters using smart pointers. Smart pointers are tools to make things easier, not harder. If smart pointers complicate your specification, then rethink your spec in term of simpler things.
Don't try to introduce smart pointers as a solution before you have a problem.
Only introduce smart pointers to solve a specific well-defined problem. Because you don't describe a specific well-defined problem, it is not possible to discuss a specific solution (involving smart pointers or not).
While i don't have a general answer or some idiom, there is boost::enable_shared_from_this . It allows you to get a shared_ptr managing an object that is already managed by shared_ptr. Since in a member function you have no reference to those managing shared_ptr's, enable_shared_ptr does allow you to get a shared_ptr instance and pass that when you need to pass the this pointer.
But this won't solve the issue of passing this from within the constructor, since at that time, no shared_ptr is managing your object yet.
One example of correct use is return *this; in functions like operator++() and operator<<().
When you are using a smart pointer class, you are right that is dangerous to directly expose "this". There are some pointer classes related to boost::shared_ptr<T> that may be of use:
boost::enable_shared_from_this<T>
Provides the ability to have an object return a shared pointer to itself that uses the same reference counting data as an existing shared pointer to the object
boost::weak_ptr<T>
Works hand-in-hand with shared pointers, but do not hold a reference to the object. If all the shared pointers go away and the object is released, a weak pointer will be able to tell that the object no longer exists and will return you NULL instead of a pointer to invalid memory. You can use weak pointers to get shared pointers to a valid reference-counted object.
Neither of these is foolproof, of course, but they'll at least make your code more stable and secure while providing appropriate access and reference counting for your objects.
If you need to use this, just use it explicitly. Smart pointers wrap only pointers of the objects they own - either exclusivelly (unique_ptr) or in a shared manner (shared_ptr).
I personally like to use the this pointer when accessing member variables of the class. For example:
void foo::bar ()
{
this->some_var += 7;
}
It's just a harmless question of style. Some people like it, somepeople don't.
But using the this pointer for any other thing is likely to cause problems. If you really need to do fancy things with it, you should really reconsider your design. I once saw some code that, in the constructor of a class, it assigned the this pointer to another pointer stored somewhere else! That's just crazy, and I can't ever think of a reason to do that. The whole code was a huge mess, by the way.
Can you tell us what exactly do you want to do with the pointer?
Another option is using intrusive smart pointers, and taking care of reference counting within the object itself, not the pointers. This requires a bit more work, but is actually more efficient and easy to control.
Another reason to pass around this is if you want to keep a central registry of all of the objects. In the constructor, an object calls a static method of the registry with this. Its useful for various publish/subscribe mechanisms, or when you don't want the registry to need knowledge of what objects/classes are in the system.