A weak/shared pointer, detect when one user remains, boost - c++

I want a pointer where I can tell when the reference count is one. Essentially the pointer works like a weak_ptr, but the cleanup needs to be manual. That is, every so often the program goes through a loop of its pointers and checks which ones have only one reference remaining. Some it will clean, others it will retain a while longer (in case somebody needs it again).
Now, I know how to do this using a combination of a custom cleanup function and weak_ptr. I just think the same thing could be accomplished, with simpler code, if I could simply figure out when only one user of the shared_ptr remains.
I know that shared_ptr has a use_count function, but it has this ominous note in the docs: "...not necessarily efficient. Use only for debugging and testing purposes..." Naturally I'm not so keen on using something with such a warning. I don't really need the count anyway, just a way to detect when there is only one left.
Is there some boost wrapper that achieves what I want (can be in any library)? Or must I use the technique I already know of custom cleanup function combined with a weak_ptr?

You cannot in general accurately determine the number of references. But you can tell when it is exactly one - use unique().

Destructively transform your shared_ptrs into weak_ptrs and then to shared_ptrs back again, except that some of those will be null. Of course there's no telling how that fares for performance, but given the interface we have it's either that or use_count.
Could look like:
std::for_each(begin, end, [](element_type& pointer)
{
std::weak_ptr<element_type::element_type> weak = element_type(std::move(pointer));
pointer = weak.lock();
});
auto predicate = [](element_type& pointer) { return !pointer; };
container.erase(std::remove_if(begin, end, predicate), end);

When you're doing something complex that can't quite be represented with the normal shared_ptr system, you might want to consider using intrusive_ptr instead - your implementation of intrusive_ptr_release can then queue objects up for later destruction instead of deleting them immediately. Note that intrusive_ptr can't be directly used with weak_ptr, although you can construct your own variant of weak_ptr if you prefer. Keep in mind, though, that if you are using multiple threads the reference counting may get a bit tricky.
If you can't use an intrusive pointer, and it's acceptable to invalidate existing weak_ptrs when the last shared_ptr is lost, you can have the destructor for the shared_ptr put the raw pointer back into your cache (or whatever) marked for eventual cleanup. You can rewrap it in a shared_ptr the next time it is retrieved. However, again, this has the downside of losing all weak_ptrs to the object at the moment of pseudo-destruction.
If you can't use an intrusive pointer, you may be best off simply designing your own smart pointer implementation. Unfortunately, shared_ptr does not have the kind of hooks needed to implement your goals efficiently, so you may be working from scratch.

Related

used case for boost make_shared

I am walking through a cpp code and have the following questions(have almost no exposure to boost library)
bool xxxx::calcYYY ()
{
bool retStatus = false;
boost::shared_ptr<DblMatrix> price = boost::make_shared<DblMatrix>(xxx, xxx);
.....
retStatus = true;
}
return retStatus;
}
Why are local scoped pointers instantiated as shared?
There must be an additional overhead to maintain reference counting,in a high performing code.
What is the boost alternative to do this correctly here?
Why are local scoped pointers instantiated as shared?
Because they point to objects that can be shared. For example, they can be passed to functions that can extend the life of the object they refer to.
There must be an additional overhead to maintain reference counting,in a high performing code.
Possibly. It's also possible the compiler can optimize it out. There are a lot of tricks, such as passing the pointers by const reference to avoid having to change the reference count, using std::move where possible, and so on that can make the additional overhead negligible for many use cases. But, presumably, it's being used because there is some benefit. We can't tell with just the code you've shown.
What is the boost alternative to do this correctly here?
There is nothing incorrect shown.
Why are local scoped pointers instantiated as shared?
Because the author wanted to use a smart pointer that would guarantee the memory is freed even if the function exits via an early return, or an exception.
The best choice for such a usage is std::unique_ptr, but that wasn't available until C++11, and if the original code predates that it wouldn't have been an option.
The next best choice is boost::unique_ptr. That actually handles this case perfectly, but it is in general a lot less useful than std::unique_ptr because it didn't support move semantics. As such, having a general rule "just use boost::shared_ptr and stop worrying" is perfectly sensible (particularly if you are going to pass the pointer into a general purpose function which may want to do reallocation).
There was also std::auto_ptr which also handles this case perfectly, but it's going to cause anyone reading the code to go "hang on, is this OK". It is best just avoided.
There must be an additional overhead to maintain reference counting,in a high performing code.
If the pointer is allocated once, and then never used until it is released, then the cost of an atomic increment and atomic decrement are going to be lost in the noise of a call to new and delete.

Is there a way to optimize shared_ptr for the case of permanent objects?

I've got some code that is using shared_ptr quite widely as the standard way to refer to a particular type of object (let's call it T) in my app. I've tried to be careful to use make_shared and std::move and const T& where I can for efficiency. Nevertheless, my code spends a great deal of time passing shared_ptrs around (the object I'm wrapping in shared_ptr is the central object of the whole caboodle). The kicker is that pretty often the shared_ptrs are pointing to an object that is used as a marker for "no value"; this object is a global instance of a particular T subclass, and it lives forever since its refcount never goes to zero.
Using a "no value" object is nice because it responds in nice ways to various methods that get sent to these objects, behaving in the way that I want "no value" to behave. However, performance metrics indicate that a huge amount of the time in my code is spent incrementing and decrementing the refcount of that global singleton object, making new shared_ptrs that refer to it and then destroying them. To wit: for a simple test case, the execution time went from 9.33 seconds to 7.35 seconds if I stuck nullptr inside the shared_ptrs to indicate "no value", instead of making them point to the global singleton T "no value" object. That's a hugely important difference; run on much larger problems, this code will soon be used to do multi-day runs on computing clusters. So I really need that speedup. But I'd really like to have my "no value" object, too, so that I don't have to put checks for nullptr all over my code, special-casing that possibility.
So. Is there a way to have my cake and eat it too? In particular, I'm imagining that I might somehow subclass shared_ptr to make a "shared_immortal_ptr" class that I could use with the "no value" object. The subclass would act just like a normal shared_ptr, but it would simply never increment or decrement its refcount, and would skip all related bookkeeping. Is such a thing possible?
I'm also considering making an inline function that would do a get() on my shared_ptrs and would substitute a pointer to the singleton immortal object if get() returned nullptr; if I used that everywhere in my code, and never used * or -> directly on my shared_ptrs, I would be insulated, I suppose.
Or is there another good solution for this situation that hasn't occurred to me?
Galik asked the central question that comes to mind regarding your containment strategy. I'll assume you've considered that and have reason to rely on shared_ptr as a communical containment strategy for which no alternative exists.
I have to suggestions which may seem controversial. What you've defined is that you need a type of shared_ptr that never has a nullptr, but std::shared_ptr doesn't do that, and I checked various versions of the STL to confirm that the customer deleter provided is not an entry point to a solution.
So, consider either making your own smart pointer, or adopting one that you change to suit your needs. The basic idea is to establish a kind of shared_ptr which can be instructed to point it's shadow pointer to a global object it doesn't own.
You have the source to std::shared_ptr. The code is uncomfortable to read. It may be difficult to work with. It is one avenue, but of course you'd copy the source, change the namespace and implement the behavior you desire.
However, one of the first things all of us did in the middle 90's when templates were first introduced to the compilers of the epoch was to begin fashioning containers and smart pointers. Smart pointers are remarkably easy to write. They're harder to design (or were), but then you have a design to model (which you've already used).
You can implement the basic interface of shared_ptr to create a drop in replacement. If you used typedefs well, there should be a limited few places you'd have to change, but at least a search and replace would work reasonably well.
These are the two means I'm suggesting, both ending up with the same feature. Either adopt shared_ptr from the library, or make one from scratch. You'd be surprised how quickly you can fashion a replacement.
If you adopt std::shared_ptr, the main theme would be to understand how shared_ptr determines it should decrement. In most implementations shared_ptr must reference a node, which in my version it calls a control block (_Ref). The node owns the object to be deleted when the reference count reaches zero, but naturally shared_ptr skips that if _Ref is null. However, operators like -> and *, or the get function, don't bother checking _Ref, they just return the shadow, or _Ptr in my version.
Now, _Ptr will be set to nullptr (or 0 in my source) when a reset is called. Reset is called when assigning to another object or pointer, so this works even if using assignment to nullptr. The point is, that for this new type of shared_ptr you need, you could simply change the behavior such that whenever that happens (a reset to nullptr), you set _Ptr, the shadow pointer in shared_ptr, to the "no value global" object's address.
All uses of *,get or -> will return the _Ptr of that no value object, and will correctly behave when used in another assignment, or reset is called again, because those functions don't rely upon the shadow pointer to act upon the node, and since in this special condition that node (or control block) will be nullptr, the rest of shared_ptr would behave as though it was pointing to nullptr correctly - that is, not deleting the global object.
Obviously this sounds crazy to alter std::pointer to such application specific behavior, but frankly that's what performance work tends to make us do; otherwise strange things, like abandoning C++ occasionally in order to obtain the more raw speed of C, or assembler.
Modifying std::shared_ptr source, taken as a copy for this special purpose, is not what I would choose (and, factually, I've faced other versions of your situation, so I have made this choice several times over decades).
To that end, I suggest you build a policy based smart pointer. I find it odd I suggested this earlier on another post today (or yesterday, it's 1:40am).
I refer to Alexandrescu's book from 2001 (I think it was Modern C++...and some words I don't recall). In that he presented loki, which included a policy based smart pointer design, which is still published and freely available on his website.
The idea should have been incorporated into shared_ptr, in my opinion.
Policy based design is implemented as the paradigm of a template class deriving from one or more of it's parameters, like this:
template< typename T, typename B >
class TopClass : public B {};
In this way, you can provide B, from which the object is built. Now, B may have the same construction, it may also be a policy level which derives from it's second parameter (or multiple derivations, however the design works).
Layers can be combined to implement unique behaviors in various categories.
For example:
std::shared_ptr and std::weak_ptrare separate classes which interact as a family with others (the nodes or control blocks) to provide smart pointer service. However, in a design I used several times, these two were built by the same top level template class. The difference between a shared_ptr and a weak_ptr in that design was the attachment policy offered in the second parameter to the template. If the type is instantiated with the weak attachment policy as the second parameter, it's a weak pointer. If it's given a strong attachment policy, it's a smart pointer.
Once you create a policy designed template, you can introduce layers not in the original design (expanding it), or to "intercept" behavior and specialize it like the one you currently require - without corrupting the original code or design.
The smart pointer library I developed had high performance requirements, along with a number of other options including custom memory allocation and automatic locking services to make writing to smart pointers thread safe (which std::shared_ptr doesn't provide). The interface and much of the code is shared, yet several different kinds of smart pointers could be fashioned simply by selecting different policies. To change behavior, a new policy could be inserted without altering the existing code. At present, I use both std::shared_ptr (which I used when it was in boost years ago) and the MetaPtr library I developed years ago, the latter when I need high performance or flexible options, like yours.
If std::shared_ptr had been a policy based design, as loki demonstrates, you'd be able to do this with shared_ptr WITHOUT having to copy the source and move it to a new namespace.
In any event, simply creating a shared pointer which points the shadow pointer to the global object on reset to nullptr, leaving the node pointing to null, provides the behavior you described.

Correctly using smart pointers

I'm having trouble getting things organized properly with smart pointers. Almost to the point that I feel compelled to go back to using normal pointers.
I would like to make it easy to use smart pointers throughout the program without having to type shared_ptr<...> every time. One solution I think of right away is to make a template class and add a typedef sptr to it so I can do class Derived : public Object < Derived > .. and then use Derived::sptr = ... But this obviously is horrible because it does not work with another class that is then derived from Derived object.
And even doing typedef shared_ptr<..> MyObjectPtr is horrible because then it needs to be done for each kind of smart pointer for consistency's sake, or at least for unique_ptr and shared_ptr.
So what's the standard way people use smart pointers? Because frankly I'm starting to see it as being too much hassle to use them. :/
So what's the standard way people use smart pointers?
Rarely. The fact that you find it a hassle to use them is a sign that you over-use pointers. Try to refactor your code to make pointers the exception, not the rule. shared_ptr in particular has its niche, but it’s a small one: namely, when you genuinely have to share ownership of a resource between several objects. This is a rare situation.
Because frankly I'm starting to see it as being too much hassle to use them. :/
Agreed. That’s the main reason not to use pointers.
There are more ways to avoid pointers. In particular, shared_ptr really only needs to spelled out when you actually need to pass ownership. In functions which don’t deal with ownership, you wouldn’t pass a shared_ptr, or a raw pointer; you would pass a reference, and dereference the pointer upon calling the function.
And inside functions you almost never need to spell out the type; for instance, you can (and should) simply say auto x = …; instead of shared_ptr<Class> x = …; to initialise variables.
In summary, you should only need to spell out shared_ptr in very few places in your code.
I have a lot of code that creates objects dynamically. So using pointers is necessary because the number of objects is not known from the start. An object is created in one subsystem, then stored in another, then passed for further processing to the subsystem that created it. So that I guess means using shared_ptr. Good design? I don't know, but it seems most logical to ask subsystem to create a concrete object that it owns, return a pointer to an interface for that object and then pass it for further processing to another piece of code that will interact with the object through it's abstract interface.
I could return unique_ptr from factory method. But then I would run into trouble if I need to pass the object for processing multiple times. Because I would still need to know about the object after I pass it to another method and unique_ptr would mean that I lose track of the object after doing move(). Since I need to have at least two references to the object this means using shared_ptr.
I heard somewhere that most commonly used smart pointer is unique_ptr. Certainly not so in my application. I end up with using shared_ptr mush more often. Is this a sign of bad design then?

Weak reference to a scoped_ptr?

Generally I follow the Google style guide, which I feel aligns nicely with the way I see things. I also, almost exclusively, use boost::scoped_ptr so that only a single manager has ownership of a particular object. I then pass around naked pointers, the idea being that my projects are structured such that the managers of said objects are always destroyed after the objects that use them are destroyed.
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Smart_Pointers
This is all great, however I was just bitten by a nasty little memory stomp bug where the owner just so happened to be deleted before objects that were using it were deleted.
Now, before everyone jumps up and down that I'm a fool for this pattern, why don't I just use shared_ptr ? etc., consider the point that I don't want to have undefined owner semantics. Although shared_ptr would have caught this particular case, it sends the wrong message to users of the system. It says, "I don't know who owns this, it could be you!"
What would have helped me would have been a weak pointer to a scoped pointer. In effect, a scoped pointer that has a list of weak references, that are nulled out when the scoped pointer destructs. This would allow single ownership semantics, but give the using objects a chance to catch the issue I ran into.
So at the expense of an extra 'weak_refs' pointer for the scoped_ptr and an extra pointer for the 'next_weak_ptr' in the weak_ptr, it would make a neat little single owner, multiple user structure.
It could maybe even just be a debug feature, so in 'release' the whole system just turns back into a normally sized scoped_ptr and a standard single pointer for the weak reference.
So..... my QUESTIONS after all of this are:
Is there such a pointer/patten already in stl/boost that I'm
missing, or should I just roll my own?
Is there a better way, that
still meets my single ownership goal?
Cheers,
Shane
2. Is there a better way, that still meets my single ownership goal?
Do use a shared_ptr, but as a class member so that it's part of the invariant of that class and the public interface only exposes a way to obtain a weak_ptr.
Of course, pathological code can then retain their own shared_ptr from that weak_ptr for as long as they want. I don't recommend trying to protect against Machiavelli here, only against Murphy (using Sutter's words). On the other hand, if your use case is asynchronous, then the fact that locking a weak_ptr returns a shared_ptr may be a feature!
Although shared_ptr would have caught this particular case, it sends the wrong message to users of the system. It says, "I don't know who owns this, it could be you!"
A shared_ptr doesn't mean "I don't know who owns this". It means "We own this." Just because one entity does not have exclusive ownership does not mean that anyone can own it.
The purpose of a shared_ptr is to ensure that the pointer cannot be destroyed until everyone who shares it is in agreement that it ought to be destroyed.
Is there such a pointer/patten already in stl/boost that I'm missing, or should I just roll my own?
You could use a shared_ptr exactly the same way you use a scoped_ptr. Just because it can be shared doesn't mean you have to share it. That would be the easiest way to work; just make single-ownership a convention rather than an API-established rule.
However, if you need a pointer that is single-owner and yet has weak pointers, there isn't one in Boost.
I'm not sure that weak pointers would help that much. Typically, if a
component X uses another component Y, X must be informed of Y's demise,
not just to nullify the pointer, but perhaps to remove it from a list,
or to change its mode of operation so that it no longer needs the
object. Many years ago, when I first started C++, there was a flurry of
activity trying to find a good generic solution. (The problem was
called relationship management back then.) As far as I know, no good
generic solution was every found; at least, every project I've worked on
has used a hand built solution based on the Observer pattern.
I did have a ManagedPtr on my site, when it was still up, which behaved
about like what you describe. In practice, except for the particular
case which led to it, I never found a real use for it, because
notification was always needed. It's not hard to implement, however;
the managed object derives from a ManagedObject class, and gets all of
the pointers (ManagedPtr, and not raw pointers) it hands out from it.
The pointer itself is registered with the ManagedObject class, and the
destructor of the ManagedObject class visits them all, and "disconnects"
them by setting the actual pointer to null. And of course, ManagedPtr
has an isValid function so that the client code can test before
dereferencing. This works well (regardless of how the object is
managed—most of my entity objects "own" themselves, and do a
delete this is response to some specific input), except that you tend
to leak invalid ManagedPtr (e.g. whenever the client keeps the pointer
in a container of some sort, because it may have more than one), and
clients still aren't notified if they need to take some action when your
object dies.
If you happen to be using QT, QPointer is basically a weak pointer to a QObject. It connects itself to the "I just got destroyed" event in the pointed-to value, and invalidates itself automatically as needed. That's a seriously hefty library to pull in for what amounts to bug tracking, though, if you aren't already jumping through QT's hoops.

Boost shared_ptr use_count function

My application problem is the following -
I have a large structure foo. Because these are large and for memory management reasons, we do not wish to delete them when processing on the data is complete.
We are storing them in std::vector<boost::shared_ptr<foo>>.
My question is related to knowing when all processing is complete. First decision is that we do not want any of the other application code to mark a complete flag in the structure because there are multiple execution paths in the program and we cannot predict which one is the last.
So in our implementation, once processing is complete, we delete all copies of boost::shared_ptr<foo>> except for the one in the vector. This will drop the reference counter in the shared_ptr to 1. Is it practical to use shared_ptr.use_count() to see if it is equal to 1 to know when all other parts of my app are done with the data.
One additional reason I'm asking the question is that the boost documentation on the shared pointer shared_ptr recommends not using "use_count" for production code.
Edit -
What I did not say is that when we need a new foo, we will scan the vector of foo pointers looking for a foo that is not currently in use and use that foo for the next round of processing. This is why I was thinking that having the reference counter of 1 would be a safe way to ensure that this particular foo object is no longer in use.
My immediate reaction (and I'll admit, it's no more than that) is that it sounds like you're trying to get the effect of a pool allocator of some sort. You might be better off overloading operator new and operator delete to get the effect you want a bit more directly. With something like that, you can probably just use a shared_ptr like normal, and the other work you want delayed, will be handled in operator delete for that class.
That leaves a more basic question: what are you really trying to accomplish with this? From a memory management viewpoint, one common wish is to allocate memory for a large number of objects at once, and after the entire block is empty, release the whole block at once. If you're trying to do something on that order, it's almost certainly easier to accomplish by overloading new and delete than by playing games with shared_ptr's use_count.
Edit: based on your comment, overloading new and delete for class sounds like the right thing to do. If anything, integration into your existing code will probably be easier; in fact, you can often do it completely transparently.
The general idea for the allocator is pretty much the same as you've outlined in your edited question: have a structure (bitmaps and linked lists are both common) to keep track of your free objects. When new needs to allocate an object, it can scan the bit vector or look at the head of the linked list of free objects, and return its address.
This is one case that linked lists can work out quite well -- you (usually) don't have to worry about memory usage, because you store your links right in the free object, and you (virtually) never have to walk the list, because when you need to allocate an object, you just grab the first item on the list.
This sort of thing is particularly common with small objects, so you might want to look at the Modern C++ Design chapter on its small object allocator (and an article or two since then by Andrei Alexandrescu about his newer ideas of how to do that sort of thing). There's also the Boost::pool allocator, which is generally at least somewhat similar.
If you want to know whether or not the use count is 1, use the unique() member function.
I would say your application should have some method that eliminates all references to the Foo from other parts of the app, and that method should be used instead of checking use_count(). Besides, if use_count() is greater than 1, what would your program do? You shouldn't be relying on shared_ptr's features to eliminate all references, your application architecture should be able to eliminate references. As a final check before removing it from the vector, you could assert(unique()) to verify it really is being released.
I think you can use shared_ptr's custom deleter functionality to call a particular function when the last copy has been released. That way, you're not using use_count at all.
You would need to hold something other than a copy of the shared_ptr in your vector so that the shared_ptr is only tracking the outstanding processing.
Boost has several examples of custom deleters in the shared_ptr docs.
I would suggest that instead of trying to use the shared_ptr's use_count to keep track, it might be better to implement your own usage counter. this way you will have full control over this rather than using the shared_ptr's one which, as you rightly suggest, is not recommended. You can also pre-set your own counter to allow for the number of threads you know will need to act on the data, rather than relying on them all being initialised at the beginning to get their copies of the structure.