C++ Is making objects depend on others a good design? - c++

I have a basic design that consists of three classes : A Data class, A Holder class wich holds and manages multiple Data objects, and a Wrapper returned by the Holder wich contains a reference to a Data object.
The problem is that Wrapper must not outlive Holder, or it will contain a dangling reference, as Holder is responsible for deleting the Data objects. But as Wrapper is intended to have a very short lifetime (get it in a function, make some computation on its data, and let it go out of scope), this should not be a problem, but i'm not sure this is a good design.
Here are some solutions i thought about:
-Rely on the user reading the documentation, technically the same thing happens with STL iterators
-Using shared_ptr to make sure the data lasts long enought, but it feels like overkill
-Make Wrapper verify its Holder still exists each time you use it
-Any idea?
(I hope everyone can understand this, as english is not my native language)
Edit : If you want to have a less theoric approach, this all comes from a little algorithm i'm trying to write to solve Sudokus, the Holder is the grid, the Data is the content of each box (either a result or a temporary supposition), and the Wrapper is a Box class wich contains a reference to the Data, plus additional information like row and column.
I did not originally said it because i want to know what to do in a more general situation.

Only ever returning a Wrapper by value will help ensure the caller doesn't hold onto it beyond the calling scope. In conjunction with a comment or documentation that "Wrappers are only valid for the lifetime of the Holder that created them" should be enough.
Either that or you decide that ownership of a Data is transferred to the Wrapper when the Wrapper is created, and the Data is destroyed along with the Wrapper - but what if you want a to destroy a Wrapper without deleting the Data? You'd need a method to optionally relinquish ownership of the Data back to the Holder.
Whichever you choose, you need to decide what owns (ie: is responsible for the lifetime of) Data and when - once you've done that you can, if you want, use smart pointers to help with that management - but they won't make the design decision for you, and you can't simply say "oh I'll use smart pointers instead of thinking about it".
Remember, if you can't manage heap memory without smart pointers - you've got no business managing heap memory with them either!

To elaborate on what you have already listed as options,
As you suggested, shared_ptr<Data> is a good option. Unless performance is an issue, you should use it.
Never hold a pointer to Data in Wrapper. Store a handle that can be used to get a pointer to the appropriate Data object. Before Data is accessed through Wrapper, get a pointer the Data object. If the pointer is not valid, throw an exception. If the pointer is valid, proceed along the happy path.

Related

API return type -- Force a smart pointer onto the user?

Suppose I have an API-border function that creates and returns an object:
<?> createOneObject();
What would be the best way to return the object to the caller?
Return the raw MyObject* and let the caller handle the pointer herself
Return std::shared_ptr<MyObject>
Something else?
Depends.
In general, returning a raw pointer is bad, because the type itself does not communicate ownership. Should the user delete the pointer? If yes, then how? If not, then when does the library do it? How does the user know if the pointer is still valid? Those things have to be documented, and lazy programmers might skip reading the docs. But, if your API must be callable from C or other languages, then a raw pointer may be the only option.
Shared pointer might be useful in special cases, but they do have overhead. If you don't need shared ownership, then use of a shared pointer may be overkill.
Unique pointer is what I would use unless there is specific reason to use something else.
Although, that assumes that a pointer should be returned in the first place. Another great option could be to return an object.
To be a bit more concrete: I have a player class that just stores an id and just offers lots of methods. In this case you would prefer to return by value?
Most definitely.
You have 2 options (without any nasty drawbacks):
C Style
Have a matching void destroyOneObject(MyObject* object); that cleans up the resource. This is the proper choice when destroying the object isn't as simple as deleting it (e.g. needs to be unregistered at some manager class).
Smart pointer
When all that needs to happen is to delete the object, return a std::unique_ptr<MyObject>. This uses RAII to clean up the object, and fits modern C++ a lot better.
The bad solutions
Don't return a raw pointer and call delete on it. You cannot guarantee that the delete matches the new that was used to create the object. It also doesn't communicate very well if you are supposed to delete it in the first place.
Don't use shared pointers. They are complete overkill in almost any situation, and are usually the result of a lack of understanding of an application's structure.

Weak reference to a scoped_ptr?

Generally I follow the Google style guide, which I feel aligns nicely with the way I see things. I also, almost exclusively, use boost::scoped_ptr so that only a single manager has ownership of a particular object. I then pass around naked pointers, the idea being that my projects are structured such that the managers of said objects are always destroyed after the objects that use them are destroyed.
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Smart_Pointers
This is all great, however I was just bitten by a nasty little memory stomp bug where the owner just so happened to be deleted before objects that were using it were deleted.
Now, before everyone jumps up and down that I'm a fool for this pattern, why don't I just use shared_ptr ? etc., consider the point that I don't want to have undefined owner semantics. Although shared_ptr would have caught this particular case, it sends the wrong message to users of the system. It says, "I don't know who owns this, it could be you!"
What would have helped me would have been a weak pointer to a scoped pointer. In effect, a scoped pointer that has a list of weak references, that are nulled out when the scoped pointer destructs. This would allow single ownership semantics, but give the using objects a chance to catch the issue I ran into.
So at the expense of an extra 'weak_refs' pointer for the scoped_ptr and an extra pointer for the 'next_weak_ptr' in the weak_ptr, it would make a neat little single owner, multiple user structure.
It could maybe even just be a debug feature, so in 'release' the whole system just turns back into a normally sized scoped_ptr and a standard single pointer for the weak reference.
So..... my QUESTIONS after all of this are:
Is there such a pointer/patten already in stl/boost that I'm
missing, or should I just roll my own?
Is there a better way, that
still meets my single ownership goal?
Cheers,
Shane
2. Is there a better way, that still meets my single ownership goal?
Do use a shared_ptr, but as a class member so that it's part of the invariant of that class and the public interface only exposes a way to obtain a weak_ptr.
Of course, pathological code can then retain their own shared_ptr from that weak_ptr for as long as they want. I don't recommend trying to protect against Machiavelli here, only against Murphy (using Sutter's words). On the other hand, if your use case is asynchronous, then the fact that locking a weak_ptr returns a shared_ptr may be a feature!
Although shared_ptr would have caught this particular case, it sends the wrong message to users of the system. It says, "I don't know who owns this, it could be you!"
A shared_ptr doesn't mean "I don't know who owns this". It means "We own this." Just because one entity does not have exclusive ownership does not mean that anyone can own it.
The purpose of a shared_ptr is to ensure that the pointer cannot be destroyed until everyone who shares it is in agreement that it ought to be destroyed.
Is there such a pointer/patten already in stl/boost that I'm missing, or should I just roll my own?
You could use a shared_ptr exactly the same way you use a scoped_ptr. Just because it can be shared doesn't mean you have to share it. That would be the easiest way to work; just make single-ownership a convention rather than an API-established rule.
However, if you need a pointer that is single-owner and yet has weak pointers, there isn't one in Boost.
I'm not sure that weak pointers would help that much. Typically, if a
component X uses another component Y, X must be informed of Y's demise,
not just to nullify the pointer, but perhaps to remove it from a list,
or to change its mode of operation so that it no longer needs the
object. Many years ago, when I first started C++, there was a flurry of
activity trying to find a good generic solution. (The problem was
called relationship management back then.) As far as I know, no good
generic solution was every found; at least, every project I've worked on
has used a hand built solution based on the Observer pattern.
I did have a ManagedPtr on my site, when it was still up, which behaved
about like what you describe. In practice, except for the particular
case which led to it, I never found a real use for it, because
notification was always needed. It's not hard to implement, however;
the managed object derives from a ManagedObject class, and gets all of
the pointers (ManagedPtr, and not raw pointers) it hands out from it.
The pointer itself is registered with the ManagedObject class, and the
destructor of the ManagedObject class visits them all, and "disconnects"
them by setting the actual pointer to null. And of course, ManagedPtr
has an isValid function so that the client code can test before
dereferencing. This works well (regardless of how the object is
managed—most of my entity objects "own" themselves, and do a
delete this is response to some specific input), except that you tend
to leak invalid ManagedPtr (e.g. whenever the client keeps the pointer
in a container of some sort, because it may have more than one), and
clients still aren't notified if they need to take some action when your
object dies.
If you happen to be using QT, QPointer is basically a weak pointer to a QObject. It connects itself to the "I just got destroyed" event in the pointed-to value, and invalidates itself automatically as needed. That's a seriously hefty library to pull in for what amounts to bug tracking, though, if you aren't already jumping through QT's hoops.

The right way to expose a complex short-lived object

I am developing a Darwinian evolution simulation. For performance reasons, it's written in C++. The sim is represented by an instance of World. The animals are represented by instances of Animal, a fairly complicated object. World has two important methods:
animal_at(int i)
and
evolve(int n).
animal_at returns a non-null raw pointer to the instance of Animal that represents the i-th animal.
evolve advances the simulation possibly invalidating any pointer returned by animal_at.
I want to make the sim and the animals easily accessible from outside. I have python-specific bindings, but i'm considering learning CORBA or Ice to implement a more generic interface. The problem is how I should expose the animals. I have two ideas, none of which feels satisfactory:
1) Rewrite the code slightly to use shared_ptr instead of a raw ptr and use a middleware that understands shared_ptr semantics.
2) Generate a "deep lazy proxy" that has the same structure as Animal. Its members would be proxies to members of Animal, recursively. animal_at would be actually called in the last moment before referring to the actual data -- the pointer would be used and immediately tossed away. This would implement the "last moment" semantics.
I dislike 1) because I would have to introduce a "zombie" state of the object, which looks inelegant to me.
I dislike 2) because the sole purpose of the proxy is to implement the "last moment" semantics.
I am looking for an automatic non-intrusive way (using code-generation) to achieve this, because I don't want to obscure the meaning of the original code. Is there any "official" name for what I call "last moment" semantics?
There is an option of using boost::weak_ptr. You can pass these around, they have to use lock() to get the underlying shared_ptr so they can see if the object no longer exists.
The lifetime of the object is determined only by the places where a shared_ptr is held. There must therefore be at least one of these at all times during the lifetime of the object, and you need one before you can create a weak_ptr.
Firstly, you might consider xml-rpc instead of CORBA as middleware (more standard use nowadays).
Secondly, external users work with marshalled data. They send a marshalled reference over the middleware and expect marshalled information based on their request and reference, so I don't see how this works together with "last moment" sementics and pointers. You either keep all your temporary animals, so external users can query them or you probably have to send a lot exceptions for not found animals. (or you let them request the "latest" animal)
If your animal_at function returns a pointer that can be invalidated at any point in the future, you might try having it return a smart pointer (I'd suggest a shared_ptr over a weak_ptr here, but you can get away with the weak pointer if you are careful with the implementation). However, instead of pointing directly to an Animal object, have it point to a boost::optional<Animal> object. That way, when you invalidate the pointer, you can set the invalid flag on the pointer and any user of the newly invalidated object have the ability to check to see if they need to get a new pointer.
I don't get why you want to export a pointer instead of exporting an interface which seems exactly the "last moment" semantics that you want. Your animal_at could return an object id, not a pointer, and the interface would give all the required access to the object, like dosomething(id, params). The interface could also include lock/unlock stuff to enable atomic access.

Boost shared_ptr use_count function

My application problem is the following -
I have a large structure foo. Because these are large and for memory management reasons, we do not wish to delete them when processing on the data is complete.
We are storing them in std::vector<boost::shared_ptr<foo>>.
My question is related to knowing when all processing is complete. First decision is that we do not want any of the other application code to mark a complete flag in the structure because there are multiple execution paths in the program and we cannot predict which one is the last.
So in our implementation, once processing is complete, we delete all copies of boost::shared_ptr<foo>> except for the one in the vector. This will drop the reference counter in the shared_ptr to 1. Is it practical to use shared_ptr.use_count() to see if it is equal to 1 to know when all other parts of my app are done with the data.
One additional reason I'm asking the question is that the boost documentation on the shared pointer shared_ptr recommends not using "use_count" for production code.
Edit -
What I did not say is that when we need a new foo, we will scan the vector of foo pointers looking for a foo that is not currently in use and use that foo for the next round of processing. This is why I was thinking that having the reference counter of 1 would be a safe way to ensure that this particular foo object is no longer in use.
My immediate reaction (and I'll admit, it's no more than that) is that it sounds like you're trying to get the effect of a pool allocator of some sort. You might be better off overloading operator new and operator delete to get the effect you want a bit more directly. With something like that, you can probably just use a shared_ptr like normal, and the other work you want delayed, will be handled in operator delete for that class.
That leaves a more basic question: what are you really trying to accomplish with this? From a memory management viewpoint, one common wish is to allocate memory for a large number of objects at once, and after the entire block is empty, release the whole block at once. If you're trying to do something on that order, it's almost certainly easier to accomplish by overloading new and delete than by playing games with shared_ptr's use_count.
Edit: based on your comment, overloading new and delete for class sounds like the right thing to do. If anything, integration into your existing code will probably be easier; in fact, you can often do it completely transparently.
The general idea for the allocator is pretty much the same as you've outlined in your edited question: have a structure (bitmaps and linked lists are both common) to keep track of your free objects. When new needs to allocate an object, it can scan the bit vector or look at the head of the linked list of free objects, and return its address.
This is one case that linked lists can work out quite well -- you (usually) don't have to worry about memory usage, because you store your links right in the free object, and you (virtually) never have to walk the list, because when you need to allocate an object, you just grab the first item on the list.
This sort of thing is particularly common with small objects, so you might want to look at the Modern C++ Design chapter on its small object allocator (and an article or two since then by Andrei Alexandrescu about his newer ideas of how to do that sort of thing). There's also the Boost::pool allocator, which is generally at least somewhat similar.
If you want to know whether or not the use count is 1, use the unique() member function.
I would say your application should have some method that eliminates all references to the Foo from other parts of the app, and that method should be used instead of checking use_count(). Besides, if use_count() is greater than 1, what would your program do? You shouldn't be relying on shared_ptr's features to eliminate all references, your application architecture should be able to eliminate references. As a final check before removing it from the vector, you could assert(unique()) to verify it really is being released.
I think you can use shared_ptr's custom deleter functionality to call a particular function when the last copy has been released. That way, you're not using use_count at all.
You would need to hold something other than a copy of the shared_ptr in your vector so that the shared_ptr is only tracking the outstanding processing.
Boost has several examples of custom deleters in the shared_ptr docs.
I would suggest that instead of trying to use the shared_ptr's use_count to keep track, it might be better to implement your own usage counter. this way you will have full control over this rather than using the shared_ptr's one which, as you rightly suggest, is not recommended. You can also pre-set your own counter to allow for the number of threads you know will need to act on the data, rather than relying on them all being initialised at the beginning to get their copies of the structure.

C++ message passing doubts

I'm writing a piece of sotware that needs objects that exchange messages between each other.
The messages have to have the following contents:
Peer *srcPeer;
const char* msgText;
void* payload;
int payLoadLen;
now, Peer has to be a pointer as I have another class that manages Peers. For the rest I'm doubtful... for example I may copy the message text and payload (by allocating two new buffers) as the message is created, then putting the deletes in the destructor of the message. This has the great advantage of avoiding to forget the deletes in the consumer functions (not the mention to make those functions simpler) but it will result in many allocations & copies and can make everything slow. So I may just assing pointers and still have the destructor delete eveything... or ... well this is a common situation that in other programming languages is not even a dilemma as there is a GC. What are your suggestions and what are the most popular practices?
Edit:
I mean that I'd like to know what are the best practices to pass the contents... like having another object that keeps track of them, or maybe shared pointers... or what would you do...
You need clear ownership: when a message is passed between peers, is the ownership changed? If you switch ownership, just have the receiver do the clean-up.
If you just "lease" a message, make sure to have a "return to owner" procedure.
Is the message shared? Then you probably need some copying or have mutexes to protect access.
If all of the messages are similar, consider using a trash stack (http://library.gnome.org/devel/glib/stable/glib-Trash-Stacks.html) - this way, you can keep a stack of allocated-yet-uninitialized message structures that you can reuse without taking the constant malloc/free hit.
Consider using shared_ptr<>, available from Boost and also part of the std::tr1 library, rather than raw pointers. It's not the best thing to use in all cases, but it looks like you want to keep things simple and it's very good at that. It's local reference-count garbage collection.
There are no rules or even best practices for this in C++, except insofar as you make a design decision about pointer ownership and stick to it.
You can try to enforce a particular policy through use of smart pointers, or you can simply document the designed behavior and hope everyone reads the manual.
In C++, you can have reference counting too, like here:
http://library.gnome.org/devel/glibmm/stable/classGlib_1_1RefPtr.html
In this case, you can pass Glib::RefPtr objects. When the last of these RefPtr-s are destroyed associated to a pointer, the object is deleted itself.
If you do not want to use glibmm, you can implement it too, it's not too difficult. Also, probably STL and boost have something like these too.
Just watch out for circular references.
The simplest thing to do is to use objects to manage the buffers. For instance, you might use std::string for both the msgText and payload members and you could do away with the payLoadLen as it would be taken care of by the payload.size() method.
If and only if you measure the performance of this solution and the act of copying the msgText and payload were causing an unacceptable performance hit, you might choose to using a shared pointer to structure that was shared by copies of the message.
In (almost) no situation would I rely on remembering to call delete in a destructor or manually writing safe copy-assignment operator and copy constructor.
The simplest policy to deal with is copying the entire message (deep copy) and send that copy to the recipient. It will require more allocation, but it frees you from many problems related to concurrent access to data. If performance gets critical, there is still room for some optimizations (like avoiding copies if the object only has one owner who is willing to give up ownership of it when sending it, etc).
Each owner is then responsible for cleaning up the message object.