The right way to expose a complex short-lived object - c++

I am developing a Darwinian evolution simulation. For performance reasons, it's written in C++. The sim is represented by an instance of World. The animals are represented by instances of Animal, a fairly complicated object. World has two important methods:
animal_at(int i)
and
evolve(int n).
animal_at returns a non-null raw pointer to the instance of Animal that represents the i-th animal.
evolve advances the simulation possibly invalidating any pointer returned by animal_at.
I want to make the sim and the animals easily accessible from outside. I have python-specific bindings, but i'm considering learning CORBA or Ice to implement a more generic interface. The problem is how I should expose the animals. I have two ideas, none of which feels satisfactory:
1) Rewrite the code slightly to use shared_ptr instead of a raw ptr and use a middleware that understands shared_ptr semantics.
2) Generate a "deep lazy proxy" that has the same structure as Animal. Its members would be proxies to members of Animal, recursively. animal_at would be actually called in the last moment before referring to the actual data -- the pointer would be used and immediately tossed away. This would implement the "last moment" semantics.
I dislike 1) because I would have to introduce a "zombie" state of the object, which looks inelegant to me.
I dislike 2) because the sole purpose of the proxy is to implement the "last moment" semantics.
I am looking for an automatic non-intrusive way (using code-generation) to achieve this, because I don't want to obscure the meaning of the original code. Is there any "official" name for what I call "last moment" semantics?

There is an option of using boost::weak_ptr. You can pass these around, they have to use lock() to get the underlying shared_ptr so they can see if the object no longer exists.
The lifetime of the object is determined only by the places where a shared_ptr is held. There must therefore be at least one of these at all times during the lifetime of the object, and you need one before you can create a weak_ptr.

Firstly, you might consider xml-rpc instead of CORBA as middleware (more standard use nowadays).
Secondly, external users work with marshalled data. They send a marshalled reference over the middleware and expect marshalled information based on their request and reference, so I don't see how this works together with "last moment" sementics and pointers. You either keep all your temporary animals, so external users can query them or you probably have to send a lot exceptions for not found animals. (or you let them request the "latest" animal)

If your animal_at function returns a pointer that can be invalidated at any point in the future, you might try having it return a smart pointer (I'd suggest a shared_ptr over a weak_ptr here, but you can get away with the weak pointer if you are careful with the implementation). However, instead of pointing directly to an Animal object, have it point to a boost::optional<Animal> object. That way, when you invalidate the pointer, you can set the invalid flag on the pointer and any user of the newly invalidated object have the ability to check to see if they need to get a new pointer.

I don't get why you want to export a pointer instead of exporting an interface which seems exactly the "last moment" semantics that you want. Your animal_at could return an object id, not a pointer, and the interface would give all the required access to the object, like dosomething(id, params). The interface could also include lock/unlock stuff to enable atomic access.

Related

Is there a way to optimize shared_ptr for the case of permanent objects?

I've got some code that is using shared_ptr quite widely as the standard way to refer to a particular type of object (let's call it T) in my app. I've tried to be careful to use make_shared and std::move and const T& where I can for efficiency. Nevertheless, my code spends a great deal of time passing shared_ptrs around (the object I'm wrapping in shared_ptr is the central object of the whole caboodle). The kicker is that pretty often the shared_ptrs are pointing to an object that is used as a marker for "no value"; this object is a global instance of a particular T subclass, and it lives forever since its refcount never goes to zero.
Using a "no value" object is nice because it responds in nice ways to various methods that get sent to these objects, behaving in the way that I want "no value" to behave. However, performance metrics indicate that a huge amount of the time in my code is spent incrementing and decrementing the refcount of that global singleton object, making new shared_ptrs that refer to it and then destroying them. To wit: for a simple test case, the execution time went from 9.33 seconds to 7.35 seconds if I stuck nullptr inside the shared_ptrs to indicate "no value", instead of making them point to the global singleton T "no value" object. That's a hugely important difference; run on much larger problems, this code will soon be used to do multi-day runs on computing clusters. So I really need that speedup. But I'd really like to have my "no value" object, too, so that I don't have to put checks for nullptr all over my code, special-casing that possibility.
So. Is there a way to have my cake and eat it too? In particular, I'm imagining that I might somehow subclass shared_ptr to make a "shared_immortal_ptr" class that I could use with the "no value" object. The subclass would act just like a normal shared_ptr, but it would simply never increment or decrement its refcount, and would skip all related bookkeeping. Is such a thing possible?
I'm also considering making an inline function that would do a get() on my shared_ptrs and would substitute a pointer to the singleton immortal object if get() returned nullptr; if I used that everywhere in my code, and never used * or -> directly on my shared_ptrs, I would be insulated, I suppose.
Or is there another good solution for this situation that hasn't occurred to me?
Galik asked the central question that comes to mind regarding your containment strategy. I'll assume you've considered that and have reason to rely on shared_ptr as a communical containment strategy for which no alternative exists.
I have to suggestions which may seem controversial. What you've defined is that you need a type of shared_ptr that never has a nullptr, but std::shared_ptr doesn't do that, and I checked various versions of the STL to confirm that the customer deleter provided is not an entry point to a solution.
So, consider either making your own smart pointer, or adopting one that you change to suit your needs. The basic idea is to establish a kind of shared_ptr which can be instructed to point it's shadow pointer to a global object it doesn't own.
You have the source to std::shared_ptr. The code is uncomfortable to read. It may be difficult to work with. It is one avenue, but of course you'd copy the source, change the namespace and implement the behavior you desire.
However, one of the first things all of us did in the middle 90's when templates were first introduced to the compilers of the epoch was to begin fashioning containers and smart pointers. Smart pointers are remarkably easy to write. They're harder to design (or were), but then you have a design to model (which you've already used).
You can implement the basic interface of shared_ptr to create a drop in replacement. If you used typedefs well, there should be a limited few places you'd have to change, but at least a search and replace would work reasonably well.
These are the two means I'm suggesting, both ending up with the same feature. Either adopt shared_ptr from the library, or make one from scratch. You'd be surprised how quickly you can fashion a replacement.
If you adopt std::shared_ptr, the main theme would be to understand how shared_ptr determines it should decrement. In most implementations shared_ptr must reference a node, which in my version it calls a control block (_Ref). The node owns the object to be deleted when the reference count reaches zero, but naturally shared_ptr skips that if _Ref is null. However, operators like -> and *, or the get function, don't bother checking _Ref, they just return the shadow, or _Ptr in my version.
Now, _Ptr will be set to nullptr (or 0 in my source) when a reset is called. Reset is called when assigning to another object or pointer, so this works even if using assignment to nullptr. The point is, that for this new type of shared_ptr you need, you could simply change the behavior such that whenever that happens (a reset to nullptr), you set _Ptr, the shadow pointer in shared_ptr, to the "no value global" object's address.
All uses of *,get or -> will return the _Ptr of that no value object, and will correctly behave when used in another assignment, or reset is called again, because those functions don't rely upon the shadow pointer to act upon the node, and since in this special condition that node (or control block) will be nullptr, the rest of shared_ptr would behave as though it was pointing to nullptr correctly - that is, not deleting the global object.
Obviously this sounds crazy to alter std::pointer to such application specific behavior, but frankly that's what performance work tends to make us do; otherwise strange things, like abandoning C++ occasionally in order to obtain the more raw speed of C, or assembler.
Modifying std::shared_ptr source, taken as a copy for this special purpose, is not what I would choose (and, factually, I've faced other versions of your situation, so I have made this choice several times over decades).
To that end, I suggest you build a policy based smart pointer. I find it odd I suggested this earlier on another post today (or yesterday, it's 1:40am).
I refer to Alexandrescu's book from 2001 (I think it was Modern C++...and some words I don't recall). In that he presented loki, which included a policy based smart pointer design, which is still published and freely available on his website.
The idea should have been incorporated into shared_ptr, in my opinion.
Policy based design is implemented as the paradigm of a template class deriving from one or more of it's parameters, like this:
template< typename T, typename B >
class TopClass : public B {};
In this way, you can provide B, from which the object is built. Now, B may have the same construction, it may also be a policy level which derives from it's second parameter (or multiple derivations, however the design works).
Layers can be combined to implement unique behaviors in various categories.
For example:
std::shared_ptr and std::weak_ptrare separate classes which interact as a family with others (the nodes or control blocks) to provide smart pointer service. However, in a design I used several times, these two were built by the same top level template class. The difference between a shared_ptr and a weak_ptr in that design was the attachment policy offered in the second parameter to the template. If the type is instantiated with the weak attachment policy as the second parameter, it's a weak pointer. If it's given a strong attachment policy, it's a smart pointer.
Once you create a policy designed template, you can introduce layers not in the original design (expanding it), or to "intercept" behavior and specialize it like the one you currently require - without corrupting the original code or design.
The smart pointer library I developed had high performance requirements, along with a number of other options including custom memory allocation and automatic locking services to make writing to smart pointers thread safe (which std::shared_ptr doesn't provide). The interface and much of the code is shared, yet several different kinds of smart pointers could be fashioned simply by selecting different policies. To change behavior, a new policy could be inserted without altering the existing code. At present, I use both std::shared_ptr (which I used when it was in boost years ago) and the MetaPtr library I developed years ago, the latter when I need high performance or flexible options, like yours.
If std::shared_ptr had been a policy based design, as loki demonstrates, you'd be able to do this with shared_ptr WITHOUT having to copy the source and move it to a new namespace.
In any event, simply creating a shared pointer which points the shadow pointer to the global object on reset to nullptr, leaving the node pointing to null, provides the behavior you described.

C++ Is making objects depend on others a good design?

I have a basic design that consists of three classes : A Data class, A Holder class wich holds and manages multiple Data objects, and a Wrapper returned by the Holder wich contains a reference to a Data object.
The problem is that Wrapper must not outlive Holder, or it will contain a dangling reference, as Holder is responsible for deleting the Data objects. But as Wrapper is intended to have a very short lifetime (get it in a function, make some computation on its data, and let it go out of scope), this should not be a problem, but i'm not sure this is a good design.
Here are some solutions i thought about:
-Rely on the user reading the documentation, technically the same thing happens with STL iterators
-Using shared_ptr to make sure the data lasts long enought, but it feels like overkill
-Make Wrapper verify its Holder still exists each time you use it
-Any idea?
(I hope everyone can understand this, as english is not my native language)
Edit : If you want to have a less theoric approach, this all comes from a little algorithm i'm trying to write to solve Sudokus, the Holder is the grid, the Data is the content of each box (either a result or a temporary supposition), and the Wrapper is a Box class wich contains a reference to the Data, plus additional information like row and column.
I did not originally said it because i want to know what to do in a more general situation.
Only ever returning a Wrapper by value will help ensure the caller doesn't hold onto it beyond the calling scope. In conjunction with a comment or documentation that "Wrappers are only valid for the lifetime of the Holder that created them" should be enough.
Either that or you decide that ownership of a Data is transferred to the Wrapper when the Wrapper is created, and the Data is destroyed along with the Wrapper - but what if you want a to destroy a Wrapper without deleting the Data? You'd need a method to optionally relinquish ownership of the Data back to the Holder.
Whichever you choose, you need to decide what owns (ie: is responsible for the lifetime of) Data and when - once you've done that you can, if you want, use smart pointers to help with that management - but they won't make the design decision for you, and you can't simply say "oh I'll use smart pointers instead of thinking about it".
Remember, if you can't manage heap memory without smart pointers - you've got no business managing heap memory with them either!
To elaborate on what you have already listed as options,
As you suggested, shared_ptr<Data> is a good option. Unless performance is an issue, you should use it.
Never hold a pointer to Data in Wrapper. Store a handle that can be used to get a pointer to the appropriate Data object. Before Data is accessed through Wrapper, get a pointer the Data object. If the pointer is not valid, throw an exception. If the pointer is valid, proceed along the happy path.

Weak reference to a scoped_ptr?

Generally I follow the Google style guide, which I feel aligns nicely with the way I see things. I also, almost exclusively, use boost::scoped_ptr so that only a single manager has ownership of a particular object. I then pass around naked pointers, the idea being that my projects are structured such that the managers of said objects are always destroyed after the objects that use them are destroyed.
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Smart_Pointers
This is all great, however I was just bitten by a nasty little memory stomp bug where the owner just so happened to be deleted before objects that were using it were deleted.
Now, before everyone jumps up and down that I'm a fool for this pattern, why don't I just use shared_ptr ? etc., consider the point that I don't want to have undefined owner semantics. Although shared_ptr would have caught this particular case, it sends the wrong message to users of the system. It says, "I don't know who owns this, it could be you!"
What would have helped me would have been a weak pointer to a scoped pointer. In effect, a scoped pointer that has a list of weak references, that are nulled out when the scoped pointer destructs. This would allow single ownership semantics, but give the using objects a chance to catch the issue I ran into.
So at the expense of an extra 'weak_refs' pointer for the scoped_ptr and an extra pointer for the 'next_weak_ptr' in the weak_ptr, it would make a neat little single owner, multiple user structure.
It could maybe even just be a debug feature, so in 'release' the whole system just turns back into a normally sized scoped_ptr and a standard single pointer for the weak reference.
So..... my QUESTIONS after all of this are:
Is there such a pointer/patten already in stl/boost that I'm
missing, or should I just roll my own?
Is there a better way, that
still meets my single ownership goal?
Cheers,
Shane
2. Is there a better way, that still meets my single ownership goal?
Do use a shared_ptr, but as a class member so that it's part of the invariant of that class and the public interface only exposes a way to obtain a weak_ptr.
Of course, pathological code can then retain their own shared_ptr from that weak_ptr for as long as they want. I don't recommend trying to protect against Machiavelli here, only against Murphy (using Sutter's words). On the other hand, if your use case is asynchronous, then the fact that locking a weak_ptr returns a shared_ptr may be a feature!
Although shared_ptr would have caught this particular case, it sends the wrong message to users of the system. It says, "I don't know who owns this, it could be you!"
A shared_ptr doesn't mean "I don't know who owns this". It means "We own this." Just because one entity does not have exclusive ownership does not mean that anyone can own it.
The purpose of a shared_ptr is to ensure that the pointer cannot be destroyed until everyone who shares it is in agreement that it ought to be destroyed.
Is there such a pointer/patten already in stl/boost that I'm missing, or should I just roll my own?
You could use a shared_ptr exactly the same way you use a scoped_ptr. Just because it can be shared doesn't mean you have to share it. That would be the easiest way to work; just make single-ownership a convention rather than an API-established rule.
However, if you need a pointer that is single-owner and yet has weak pointers, there isn't one in Boost.
I'm not sure that weak pointers would help that much. Typically, if a
component X uses another component Y, X must be informed of Y's demise,
not just to nullify the pointer, but perhaps to remove it from a list,
or to change its mode of operation so that it no longer needs the
object. Many years ago, when I first started C++, there was a flurry of
activity trying to find a good generic solution. (The problem was
called relationship management back then.) As far as I know, no good
generic solution was every found; at least, every project I've worked on
has used a hand built solution based on the Observer pattern.
I did have a ManagedPtr on my site, when it was still up, which behaved
about like what you describe. In practice, except for the particular
case which led to it, I never found a real use for it, because
notification was always needed. It's not hard to implement, however;
the managed object derives from a ManagedObject class, and gets all of
the pointers (ManagedPtr, and not raw pointers) it hands out from it.
The pointer itself is registered with the ManagedObject class, and the
destructor of the ManagedObject class visits them all, and "disconnects"
them by setting the actual pointer to null. And of course, ManagedPtr
has an isValid function so that the client code can test before
dereferencing. This works well (regardless of how the object is
managed—most of my entity objects "own" themselves, and do a
delete this is response to some specific input), except that you tend
to leak invalid ManagedPtr (e.g. whenever the client keeps the pointer
in a container of some sort, because it may have more than one), and
clients still aren't notified if they need to take some action when your
object dies.
If you happen to be using QT, QPointer is basically a weak pointer to a QObject. It connects itself to the "I just got destroyed" event in the pointed-to value, and invalidates itself automatically as needed. That's a seriously hefty library to pull in for what amounts to bug tracking, though, if you aren't already jumping through QT's hoops.

How can I maintain a weak reference on a COM object in C++?

In my application, I'm hooking various functions for creating COM objects (such as CoCreateInstanceEx) to get notified whenever some object is created. I'm keeping track of all created objects in a std::list and I'm iterating over that list to do various things (like checking which OLE objects have been activated).
The issue with this is that right now, whenever adding an IUnknown pointer to my list, I call IUnknown::AddRef on it to make sure that it doesn't get destroyed while I'm tracking it. That's not what I really want though; the lifetime of the object should be as long (or short) as it is without my tracing code, so I'd rather like to maintain a weak reference on the objects. Whenever the last reference to some tracked COM object is removed (and thus the object gets destroyed), I'd like to get notified so that I can update my bookkeeping (e.g. by setting the pointer in my list to NULL).*
What's the best way to do this? Right now, I'm patching the (first) VTable of all created objects so that the calls to IUnknown::Release via the first vtable get notified. However, this won't work for COM interfaces which inherit from multiple interfaces (and thus have multiple vtables), but I'm not sure whether this is really a problem: given the Rules for Implementing QueryInterface, there should always be just one IUnknown returned by IUnknown::QueryInterface, right? So I could do that and then patch that vtable.
Furthermore, this approach is also a bit hairy since it involves creating thunks which generate some code. I only implemented this for 32bit so far. Not a big issue, but still.
I'm really wondering whether there isn't a more elegant way to have a weak reference to a COM object. Does anybody know?
*: The next thing I'll have to solve is making this work correctly in case I have active iterators (I'm using custom iterator objects) traversing the list of COM objects. I may need to keep track of the active iterators and once the last one finished, remove all null pointers from the list. Or something like that.
This isn't an answer as much as a set of issues why this is a really tricky thing to do - I'm putting it in as an answer since there's too much information here than fits in a comment :)
My understanding is that the concept of weak reference just doesn't exist in COM, period. You've got reference counting via IUnknown, and that's the sum total of how COM deals with object lifetime management. Anything beyond that is, strictly speaking, not COM.
(.Net does support the concept, but it's got an actual GC-based memory manager to provide appropriate support, and can treat WeakRef objects differently than regular references in memory. But that's not the case with the very simple world that COM assumes, which is a world of plain memory and pointers, and little more.)
COM specifies that reference counting is per-interface; any COM object is free to do ref counting per object as a convenience, but the upshot is that if you're wrapping an object, you have to assume the most restrictive case. So you cannot assume that any given IUnknown will be used for all addrefs/releases on that object: you'd really need to track each interface separately.
The canonical IUnknown - the one you get back by QI'ing for IUnknown - could be any interface at all - even a dedicated IUnknown that is used only for the purpose of acting as an identity! - so long as the same binary pointer value is returned each time. All other interfaces could be implemented any way; typically the same value is returned each time, but a COM object could legitimately return a new IFoo each time someone QI's for IFoo. Or even keep around a cache of IFoos and return one at random.
...and then you've got aggregation to deal with - basically, COM doesn't have a strong concept of object at all, it's all about interfaces. Objects, in COM, are just a collection of interfaces that happen to share the same canonical IUnknown: they might be implemented as a single C/C++ object behind the scenes, or as a family of related C/C++ objects presenting a facade of a 'single COM object'.
Having said all of that, given that:
I'm tracing the state of various components (including all COM objects) of this software for the sake of debugging
Here's an alternate approach that might produce some useful data to debug with.
The idea here is that many implementations of COM objects will return the ref count as the return value to Release() - so if they return 0, then that's a clue that the interface may have been released.
This is not guaranteed, however: as MSDN states:
The method returns the new reference count. This value is intended to be used only for test purposes.
(emphasis added.)
But that's apparently what you're doing here.
So one thing you could do, assuming you own the calling code, is to replace calls with Release() with an inline called MyRelease() or similar that will call release, and if it notices that the return value is 0, then notes that the interface pointer is now possibly freed - removes it from a table, logs it to a file, etc.
One major caveat: keep in mind that COM does not have a concept of weak ref, even if you try to hack something together. Using a COM interface pointer that has not been AddRef()'d is illegal as far as COM is concerned; so if you save away interface pointer values in any sort of list, the only thing you should so with those is treat them as opaque numbers for debugging purposes (eg. log them to a file so you can correlate creates with destroys, or keep track of how many you have outstanding), but do not attempt to use them as actual interface pointers.
Again, keep in mind that nothing requires a COM object to follow the convention of returning the refcount; so be aware that you could see something that looks like a bug but is actually just an implementation of Release just happens to always returns 0 (or rand(), if you're especially unlucky!)
First, you're right that QueryInterface for IUnknown should always return the same pointer; IUnknown is treated as the object's identity IIRC, so needs to be stable.
As for weak pointers, off the top of my head, maybe you could give CoMarshalInterThreadInterfaceInStream a whirl? It is meant to allow you to serialize a reference to a COM object into a stream, then create a new reference to the object on some other thread using the stream. However, if you serialise into a stream and retain the stream as a sort of weak pointer, then unmarshal later on to recover the pointer, you could check whether unmarshalling fails; If so, the object is gone.
With WinRT IWeakReference was added to enable weak refs to COM objects. Objects created with WRL's RuntimeClass support IWeakReference by default (can be disabled with an option).
you can use IWeakReference in your designs but it means you will need to use at least some WinRT concepts, IInspectable based interface.

Is it a good (correct) way to encapsulate a collection?

class MyContainedClass {
};
class MyClass {
public:
MyContainedClass * getElement() {
// ...
std::list<MyContainedClass>::iterator it = ... // retrieve somehow
return &(*it);
}
// other methods
private:
std::list<MyContainedClass> m_contained;
};
Though msdn says std::list should not perform relocations of elements on deletion or insertion, is it a good and common way to return pointer to a list element?
PS: I know that I can use collection of pointers (and will have to delete elements in destructor), collection of shared pointers (which I don't like), etc.
I don't see the use of encapsulating this, but that may be just me. In any case, returning a reference instead of a pointer makes a lot more sense to me.
In a general sort of way, if your "contained class" is truly contained in your "MyClass", then MyClass should not be allowing outsiders to touch its private contents.
So, MyClass should be providing methods to manipulate the contained class objects, not returning pointers to them. So, for example, a method such as "increment the value of the umpteenth contained object", rather than "here is a pointer to the umpteenth contained object, do with it as you wish".
It depends...
It depends on how much encapsulated you want your class to be, and what you want to hide, or show.
The code I see seems ok for me. You're right about the fact the std::list's data and iterators won't be invalidated in case of another data/iterator's modification/deletion.
Now, returning the pointer would hide the fact you're using a std::list as an internal container, and would not let the user to navigate its list. Returning the iterator would let more freedom to navigate this list for the users of the class, but they would "know" they are accessing a STL container.
It's your choice, there, I guess.
Note that if it == std::list<>.end(), then you'll have a problem with this code, but I guess you already know that, and that this is not the subject of this discussion.
Still, there are alternative I summarize below:
Using const will help...
The fact you return a non-const pointer lets the user of you object silently modify any MyContainedClass he/she can get his/her hands on, without telling your object.
Instead or returning a pointer, you could return a const pointer (and suffix your method with const) to stop the user from modifying the data inside the list without using an accessor approved by you (a kind of setElement ?).
const MyContainedClass * getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return &(*it);
}
This will increase somewhat the encapsulation.
What about a reference?
If your method cannot fail (i.e. it always return a valid pointer), then you should consider returning the reference instead of the pointer. Something like:
const MyContainedClass & getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return *it;
}
This has nothing to do with encapsulation, though..
:-p
Using an iterator?
Why not return the iterator instead of the pointer? If for you, navigating the list up and down is ok, then the iterator would be better than the pointer, and is used mostly the same way.
Make the iterator a const_iterator if you want to avoid the user modifying the data.
std::list<MyContainedClass>::const_iterator getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return it;
}
The good side would be that the user would be able to navigate the list. The bad side is that the user would know it is a std::list, so...
Scott Meyers in his book Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library says it's just not worth trying to encapsulate your containers since none of them are completely replaceable for another.
Think good and hard about what you really want MyClass for. I've noticed that some programmers write wrappers for their collections just as a matter of habit, regardless of whether they have any specific needs above and beyond those met by the standard STL collections. If that's your situation, then typedef std::list<MyContainedClass> MyClass and be done with it.
If you do have operations you intend to implement in MyClass, then the success of your encapsulation will depend more on the interface you provide for them than on how you provide access to the underlying list.
No offense meant, but... With the limited information you've provided, it smells like you're punting: exposing internal data because you can't figure out how to implement the operations your client code requires in MyClass... or possibly, because you don't even know yet what operations will be required by your client code. This is a classic problem with trying to write low-level code before the high-level code that requires it; you know what data you'll be working with, but haven't really nailed down exactly what you'll be doing with it yet, so you write a class structure that exposes the raw data all the way to the top. You'd do well to re-think your strategy here.
#cos:
Of course I'm encapsulating
MyContainedClass not just for the sake
of encapsulation. Let's take more
specific example:
Your example does little to allay my fear that you are writing your containers before you know what they'll be used for. Your example container wrapper - Document - has a total of three methods: NewParagraph(), DeleteParagraph(), and GetParagraph(), all of which operate on the contained collection (std::list), and all of which closely mirror operations that std::list provides "out of the box". Document encapsulates std::list in the sense that clients need not be aware of its use in the implementation... but realistically, it is little more than a facade - since you are providing clients raw pointers to the objects stored in the list, the client is still tied implicitly to the implementation.
If we put objects (not pointers) to
container they will be destroyed
automatically (which is good).
Good or bad depends on the needs of your system. What this implementation means is simple: the document owns the Paragraphs, and when a Paragraph is removed from the document any pointers to it immediately become invalid. Which means you must be very careful when implementing something like:
other objects than use collections of
paragraphs, but don't own them.
Now you have a problem. Your object, ParagraphSelectionDialog, has a list of pointers to Paragraph objects owned by the Document. If you are not careful to coordinate these two objects, the Document - or another client by way of the Document - could invalidate some or all of the pointers held by an instance of ParagraphSelectionDialog! There's no easy way to catch this - a pointer to a valid Paragraph looks the same as a pointer to a deallocated Paragraph, and may even end up pointing to a valid - but different - Paragraph instance! Since clients are allowed, and even expected, to retain and dereference these pointers, the Document loses control over them as soon as they are returned from a public method, even while it retains ownership of the Paragraph objects.
This... is bad. You've end up with an incomplete, superficial, encapsulation, a leaky abstraction, and in some ways it is worse than having no abstraction at all. Because you hide the implementation, your clients have no idea of the lifetime of the objects pointed to by your interface. You would probably get lucky most of the time, since most std::list operations do not invalidate references to items they don't modify. And all would be well... until the wrong Paragraph gets deleted, and you find yourself stuck with the task of tracing through the callstack looking for the client that kept that pointer around a little bit too long.
The fix is simple enough: return values or objects that can be stored for as long as they need to be, and verified prior to use. That could be something as simple as an ordinal or ID value that must be passed to the Document in exchange for a usable reference, or as complex as a reference-counted smart pointer or weak pointer... it really depends on the specific needs of your clients. Spec out the client code first, then write your Document to serve.
The Easy way
#cos, For the example you have shown, i would say the easiest way to create this system in C++ would be to not trouble with the reference counting. All you have to do would be to make sure that the program flow first destroys the objects (views) which holds the direct references to the objects (paragraphs) in the collection, before the root Document get destroyed.
The Tough Way
However if you still want to control the lifetimes by reference tracking, you might have to hold references deeper into the hierarchy such that Paragraph objects holds reverse references to the root Document object such that, only when the last paragraph object gets destroyed will the Document object get destructed.
Additionally the paragraph references when used inside the Views class and when passed to other classes, would also have to passed around as reference counted interfaces.
Toughness
This is too much overhead, compared to the simple scheme i listed in the beginning. It avoids all kinds of object counting overheads and more importantly someone who inherits your program does not get trapped in the reference dependency threads traps that criss cross your system.
Alternative Platforms
This kind-of tooling might be easier to perform in a platform that supports and promotes this style of programming like .NET or Java.
You still have to worry about memory
Even with a platform such as this you would still have to ensure your objects get de-referenced in a proper manner. Else outstanding references could eat up your memory in the blink of an eye. So you see, reference counting is not the panacea to good programming practices, though it helps avoid lots of error checks and cleanups, which when applied the whole system considerably eases the programmers task.
Recommendation
That said, coming back to your original question which gave raise to all the reference counting doubts - Is it ok to expose your objects directly from the collection?
Programs cannot exist where all classes / all parts of the program are truly interdependent of each other. No, that would be impossible, as a program is the running manifestation of how your classes / modules interact. The ideal design can only minimize the dependencies and not remove them totally.
So my opinion would be, yes it is not a bad practice to expose the references to the objects from your collection, to other objects that need to work with them, provided you do this in a sane manner
Ensure that only a few classes / parts of your program can get such references to ensure minimum interdependency.
Ensure that the references / pointers passed are interfaces and not concrete objects so that the interdependency is avoided between concrete classes.
Ensure that the references are not further passed along deeper into the program.
Ensure that the program logic takes care of destroying the dependent objects, before cleaning up the actual objects that satisfy those references.
I think the bigger problem is that you're hiding the type of collection so even if you use a collection that doesn't move elements you may change your mind in the future. Externally that's not visible so I'd say it's not a good idea to do this.
std::list will not invalidate any iterators, pointers or references when you add or remove things from the list (apart from any that point the item being removed, obviously), so using a list in this way isn't going to break.
As others have pointed out, you may want not want to be handing out direct access to the private bits of this class. So changing the function to:
const MyContainedClass * getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return &(*it);
}
may be better, or if you always return a valid MyContainedClass object then you could use
const MyContainedClass& getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return *it;
}
to avoid the calling code having to cope with NULL pointers.
STL will be more familiar to a future programmer than your custom encapsulation, so you should avoid doing this if you can. There will be edge cases that you havent thought about which will come up later in the app's lifetime, wheras STL is failry well reviewed and documented.
Additionally most containers support somewhat similar operations like begin end push etc. So it should be fairly trivial to change the container type in your code should you change the container. eg vector to deque or map to hash_map etc.
Assuming you still want to do this for a more deeper reason, i would say the correct way to do this is to implement all the methods and iterator classes that list implements. Forward the calls to the member list calls when you need no changes. Modify and forward or do some custom actions where you need to do something special (the reason why you decide to this in the first place)
It would be easier if STl classes where designed to be inherited from but for efficiency sake it was decided not to do so. Google for "inherit from STL classes" for more thoughts on this.