I made a cute generic (i.e. template) List class to handle lists in C++. The reason for that is that I found the std::list class terribly ugly for everyday use and since I constantly use lists, I needed a new one. The major improvement is that with my class, I can use [] to get items from it. Also, still to be implemented is an IComparer system to sort things.
I'm using this List class in OBJLoader, my class that loads Wavefront .obj files and converts them to meshes. OBJLoader contains lists of pointers to the following "types": 3D positions, 3D normals, uv texture coordinates, vertices, faces and meshes. The vertices list has objects that must be linked to some objects in all of the 3D positions, 3D normals and uv texture coordinates lists. Faces link to vertices and meshes link to faces. So they are all inter-connected.
For the sake of simplicity, let's consider that, in some context, there are just two lists of pointers: List<Person*> and List<Place*>. Person class contains, among others, the field List<Place*> placesVisited and the Place class contains the field List<Person*> peopleThatVisited. So we have the structure:
class Person
{
...
public:
Place* placeVisited;
...
};
class Place
{
...
public:
List<People*> peopleThatVisited;
};
Now we have the following code:
Person* psn1 = new Person();
Person* psn2 = new Person();
Place* plc1 = new Place();
Place* plc2 = new Place();
Place* plc2 = new Place();
// make some links between them here:
psn1->placesVisited.Add(plc1, plc2);
psn2->placesVisited.Add(plc2, plc3);
// add the links to the places as well
plc1->peopleThatVisited.Add(psn1);
plc2->peopleThatVisited.Add(psn1, psn2);
plc3->peopleThatVisited.Add(plc3);
// to make things worse:
List<Person*> allThePeopleAvailable;
allThePeopleAvailable.Add(psn1);
allThePeopleAvailable.Add(psn2);
List<Place*> allThePlacesAvailable;
allThePlacesAvailable.Add(plc1);
allThePlacesAvailable.Add(plc2);
allThePlacesAvailable.Add(plc3);
All done. What happens when we reach }? All the dtors are called and the program crashes because it tries to delete things two or more times.
The dtor of my list looks like this:
~List(void)
{
cursor = begin;
cursorPos = 0;
while(cursorPos < capacity - 1)
{
cursor = cursor->next;
cursorPos++;
delete cursor->prev;
}
delete cursor;
}
where Elem is:
struct Elem
{
public:
Elem* prev;
T value;
Elem* next;
};
and T is the generic List type.
Which brings us back to the question: What ways are there to safely delete my List classes? The elements inside may or may not be pointers and, if they are pointers, I would like to be able, when I delete my List, to specify whether I want to delete the elements inside or just the Elem wrappers around them.
Smart pointers could be an answer, but that would mean that I can't have a List<bubuType*>, but just List<smart_pointer_to_bubuType>. This could be ok, but again: declaring a List<bubuType*> would cause no error or warning and in some cases the smart pointers would cause some problems in the implementation: for example, I might want to declare a List<PSTR> for some WinAPI returns. I think getting those PSTR inside smart pointers would be an ugly job. Thus, the solution I'm looking for I think should be somehow related to the deallocation system of the List template.
Any ideas?
Without even looking at your code, I say: scrap it!
C++ has a list class template that's about as efficient as it gets, well-known to all C++ programmers, and comes bug-free with your compiler.
Learn to use the STL.1 Coming from other OO languages, the STL might seem strange, but there's an underlying reason for its strangeness, an alien beauty combining abstraction and performance - something considered impossible before Stepanov came and thought up the STL.
Rest assured that you are not the only one struggling with understanding the STL. When it came upon us, we all struggled to grasp its concepts, to learn its particularities, to understand how it ticks. The STL is a strange beast, but then it manages to combine two goals everybody thought could never be combined, so it's allowed to seem unfamiliar at first.
I bet writing your own linked list class used to be the second most popular indoor sport of C++ programmers - right after writing your own string class. Those of us who had been programming C++ 15 years ago nowadays enjoy ripping out those bug-ridden, inefficient, strange, and unknown string, list, and dictionary classes rotting in old code, and replacing it with something that is very efficient, well-known, and bug-free. Starting your own list class (other than for educational purposes) has to be one of the worst heresies.
If you program in C++, get used to one of the mightiest tools in its box as soon as possible.
1Note that the term "STL" names that part of the C++ standard library that stems from Stepanov's library (plus things like std::string which got an STL interface attached as an afterthought), not the whole standard library.
The best answer is that you must think on the lifetime of each one of the objects, and the responsibility of managing that lifetime.
In particular, in a relation from people and the sites they have visited, most probably neither of them should be naturally made responsible for the lifetime of the others: people can live independently from the sites that they have visited, and places exists regardless of whether they have been visited. This seems to hint that the lifetime of both people and sites is unrelated to the others, and that the pointers held are not related to resource management, but are rather references (not in the C++ sense).
Once you know who is responsible for managing the resource, that should be the code that should delete (or better hold the resource in a container or suitable smart pointer if it needs to be dynamically allocated), and you must ensure that the deletion does not happen before the other objects that refer to the same elements finish with them.
If at the end of the day, in your design ownership is not clear, you can fall back to using shared_ptr (either boost or std) being careful not to create circular dependencies that would produce memory leaks. Again to use the shared_ptrs correctly you have to go back and think, think on the lifetime of the objects...
Always, always use smart pointers if you are responsible for deallocating that memory. Do not ever use raw pointers unless you know that you're not responsible for deleting that memory. For WinAPI returns, wrap them into smart pointers. Of course, a list of raw pointers is not an error, because you may wish to have a list of objects whose memory you do not own. But avoiding smart pointers is most assuredly not a solution to any problem, because they're a completely essential tool.
And just use the Standard list. That's what it's for.
In the lines:
// make some links between them here:
psn1->placesVisited.Add(plc1, plc2);
psn2->placesVisited.Add(plc2, plc3);
// add the links to the places as well
plc1->peopleThatVisited.Add(psn1);
plc2->peopleThatVisited.Add(psn1, psn2);
plc3->peopleThatVisited.Add(plc3);
You have instances on the heap that contain pointers to each others. Not only one, but adding the same Place pointer to more than one person will also cause the problem (deleting more than one time the same object in memory).
Telling you to learn STL or to use shared_ptr (Boost) can be a good advice, however, as David Rodríguez said, you need to think of the lifetime of the objects. In other words, you need to redesign this scenario so that the objects do not contain pointers to each other.
Example: Is it really necessary to use pointers? - in this case, and if you require STL lists or vectors, you need to use shared_ptr, but again, if the objects make reference to each other, not even the best shared_ptr implementation will make it.
If this relationship between places and persons is required, design a class or use a container that will carry the references to each other instead of having the persons and places point at each other. Like a many-to-many table in a RDBMS. Then you will have a class/container that will take care of deleting the pointers at the end of the process. This way, no relations between Places and Persons will exist, only in the container.
Regards, J. Rivero
Without looking the exact code of the Add function, and the destructor of your list, it's hard to pin point the problem.
However, as said in the comments, the main problem with this code is that you don't use std::list or std::vector. There are proven efficient implementations, which fit what you need.
First of all, I would certainly use the STL (standard template library), but, I see that you are learning C++, so as an exercise it can be nice to write such a thing as a List template.
First of all, you should never expose data members (e.g. placeVisited and peopleThatVisited). This is a golden rule in object oriented programming. You should use getter and setter methods for that.
Regarding the problem around double deletion: the only solution is having a wrapper class around your pointers, that keeps track of outstanding references. Have a look at the boost::shared_ptr. (Boost is another magnificent well-crafted C++ library).
The program crashes because delete deallocates the memory of the pointer it is given it isn't removing the items from you list. You have the same pointer in at least two lists, so delete is getting called on the same block of memory multiple times this causes the crash.
First, smart pointers are not the answer here. All they will do is
guarantee that the objects never get deleted (since a double linked list
contains cycles, by definition).
Second, there's no way you can pass an argument to the destructor
telling it to delete contained pointers: this would have to be done
either via the list's type, one of its template arguments, partial
specialization or an argument to the constructor. (The latter would
probably require partial specialization as well, to avoid trying to
delete a non-pointer.)
Finally, the way to not delete an object twice is to not call delete on
it twice. I'm not quite sure what you thing is happening in your
destructor, but you never change cursor, so every time through, you're
deleting the same two elements. You probably need something more along
the lines of:
while ( cursor not at end ) {
Elem* next = cursor->next;
delete cursor;
cursor = next;
}
--
James Kanze
And you could easily implement [] around a normal list:
template <class Type>
class mystdlist : public std::list<Type> {
public:
Type& operator[](int index) {
list<Type>::iterator iter = this.begin();
for ( int i = 0; i < index; i++ ) {
iter++;
}
return *iter;
}
};
Why you would want to do this is strange, IMHO. If you want O(1) access, use a vector.
Related
I have been given some code to read which does some geometric operations on meshes.
A mesh data structure, by definition, should contain at least the information
regarding the coordinates of points, edge connectivity and face information.
So, the code given to me has classes to define vertex, edge and face data structure,
named respectively as Vertex, Edge and Face.
However the mesh class looks like this.
class basemesh
{
public:
/* Methods to operate on the protected data below.*/
protected:
/*! list of edges */
std::list<Edge*> m_edges;
/*! list of vertices */
std::list<Vertex*> m_verts;
/*! list of faces */
std::list<Face*> m_faces;
}
My question: Why does the mesh data structure store a list of pointers rather than a
list of the corresponding objects themselves.
e.g why not say directly std::list<Vertex>
I have seen this construct being used in a couple of other C++ codes
Does this have something to do with inheritance of classes? Or is it something to do
with performance with regards to iterating on the list?
This basemesh class is, as the name suggests, a base class from which
other specialized meshes are derived.
There is no performance reasons here. Its simply a case of ownership sharing. Remember this as a rule of thumb: Pointers in C++ are used to share/pass ownership of a resource, or to provide polymorphic behaviour through dynamic binding.
People is talking about performence because you avoid copying the things. Blah, blah, blah.
If you need to copy, you should copy. The only reason why its using pointers is because the author didn't want to copy the things when he/she copies the list of things, in other words, he/she wants to maintain the same things in two locations (lists): Ownership sharing, as I said before.
On the other hand, note that the class is called basemesh. So the real point of the pointers here could be to work with polymorphic vertices, edges, etc (Dynamic binding).
NOTE: If performance was the point here, I'm pretty sure the author would be using compact and aligned non-cache-miss-prone std::vector instead of std::list. In this case, the most presumable reason about the use of pointers is polymorphism, not performance. Anything related to pointers, dereferencing, and transversing linked lists will always have less performance than compact data, exactly what std::vector<Vertex> is, for example. Again, if the use of pointers is not for polymorphism, is for ownership related things, not performance.
Other note: Copying Yes, you are copying. But note what and how are copying. Vertices are, except of a very rare implementation, pairs of floats/ints. There is no gain at all about copying 64bits of floats vs 32/64bits of pointers.
Also note that, except you don't be so lucky, you are copying things stored at the same cache line, or almost at the cache.
A good rule about optimization nowadays is: Try to optimize memory accesses, not CPU cicles. I recommend this thread: What is "cache-friendly" code?, and this for a practical case: Why are elementwise additions much faster in separate loops than in a combined loop?. Finally, this thread contains good notes about optimizing using modern compilers.
My guess is that it's either made for a very unusual specific case, but more likely, it's written by a programmer who doesn't know how heap allocations or std::list actually work, and just blindly use pointers.
It seems very unlikely a std::list of pointers to single vertices was the best option performance- or designwise.
On a practical level if a method changes a point it does not need to reproduce the change in the other data structures. They will all point to the same thing.
But in terms of memory management it would be wise to use smart pointers,
At a guess I'd say it's so that these objects can have pointers to each other (e.g. an Edge can have pointers to two Vertices, each of which can have a pointer back to the Edge).
If all the Vertices lived in a std::list in basemesh, then pointers to them would not be reliable, although list::iterators might work well enough.
Using pointers is less efficient when retrieving inner data in general because you will have to dereference the value every time you access it.
But at the same time it will be more efficient when passing data around, since you are just passing pointers. I guess the solution chosen is related to the fact that data is shared between multiple objects by composition. Eg: multiple Edge instances could refer to same Vertex.
Now std::list guarantees that addresses to values contained are consistent until the element itself is removed so actually doing something like
Edge(const Vertex *v1, const Vertex *v2) { .. }
std::list<Vertex>::iterator it = std::advance(vertices.begin(), 3);
std::list<Vertex>::iterator it2 = std::advance(vertices.begin(), 5);
new Edge(&(*it), &(*it2));
Would work since addresses won't be invalidated so there is no real necessity to use pointers to store objects. Actually by using this solution you don't need to care about memory management of single objects since you won't need to delete them or wrap them into smart pointers.
It's using pointers for performance reasons and to reduce the chance of an error.
Imagine the alternative of not using pointers. Every insertion into class basemesh would cause a copy of the object to be created, and every time you access an object, if you aren't careful, you'll get a copy as well.
For example, imagine this statement:
Edge e = m_edges[0];
e.doSomethingThatModifiesState();
In this example, without pointers, you'll have a copy of the object, and any operations you perform on it will not affect the actual edge object stored in m_edges.
With pointers, you don't have this issue:
Edge* e = m_edges[0];
e->doSomethingThatModifiesState();
In this example, no copy of the object is made, and when you do something, you get the intended behavior.
As many others said the speed is the most obvious reason. Another reason is to get polymorphic behavior through pointers to the base class.
I am relatively new to pointers and have written this merge function. Is this effective use of pointers? and secondly the *two variable, it should not be deleted when they are merged right? that would be the client´s task, not the implementer?
VectorPQueue *VectorPQueue::merge(VectorPQueue *one, VectorPQueue *two) {
int twoSize = two->size();
if (one->size() != 0) {
for (int i = 0; i < twoSize;i++)
{
one->enqueue(two->extractMin());
}
}
return one;
}
The swap function is called like this
one->merge(one, two);
Passing it the these two objects to merge
PQueue *one = PQueue::createPQueue(PQueue::UnsortedVector);
PQueue *two = PQueue::createPQueue(PQueue::UnsortedVector);
In your case pointers are completely unnecessary. You can simply use references.
It is also unnecessary to pass in the argument on which the member function is called. You can get the object on which a member function is called with the this pointer.
/// Merge this with other.
void VectorPQueue::merge(VectorPQueue& other) {
// impl
}
In general: Implementing containers with inheritance is not really the preferred style. Have a look at the standard library and how it implements abstractions over sequences (iterators).
At first sight, I cannot see any pointer-related problems. Although I'd prefer to use references instead, and make merge a member function of VectorPQueue so I don't have to pass the first argument (as others already pointed out). One more thing which confuses me is the check for one->size() != 0 - what would be the problem if one is empty? The code below would still correctly insert two into one, as it depends only on two's size.
Regarding deletion of two:
that would be the client´s task, not the implementer
Well, it's up to you how you want do design your interface. But since the function only adds two's elements to one, I'd say it should not delete it. Btw, I think a better name for this method would be addAllFrom() or something like this.
Regarding pointers in general:
I strongly suggest you take a look into smart pointers. These are a common technique in C++ to reduce memory management effort. Using bare pointers and managing them manually via new/delete is very error-prone, hard to make strongly exception-safe, will almost guarantee you memory leaks etc. Smart pointers on the other hand automatically delete their contained pointers as soon as they are not needed any more. For illustrative purposes, the C++ std lib has auto_ptr (unique_ptr and shared_ptr if your compiler supports C++ 11). It's used like this:
{ // Beginning of scope
std::auto_ptr<PQueue> one(PQueue::createPQueue(PQueue::UnsortedVector));
// Do some work with one...:
one->someFunction();
// ...
} // End of scope - one will automatically be deleted
My personal rules of thumb: Only use pointers wrapped in smart pointers. Only use heap allocated objects at all, if:
they have to live longer than the scope in which they are created, and a copy would be too expensive (C++ 11 luckily has move semantics, which eliminate a lot of such cases)
I have to call virtual functions on them
In all other cases, I try to use stack allocated objects and STL containers as much as possible.
All this might seem a lot at first if you're starting with C++, and it's totally ok (maybe even necessary) to try to fully understand pointers before you venture into smart pointers etc.. but it saves a lot of time spend debugging later on. I'd also recommend reading a few books on C++ - I was actually thinking I understood most of C++, until I read my first book :)
My question is best illustrated with a code sample, so let's just start off with that:
class Game
{
// All this vector does is establish ownership over the Card objects
// It is initialized with data when Game is created and then is never
// changed.
vector<shared_ptr<Card> > m_cards;
// And then we have a bunch of pointers to the Cards.
// All these pointers point to Cards from m_cards.
// These could have been weak_ptrs, but at the moment, they aren't
vector<Card*> m_ptrs;
// Note: In my application, m_ptrs isn't there, instead there are
// pointers all over the place (in objects that are stored in member
// variables of Game.
// Also, in my application, each Card in m_cards will have a pointer
// in m_ptrs (or as I said, really just somewhere), while sometimes
// there is more than one pointer to a Card.
}
Now what I want to do is to make a deep copy of this Game class. I make a new vector with new shared_ptrs in it, which point to new Card objects which are copies of the original Card objects. That part is easy.
Then the trouble starts, the pointers of m_ptrs should be updated to point to the cards in m_cards, which is no simple task.
The only way I could think of to do this is to create a map and fill it during the copying of m_cards (with map[oldPtr] = newPtr) and then to use that to update m_ptrs. However, this is only O(m * log(n)) (m = m_ptrs.size(); n = m_cards.size()). As this is going to be a pretty regular operation* I would like to do this efficiently, and I have the feeling that it should be possible in O(m) using custom pointers. However, I can't seem to find an efficient way of doing this. Anybody who does?
*it's used to create a testbed for the AI, letting it "try out" different moves
Edit: I would like to add a bit on accepting an answer, as I haven't yet. I am waiting until I get back to this project (I got on a side track as I had worked too much on this project - if you do it for fun it's got to stay fun), so it may be a while longer before I accept an answer. Nevertheless, I will accept an answer some time, so don't worry :P
Edit nr 2: I still haven't gotten back to this project. Right now, I am thinking about just taking the O(m * log(n)) way and not complaining, then seeing later if it needs to be faster. However, as I have recently taken some time to learn my patterns, I am also thinking that I really need to refactor this project some time. Oh, and that I might just spend some time working on this problem with all the new knowledge I have under my belt. Since there isn't an answer that says "just stick with the hashmap and see later if it really needs to be faster" (and I would actually be pretty disappointed if there was, as it's not an answer to my question), I am postponing the picking of an answer yet a bit more till I do get back to this project.
Edit nr 3: I still didn't get back to this project. More precisely, it has been shelved indefinitely. I am pretty sure I just wouldn't get my head too bent over the O(m * log(n))right now, and then perhaps look at it later if it turned out to be a problem. However, that would just not have been a good answer to my question, as I explicitly asked for better performance. Not wanting to leave the answers unaccepted any longer, I chose the most helpful answer and accepted it.
Store the pointers as indexes.
As you say they all point to m_Cards which is a vector that can be indexed (is that correct English?).
Either you do that only for storing and convert them back to pointers at loading.
Or you may think of using indices generally instead of pointers.
What about keeping cards elements index instead of pointer:
vector<int> m_indexes;
...
Card* ptr = &m_cards[m_indexes[0]];
Vector with indexes can be copied without changes.
I recently encountered the very similar problem: cloning the class internal structure implemented by pointers and std::vector as an objects storage.
First of all (unrelated to the question though), I'd suggest either stick with smart pointers or with plain structures. In your case it means that it makes much more sense to use vector<weak_ptr<Card> > m_ptrs instead of raw pointers.
About the question itself - one more possible workaround is using pointer differences in the copy constructor. I will demonstrate it for vector of objects, but working with shared pointers will utilize the same principle, the only difference will be in copying of m_cards (you should not simply use assignment if you want objects clones but copy the m_cards element-by-element).
It is very important that the method works only for containers where the elements are guaranteed to be stored consequently (vector, array).
Another very important moment is that the m_ptrs elements should represent only internal Card structrure, i. e. they must point only to the internal m_cards elements.
// assume we store objects directly, not in shared pointers
// the only difference for shared pointers will be in
// m_cards assignment
// and using m_cards[...].get() instead of &m_cards[...]
vector<Card> m_cards;
vector<Card*> m_ptrs;
In that case your array of pointers can be easily computed by using pointers arithmetic taking linear time. In that case your copy constructor will look like this:
Game::Game(const Game &rhs) {
if (rhs.m_cards.empty())
return;
m_cards = rhs.m_cards;
// if we have vector of shared pointers
// and we need vector of pointers to objects clones
// something like this should be done
// for (auto p: rhs.m_cards) {
// // we must be certain here that dereferencing is safe,
// // i. e. object must exist. If not, additional check is required.
// // copy constructor will be called here:
// m_cards.push_back(std::make_shared<Card>(*p));
// }
Card *first = &rhs.m_cards[0];
for (auto p: rhs.m_ptrs) {
m_ptrs.push_back(&m_cards[p - first]);
}
}
Basically in this deepcopy method you will be still working with indexes, but you preserve the convenience of working with pointers in other class methods without storing your indexes separately.
Anyway, for using that kind of structure you should exactly know what you are doing with the class members and why, that requires much more manual control (for example, at least adding/removing elements to/from m_cards should be done consciously, in other case m_ptrs can easily become broken even without copying the object).
Learning C++, so be gentle :)...
I have been designing my application primarily using heap variables (coming from C), so I've designed structures like this:
QList<Criteria*> _Criteria;
// ...
Criteria *c = new Criteria(....);
_Criteria.append(c);
All through my program, I'm passing pointers to specific Criteria, or often the list. So, I have a function declared like this:
QList<Criteria*> Decision::addCriteria(int row,QString cname,QString ctype);
Criteria * Decision::getCriteria(int row,int col)
which inserts a Criteria into a list, and returns the list so my GUI can display it.
I'm wondering if I should have used references, somehow. Since I'm always wanting that exact Criteria back, should I have done:
QList<Criteria> _Criteria;
// ....
Criteria c(....);
_Criteria.append(c);
...
QList<Criteria>& Decision::addCriteria(int row,QString cname,QString ctype);
Criteria& Decision::getCriteria(int row,int col)
(not sure if the latter line is syntactically correct yet, but you get the drift).
All these items are specific, quasi-global items that are the core of my program.
So, the question is this: I can certainly allocate/free all my memory w/o an issue in the method I'm using now, but is there are more C++ way? Would references have been a better choice (it's not too late to change on my side).
TIA
Mike
I would return QList<Criteria> as a plain value. QList is one of Qt's shared classes, meaning that its internal representation is shared between multiple instances so long as none of them are modified.
If the Criteria class is fairly complex, such that an incidental copy made because one of the lists is modified at some point incurs noticable overhead, then I would use QSharedData in the implementation of Criteria so that it, too, is only copied as needed.
This approach has two downsides: one, the copying, if any, is implicit and may happen when you don't expect it to, and two, it doesn't allow for polymorphic use of Criteria. If you have Criteria as the root of a class hierarchy, then you must use pointers. In that case, I would use shared_ptr from Boost or C++ TR1 to avoid memory management hassles, or make Critera inherit publicly from QObject and make all Critera objects children of Decision.
I don't think references would be a better choice. If you are dynamically allocating these objects, you still need keep a copy of the pointer around to delete later. Plus, when passing around pointers you don't have to worry about copy constructors or an implicit sharing technique like QSharedData. You'll still get "that exact Criteria back".
My advice is: Unless you have a really good reason to make things more complex, keep it simple.
However, from a Qt standpoint you should generally not pass around pointers or references to Qt objects. These objects do use implicit sharing so they don't act like "normal" C++ objects. If you are still learning C++ I'd suggest leaving this technique out of your own code for now. But to use Qt effectively you need to understand how it works so I recommend reading more about it here:
http://qt.nokia.com/doc/4.6/implicit-sharing.html
Good luck!
EDIT:
One thing I forgot to mention. If you know you don't want you class to be copied, you can enforce this by declaring a private copy constructor and operator= overload:
class A
{
//Code goes here
private:
A(const A&);
A& operator=(const A&);
};
class MyContainedClass {
};
class MyClass {
public:
MyContainedClass * getElement() {
// ...
std::list<MyContainedClass>::iterator it = ... // retrieve somehow
return &(*it);
}
// other methods
private:
std::list<MyContainedClass> m_contained;
};
Though msdn says std::list should not perform relocations of elements on deletion or insertion, is it a good and common way to return pointer to a list element?
PS: I know that I can use collection of pointers (and will have to delete elements in destructor), collection of shared pointers (which I don't like), etc.
I don't see the use of encapsulating this, but that may be just me. In any case, returning a reference instead of a pointer makes a lot more sense to me.
In a general sort of way, if your "contained class" is truly contained in your "MyClass", then MyClass should not be allowing outsiders to touch its private contents.
So, MyClass should be providing methods to manipulate the contained class objects, not returning pointers to them. So, for example, a method such as "increment the value of the umpteenth contained object", rather than "here is a pointer to the umpteenth contained object, do with it as you wish".
It depends...
It depends on how much encapsulated you want your class to be, and what you want to hide, or show.
The code I see seems ok for me. You're right about the fact the std::list's data and iterators won't be invalidated in case of another data/iterator's modification/deletion.
Now, returning the pointer would hide the fact you're using a std::list as an internal container, and would not let the user to navigate its list. Returning the iterator would let more freedom to navigate this list for the users of the class, but they would "know" they are accessing a STL container.
It's your choice, there, I guess.
Note that if it == std::list<>.end(), then you'll have a problem with this code, but I guess you already know that, and that this is not the subject of this discussion.
Still, there are alternative I summarize below:
Using const will help...
The fact you return a non-const pointer lets the user of you object silently modify any MyContainedClass he/she can get his/her hands on, without telling your object.
Instead or returning a pointer, you could return a const pointer (and suffix your method with const) to stop the user from modifying the data inside the list without using an accessor approved by you (a kind of setElement ?).
const MyContainedClass * getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return &(*it);
}
This will increase somewhat the encapsulation.
What about a reference?
If your method cannot fail (i.e. it always return a valid pointer), then you should consider returning the reference instead of the pointer. Something like:
const MyContainedClass & getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return *it;
}
This has nothing to do with encapsulation, though..
:-p
Using an iterator?
Why not return the iterator instead of the pointer? If for you, navigating the list up and down is ok, then the iterator would be better than the pointer, and is used mostly the same way.
Make the iterator a const_iterator if you want to avoid the user modifying the data.
std::list<MyContainedClass>::const_iterator getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return it;
}
The good side would be that the user would be able to navigate the list. The bad side is that the user would know it is a std::list, so...
Scott Meyers in his book Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library says it's just not worth trying to encapsulate your containers since none of them are completely replaceable for another.
Think good and hard about what you really want MyClass for. I've noticed that some programmers write wrappers for their collections just as a matter of habit, regardless of whether they have any specific needs above and beyond those met by the standard STL collections. If that's your situation, then typedef std::list<MyContainedClass> MyClass and be done with it.
If you do have operations you intend to implement in MyClass, then the success of your encapsulation will depend more on the interface you provide for them than on how you provide access to the underlying list.
No offense meant, but... With the limited information you've provided, it smells like you're punting: exposing internal data because you can't figure out how to implement the operations your client code requires in MyClass... or possibly, because you don't even know yet what operations will be required by your client code. This is a classic problem with trying to write low-level code before the high-level code that requires it; you know what data you'll be working with, but haven't really nailed down exactly what you'll be doing with it yet, so you write a class structure that exposes the raw data all the way to the top. You'd do well to re-think your strategy here.
#cos:
Of course I'm encapsulating
MyContainedClass not just for the sake
of encapsulation. Let's take more
specific example:
Your example does little to allay my fear that you are writing your containers before you know what they'll be used for. Your example container wrapper - Document - has a total of three methods: NewParagraph(), DeleteParagraph(), and GetParagraph(), all of which operate on the contained collection (std::list), and all of which closely mirror operations that std::list provides "out of the box". Document encapsulates std::list in the sense that clients need not be aware of its use in the implementation... but realistically, it is little more than a facade - since you are providing clients raw pointers to the objects stored in the list, the client is still tied implicitly to the implementation.
If we put objects (not pointers) to
container they will be destroyed
automatically (which is good).
Good or bad depends on the needs of your system. What this implementation means is simple: the document owns the Paragraphs, and when a Paragraph is removed from the document any pointers to it immediately become invalid. Which means you must be very careful when implementing something like:
other objects than use collections of
paragraphs, but don't own them.
Now you have a problem. Your object, ParagraphSelectionDialog, has a list of pointers to Paragraph objects owned by the Document. If you are not careful to coordinate these two objects, the Document - or another client by way of the Document - could invalidate some or all of the pointers held by an instance of ParagraphSelectionDialog! There's no easy way to catch this - a pointer to a valid Paragraph looks the same as a pointer to a deallocated Paragraph, and may even end up pointing to a valid - but different - Paragraph instance! Since clients are allowed, and even expected, to retain and dereference these pointers, the Document loses control over them as soon as they are returned from a public method, even while it retains ownership of the Paragraph objects.
This... is bad. You've end up with an incomplete, superficial, encapsulation, a leaky abstraction, and in some ways it is worse than having no abstraction at all. Because you hide the implementation, your clients have no idea of the lifetime of the objects pointed to by your interface. You would probably get lucky most of the time, since most std::list operations do not invalidate references to items they don't modify. And all would be well... until the wrong Paragraph gets deleted, and you find yourself stuck with the task of tracing through the callstack looking for the client that kept that pointer around a little bit too long.
The fix is simple enough: return values or objects that can be stored for as long as they need to be, and verified prior to use. That could be something as simple as an ordinal or ID value that must be passed to the Document in exchange for a usable reference, or as complex as a reference-counted smart pointer or weak pointer... it really depends on the specific needs of your clients. Spec out the client code first, then write your Document to serve.
The Easy way
#cos, For the example you have shown, i would say the easiest way to create this system in C++ would be to not trouble with the reference counting. All you have to do would be to make sure that the program flow first destroys the objects (views) which holds the direct references to the objects (paragraphs) in the collection, before the root Document get destroyed.
The Tough Way
However if you still want to control the lifetimes by reference tracking, you might have to hold references deeper into the hierarchy such that Paragraph objects holds reverse references to the root Document object such that, only when the last paragraph object gets destroyed will the Document object get destructed.
Additionally the paragraph references when used inside the Views class and when passed to other classes, would also have to passed around as reference counted interfaces.
Toughness
This is too much overhead, compared to the simple scheme i listed in the beginning. It avoids all kinds of object counting overheads and more importantly someone who inherits your program does not get trapped in the reference dependency threads traps that criss cross your system.
Alternative Platforms
This kind-of tooling might be easier to perform in a platform that supports and promotes this style of programming like .NET or Java.
You still have to worry about memory
Even with a platform such as this you would still have to ensure your objects get de-referenced in a proper manner. Else outstanding references could eat up your memory in the blink of an eye. So you see, reference counting is not the panacea to good programming practices, though it helps avoid lots of error checks and cleanups, which when applied the whole system considerably eases the programmers task.
Recommendation
That said, coming back to your original question which gave raise to all the reference counting doubts - Is it ok to expose your objects directly from the collection?
Programs cannot exist where all classes / all parts of the program are truly interdependent of each other. No, that would be impossible, as a program is the running manifestation of how your classes / modules interact. The ideal design can only minimize the dependencies and not remove them totally.
So my opinion would be, yes it is not a bad practice to expose the references to the objects from your collection, to other objects that need to work with them, provided you do this in a sane manner
Ensure that only a few classes / parts of your program can get such references to ensure minimum interdependency.
Ensure that the references / pointers passed are interfaces and not concrete objects so that the interdependency is avoided between concrete classes.
Ensure that the references are not further passed along deeper into the program.
Ensure that the program logic takes care of destroying the dependent objects, before cleaning up the actual objects that satisfy those references.
I think the bigger problem is that you're hiding the type of collection so even if you use a collection that doesn't move elements you may change your mind in the future. Externally that's not visible so I'd say it's not a good idea to do this.
std::list will not invalidate any iterators, pointers or references when you add or remove things from the list (apart from any that point the item being removed, obviously), so using a list in this way isn't going to break.
As others have pointed out, you may want not want to be handing out direct access to the private bits of this class. So changing the function to:
const MyContainedClass * getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return &(*it);
}
may be better, or if you always return a valid MyContainedClass object then you could use
const MyContainedClass& getElement() const {
// ...
std::list<MyContainedClass>::const_iterator it = ... // retrieve somehow
return *it;
}
to avoid the calling code having to cope with NULL pointers.
STL will be more familiar to a future programmer than your custom encapsulation, so you should avoid doing this if you can. There will be edge cases that you havent thought about which will come up later in the app's lifetime, wheras STL is failry well reviewed and documented.
Additionally most containers support somewhat similar operations like begin end push etc. So it should be fairly trivial to change the container type in your code should you change the container. eg vector to deque or map to hash_map etc.
Assuming you still want to do this for a more deeper reason, i would say the correct way to do this is to implement all the methods and iterator classes that list implements. Forward the calls to the member list calls when you need no changes. Modify and forward or do some custom actions where you need to do something special (the reason why you decide to this in the first place)
It would be easier if STl classes where designed to be inherited from but for efficiency sake it was decided not to do so. Google for "inherit from STL classes" for more thoughts on this.