Structure for hierarchal Component storage - c++

I've been batting this problem around in my head for a few days now and haven't come to any satisfactory conclusions so I figured I would ask the SO crew for their opinion. For a game that I'm working on I'm using a Component Object Model as described here and here. It's actually going fairly well but my current storage solution is turning out to be limiting (I can only request components by their class name or an arbitrary "family" name). What I would like is the ability to request a given type and iterate through all components of that type or any type derived from it.
In considering this I've first implemented a simple RTTI scheme that stores the base class type through the derived type in that order. This means that the RTTI for, say, a sprite would be: component::renderable::sprite. This allows me to compare types easily to see if type A is derived from type B simply by comparing the all elements of B: i.e. component::renderable::sprite is derived from component::renderable but not component::timer. Simple, effective, and already implemented.
What I want now is a way to store the components in a way that represents that hierarchy. The first thing that comes to mind is a tree using the types as nodes, like so:
component
/ \
timer renderable
/ / \
shotTimer sprite particle
At each node I would store a list of all components of that type. That way requesting the "component::renderable" node will give me access to all renderable components regardless of derived type. The rub is that I want to be able to access those components with an iterator, so that I could do something like this:
for_each(renderable.begin(), renderable.end(), renderFunc);
and have that iterate over the entire tree from renderable down. I have this pretty much working using a really ugly map/vector/tree node structure and an custom forward iterator that tracks a node stack of where I've been. All the while implementing, though, I felt that there must be a better, clearer way... I just can't think of one :(
So the question is: Am I over-complicating this needlessly? Is there some obvious simplification I'm missing, or pre-existing structure I should be using? Or is this just inheritly a complex problem and I'm probably doing just fine already?
Thanks for any input you have!

You should think about how often you need to do the following:
traverse the tree
add/remove elements from the tree
how many objects do you need to keep track of
Which is more frequent will help determine the optimum solution
Perhaps instead of make a complex tree, just have a list of all types and add a pointer to the object for each type it is derived from. Something like this:
map<string,set<componenet *>> myTypeList
Then for an object that is of type component::renderable::sprite
myTypeList["component"].insert(&object);
myTypeList["renderable"].insert(&object);
myTypeList["sprite"].insert(&object);
By registering each obejct in multiple lists, it then becomes easy to do something to all object of a given type and subtypes
for_each(myTypeList["renderable"].begin(),myTypeList["renderable"].end(),renderFunc);
Note that std::set and my std::map construct may not be the optimum choice, depending on how you will use it.
Or perhaps a hybrid approach storing only the class heirarchy in the tree
map<string, set<string> > myTypeList;
map<string, set<component *> myObjectList;
myTypeList["component"].insert("component");
myTypeList["component"].insert("renderable");
myTypeList["component"].insert("sprite");
myTypeList["renderable"].insert("renderable");
myTypeList["renderable"].insert("sprite");
myTypeList["sprite"].insert("sprite");
// this isn't quite right, but you get the idea
struct doForList {
UnaryFunction f;
doForList(UnaryFunction f): func(f) {};
operator ()(string typename) {
for_each(myTypeList[typename].begin();myTypeList[typename].end(), func);
}
}
for_each(myTypeList["renderable"].begin(),myTypeList["renderable"].end(), doForList(myFunc))

The answer depends on the order you need them in. You pretty much have a choice of preorder, postorder, and inorder. Thus have obvious analogues in breadth first and depth first search, and in general you'll have trouble beating them.
Now, if you constraint the problem a litle, there are a number of old fashioned algorithms for storing trees of arbitrary data as arrays. We used them a lot in the FORTRAN days. One of them had the key trick being to store the children of A, say A2 and A3, at index(A)*2,index(A)*2+1. The problem is that if your tree is sparse you waste space, and the size of your tree is limited by the array size. But, if I remember this right, you get the elements in breadth-first order by simple DO loop.
Have a look at Knuth Volume 3, there is a TON of that stuff in there.

If you want to see code for an existing implementation, the Game Programming Gems 5 article referenced in the Cowboy Programming page comes with a somewhat stripped down version of the code we used for our component system (I did a fair chunk of the design and implementation of the system described in that article).
I'd need to go back and recheck the code, which I can't do right now, we didn't represent things in a hierarchy in the way you show. Although components lived in a class hierarchy in code, the runtime representation was a flat list. Components just declared a list of interfaces that they implemented. The user could query for interfaces or concrete types.
So, in your example, Sprite and Particle would declare that they implemented the RENDERABLE interface, and if we wanted to do something to all renderables, we'd just loop through the list of active components and check each one. Not terribly efficient on the face of it, but it was fine in practice. The main reason it wasn't an issue was that it actually turns out to not be a very common operation. Things like renderables, for example, added themselves to the render scene at creation, so the global scene manager maintained its own list of renderable objects and never needed to query the component system for them. Similarly with phyics and collision components and that sort of thing.

Related

visitor pattern adding new functionality

I've read thes question about visitor patterns https://softwareengineering.stackexchange.com/questions/132403/should-i-use-friend-classes-in-c-to-allow-access-to-hidden-members. In one of the answers I've read
Visitor give you the ability to add functionality to a class without actually touching the class itself.
But in visited object we have to add either new interface, so we actualy "touch" the class (or at least in some cases to put setters and getters, also changing the class).
How exactly I will add functionality with visitor without changing visiting class?
The visitor pattern indeed assumes that each class interface is general enough, so that, if you would know the actual type of the object, you would be able to perform the operation from outside the class. If this is not the starting point, visitor indeed might not apply.
(Note that this assumption is relatively weak - e.g., if each data member has a getter, then it is trivially achieved for any const operation.)
The focus of this pattern is different. If
this is the starting point
you need to support an increasing number of operations
then what changes to the classs' code do you need to do in order to dispatch new operations applied to pointers (or references) to the base class.
To make this more concrete, take the classic visitor CAD example:
Consider the design of a 2D CAD system. At its core there are several types to represent basic geometric shapes like circles, lines and arcs. The entities are ordered into layers, and at the top of the type hierarchy is the drawing, which is simply a list of layers, plus some additional properties.
A fundamental operation on this type hierarchy is saving the drawing to the system's native file format. At first glance it may seem acceptable to add local save methods to all types in the hierarchy. But then we also want to be able to save drawings to other file formats, and adding more and more methods for saving into lots of different file formats soon clutters the relatively pure geometric data structure we started out with.
The starting point of the visitor pattern is that, say, a circle, has sufficient getters for its specifics, e.g., its radius. If that's not the case, then, indeed, there's a problem (in fact, it's probably a badly designed CAD code base anyway).
Starting from this point, though, when considering new operations, e.g., writing to file type A, there are two approaches:
implement a virtual method like write_to_file_type_a for each class and each operation
implement a virtual method accept_visitor for each class only, only once
The "without actually touching the class itself" in your question means, in point 2 just above, that this is all that's now needed to dispatch future visitors to the correct classes. It doesn't mean that the visitor will start writing getters, for example.
Once a visitor interface has been written for one purpose, you can visit the class in different ways. The different visiting does not require touching the class again, assuming you are visiting the same compontnts.

List design (Object oriented) suggestion needed

I'm trying to implement a generic class for lists for an embedded device using C++. Such a class will provide methods to update the list, sort the list, filter the list based on some user specified criteria, group the list based on some user specified criteria etc. But there are quite a few varieties of lists I want this generic class to support and each of these varieties can have different display aspects. Example: One variety of list can have strings and floating point numbers in each of its elements. Other variety could have a bitmap, string and special character in each of it's elements. etc.
I wrote down a class with the methods of interest (sort, group, etc). This class has an object of another class (say DisplayAspect) as its member. But the number of member variables and the type of each member variable of class DisplayAspect is unknown. What would be a better way to implement this?
Why not use the std::list, C++ provides that and it provides all the functionality you mentioned(It is templated class, So it supports all data types you can think of).
Also, there is no point reinventing the wheel as the code you write will almost will never be as efficient as std::list.
In case you still want to reinvent this wheel, You should write a template list class.
First, you should probably use std::list as your list, as others have stated. It seems to me that you are having problems more with what to put in the list, however, so I'm focusing on that part of the question.
Since you want to also store multiple bits of information in each element of the list, you will need to create multiple classes, one to store each combination. You don't describe why you are storing mutiple bits of information, but you'd want to use a logical name for each class. So if, for example, you were storing a name and a price (string and a double), you could give the class some name like Product.
You mention creating a class called DisplayAspect.
If this is because you want to have one piece of code print all of these lists, then you should use inheritance and polymorphism to accomplish this goal. One way to accomplish that is to make your DisplayAspect class an abstract class with the needed functions (printItem() for example) pure virtual and have each of the classes you created for the combinations of data be subclasses of this DisplayAspect class.
If, on the other hand, you created the DisplayAspect class so that you could reuse your list code, you should look into template classes. std::list is an example of a template class and it will hold any type you'd like to put into it and in that case, you could drop your DisplayAspect class.
Others (e.g., #Als) have already given the obvious, direct, answer to the question you asked. If you really want a linked list, they're undoubtedly correct: std::list is the obvious first choice.
I, however, am going to suggest that you probably don't want a linked list at all. A linked list is only rarely a useful data structure. Given what you've said you want (sorting, grouping), and especially your target (embedded system, so you probably don't have a lot of memory to waste) a linked list probably isn't a very good choice for what you're trying to do. At least right off, it sounds like something closer to an array probably makes a lot more sense.
If you end up (mistakenly) deciding that a linked list really is the right choice, there's a fair chance you only need a singly linked list though. For that, you might want to look at Boost Slist. While it's a little extra work to use (it's intrusive), this will generally have lower overhead, so it's at least not quite a poor of a choice as many generic linked lists.

Compartmentalisation and Design of Classes in C++

In my spare time, I've been taking code I've written for various purposes and appropriating them into other languages just to have a look at what's out there. Currently I'm taking a genetic programming graph colouring algorithm, originally written in Java, and trying to coerce it into C++.
The arbitrary data structure I'm using for the task has a few classes. In Java, it wasn't so much of an issue for me because I had been exposed to it for a while. The graph structure was only created once, and a Colouring was assigned to that. The Colouring (specifically finding a mostly optimal one) was the real point of the code. I could have a Graph class with inner classes like Node and Edge, for instance, or I could have a package graph with classes Graph, Node, Edge, etc.
The first case above might lend itself well to my idea of C++. A main *.cpp file might have some classes Node, Graph, Edge, defined in it. But this seems to really be missing the point of C++, from what I can tell. I'm just taking what I wrote in Java and forcing it into C++, adding destructors where appropriate and turning object references to pointers. I'm not yet thinking in C++. Do these classes bear separating into separate *.cpp files? Should they be separated, and then compiled as a library to use in the main program? What I really need are some good resources or contrived examples (or even rules of thumb) to say, in C++ programming, what are the different options that exist and when is it a good idea to thinking about one over the other?
EDIT: I've been asked by #Pawel Zubrycki to provide some example code. I'm not going to do this, because each component is fairly trivial - It generally has a reference to the next thing, and some get/set methods. I will, however, describe it.
It's essentially an incidence list. There is some unnecessary use of classes termed ...Pointer - they were a product of a literal translation of a diagram first used to explain incidence lists to me.
There is a container class, VertexList, which contains a head element VertexPointer, and methods to add new VertexPointer objects (Adding it to the graph, but not connecting it to any other nodes, allowing searches to search non-connected graphs), naive search for indices on Vertex objects, etc. Every VertexPointer has a Vertex object, as well as a VertexPointer next;, and all those handy hasNext() methods that you might expect. A Vertex also has an associated ConnectionList
The same is duplicated for EdgeList, EdgePointer, and Edge, except that an Edge is associated with two Connection objects.
ConnectionList and Connection: ConnectionList mimicking VertexList or EdgeList, having a Connection head; and all those handy methods you might expect, like addConnection(). A Connection has an Edge associated with it, as well as some Connection next;
This allows us to easily get the connected components of any one point in the graph, and have an arbitrary number of connections.
It seems pretty over-the-top complicated, but the same functionality could be duplicated with some LinkedList of Vertex objects, a LinkedList of Edge objects, and a number of LinkedList of Connection objects. The LinkedList of Vertex Objects allows us to iterate over all Vertices for exhaustive searches on Vertices, and the same applies for edges. The LinkedList objects of Connection allow us to quickly traverse to any connected Vertices and to arbitrarily add or connections in the graph. This step up in complexity was added to deal with the complexity of evaluating a certain colouring of a graph (weighted edges, quick traversal of local subgraphs, etc.)
If you have classes like Node, Graph and Edge, and their implementation is not too large, it makes perfectly good sense to define them in one and the same .cpp file. After all, they are meant to be used together.
In C++, a package like this is called a component. Usually it makes more sense to think in components than classes, since C++ is not only an OOP language and classes are not always the preferred way do things.
If you want to learn more about the preferred way to organize code in C++, I recommend Large Scale C++ Software Design.
BTW: Making a library out of these classes really seems overkill.

Shared single variable for list?

This problem is a little difficult to describe, so bear with me if it isn't clear.
I want to implement a doubly-linked list with a single, universally accessible [to the items inside] Head, End and Iter pointers - this would greatly reduce memory overhead and processing/accessing times...
Static almost fulfills this role - except, it's shared by all classes of the same type - which what I don't want [as I might have multiple doubly-linked lists - I need one per list, not one per class]. So what I need is something similar to static, except it's localised to different declarations.
Head/Node methods become complicated (notably as it uses templates) and I want to avoid this at all costs. Head just ends up having duplicate functions of Node [so Node is accessible], which seems a waste and added complexity just to have three local-universal variables.
What I'd like is something similar to this:
class Test
{
private:
static Test *Head; //Single universal declaration!
static Test *End;
static Test *Iter;
//etc etc
};
Except...
Test A; //Set of 'static' variables 'unique' to A
Test B; //Set of 'static' variables 'unique' to B
I am willing to entertain any and all solutions to the problem, but please avoid complicated solutions - this is meant as an improvement and needs to be quick and simple to implement.
Additional Information [as requested]:
There isn't a 'problem' per se [aside in terms of avoiding overhead and design] - this is setting the frame-work/ground-work for several other classes/functions to build on. So the class needs to be able to handle multiple roles/variables/classes - for this, it has to be templated [although this isn't entirely relevant].
One [of many] of it's main roles is storing individual characters [loaded from files] in seperate Nodes. Given the size can vary, it has to be dynamic. However, as one of it's roles involve loading from files, it can't be an array [as reading the file to work out number of arguments, characters etc causes harddrive/access bottlenecks]. So...
...Singly-linked lists would allow a character to be [easily] added [to the list] on each pass that gets a character [and counted at the same time - solving two problems in one]. The problem is singly-linked lists are very hard to [safely] delete, and navigation is one way. Which is a problem as this hinders search functionality, and notably, the intended multipurpose role...
...So the conclusion is it has to be a doubly-linked list. I don't like the STL or standard lists as I have no idea of their efficiency or safety, or indeed, compatibility with additional features the class has to support. So it has to be a custom built D-L-List...
...However I previously (some time ago) implemented a Head/Node method - it worked. However it become complex and difficult to debug as Head and Node shared functions. This time around I just want a simple, single [Readable! It's going to be shared!] class that somehow sidesteps the almost 'beaucratic' nature of C++. That means no Head/Iter/End copying overhead (and all functions/variables/debugging required for it) and no Head system with it's duplication...
...Static is the closest I get. Perhaps there is a way that somehow, you have Class A that stores the three variables, and a Class B that stores the list - and both of them are aware of each other and are able to communicate via some method/function (no pointer storage!)...
...Something along those lines. I am pretty sure there is some hierarchy or sub-class or inheiretence trick that would pull this off, and I need someone who knows the finer arts better than I do to, refine my raw idea or something.
If static variables are not suitable, you have only one possibility - use instance variables.
If you want to share the variables between the items, put them in the list itself and maintain a pointer to the list in each item as follows:
class List
{
Item* head;
Item* end;
Item* iter;
};
class Item
{
List* list;
};
Make a List class (as already shown by vitaut, but add a makeEntry() function in which a reference to the List class can be passed. If List becomes more complicated, I would isolate these members to ListInfo, so the node only have access to them

Most efficient way to add data to an instance

I have a class, let's say Person, which is managed by another class/module, let's say PersonPool.
I have another module in my application, let's say module M, that wants to associate information with a person, in the most efficient way. I considered the following alternatives:
Add a data member to Person, which is accessed by the other part of the application. Advantage is that it is probably the fastest way. Disadvantage is that this is quite invasive. Person doesn't need to know anything about this extra data, and if I want to shield this data member from other modules, I need to make it private and make module M a friend, which I don't like.
Add a 'generic' property bag to Person, in which other modules can add additional properties. Advantage is that it's not invasive (besides having the property bag), and it's easy to add 'properties' by other modules as well. Disadvantage is that it is much slower than simply getting the value directly from Person.
Use a map/hashmap in module M, which maps the Person (pointer, id) to the value we want to store. This looks like the best solution in terms of separation of data, but again is much slower.
Give each person a unique number and make sure that no two persons ever get the same number during history (I don't even want to have these persons reuse a number, because then data of an old person may be mixed up with the data of a new person). Then the external module can simply use a vector to map the person's unique number to the specific data. Advantage is that we don't invade the Person class with data it doesn't need to know of (except his unique nubmer), and that we have a quick way of getting the data specifically for module M from the vector. Disadvantage is that the vector may become really big if lots of persons are deleted and created (because we don't want to reuse the unique number).
In the last alternative, the problem could be solved by using a sparse vector, but I don't know if there are very efficient implementations of a sparse vector (faster than a map/hashmap).
Are there other ways of getting this done?
Or is there an efficient sparse vector that might solve the memory problem of the last alternative?
I would time the solution with map/hashmap and go with it if it performs good enough. Otherwise you have no choice but add those properties to the class as this is the most efficient way.
Alternatively, you can create a subclass of Person, basically forward all the interface methods to the original class but add all the properties you want and just change original Person to your own modified one during some of the calls to M.
This way module M will see the subclass and all the properties it needs but all other modules would think of it as just an instance of Person class and will not be able to see your custom properties.
The first and third are reasonably common techniques. The second is how dynamic programming languages such as Python and Javascript implement member data for objects, so do not dismiss it out of hand as impossibly slow. The fourth is in the same ballpark as how relational databases work. It is possible, but difficult, to make relational databases run the like the clappers.
In short, you've described 4 widely used techniques. The only way to rule any of them out is with details specific to your problem (required performance, number of Persons, number of properties, number of modules in your code that will want to do this, etc), and corresponding measurements.
Another possibility is for module M to define a class which inherits from Person, and adds extra data members. The principle here is that M's idea of a person differs from Person's idea of a person, so describe M's idea as a class. Of course this only works if all other modules operating on the same Person objects are doing so via polymorphism, and furthermore if M can be made responsible for creating the objects (perhaps via dependency injection of a factory). That's quite a big "if". An even bigger one, if nothing other than M needs to do anything life-cycle-ish with the objects, then you may be able to use composition or private inheritance in preference to public inheritance. But none of it is any use if module N is going to create a collection of Persons, and then module M wants to attach extra data to them.