Most efficient way to add data to an instance - c++

I have a class, let's say Person, which is managed by another class/module, let's say PersonPool.
I have another module in my application, let's say module M, that wants to associate information with a person, in the most efficient way. I considered the following alternatives:
Add a data member to Person, which is accessed by the other part of the application. Advantage is that it is probably the fastest way. Disadvantage is that this is quite invasive. Person doesn't need to know anything about this extra data, and if I want to shield this data member from other modules, I need to make it private and make module M a friend, which I don't like.
Add a 'generic' property bag to Person, in which other modules can add additional properties. Advantage is that it's not invasive (besides having the property bag), and it's easy to add 'properties' by other modules as well. Disadvantage is that it is much slower than simply getting the value directly from Person.
Use a map/hashmap in module M, which maps the Person (pointer, id) to the value we want to store. This looks like the best solution in terms of separation of data, but again is much slower.
Give each person a unique number and make sure that no two persons ever get the same number during history (I don't even want to have these persons reuse a number, because then data of an old person may be mixed up with the data of a new person). Then the external module can simply use a vector to map the person's unique number to the specific data. Advantage is that we don't invade the Person class with data it doesn't need to know of (except his unique nubmer), and that we have a quick way of getting the data specifically for module M from the vector. Disadvantage is that the vector may become really big if lots of persons are deleted and created (because we don't want to reuse the unique number).
In the last alternative, the problem could be solved by using a sparse vector, but I don't know if there are very efficient implementations of a sparse vector (faster than a map/hashmap).
Are there other ways of getting this done?
Or is there an efficient sparse vector that might solve the memory problem of the last alternative?

I would time the solution with map/hashmap and go with it if it performs good enough. Otherwise you have no choice but add those properties to the class as this is the most efficient way.
Alternatively, you can create a subclass of Person, basically forward all the interface methods to the original class but add all the properties you want and just change original Person to your own modified one during some of the calls to M.
This way module M will see the subclass and all the properties it needs but all other modules would think of it as just an instance of Person class and will not be able to see your custom properties.

The first and third are reasonably common techniques. The second is how dynamic programming languages such as Python and Javascript implement member data for objects, so do not dismiss it out of hand as impossibly slow. The fourth is in the same ballpark as how relational databases work. It is possible, but difficult, to make relational databases run the like the clappers.
In short, you've described 4 widely used techniques. The only way to rule any of them out is with details specific to your problem (required performance, number of Persons, number of properties, number of modules in your code that will want to do this, etc), and corresponding measurements.
Another possibility is for module M to define a class which inherits from Person, and adds extra data members. The principle here is that M's idea of a person differs from Person's idea of a person, so describe M's idea as a class. Of course this only works if all other modules operating on the same Person objects are doing so via polymorphism, and furthermore if M can be made responsible for creating the objects (perhaps via dependency injection of a factory). That's quite a big "if". An even bigger one, if nothing other than M needs to do anything life-cycle-ish with the objects, then you may be able to use composition or private inheritance in preference to public inheritance. But none of it is any use if module N is going to create a collection of Persons, and then module M wants to attach extra data to them.

Related

visitor pattern adding new functionality

I've read thes question about visitor patterns https://softwareengineering.stackexchange.com/questions/132403/should-i-use-friend-classes-in-c-to-allow-access-to-hidden-members. In one of the answers I've read
Visitor give you the ability to add functionality to a class without actually touching the class itself.
But in visited object we have to add either new interface, so we actualy "touch" the class (or at least in some cases to put setters and getters, also changing the class).
How exactly I will add functionality with visitor without changing visiting class?
The visitor pattern indeed assumes that each class interface is general enough, so that, if you would know the actual type of the object, you would be able to perform the operation from outside the class. If this is not the starting point, visitor indeed might not apply.
(Note that this assumption is relatively weak - e.g., if each data member has a getter, then it is trivially achieved for any const operation.)
The focus of this pattern is different. If
this is the starting point
you need to support an increasing number of operations
then what changes to the classs' code do you need to do in order to dispatch new operations applied to pointers (or references) to the base class.
To make this more concrete, take the classic visitor CAD example:
Consider the design of a 2D CAD system. At its core there are several types to represent basic geometric shapes like circles, lines and arcs. The entities are ordered into layers, and at the top of the type hierarchy is the drawing, which is simply a list of layers, plus some additional properties.
A fundamental operation on this type hierarchy is saving the drawing to the system's native file format. At first glance it may seem acceptable to add local save methods to all types in the hierarchy. But then we also want to be able to save drawings to other file formats, and adding more and more methods for saving into lots of different file formats soon clutters the relatively pure geometric data structure we started out with.
The starting point of the visitor pattern is that, say, a circle, has sufficient getters for its specifics, e.g., its radius. If that's not the case, then, indeed, there's a problem (in fact, it's probably a badly designed CAD code base anyway).
Starting from this point, though, when considering new operations, e.g., writing to file type A, there are two approaches:
implement a virtual method like write_to_file_type_a for each class and each operation
implement a virtual method accept_visitor for each class only, only once
The "without actually touching the class itself" in your question means, in point 2 just above, that this is all that's now needed to dispatch future visitors to the correct classes. It doesn't mean that the visitor will start writing getters, for example.
Once a visitor interface has been written for one purpose, you can visit the class in different ways. The different visiting does not require touching the class again, assuming you are visiting the same compontnts.

C++ member variable change listeners (100+ classes)

I am trying to make an architecture for a MMO game and I can't figure out how I can store as many variables as I need in GameObjects without having a lot of calls to send them on a wire at the same time I update them.
What I have now is:
Game::ChangePosition(Vector3 newPos) {
gameobject.ChangePosition(newPos);
SendOnWireNEWPOSITION(gameobject.id, newPos);
}
It makes the code rubbish, hard to maintain, understand, extend. So think of a Champion example:
I would have to make a lot of functions for each variable. And this is just the generalisation for this Champion, I might have have 1-2 other member variable for each Champion type/"class".
It would be perfect if I would be able to have OnPropertyChange from .NET or something similar. The architecture I am trying to guess would work nicely is if I had something similar to:
For HP: when I update it, automatically call SendFloatOnWire("HP", hp);
For Position: when I update it, automatically call SendVector3OnWire("Position", Position)
For Name: when I update it, automatically call SendSOnWire("Name", Name);
What are exactly SendFloatOnWire, SendVector3OnWire, SendSOnWire ? Functions that serialize those types in a char buffer.
OR METHOD 2 (Preffered), but might be expensive
Update Hp, Position normally and then every Network Thread tick scan all GameObject instances on the server for the changed variables and send those.
How would that be implemented on a high scale game server and what are my options? Any useful book for such cases?
Would macros turn out to be useful? I think I was explosed to some source code of something similar and I think it used macros.
Thank you in advance.
EDIT: I think I've found a solution, but I don't know how robust it actually is. I am going to have a go at it and see where I stand afterwards. https://developer.valvesoftware.com/wiki/Networking_Entities
On method 1:
Such an approach could be relatively "easy" to implement using a maps, that are accessed via getters/setters. The general idea would be something like:
class GameCharacter {
map<string, int> myints;
// same for doubles, floats, strings
public:
GameCharacter() {
myints["HP"]=100;
myints["FP"]=50;
}
int getInt(string fld) { return myints[fld]; };
void setInt(string fld, int val) { myints[fld]=val; sendIntOnWire(fld,val); }
};
Online demo
If you prefer to keep the properties in your class, you'd go for a map to pointers or member pointers instead of values. At construction you'd then initialize the map with the relevant pointers. If you decide to change the member variable you should however always go via the setter.
You could even go further and abstract your Champion by making it just a collection of properties and behaviors, that would be accessed via the map. This component architecture is exposed by Mike McShaffry in Game Coding Complete (a must read book for any game developer). There's a community site for the book with some source code to download. You may have a look at the actor.h and actor.cpp file. Nevertheless, I really recommend to read the full explanations in the book.
The advantage of componentization is that you could embed your network forwarding logic in the base class of all properties: this could simplify your code by an order of magnitude.
On method 2:
I think the base idea is perfectly suitable, except that a complete analysis (or worse, transmission) of all objects would be an overkill.
A nice alternative would be have a marker that is set when a change is done and is reset when the change is transmitted. If you transmit marked objects (and perhaps only marked properties of those), you would minimize workload of your synchronization thread, and reduce network overhead by pooling transmission of several changes affecting the same object.
Overall conclusion I arrived at: Having another call after I update the position, is not that bad. It is a line of code longer, but it is better for different motives:
It is explicit. You know exactly what's happening.
You don't slow down the code by making all kinds of hacks to get it working.
You don't use extra memory.
Methods I've tried:
Having maps for each type, as suggest by #Christophe. The major drawback of it was that it wasn't error prone. You could've had HP and Hp declared in the same map and it could've added another layer of problems and frustrations, such as declaring maps for each type and then preceding every variable with the mapname.
Using something SIMILAR to valve's engine: It created a separate class for each networking variable you wanted. Then, it used a template to wrap up the basic types you declared (int, float, bool) and also extended operators for that template. It used way too much memory and extra calls for basic functionality.
Using a data mapper that added pointers for each variable in the constructor, and then sent them with an offset. I left the project prematurely when I realised the code started to be confusing and hard to maintain.
Using a struct that is sent every time something changes, manually. This is easily done by using protobuf. Extending structs is also easy.
Every tick, generate a new struct with the data for the classes and send it. This keeps very important stuff always up to date, but eats a lot of bandwidth.
Use reflection with help from boost. It wasn't a great solution.
After all, I went with using a mix of 4, and 5. And now I am implementing it in my game. One huge advantage of protobuf is the capability of generating structs from a .proto file, while also offering serialisation for the struct for you. It is blazingly fast.
For those special named variables that appear in subclasses, I have another struct made. Alternatively, with help from protobuf I could have an array of properties that are as simple as: ENUM_KEY_BYTE VALUE. Where ENUM_KEY_BYTE is just a byte that references a enum to properties such as IS_FLYING, IS_UP, IS_POISONED, and VALUE is a string.
The most important thing I've learned from this is to have as much serialization as possible. It is better to use more CPU on both ends than to have more Input&Output.
If anyone has any questions, comment and I will do my best helping you out.
ioanb7

List design (Object oriented) suggestion needed

I'm trying to implement a generic class for lists for an embedded device using C++. Such a class will provide methods to update the list, sort the list, filter the list based on some user specified criteria, group the list based on some user specified criteria etc. But there are quite a few varieties of lists I want this generic class to support and each of these varieties can have different display aspects. Example: One variety of list can have strings and floating point numbers in each of its elements. Other variety could have a bitmap, string and special character in each of it's elements. etc.
I wrote down a class with the methods of interest (sort, group, etc). This class has an object of another class (say DisplayAspect) as its member. But the number of member variables and the type of each member variable of class DisplayAspect is unknown. What would be a better way to implement this?
Why not use the std::list, C++ provides that and it provides all the functionality you mentioned(It is templated class, So it supports all data types you can think of).
Also, there is no point reinventing the wheel as the code you write will almost will never be as efficient as std::list.
In case you still want to reinvent this wheel, You should write a template list class.
First, you should probably use std::list as your list, as others have stated. It seems to me that you are having problems more with what to put in the list, however, so I'm focusing on that part of the question.
Since you want to also store multiple bits of information in each element of the list, you will need to create multiple classes, one to store each combination. You don't describe why you are storing mutiple bits of information, but you'd want to use a logical name for each class. So if, for example, you were storing a name and a price (string and a double), you could give the class some name like Product.
You mention creating a class called DisplayAspect.
If this is because you want to have one piece of code print all of these lists, then you should use inheritance and polymorphism to accomplish this goal. One way to accomplish that is to make your DisplayAspect class an abstract class with the needed functions (printItem() for example) pure virtual and have each of the classes you created for the combinations of data be subclasses of this DisplayAspect class.
If, on the other hand, you created the DisplayAspect class so that you could reuse your list code, you should look into template classes. std::list is an example of a template class and it will hold any type you'd like to put into it and in that case, you could drop your DisplayAspect class.
Others (e.g., #Als) have already given the obvious, direct, answer to the question you asked. If you really want a linked list, they're undoubtedly correct: std::list is the obvious first choice.
I, however, am going to suggest that you probably don't want a linked list at all. A linked list is only rarely a useful data structure. Given what you've said you want (sorting, grouping), and especially your target (embedded system, so you probably don't have a lot of memory to waste) a linked list probably isn't a very good choice for what you're trying to do. At least right off, it sounds like something closer to an array probably makes a lot more sense.
If you end up (mistakenly) deciding that a linked list really is the right choice, there's a fair chance you only need a singly linked list though. For that, you might want to look at Boost Slist. While it's a little extra work to use (it's intrusive), this will generally have lower overhead, so it's at least not quite a poor of a choice as many generic linked lists.

Is it a good practice to write classes that typically have only one public method exposed?

The more I get into writing unit tests the more often I find myself writing smaller and smaller classes. The classes are so small now that many of them have only one public method on them that is tied to an interface. The tests then go directly against that public method and are fairly small (sometimes that public method will call out to internal private methods within the class). I then use an IOC container to manage the instantiation of these lightweight classes because there are so many of them.
Is this typical of trying to do things in a more of a TDD manner? I fear that I have now refactored a legacy 3,000 line class that had one method in it into something that is also difficult to maintain on the other side of the spectrum because there is now literally about 100 different class files.
Is what I am doing going too far? I am trying to follow the single responsibility principle with this approach but I may be treading into something that is an anemic class structure where I do not have very intelligent "business objects".
This multitude of small classes would drive me nuts. With this design style it becomes really hard to figure out where the real work gets done. I am not a fan of having a ton of interfaces each with a corresponding implementation class, either. Having lots of "IWidget" and "WidgetImpl" pairings is a code smell in my book.
Breaking up a 3,000 line class into smaller pieces is great and commendable. Remember the goal, though: it's to make the code easier to read and easier to work with. If you end up with 30 classes and interfaces you've likely just created a different type of monster. Now you have a really complicated class design. It takes a lot of mental effort to keep that many classes straight in your head. And with lots of small classes you lose the very useful ability to open up a couple of key files, pick out the most important methods, and get an idea of what the heck is going on.
For what it's worth, though, I'm not really sold on test-driven design. Writing tests early, that's sensible. But reorganizing and restructuring your class design so it can be more easily unit tested? No thanks. I'll make interfaces only if they make architectural sense, not because I need to be able to mock up some objects so I can test my classes. That's putting the cart before the horse.
You might have gone a bit too far if you are asking this question. Having only one public method in a class isn't bad as such, if that class has a clear responsibility/function and encapsulates all logic concerning that function, even if most of it is in private methods.
When refactoring such legacy code, I usually try to identify the components in play at a high level that can be assigned distinct roles/responsibilities and separate them into their own classes. I think about which functions should be which components's responsibility and move the methods into that class.
You write a class so that instances of the class maintain state. You put this state in a class because all the state in the class is related.You have function to managed this state so that invalid permutations of state can't be set (the infamous square that has members width and height, but if width doesn't equal height it's not really a square.)
If you don't have state, you don't need a class, you could just use free functions (or in Java, static functions).
So, the question isn't "should I have one function?" but rather "what state-ful entity does my class encapsulate?"
Maybe you have one function that sets all state -- and you should make it more granular, so that, e.g., instead of having void Rectangle::setWidthAndHeight( int x, int y) you should have a setWidth and a separate setHeight.
Perhaps you have a ctor that sets things up, and a single function that doesIt, whatever "it" is. Then you have a functor, and a single doIt might make sense. E.g., class Add implements Operation { Add( int howmuch); Operand doIt(Operand rhs);}
(But then you may find that you really want something like the Visitor Pattern -- a pure functor is more likely if you have purely value objects, Visitor if they're arranged in a tree and are related to each other.)
Even if having these many small objects, single-function is the correct level of granularity, you may want something like a facade Pattern, to compose out of primitive operations, often-used complex operations.
There's no one answer. If you really have a bunch of functors, it's cool. If you're really just making each free function a class, it's foolish.
The real answer lies in answering the question, "what state am I managing, and how well do my classes model my problem domain?"
I'd be speculating if I gave a definite answer without looking at the code.
However it sounds like you're concerned and that is a definite flag for reviewing the code. The short answer to your question comes back to the definition of Simple Design. Minimal number of classes and methods is one of them. If you feel like you can take away some elements without losing the other desirable attributes, go ahead and collapse/inline them.
Some pointers to help you decide:
Do you have a good check for "Single Responsibility" ? It's deceptively difficult to get it right but is a key skill (I still don't see it like the masters). It doesn't necessarily translate to one method-classes. A good yardstick is 5-7 public methods per class. Each class could have 0-5 collaborators. Also to validate against SRP, ask the question what can drive a change into this class ? If there are multiple unrelated answers (e.g. change in the packet structure (parsing) + change in the packet contents to action map (command dispatcher) ) , maybe the class needs to be split. On the other end, if you feel that a change in the packet structure, can affect 4 different classes - you've run off the other cliff; maybe you need to combine them into a cohesive class.
If you have trouble naming the concrete implementations, maybe you don't need the interface. e.g. XXXImpl classes implmenting XXX need to be looked at. I recently learned of a naming convention, where the interface describes a Role and the implementation is named by the technology used to implement the role (or falling back to what it does). e.g. XmppAuction implements Auction (or SniperNotifier implements AuctionEventListener)
Lastly are you finding it difficult to add / modify / test existing code (e.g. test setup is long or painful ) ? Those can be signs that you need to go refactoring.

Structure for hierarchal Component storage

I've been batting this problem around in my head for a few days now and haven't come to any satisfactory conclusions so I figured I would ask the SO crew for their opinion. For a game that I'm working on I'm using a Component Object Model as described here and here. It's actually going fairly well but my current storage solution is turning out to be limiting (I can only request components by their class name or an arbitrary "family" name). What I would like is the ability to request a given type and iterate through all components of that type or any type derived from it.
In considering this I've first implemented a simple RTTI scheme that stores the base class type through the derived type in that order. This means that the RTTI for, say, a sprite would be: component::renderable::sprite. This allows me to compare types easily to see if type A is derived from type B simply by comparing the all elements of B: i.e. component::renderable::sprite is derived from component::renderable but not component::timer. Simple, effective, and already implemented.
What I want now is a way to store the components in a way that represents that hierarchy. The first thing that comes to mind is a tree using the types as nodes, like so:
component
/ \
timer renderable
/ / \
shotTimer sprite particle
At each node I would store a list of all components of that type. That way requesting the "component::renderable" node will give me access to all renderable components regardless of derived type. The rub is that I want to be able to access those components with an iterator, so that I could do something like this:
for_each(renderable.begin(), renderable.end(), renderFunc);
and have that iterate over the entire tree from renderable down. I have this pretty much working using a really ugly map/vector/tree node structure and an custom forward iterator that tracks a node stack of where I've been. All the while implementing, though, I felt that there must be a better, clearer way... I just can't think of one :(
So the question is: Am I over-complicating this needlessly? Is there some obvious simplification I'm missing, or pre-existing structure I should be using? Or is this just inheritly a complex problem and I'm probably doing just fine already?
Thanks for any input you have!
You should think about how often you need to do the following:
traverse the tree
add/remove elements from the tree
how many objects do you need to keep track of
Which is more frequent will help determine the optimum solution
Perhaps instead of make a complex tree, just have a list of all types and add a pointer to the object for each type it is derived from. Something like this:
map<string,set<componenet *>> myTypeList
Then for an object that is of type component::renderable::sprite
myTypeList["component"].insert(&object);
myTypeList["renderable"].insert(&object);
myTypeList["sprite"].insert(&object);
By registering each obejct in multiple lists, it then becomes easy to do something to all object of a given type and subtypes
for_each(myTypeList["renderable"].begin(),myTypeList["renderable"].end(),renderFunc);
Note that std::set and my std::map construct may not be the optimum choice, depending on how you will use it.
Or perhaps a hybrid approach storing only the class heirarchy in the tree
map<string, set<string> > myTypeList;
map<string, set<component *> myObjectList;
myTypeList["component"].insert("component");
myTypeList["component"].insert("renderable");
myTypeList["component"].insert("sprite");
myTypeList["renderable"].insert("renderable");
myTypeList["renderable"].insert("sprite");
myTypeList["sprite"].insert("sprite");
// this isn't quite right, but you get the idea
struct doForList {
UnaryFunction f;
doForList(UnaryFunction f): func(f) {};
operator ()(string typename) {
for_each(myTypeList[typename].begin();myTypeList[typename].end(), func);
}
}
for_each(myTypeList["renderable"].begin(),myTypeList["renderable"].end(), doForList(myFunc))
The answer depends on the order you need them in. You pretty much have a choice of preorder, postorder, and inorder. Thus have obvious analogues in breadth first and depth first search, and in general you'll have trouble beating them.
Now, if you constraint the problem a litle, there are a number of old fashioned algorithms for storing trees of arbitrary data as arrays. We used them a lot in the FORTRAN days. One of them had the key trick being to store the children of A, say A2 and A3, at index(A)*2,index(A)*2+1. The problem is that if your tree is sparse you waste space, and the size of your tree is limited by the array size. But, if I remember this right, you get the elements in breadth-first order by simple DO loop.
Have a look at Knuth Volume 3, there is a TON of that stuff in there.
If you want to see code for an existing implementation, the Game Programming Gems 5 article referenced in the Cowboy Programming page comes with a somewhat stripped down version of the code we used for our component system (I did a fair chunk of the design and implementation of the system described in that article).
I'd need to go back and recheck the code, which I can't do right now, we didn't represent things in a hierarchy in the way you show. Although components lived in a class hierarchy in code, the runtime representation was a flat list. Components just declared a list of interfaces that they implemented. The user could query for interfaces or concrete types.
So, in your example, Sprite and Particle would declare that they implemented the RENDERABLE interface, and if we wanted to do something to all renderables, we'd just loop through the list of active components and check each one. Not terribly efficient on the face of it, but it was fine in practice. The main reason it wasn't an issue was that it actually turns out to not be a very common operation. Things like renderables, for example, added themselves to the render scene at creation, so the global scene manager maintained its own list of renderable objects and never needed to query the component system for them. Similarly with phyics and collision components and that sort of thing.