C++ Runtime class switching, relying on base class inheritance - c++

I am not fully versed in C++ and with the current project I am working on, have hit a small snag. I have come across this: http://www.terrainformatica.com/2010/08/cpp-how-to-change-class-of-object-in-runtime/ but I am unsure if this is the solution I am looking for or if there are any better alternatives.
My intentions is to switch turret types stored in memory, despite them being different instantiations of base CTurret. And with as little performance impact as possible since this is going to be implemented in a simple game.
Basically I have a base class CTurret of base class CEntity. Now I have several Turrets (Basic, Fast, Harmless). Each turret uses the base class CTurret I want to reserve some memory to hold generic turrets which can then be simply swapped out for an actual turret type. Better visual below:
class CEntity...
class CTurret : public CEntity...
class CBasicTurret : public CTurret...
class CFastTurret : public CTurret...
Memory (5 generic turrets)
Array[CTurret, CTurret, CTurret, CTurret, CTurret] // Can not use generic!
User wants a basic turret, populate an available generic turret:
Array[CBasicTurret, CTurret, CTurret, CTurret, CTurret] // Ah, a basic turret I'll use that.
When no longer needed:
Array[CTurret, CTurret, CTurret, CTurret, CTurret] // Back to original.
So far I can see only two options to achieve what I want:
1) I could put all the turret types in the basic class and change it form the base to... an actual class. But this would get messy fast.
2) I could reserve memory for each turret type which would not be an ideal situation since memory is precious and should not be wasted as such. Also the user may request more than is already reserve which could pose future problems.

You need to decide what the difference is between turret types.
The general choices are a polymorphic base class (with the derived classes overriding inherited virtual functions to specialise behaviour) or a single class with state (e.g. an enumerated type to describe operating modes, a float to describe speed, etc).
An array of smart pointers can be used if you decide on a polymorphic base. Simply pick which pointer (or index into the array) to use. The pointers (and objects they point to) will be in memory anyway.
With a single object, just change its state as required. So change values like height, width, operating mode, etc. This is the better option if you're really worried about memory usage (only one object in memory needed, although there is nothing stopping you having an array of objects, where each element has different state).
The placement new trick you linked to has a few gotchas that have been glossed over, and the particular example of morphing an object with data into an object that doesn't is problematical (e.g. causing the caller to exhibit undefined behaviour).

Related

A question about encapsulation and inheritence practices

I've heard people saying that having protected members kind of breaks the point of encapsulation and is not the best practice, one should design the program such that derived classes will not need to have access to private base class members.
An example situation
Now, imagine the following scenario, a simple 8bit game, we have bunch of different objects, such as, regular boxes act as obstacles, spikes, coins, moving platforms etc. List can go on.
All of them have x and y coordinates, a rectangle that specifies size of the object, and collision box, and a texture. Also they can share functions like setting position, rendering, loading the texture, checking for collision etc.
But some of them also need to modify base members, e.g. boxes can be pushed around so they might need a move function, some objects may be moving by themselves or maybe some blocks change texture in-game.
Therefore a base class like object can really come in handy, but that would either require ton of getters - setters or having private members to be protected instead. Either way, compromises encapsulation.
Given the anecdotal context, which would be a better practice:
1. Have a common base class with shared functions and members, declared as protected. Be able to use common functions, pass the reference of base class to non-member functions which only needs to access shared properties. But compromise encapsulation.
2. Have a separate class for each, declare the member variables as private and don't compromise encapsulation.
3. A better way that I couldn't have thought.
I don't think encapsulation is highly vital and probably way to go for that anecdote would be just having protected members, but my goal with this question is writing a well practiced, standard code, rather than solving that specific problem.
Thanks in advance.
First off, I'm going to start by saying there is not a one-size fits all answer to design. Different problems require different solutions; however there are design patterns that often may be more maintainable over time than others.
Indeed, a lot of suggestions for design make them better in a team environment -- but good practices are also useful for solo projects as well so that it can be easier to understand and change in the future.
Sometimes the person who needs to understand your code will be you, a year from now -- so keep that in mind😊
I've heard people saying that having protected members kind of breaks the point of encapsulation
Like any tool, it can be misused; but there is nothing about protected access that inherently breaks encapsulation.
What defines the encapsulation of your object is the intended projected API surface area. Sometimes, that protected member is logically part of the surface-area -- and this is perfectly valid.
If misused, protected members can give clients access to mutable members that may break a class's intended invariants -- which would be bad. An example of this would be if you were able to derive a class exposing a rectangle, and were able to set the width/height to a negative value. Functions in the base class, such as compute_area could suddenly yield wrong values -- and cause cascading failures that should otherwise have been guarded against by better encapsulated.
As for the design of your example in question:
Base classes are not necessarily a bad thing, but can easily be overused and can lead to "god" classes that unintentionally expose too much functionality in an effort to share logic. Over time this can become a maintenance burden and just an overall confusing mess.
Your example sounds better suited to composition, with some smaller interfaces:
Things like a point and a vector type would be base-types to produce higher-order compositions like rectangle.
This could then be composed together to create a model which handles general (logical) objects in 2D space that have collision.
intersection/collision logic can be handled from an outside utility class
Rendering can be handled from a renderable interface, where any class that needs to render extends from this interface.
intersection handling logic can be handled by an intersectable interface, which determines behaviors of an object on intersection (this effectively abstracts each of the game objects into raw behaviors)
etc
encapsulation is not a security thing, its a neatness thing (and hence a supportability, readability ..). you have to assume that people deriving classes are basically sensible. They are after all writing programs either of their own using your base classes (so who cares), or they are writing in a team with you
The primary purpose of "encapsulation" in object-oriented programming is to limit direct access to data in order to minimize dependencies, and where dependencies must exist, to express those in terms of functions not data.
This is ties in with Design by Contract where you allow "public" access to certain functions and reserve the right to modify others arbitrarily, at any time, for any reason, even to the point of removing them, by expressing those as "protected".
That is, you could have a game object like:
class Enemy {
public:
int getHealth() const;
}
Where the getHealth() function returns an int value expressing the health. How does it derive this value? It's not for the caller to know or care. Maybe it's byte 9 of a binary packet you just received. Maybe it's a string from a JSON object. It doesn't matter.
Most importantly because it doesn't matter you're free to change how getHealth() works internally without breaking any code that's dependent on it.
However, if you're exposing a public int health property that opens up a whole world of problems. What if that is manipulated incorrectly? What if it's set to an invalid value? How do you trap access to that property being manipulated?
It's much easier when you have setHealth(const int health) where you can do things like:
clamp it to a particular range
trigger an event when it exceeds certain bounds
update a saved game state
transmit an update over the network
hook in other "observers" which might need to know when that value is manipulated
None of those things are easily implemented without encapsulation.
protected is not just a "get off my lawn" thing, it's an important tool to ensure that your implementation is used correctly and as intended.

Most efficient way to get an integer type id in a family of common base types

The problem:
I have a family of objects with a common base, and I need to be able to identify the specific concrete type via an integer value.
There are two obvious approaches to do that, however both come with unacceptable overheads in terms of memory or cpu time. Since the project deals with billions of objects, the tiniest of overhead ends up being heavily pronounced, and I have tested this, it is not a case of premature optimization. The operations involved in processing the objects are all trivial, and the overhead of virtual calls diminishes performance tremendously.
a pure virtual int type() function implemented for every type, unfortunately that comes with the overhead of a virtual call for something as trivial as returning a static integer value
a int type member for every instance, specified in the constructor type, which introduces a 4 byte overhead for each of those billions of objects, wasting memory, polluting the cache and whatnot
I remember some time ago someone asking about "static virtual member variables", and naturally the answers boiled down to "no, that makes no sense", however being able to put a user variable in the vtable and having the ability to set its value for each specific type seems to be a very efficient solution to my problem.
This way both of the above-mentioned overheads are avoided, no virtual calls are necessary and there is no per-instance memory overhead either. The only overhead is the indirection to get the vtable, but considering the frequency of access of that data, it will most likely be kept into the cpu cache most of the time.
My current obvious option is to do "manual OOP" - do vtables manually in order to incorporate the necessary "meta" data into them as well, init the vtable pointer for every type and use awkward syntax to invoke pseudo "member" functions. Or even omit the use of a vtable pointer altogether, and store the id instead, and use that as an index for a table of vtables, which will be even more efficient, as it will avoid the indirection, and will shrink the size down, as I only need 2^14 distinct types.
It would be nice if I can avoid reinventing the wheel. I am not picky about the solution as long as it can give me the efficiency guarantees.
Maybe there is a way to have my type id integer in the vtable, or maybe there is another way altogether, which is highly possible since I don't keep up with the trends, and C++ got a lot of new features in the last few years.
Naturally, those ids would need to be uniform and consistent, rather than some arbitrary values of whatever the compiler cooks up internally. If that wasn't a requirement, I'd just use the vtable pointer values for an even more efficient solution that avoids indirection.
Any ideas?
If you have way more instances than you have types, then the most straightforward solution is to abstract at the level of a homogeneous container rather than a single instance.
Instead of:
{PolymorphicContainer}: Foo*, Bar*, Baz*, Foo*, Bar*, Bar*, Baz*, ...
... and having to store some type information (vtable, type field, etc) to distinguish each element while accessing memory in the most sporadic ways, you can have:
{FooContainer}: Foo, Foo, Foo, Foo, Foo, ...
{BarContainer}: Bar, Bar, Bar, Bar, Bar, ...
{BazContainer}: Baz, Baz, Baz, Baz, Baz, ...
{PolymorphicContainer}: FooContainer*, BarContainer*, BazContainer*
And you store the type information (vtable or what not) inside the containers. That does mean you need access patterns of a kind that tend to be more homogeneous, but often such an arrangement can be made in most problems I've encountered.
Gamedevs used to do things like sort polymorphic base pointers by subtype while using a custom allocator for each to store them contiguously. That combination of sorting by base pointer address and allocating each type from separate pools makes it so you then get the analogical equivalent of:
Foo*, Foo*, Foo*, Foo*, ..., Bar*, Bar*, Bar*, Bar*, ..., Baz*, Baz*, Baz*, ...
With most of them stored contiguously because they each use a custom allocator which puts all the Foos into contiguous blocks separate from all the Bars, e.g. Then on top of spatial locality you also get temporal locality on the vtables if you access things in a sequential pattern.
But that's more painful to me than abstracting at the level of the container, and doing it that way still requires the overhead of two pointers (128-bits on 64-bit machines) per object (a vptr and a base pointer to the object itself). Instead of processing orcs, goblins, humans, etc, individually through a Creature* base pointer, it makes sense to me to store them in homogeneous containers, abstract that, and process Creatures* pointers which point to entire homogeneous collections. Instead of:
class Orc: public Creature {...};
... we do:
// vptr only stored once for all orcs in the entire game.
class Orcs: public Creatures
{
public:
// public interface consists predominantly of functions
// which process entire ranges of orcs at once (virtual
// dispatch only paid once possibly for a million orcs
// rather than a million times over per orc).
...
private:
struct OrcData {...};
std::vector<OrcData> orcs;
};
Instead of:
for each creature:
creature.do_something();
We do:
for each creatures:
creatures.do_something();
Using this strategy, if we need a million orcs in our video game, we'd cut the costs associated with virtual dispatch, vptrs, and base pointers to 1/1,000,000th of the original cost, not to mention you get very optimal locality of reference as well free of charge.
If in some cases we need to do something to a specific creature, you might be able to store a two-part index (might be able to fit it in 32-bits or maybe 48) storing creature type index and then relative creature index in that container, though this strategy is most beneficial when you don't have to call functions just to process one creature in your critical paths. Generally you can fit this into 32-bit indices or possibly 48-bits if you then set a limit for each homogeneous container of 2^16 before it is considered "full" and you create another one for the same type, e.g. We don't have to store all the creatures of one type in one container if we want to cram our indices.
I can't say for sure if this is applicable to your case because it depends on access patterns, but it is generally the first solution I consider when you have performance issues associated with polymorphism. The first way I look at it is that you're paying the costs like virtual dispatch, loss of contiguous access patterns, loss of temporal locality on vtables, memory overhead of vptr, etc. at too granular of a level. Make the design coarser (bigger objects, like objects representing a whole collection of things, not an individual object per thing) and the costs become negligible again.
Whatever the case may be, instead of thinking about this in terms of vtables and what not, think of it in terms of how you arrange data, just bits and bytes, so that you don't have to store a pointer or integer with every single little object. Draw things out just thinking about bits and bytes, not classes and vtables and virtual functions and nice public interfaces and so forth. Think about that later after you settle on a memory representation/layout, and start off just thinking about bits and bytes, like so:
I find this so much easier to think about for data-oriented designs with performance-critical needs well-anticipated upfront than trying to think about language mechanisms and nice interface designs and all that. Instead I think in a C-like way first of just bits and bytes and communicate and sketch my ideas as structs and figure out where the bits and bytes should go. Then once you figure that out, you can figure out how to put a nice interface on top.
Anyway, for avoiding overhead of type information per teeny object, that means grouping them together somehow in memory and storing that analogical type field once per group instead of once per element in group. Allocating elements of a particular type in a uniform way might also give you that information based on their pointer address or index, e.g. There are many ways to about this but just think about it in terms of data stored in memory as a general strategy.
The answer is somewhat embedded in your question topic:
Most efficient way to get an integer type id in a family of common
base types [...]
You store the integer ID once per family or at least once per multiple objects in that family instead of once per object. That's the only way, however you approach it, to avoid storing it once per object unless the info is already available. The alternative is to deduce it from some other information available, like you might be able to deduce it from the object's index or pointer address, at which point storing the ID would just be redundant information.

Multiple object type container or dynamic casting for a game project?

I have a very specific... well, lets not call it a problem, lets rather call it a deadlock. I'm writing a simple 2d game using allegro5 along with c++, and have a specific problem I'd like to overcome.
Main problem:
Currently, for game loop i'm using a list container, which holds all of my objects (of type GameObject) inside, and then im iterating on it to do things like updating the objects positions, rendering and animatating sprites.
From the class GameObject (which hold generic information used for updating, rendering and memory handling methods) inherits a Creature class, which should handle things like attacking methods.
The problem that comes up is that when iterating my main list of GameObjects (which would include Creatures as well) i cannot directly use the methods of my Creatures class. Of course I understand why I cannot do that (Encapsulation).
So far I've come to few possible solutions (which, in my humble opinion are not perfect), but I would like to ask for help in helping to find easy to implement and efficient solution:
- Using a container that could hold multiple object types.
- Using dynamic_cast at some point, to cast a creature GameObjects to Creature class to temporary use Creature methods and variables (is that even possibile?)
- Setting up a second container for handling the Creature methods and variables (I would like avoid that, as then I would need a single object to be in two containers at once - when adding new types of classes 'buildings', 'obstacles' or 'arrows' thier number will grow!)
I'm a very beginner programmer, and as I understand creating a game could be kind of overkill for my level of skill, im determined to push this game forward with any means nessesary. (Especially since I've learned a lot so far)
I hope I've explained a problem in detail - i'm not posing any code here, as its more of a theoretical problem then practical one, im just iterating a GameObject list after all.
With regards,
As you've found out, containers can only hold one type of object at a time.
If that object type is a base class pointer, it can point to any object derived from the base class. However, you need to first cast the pointer to the appropriate type before you can use it's specific abilities.
You answered your own question when you brought up dynamic_cast.
You can use dynamic_cast on the base pointer stored in your container to determine if the object is actually of a different type derived from your base class.
See the section on dynamic_cast here :
http://www.cplusplus.com/doc/tutorial/typecasting/
Example
Derived* d = dynamic_cast<Derived*>(ptr_base_class);
if (d) {/* We now know that ptr_base_class holds an object of type Derived */}
else {
/// This object is not a Derived class type
}
However, if you had to iterate over your entire base class pointer list using dynamic_cast to determine if an object is of a specified type, it would be wasteful.
Here's where you answered your own question again. Keep a separate list of all Creature*s so you don't have to cast them. Yes, you will be using a /little/ more memory, but not much. Being able to iterate over the Creature list without iterating the entity list improves your performance. To make things easier, make your own container that has a list of each type of object as well as a main list of all objects. If you don't care about their derived class, iterate the main list. If you care about what class they are, iterate their specific list.

Vector of base and inherited objects

How would one go about creating a vector that includes both the base class as well as any derived classes?
For example, in a chess engine, I currently have a Move class which stores a particular move and a few functions to help it. In order to save memory, as millions of these objects are going to be created, I also have a derived class CaptureMove that extends the Move class storing a bit more information about what and where the piece was captured.
From what I can gather, pointers to Move objects should work, but I'm not quite sure on how to go about it.
The question is quite broad. Here some ideas:
Vectors of base pointers:
This works extremely well if your class is polymorphic (i.e. the relevant functions of the base class are virtual).
vector<Move*> mp;
mp.push_back (new Move); // attention, you have to delete it ofr memory will leak
mp.push_back (new CaptureMove);
It the simplest way to proceed. However you have to make sure that when you add an object, it's allocated properly (e.g. created with new), and that once you no longer need it, you delete it. This can be very cumbersome, especially if vector was copied and some of its pointers are still in use.
This approach can be practical for example if you create and delete the objects in a centralised manner, so that the vector only uses pointers which are properly managed somewhere else.
Vector of shared base pointers:
vector<shared_ptr<Move>> m;
m.push_back(make_shared<Move>());
m.push_back(make_shared<CaptureMove>());
m.push_back(make_shared<Move>());
Here an online demo.
It extends the pointer solution, using smart pointers to take care of the release of unused objects.
Honestly, it's a little overhead but it's really worth it, in order to have reliable code. This is the approach I would take personnally if I'd have to do it.
Vector of compound object
You could also prefer to store the object instead of a pointer to the object. While the idea seems simple, it's more difficult to do, because different derivates could have different size. And it has serious drawbacks, because you'd need to know all possible base and derived types you may store in the vector, which makes this approach less flexible.
You could certainly manage this with a complex union, but the easiers way would be to use boost::variant.
vector<boost::variant<Move, CaptureMove>> m;
This approach is only worth considering if the number of derived classes is very limited, but you have huge numbers of small objects (so that memory allocation would become a real overhead) of almost the same size.

How to achieve cache coherency with an abstract class pointer vector in C++?

I'm making a little game in C++. I found answers on StackExchange sites about cache coherency, and I would like to use it in my game, but I'm using child classes of an abstract class, Entity.
I'm storing all entities in a std::vector so that I can access virtual functions in loops. Entity::update() is a virtual function of Entity overridden by subclasses like PlayerEntity.
In Game.hpp - Private Member Variables:
std::vector<Entity*> mEntities;
PlayerEntity* mPlayer;
In Game.cpp - Constructor:
mPlayer = new PlayerEntity();
mEntities.push_back(mPlayer);
Here's what my update function (in the main loop) looks like:
void Game::update() {
for (Entity* entity : mEntities) {
entity->update(mTimeStep, mGeneralClock.getElapsedTime().asMilliseconds());
}
}
My question is:
How do I make my entities objects be next to each other in memory, and thus achieve cache coherency?
I tried to simply make the vector of pointers a vector of objects and make the appropriate changes, but then I couldn't use polymorphism for obvious reasons.
Side question: what determines where an object in allocated in memory?
Am I doing the whole thing wrong? If so, how should I store my entities?
Note: I'm sorry if my english is bad, I'm not a native speaker.
Obviously, first measure which parts are even worth optimizing. Not all games are created equal, and not all code within a game is created equal. There is no use in completely restructuring the script that triggers the end boss's death animation to make it use 1 cache line instead of 2. That said...
If you are aiming for optimizing for cache, forget about inheritance and virtual functions. Or at least be critical of them. As you note, creating a contiguous array of polymorphic objects is somewhere between hard & error-prone and completely infeasible (depending on whether subclasses have different sizes).
You can attempt to create a pool, to have nearby entities (in the entities vector) more likely to be close to each other (in memory), but frankly I doubt you'll do much better than a state of the art general-purpose allocator, especially when the entities' size and lifetime varies significantly. A pool would only help if entities adjacent in the vector are allocated back-to-back. But in that case, any standard allocator gives the same locality advantages. It's not like tcmalloc and friends select a random cache line to allocate from just to annoy you.
You might be able squeeze a bit of memory out of knowing your object types, but this is purely hypothetical and would have to be proven first to justify the effort of implementing it. Also note that a run of the mill pool either assumes that all objects are the same size, or that you never deallocate individual objects. Allowing both puts you halfway towards a general-purpose allocator, which you're bound to do worse.
You can segregate objects based on their types. That is, instead of a single vector with polymorphic Entitys with virtual functions, have N vectors: vector<Bullet>, vector<Monster>, vector<Loot>, and so on. This is less insane than it sounds for threereasons:
Often, you can pull out the entire business of managing one such vector into a dedicated system. So in the end you might even have a vector<System *> where each System has a vector for one kind of thing, and updates all those things in a single virtual call (delegating to many statically-dispatched calls).
You don't need to represent everything ever in this abstraction. Not every little integer needs to be wrapped in its own type of entity.
If you go further down this route and take hints from entity component systems, you also gain an alternative to inheritance for code reuse (class Monster : Entity {}; class Skeleton : Monster {};) that plays nicer with the hard-earned cache friendliness.
It is not easy because polymorphism doesn't work well with cache coherency.
I think the best you can overload the base class new operator to allocate memory from a pool. But to do this, you need to know the size of all derived classes and after some allocating/deallocating you can have memory fragmentation which will lower the gain.
Have a look at Cachegrind, it's a tool that simulates how your program interacts with a machine's cache hierarchy.