According to wikipedia prototype pattern is :
The prototype pattern is a creational design pattern used in software development when the type of objects to create is determined by a prototypical instance, which is cloned to produce new objects. This pattern is used to:
Avoid subclasses of an object creator in the client application, like the abstract factory pattern does.
Avoid the inherent cost of creating a new object in the standard way (e.g., using the new keyword) when it is prohibitively expensive for a given application.
I saw certain demo codes of this pattern in C++ all of them are using copy constructor.
Can anyone explain how point number two applies(in general as well as in context of C++) as we are using copy constructor anyways in clone function. If it can be done without copy constructor then an example code snippet would be great.
You can copy without dynamic allocation. For example, here's a cloning that only happens in a local scope:
Foo prototype;
void local()
{
Foo x = prototype; // first copy
x.mutate();
Foo y = x; // another copy
}
No dynamic allocation is used, ever.
It is true that return new Foo(*this); also makes a copy, but what's more important is that that object is allocated dynamically. That's the cost that your article to which is alluding.
In a game I've been making in Java, I ran into an interesting situation that fit the bill of a prototype pattern quite well. You see, I had this Animation object that stored a container of images to flip through, as well as some other data that tracked how long since the last frame was rendered, which frame it was on, if the animation was running or not, etc.
I found that for multiple characters to use the same Animation object was causing problems. If two characters shared an animation, they would turn on and turn off the animation at conflicting times for each other. I would have guys standing still with walking animations, or moving with standing animations. Creation of the animation objects were costly and time consuming what with creating the sprites, setting the ammount of time they would display for, creating an interval queue of images, etc.
Instead, I made the Animation object a prototype object. If an Animation clones itself, It shares the original collection of frames with all other animations since those are immutable, but also expensive to construct. Instead the new objects would share this immutable base, but have their own information of which frame to draw and when.
Think of it like a projector. When it get's cloned, the new projector might have it's own information on if it's running or not, which frame it's on, etc, but it may be using the same piece of film that the original projector is using. The reason why they don't trip each other up is that the film is immutable. (and expensive to create)
In all honesty, the usage of the prototype in this manner is a great way to implement a flyweight pattern. Objects that share Objects that are expensive to create. If you "clone" them, they would be instantiated with their new transient state, but still share those expensive base objects with it's creator.
Calling the copy constructor for object which isn't use dynamic memory inside itself is much more faster then perform any allocation in dynamic memory via new. Because allocation in dynamic memory is a kind of system call.
Related
I'm making a little game in C++. I found answers on StackExchange sites about cache coherency, and I would like to use it in my game, but I'm using child classes of an abstract class, Entity.
I'm storing all entities in a std::vector so that I can access virtual functions in loops. Entity::update() is a virtual function of Entity overridden by subclasses like PlayerEntity.
In Game.hpp - Private Member Variables:
std::vector<Entity*> mEntities;
PlayerEntity* mPlayer;
In Game.cpp - Constructor:
mPlayer = new PlayerEntity();
mEntities.push_back(mPlayer);
Here's what my update function (in the main loop) looks like:
void Game::update() {
for (Entity* entity : mEntities) {
entity->update(mTimeStep, mGeneralClock.getElapsedTime().asMilliseconds());
}
}
My question is:
How do I make my entities objects be next to each other in memory, and thus achieve cache coherency?
I tried to simply make the vector of pointers a vector of objects and make the appropriate changes, but then I couldn't use polymorphism for obvious reasons.
Side question: what determines where an object in allocated in memory?
Am I doing the whole thing wrong? If so, how should I store my entities?
Note: I'm sorry if my english is bad, I'm not a native speaker.
Obviously, first measure which parts are even worth optimizing. Not all games are created equal, and not all code within a game is created equal. There is no use in completely restructuring the script that triggers the end boss's death animation to make it use 1 cache line instead of 2. That said...
If you are aiming for optimizing for cache, forget about inheritance and virtual functions. Or at least be critical of them. As you note, creating a contiguous array of polymorphic objects is somewhere between hard & error-prone and completely infeasible (depending on whether subclasses have different sizes).
You can attempt to create a pool, to have nearby entities (in the entities vector) more likely to be close to each other (in memory), but frankly I doubt you'll do much better than a state of the art general-purpose allocator, especially when the entities' size and lifetime varies significantly. A pool would only help if entities adjacent in the vector are allocated back-to-back. But in that case, any standard allocator gives the same locality advantages. It's not like tcmalloc and friends select a random cache line to allocate from just to annoy you.
You might be able squeeze a bit of memory out of knowing your object types, but this is purely hypothetical and would have to be proven first to justify the effort of implementing it. Also note that a run of the mill pool either assumes that all objects are the same size, or that you never deallocate individual objects. Allowing both puts you halfway towards a general-purpose allocator, which you're bound to do worse.
You can segregate objects based on their types. That is, instead of a single vector with polymorphic Entitys with virtual functions, have N vectors: vector<Bullet>, vector<Monster>, vector<Loot>, and so on. This is less insane than it sounds for threereasons:
Often, you can pull out the entire business of managing one such vector into a dedicated system. So in the end you might even have a vector<System *> where each System has a vector for one kind of thing, and updates all those things in a single virtual call (delegating to many statically-dispatched calls).
You don't need to represent everything ever in this abstraction. Not every little integer needs to be wrapped in its own type of entity.
If you go further down this route and take hints from entity component systems, you also gain an alternative to inheritance for code reuse (class Monster : Entity {}; class Skeleton : Monster {};) that plays nicer with the hard-earned cache friendliness.
It is not easy because polymorphism doesn't work well with cache coherency.
I think the best you can overload the base class new operator to allocate memory from a pool. But to do this, you need to know the size of all derived classes and after some allocating/deallocating you can have memory fragmentation which will lower the gain.
Have a look at Cachegrind, it's a tool that simulates how your program interacts with a machine's cache hierarchy.
My questions are specifically dealing with dependency injection through the constructor. I understand the pros/cons of service locator pattern, constructor/setter injection, and their flavors, however there is something I can't seem to get past after choosing pure constructor injection. After reading many materials for testable design, including a thorough perusing of Miško Hevery's blog (specifically this post) I'm at the following situation:
Assume I'm writing a C++ program, and I have correctly injected my dependencies through their constructors. For readability I have given myself a high-level object which has a single Execute() function called from main:
int main(int argc, char* argv[]) {
MyAwesomeProgramObject object(argc, argv);
return object.Execute();
}
Execute()'s responsibility is to simply wire up all required objects and kick off the highest level object. The highest level object requires a couple dependencies and those objects required a few objects and so on and so on, implying a function that looks like this:
MyAwesomeProgramObject::Execute() {
DependencyOne one;
DependencyTwo two;
DependencyThree three;
MidLevelOne mid_one(one);
MidLevelTwo mid_two(two, three);
// ...
MidLevelN mid_n(mid_dependencyI, mid_dependencyJ, mid_dependencyK);
// ...
HighLevelObject1 high_one(mid_one, mid_n);
HighLevelObject2 high_two(mid_two);
ProgramObject object(high_one, high_two);
return object.Go();
}
From what I take from Miško's blog (and I would ask him, but figured he wouldn't have time to get back to me), this is the only way to satisfy pure constructor injection of dependencies.
In the blog post mentioned, he states we should have factories on a per object lifetime level, but this is essentially what Execute is doing, making my code look identical to his example:
AuditRecord audit = new AuditRecord();
Database database = new Database(audit);
Captcha captcha = new Captcha();
Authenticator authenticator =
new Authenticator(database, captcha, audit);
LoginPage = new LoginPage(audit, authenticator);
Questions:
Is this the correct approach?
Is this a pattern that I'm not aware of (seems similar to Maven's context.xml)?
For pure constructor injection, do I simply suffer the cost of "upfront" allocation?
Note that your different examples are contradictory. At first you show creating objects on the stack, and your last example allocates objects.
Objects on the stack are somewhat dangerous, but just fine in most cases. The main problem is to give another object that has a longer lifetime than your function an object pointer when that object comes from the stack... when that long lived object tries to access the object on the stack after the function returned, you have a problem. If all the objects have a stack lifetime, then you're fine.
Personally, I started using shared pointers and I find that to be the ultimate in easing the management of a large number of objects.
std::shared_ptr<foo> foo_object(new foo);
std::shared_ptr<blah> foo_object(new blah(foo));
In this way blah can hold a copy of the foo shared pointer forever and everything works as expected, even between function boundaries. Not only that, the shared pointer is NULL on creation, and auto-deleted on deletion (when the last shared pointer is deleted, of course.) And you can use weak pointer in objects that do not need the pointer to always be set...
Otherwise, I think that what you are trying to do works to a certain extend. In my world, it is often that things get created at a later time so I need a setter. However, constructor injections are very useful to force the user to properly initialize your object (i.e. I often create read-only objects, no setters, which are 100% initialized on construction, very practical, very safe!)
I am developing a large, complex model (mostly simple math, primarily algebra, but lots of calculations). I did my initial bid assuming I'd have to do everything once, but now the scope has expanded such that I need to run the entire model multiple times (on the same set of underlying assumptions but with a different dataset).
Based on my initial needs, I created a bunch of classes, then created dynamic instances of those classes in my main function, passing them by reference to each function as I went. That way, at the end of the main function, I can do all the necessary reporting / output once all of the functions have run. My question is about how to now modify my main function to allow for multiple iterations. A few sample bits of code below, followed by my question(s):
// Sample class declaration (in main)
vector<PropLevelFinancials> PLF;
// Sample function call (functions in other header files)
ProcessPropFinancials(vector<PropLevelFinancials>& PLF);
// Reporting at the end of main (functions in other header files)
OutputSummaryReport(vector<PropLevelFinancials>& PLF, other objects);
// What I need to do next
// Clear/Delete PLF and other objects, iterate through model again
I am quite happy with the speed and result of my current program, so don't need a whole lot of input on that front (although always welcome suggestions).
How should I implement the ability to cycle through multiple datasets (I obviously know how to do the loop, my question is about memory management)? Speed is critical. I want to essentially delete the existing instance of the class object I have created (PLF), and then run everything again on a new instance of the object(s). Is this the type of situation where I should use "new" and "delete" to manage the iterations? Would that change my function calls as outlined above? If I wanted to avoid using new and delete (stay on the stack), what are my options?
No. Do not, ever, use new and delete without a highly exceptional cause.
std::vector<T> offers a clear() member you can use. I suggest you implement a similar one for your own classes if that is what you need. Or you can simply do PLF = std::vector<T>();, which would work a bit better for your own UDTs without modification, assuming that you wrote them according to the most basic C++ guidelines.
Of course I would like to know some magic fix to this but I am open to restructuring.
So I have a class DeviceDependent, with the following constructor
DeviceDependent(Device& device);
which stores a reference to the device. The device can change state, which will necessitate a change in all DeviceDependent instances dependent on that device. (You guessed it this is my paltry attempt to ride the directX beast)
To handle this I have the functions DeviceDependent::createDeviceResources(), DeviceDependent::onDeviceLost().
I planned to register each DeviceDependentinstance to the device specified in the DeviceDependent constructor. The Device would keep a std::vector<DeviceDependent*> of all DeviceDependent instances so registered. It would then iterate through that vector and called the above functions when appropriate.
This seemed simple enough, but what I especially liked about it was that I could have a std::vector<DeviceDependent (or child)> somewhere else in the code and iterate over them quickly. For instance I have a class Renderable which as the name suggest represents a renderable object, I need to iterate over this once a frame at least and because of this I did not want the objects to be scattered throughout memory.
Down to business, here is the problem:
When I create the solid objects I relied on move semantics. This was purely by instinct I did not consider copying large objects like these to add them to the std::vector<DeviceDependent (or child)> collection. (and still abhor the idea)
However, with move semantics (and I have tested this for those who don't believe it) the address of the object changes. What's more it changes after the default constructor is called. That means my code inside the constructor of DeviceDependant calling device.registerDeviceDependent(this) compiles and runs fine, but the device accumulates a list of pointers which are invalidated as soon as the object is moved into the vector.
I want to know if there is someway I can stick to this plan and make it work.
Things I thought of:
Making the 'real' vector a collection of shared pointers, no issue copying. The object presumably will not change address. I don't like this plan because I am afraid that leaving things out on the heap will harm iteration performance.
Calling register after the object has been moved, it's what I'm doing provisionally but I don't like it because I feel the constructor is the proper place to do this. There
should not exist an instance of DeviceDependent that is not on some device's manifest.
Writing my own move constructor or move assignment functions. This way I could remove the old address from the device and change it to the new one. I don't want to do this because I don't want to keep updating it as the class evolves.
This has nothing to do with move constructors. The issue is std::vector. When you add a new item to that vector, it may reallocate its memory, and that will cause all the DeviceDependant objects to be transferred to a new memory block internal to the vector. Then new versions of each item will be constructed, and the old ones deleted. Whether the construction is copy-construction or move-construction is irrelevant; the objects effectively change their address either way.
To make your code correct, DeviceDependant objects need to unregister themselves in their destructor, and register themselves in both copy- and move-constructors. You should do this regardless of what else you decide about storage, if you have not deleted those constructors. Otherwise those constructors, if called, will do the wrong thing.
One approach not on your list would be to prevent the vector reallocating by calling reserve() with the maximum number of items you will store. This is only practical if you know a reasonable upper-bound to the number of DeviceDependant objects. However, you may find that reserving an estimate, while not eliminating the vector reallocations entirely, makes it rare enough that the cost of un-registering and re-registering becomes insignificant.
It sounds like your goal is getting cache-coherency for the DeviceDependants. You might find that using a std::deque as main storage avoids the re-allocations while still giving enough cache-coherency. Or you could gain cache-coherency by writing a custom allocator or operator new().
As an aside, it sounds like your design is being driven by performance costs that you are only guessing at. If you actually measure it, you might find that using std::vector> is fine, and doesn't significantly the time it takes to iterate over them. (Note you don't need shared pointers here, since the vector is the only owner, so you can avoid the overheads of reference-counting.)
Right now, I'm modelling some sort of little OpenGL library to fool around with graphic programming etc. Therefore, I'm using classes to wrap around specific OpenGL function calls like texture creation, shader creation and so on, so far, so good.
My Problem:
All OpenGL calls must be done by the thread which owns the created OpenGL Context (at least under Windows, every other thread will do nothing and create an OpenGL error). So, in order to get an OpenGL context, I firstly create an instance of a window class (just another wrapper around the Win API calls) and finally create an OpenGL context for that window. That sounded quiet logical to me. (If there's already a flaw in my design that makes you scream, let me know...)
If I want to create a texture, or any other object that needs OpenGL calls for creation, I basically do this (the called constructor of an OpenGL object, example):
opengl_object()
{
//do necessary stuff for object initialisation
//pass object to the OpenGL thread for final contruction
//wait until object is constructed by the OpenGL thread
}
So, in words, I create an object like any other object using
opengl_object obj;
Which then, in its contructor, puts itself into a queue of OpenGL objects to be created by the OpenGL context thread. The OpenGL context thread then calls a virtual function which is implemented in all OpenGL objects and contains the necessary OpenGL calls to finally create the object.
I really thought, this way of handling that problem, would be nice. However, right now, I think I'm awfully wrong.
The case is, even though the above way works perfectly fine so far, I'm having troubles as soon as the class hierarchy goes deeper. For example (which is not perfectly, but it shows my problem):
Let's say, I have a class called sprite, representing a Sprite, obviously. It has its own create function for the OpenGL thread in which the vertices and texture coordinates are loaded into the graphic cards memory and so on. That's no problem so far.
Let's further say, I want to have 2 ways of rendering sprites. One Instanced and one through another way. So, I would end up with 2 classes, sprite_instanced and sprite_not_instanced. Both are derived from the sprite class, as they both are sprite which are only rendered differently. However, sprite_instanced and sprite_not_instanced need further OpenGL calls in their create function.
My Solution so far (and I feel really awful about it!)
I have some kind of understanding how object generation in c++ works and how it affects virtual functions. So I decided to use the virtual create function of the class sprite only to load the vertex data and so on into the the graphics memory. The virtual create method of sprite_instanced will then do the preparation to render that sprite instanced.
So, if I want write
sprite_instanced s;
Firstly, the sprite constructor is called and after some initialisation, the constructing thread passes the object to the OpenGL thread. At this point, the passed object is merely a normal sprite, so sprite::create will be called and the OpenGL thread will create a normal sprite. After that, the constructing thread will call the constructor of sprite_instanced, again do some initialisation and pass the object to the OpenGL thread. This time however, it's a sprite_instanced and therefore sprite_instanced::create will be called.
So, if I'm right with the above assumption, everything happens exactly as it should, in my case at least. I spend the last hour reading about calling virtual functions from constructors and how the v-table is build etc. I've ran some test to check my assumption, but that might be compiler-specific so I don't rely on them 100%. In addition, it just feels awful and like a terrible hack.
Another Solution
Another possibility would be implementing factory method in the OpenGL thread class to take care of that. So I can do all the OpenGL calls inside the constructor of those objects. However, in that case, I would need a lot of functions (or one template-based approach) and it feels like a possible loss of potential rendering time when the OpenGL thread has more to do than it needs to...
My Question
Is it ok to handle it the way I described it above? Or should I rather throw that stuff away and do something else?
You were already given some good advice. So I'll just spice it up a bit:
One important thing to understand about OpenGL is, that it is a state machine, which doesn't need some elaborate "initialization". You just use it, and that's about it. Buffer Objects (Textures, Vertex Buffer Objects, Pixel Buffer Objects) may make it look different, and most tutorials and real world applications indeed fill Buffer Objects at application start.
However it is perfectly fine to create them during regular program execution. In my 3D engine I use the free CPU time during the double buffer swap for asynchronous uploads into Buffer Objects (for(b in buffers){glMapBuffer(b.target, GL_WRITE_ONLY);} start_buffer_filling_thread(); SwapBuffers(); wait_for_buffer_filling_thread(); for(b in buffers){glUnmapBuffer(b.target);}).
It's also important to understand that for simple things like sprites should not given its own VBO for each sprite. One normally groups large groups of sprites in a single VBO. You don't have to draw them all together, since you can offset into the VBO and make partial drawing calls. But this common pattern of OpenGL (geometrical objects sharing a buffer object) completely goes against that principle of your classes. So you's need some buffer object manager, that hands out slices of address space to consumers.
Using a class hierachy with OpenGL in itself is not a bad idea, but then it should be some levels higher than OpenGL. If you just map OpenGL 1:1 to classes you gain nothing but complexity and bloat. If I call OpenGL functions directly or by class, I'll still have to do all the grunt work. So a texture class should not just map the concept of a texture object, but it should also take care of interacting with Pixel Buffer Objects (if used).
If you actually want to wrap OpenGL in classes I strongly recommend not using virtual functions but static (means on the compilation unit level) inline classes, so that they become syntactic sugar the compiler will not bloat up too much.
The question is simplified by the fact a single context is assumed to be current on a single thread; actually there can be multiple OpenGL contexts, also on different threads (and while we're at, we consider context name spaces sharing).
First all, I think you should separate the OpenGL calls from the object constructor. Doing this allow you to setup an object without carrying about OpenGL context currency; successively the object could be enqueued for creation in the main rendering thread.
An example. Suppose we have 2 queues: one which holds Texture objects for loading texture data from filesystem, one which hold Texture objects for uploading texture data on GPU memory (after having loaded data, of course).
Thread 1: The texture loader
{
for (;;) {
while (textureLoadQueue.Size() > 0) {
Texture obj = textureLoadQueue.Dequeue();
obj.Load();
textureUploadQueue.Enqueue(obj);
}
}
}
Thread 2: The texture uploader code section, essentially the main rendering thread
{
while (textureUploadQueue.Size() > 0) {
Texture obj = textureUploadQueue.Dequeue();
obj.Upload(ctx);
}
}
The Texture object constructor should looks like:
Texture::Texture(const char *path)
{
mImagePath = path;
textureLoadQueue.Enqueue(this);
}
This is only an example. Of course each object has different requirements, but this solution is the most scalable.
My solution is essentially described by the interface IRenderObject (the documentation is far different from the current implementation, since I'm refactoring alot at this moment and the development is at a very alpha level). This solution is applied to C# languages, which introduce additional complexity due the garbage collection management, but the concept are perfectly adaptable to C++ language.
Essentially, the interface IRenderObject define a base OpenGL object:
It has a name (those returned by Gen routines)
It can be created using a current OpenGL context
It can be deleted using a current OpenGL context
It can be released asynchronously using an "OpenGL garbage collector"
The creation/deletion operations are very intuitive. The take a RenderContext abstracting the current context; using this object, it is possible to execute checks that can be useful to find bug in object creation/deletion:
The Create method check whether the context is current, if the context can create an object of that type, and so on...
The Delete method check whether the context is current, and more important, check whether the context passed as parameter is sharing the same object name space of the context that has created the underlying IRenderObject
Here is an example on the Delete method. Here the code works, but it doesn't work as expected:
RenderContext ctx1 = new RenderContext(), ctx2 = new RenderContext();
Texture tex1, tex2;
ctx1.MakeCurrent(true);
tex1 = new Texture2D();
tex1.Load("example.bmp");
tex1.Create(ctx1); // In this case, we have texture object name = 1
ctx2.MakeCurrent(true);
tex2 = new Texture2D();
tex2.Load("example.bmp");
tex2.Create(ctx2); // In this case, we have texture object name = 1, the same has before since the two contexts are not sharing the object name space
// Somewhere in the code
ctx1.MakeCurrent(true);
tex2.Delete(ctx1); // Works, but it actually delete the texture represented by tex1!!!
The asynchronous release operation aim to delete the object, but not having a current context ( infact the method doesn't take any RenderContext parameter). It could happen that the object is disposed in a separate thread, which doesn't have a current context; but also, I cannot rely on the gargage colllector (C++ doesn't have one), since it is executed in a thread with I have no control. Furthermore, it is desiderable to implement the IDisposable interface, so application code can control the OpenGL object lifetime.
The OpenGL GarbageCollector, is executed on the thread having the right context current.
It is always bad form to call any virtual function in a constructor. The virtual call will not be completed as normal.
Your data structures are very confused. You should investigate the concept of Factory objects. These are objects that you use to construct other objects. You should have a SpriteFactory, which gets pushed into some kind of queue or whatever. That SpriteFactory should be what creates the Sprite object itself. That way, you don't have this notion of a partially constructed object, where creating it pushes itself into a queue and so forth.
Indeed, anytime you start to write, "Objectname::Create", stop and think, "I really should be using a Factory object."
OpenGL was designed for C, not C++. What I've learned works best is to write functions rather than classes to wrap around OpenGL functions, as OpenGL manages its own objects internally. Use classes for loading your data, then pass it to C-style functions which deal with OpenGL. You should be very careful generating/freeing OpenGL buffers in constructors/destructors!
I would avoid having your objects insert themselves into the GL thread's queue on construction. That should be an explicit step, e.g.
gfxObj_t thing(arg) // read a file or something in constructor
mWindow.addGfxObj(thing) // put the thing in mWindow's queue
This lets you do things like constructing a set of objects and then putting them all in the queue at once, and guarantees that the constructor ends before any virtual functions are called. Note that putting the enqueue at the end of the constructor does not guarantee this, because constructors are always called from top-most class down. This means that if you queue an object to have a virtual function called on it, derived classes will be enqueued before their own constructors begin acting. This means you have a race condition which can cause action on an uninitialized object! A nightmare to debug if you don't realize what you've done.
I think the problem here isn't RAII, or the fact that OpenGL is a c-style interface. It's that you're assuming sprite and sprite_instanced should both derive from a common base. These problems occur all the time with class hierarchies and one of the first lessons I learned about object orientation, mostly through many errors, is that it's almost always better to encapsulate than to derive. EXCEPT that if you're going to derive, do it through an abstract interface.
In other words, don't be fooled by the fact that both of these classes have the name "sprite" in them. They are otherwise totally different in behaviour. For any common functionality they share, implement an abstract base that encapsulates that functionality.