I currently have a project using DirectX11, that generates random terrain based on the Hill algorithm. I have a set up where by you can change the inputs on the terrain (seed, number of hills etc) and then reinitialize and watch the terrain get generated hill by hill. The issue with this is that (as you would expect) there is a large FPS loss, but also things like the camera will stutter when attempting to move it. Essentially, what I want to do is create a thread for the terrain hill step, so that the generation doesn't interfere with the frame time and therefor the camera (so the camera can still move seamlessly). I've looked at a few resources but I'm still not understanding threads properly.
Checking when to reinitialize the terrain, during the update method:
void CThrive::Update(float frameTime)
{
CKeyboardState kb = CKeyboardState::GetKeyboardState(mWindow->GetHWND());
mCamera->Update(kb, frameTime);
if (mGui->Reint())
{
SafeDelete(mTerrain);
mTerrain = new CTerrain(mGui->GetTerrSize(), mGui->GetTerrMin(), mGui->GetTerrMax(), mGui->GetTerrNumHills(), mGui->GetTerrSeed());
mNewTerrain = true;
mGui->SetReint(false);
}
Calling method to generate new hills, during the render method:
void CThrive::Render()
{
if (mNewTerrain)
{
reintTerrain();
}
MainPass();
}
Method used to add to the terrain:
void CThrive::reintTerrain()
{
if (!mTerrain->GenerationComplete())
{
mTerrain->GenerateStep(mGraphicsDevice->GetDevice());
}
else
{
mNewTerrain = false;
}
}
I assume I'd create a thread for reintTerrain, but I'm not entirely sure how to properly make this work within the class, as I require it to stop adding hills when it's finished.
Thank you for your help
Use std::thread for thread creation. Pass pointer to thread's entry point to its constructor as a lambda of member function pointer. Instances of std::thread may reside in the private section of your class. Accesses to shared object's fields used by multiple threads should be protected by fences (std::atomic<>, std::atomic_thread_fence) in order to avoid cache coherency problems.
Related
Ok I am trying to switch my Game Engine to multithreading. I have done the research on how to make it work to use OpenGL in multithreaded application. I have no problem with rendering or switching contexts. Let my piece of code explain the problem :) :
for (it = (*mIt).Renderables.begin(); it != (*mIt).Renderables.end(); it++)
{
//Set State Modeling matrix
CORE_RENDERER->state.ModelMatrix = (*it).matrix;
CORE_RENDERER->state.ActiveSubmesh = (*it).submesh;
//Internal Model Uniforms
THREAD_POOL->service.post([&]
{
for (unsigned int i = 0; i < CORE_RENDERER->state.ActiveShaderProgram->InternalModelUniforms.size(); i++)
{
CORE_RENDERER->state.ActiveShaderProgram->InternalModelUniforms[i]->Set( CORE_RENDERER->state.ModelMatrix);
}
CORE_RENDERER->state.ActiveSubmesh->_Render();
});
//Sleep(10);
}
I'll quickly explain what are the elements in the code to make my problem more clear. Renderables is a simple std::vector of elements with _Render() function which works perfectly. CORE_RENDERER->state is a struct holding information about current render state such as current material properties as well as current submesh ModelMatrix. So Matrix and Submesh are stored to state struct (I KNOW THIS IS SLOW, I'll probably change that in time :) ) The next piece of code is sent to THREAD_POOL->service which is actually boost::asio::io_service and has only one thread so it acts like a queue of rendering commands. The idea is that the main thread provides information about what to render and do frustum culling and other tests while an auxilary thread does actual rendering. This works fine, except there is a slight problem:
The code that is sent to thread pool starts to execute, but before all InternalModelUniforms are set and submesh is rendered the next iteration of Renderables is executed and both ModelMatrix and ActiveSubmesh are changed. The program doesn't crash but both informations change and some meshes are rendered some matrices are right others not which results in flickering image. Objects apear on frame and the next frame they are gone. The problem is only fixed if I enable that Sleep(10) function which makes sure that the code is executed before next iteration which obviously kills the idea of gaining preformance. What is the best possible solution for this? How can I send commands to the queue each with unique built in data? Maybe I need to implement my own queue for commands and a single thread without io_service?
I will continue my research as I know there is a way. The idea is right cause I get preformance boost as not a single if/else statement is processed by the rendering thread :) Any help or tips will really help!
Thanks!
Update:
After struggling for few nights I have created a very primitive model of communication between main thread and an Aux Thread. I created a class that represents a base command to be executed by aux thread:
class _ThreadCommand
{
public:
_ThreadCommand() {}
~_ThreadCommand() {}
virtual void _Execute() = 0;
virtual _ThreadCommand* Clone() = 0;
};
These commands that are childs of this class have _Execute() function to do whatever operation needs to be done. The main thread upon rendering fills a boost::ptr_vector of these commands While aux thread keeps on checking if there are any commands to process. When commands are found it copies entire vector to it's own vector inside _AuxThread and clears the original one. Commands are then executed by calling _Execute functions on each:
void _AuxThread()
{
//List of Thread commands
boost::ptr_vector<_ThreadCommand> _cmd;
//Infinitive loop
while(CORE_ENGINE->isRunning())
{
boost::lock_guard<boost::mutex> _lock(_auxMutex);
if (CORE_ENGINE->_ThreadCommands.size() > 0)
{
boost::lock_guard<boost::mutex> _auxLock(_cmdMutex);
for (unsigned int i = 0; i < CORE_ENGINE->_ThreadCommands.size(); i++)
{
_cmd.push_back(CORE_ENGINE->_ThreadCommands[i].Clone());
}
//Clear commands
CORE_ENGINE->_ThreadCommands.clear();
//Execute Commands
for (unsigned int i = 0; i < _cmd.size(); i++)
{
//Execute
_cmd[i]._Execute();
}
//Empty _cmd
_cmd.clear();
}
}
//Notify main thread that we have finished
CORE_ENGINE->_ShutdownCondition->notify_one();
}
I know that this is a really bad way to do it. Preformance is quite slower which I'm quite sure is because of all the copying and mutex locks. But at least the renderer works. You can get the idea of what I want to achieve but as I said I am very new to multithreading. What is the best solution for this scenario? Should I return back to ThreadPool system with asio::io_service? How can I feed commands to AuxThread with all values that must be sent to renderer to preform tasks in correct way?
First, a warning. Your "slight problem" is not slight at all. It is race condition, which is undefined behavior in C++, which, in turn, implies that anything could happen, including:
Everything renders fine
Image flickers
Nothing renders at all
It crashes on the last Saturday of every month. Or working fine on your computer and crashing on everyone's else.
Seriously, do not ever rely on UB, especially when writing library/framework/game engine.
Now about your question.
Lets leave aside any practical benefits of your approach and fix it first.
Actually, OpenGL implementation uses something very similar under the hood. Commands are executed asynchronously by the driver thread. I recommend you to read about their implementation to get some ideas on how to improve your design.
What you need to do, is to somehow "capture" the state at the time you post a rendering command. Simplest possible thing - copy the CORE_RENDERER->state into closure and use this copy to do the rendering. If state is large enough, it can be costly, though.
Alternative solution (and OpenGL goes that way) is to make every change in the state a command also, so
CORE_RENDERER->state.ModelMatrix = (*it).matrix;
CORE_RENDERER->state.ActiveSubmesh = (*it).submesh;
translates into
Matrix matrix = (*it).matrix;
Submesh submesh = (*it).submesh;
THREAD_POOL->service.post([&,matrix,submesh]{
CORE_RENDERER->state.ModelMatrix = matrix;
CORE_RENDERER->state.ActiveSubmesh = submesh;
});
Notice, however, that now you can't simply read CORE_RENDERER->state.ModelMatrix from your main thread, as it is changing in a different thread. You must first ensure that command queue is empty.
I have a program where users can create Frames using Lua command such as:
frm=Frame.new()
the above-mentioned command shows a frame to the user. Behind the scenes the C++ wrapper is as follows:
Frame* Frame_new(lua_State* L)
{
int nargs=lua_gettop(L);
Frame* wb=0;
if(nargs==0){
//Omitted
wb=mainfrm->GetFrame();
lua_pushlightuserdata(L,(void*)(wb));
int key=luaL_ref(L, LUA_REGISTRYINDEX);
wb->SetLuaRegistryKey(key);
}
return wb;
}
Since the frame is shown to the user, the user can close the frame by just clicking on the close button provided by the operating system. This generates a close event and it is handled as follows:
void Frm::OnClose(wxCloseEvent& evt)
{
//Omitted for brevity
int LuaRegistryKey=GetFrame()->GetLuaRegistryKey();
lua_rawgeti(glbLuaState,LUA_REGISTRYINDEX,LuaRegistryKey);//userdata
Frame* wb1=(Frame*)lua_touserdata(glbLuaState,-1); //userdata
lua_pop(glbLuaState,1); //
lua_getglobal(glbLuaState,"_G"); //table
lua_pushnil(glbLuaState); //table key
while (lua_next(glbLuaState,-2)) {//table key value
const char* name = lua_tostring(glbLuaState,-2);//table
if(lua_type(glbLuaState,-1)==LUA_TUSERDATA){
Frame* wb2=(Frame*)lua_touserdata(glbLuaState,-1);
if(wb2==m_Frame){ //this part doesnt work
lua_pushnumber(glbLuaState,0);
lua_setglobal(glbLuaState,name);
lua_pop(glbLuaState,1);
break;
}
}
lua_pop(glbLuaState,1); //table key
} //table
lua_pop(glbLuaState,1); //
if(m_Frame==wb1) {delete m_Frame; m_Frame=0; wb1=0;}
if(wb1) {delete wb1; wb1=0;}
luaL_unref(glbLuaState,LUA_REGISTRYINDEX,LuaRegistryKey );
}
Now the goal is when user closes the frame the variable created by frm=Frame.new() should be nil so that user can not call one of its methods, such as frm:size() which crashes the program.
In the above C++ code for handling the close event, wb1 and current frame has the same memory address. Now to my understanding all I need to do is search the global table for the userdata type Frame and compare the memory addresses so that I know I am choosing the right frame and then set it to nil.
However, Frame* wb2=(Frame*)lua_touserdata(glbLuaState,-1); returns a completely different address from wb1, therefore I cannot know which variable of type frame I am referring to.
To my understanding wb2 has a different memory address possibly due to 3 scenarios:
1) frm is a full userdata
2) frm is inside global lua table, therefore has a different address (although this doesnt make sense to me as I pushed the address of Frame in C++).
3) I am thinking completely in the wrong way or cant see something simple.
Now to my understanding all I need to do is search the global table for the userdata type Frame and compare the memory addresses so that I know I am choosing the right frame and then set it to nil.
Your understanding is wrong.
First, you did not return userdata to Lua. You returned light userdata. That's different. The lua_type of light userdata is LUA_TLIGHTUSERDATA.
Second, even if you fixed that problem, you're not iterating through tables inside the global table. So something as simple as this would confound you:
global_var = {}
global_var.frame = Frame.new()
Lua code should be able to store its data wherever it wants. And if it wants to store some userdata in a table, who are you to say no?
Third, even if you iterated through every table accessible globally recursively (with protection from infinite loops), that wouldn't stop this:
local frm = Frame.new()
function GlobalFunc(...)
frm:Stuff();
end
Because Lua has proper lexical scoping, GlobalFunc will store a reference to the frm local internally. And since frm is a local variable, you cannot get at it just from iterating through globals.
Generally speaking, if you give a value to Lua, it now owns that value. It can do whatever it wants, and it's generally considered rude to break this contract.
Though it's not impossible. The way to handle it is by using an actual userdata rather than light userdata. Each regular userdata is an object, a full allocation of memory. Inside that allocation you would store the Frame pointer. When it comes time for that Frame to be destroyed, all you have to do is set the Frame pointer inside the userdata to NULL.
Conceptually, it's like this in C++:
struct FramePtr
{
Frame *ptr;
};
Lua would be passing around a single allocation of FramePtr. So if you set that allocation's FramePtr to NULL, everyone sees it. No iterating through global tables or somesuch.
Of course, accessing the Frame from a FramePtr requires an extra indirection. However, by using full userdata instead of light userdata, you can also attach a proper metatable to it (light userdata doesn't get per-object metatables; every light userdata shares the same metatable).
In UE4 I am working on a Puzzle Block game in my Graphics 2 class. Our professor and our class is learning about UE4 together. Our class as a whole is a little confused about one thing in the C++ code and my professor said he would try to figure the answer out himself, but I figured I would jump start our next class with information that I find here.
Okay so in the BlockGrid.cpp file this section of code is used to create the blocks.
void AMyProject2BlockGrid::BeginPlay()
{
Super::BeginPlay();
// Number of blocks
const int32 NumBlocks = Size * Size;
// Loop to spawn each block
for(int32 BlockIndex=0; BlockIndex<NumBlocks; BlockIndex++)
{
const float XOffset = (BlockIndex/Size) * BlockSpacing; // Divide by dimension
const float YOffset = (BlockIndex%Size) * BlockSpacing; // Modulo gives remainder
// Make postion vector, offset from Grid location
const FVector BlockLocation = FVector(XOffset, YOffset, 0.f) + GetActorLocation();
// Spawn a block
AMyProject2Block* NewBlock = GetWorld()->SpawnActor<AMyProject2Block>(BlockLocation, FRotator(0,0,0));
// Tell the block about its owner
if(NewBlock != NULL)
{
NewBlock->OwningGrid = this;
}
}
}
The confusion starts with the following line in this function:
AMyProject2Block* NewBlock = GetWorld()->SpawnActor<AMyProject2Block>(BlockLocation, FRotator(0,0,0));
Each time it looks like it is rewriting NewBlock for each new block in the puzzle. Our problem is for the game we are creating, which is a Lights Out game, is if NewBlock is being continually being rewritten, then how is it keeping track of the addresses for the information to the blocks that are being displayed on the screen? This could be fixed by simply creating an array to store the information, but if the information is still being kept somewhere this would be inefficient. So how can we access the information for the blocks if NewBlock is being overwritten with each loop without making an array to inefficiently store the data?
THANKS!!!! :)
#Mathew's comment has the right idea, when you use SpawnActor, Unreal does a whole bunch of behind the scenes stuff but essentially creates your actor inside the world and manages its lifetime. (For example, to remove your actor from the level, you would need to use:
MyActor->Destroy();
Rather than the C++ method:
delete PtrToActor;
This handles removing it from the scene, updating collision volumes etc. before actually deleting the actor.
To your code, you are of course overwriting the pointer to your block, so the block itself is left untouched. You can use the TActorIterator<T> iterable to loop through all actors of a type which would save you having to store an array yourself. You would do something like this:
for (TActorIterator<AMyProject2Block> ActorItr(GetWorld()); ActorItr; ++ActorItr )
{
AMyProject2Block* PtrToActor = *ActorItr;
}
Where here the PtrToActor will point to each instance in turn as the loop advances.
However, it isn't really inefficient (and in many cases, is efficient) to store your own separate array of pointers. It is only a small memory cost (since it is just pointers) and might be faster as you don't have to filter the actors to find the one you want. In either case, it is a much of a muchness so you should choose whichever one feels more logical.
In your example, I'd keep them in 2D data structure so you can access them via their position rather than just getting an un-ordered list of them from the engine.
I'm porting a game from Ruby to C++. There is a main render loop that updates and draw the content. Now let's say that during the game, you want to select an item another screen. The way it's done in the original code is to do Item item = getItemFromMenu(); getItemFromMenu is a function that will open the menu and do have its own update/render loop, which mean that during the whole time the player has this other screen open, you are in a nested render loop. I feel like this is a bad method but I'm not sure why. On the other hand it's very handy because I can open the menu with just 1 function call and so the code is localized.
Any idea if this is a bad design or not?
I hesitated to post it on gamedev, but since this is mostly a design issue I posted it here
edit : some pseudo-code to give you an idea:
The usual loop in the main part of the code:
while(open) {
UpdateGame();
DrawGame();
}
now inside UpdateGame() i would do something like:
if(keyPressed == "I") {
Item& item = getItemFromInventory();
}
And getItemFromInventory():
while(true) {
UpdateInventory();
if(item_selected) return item;
DrawInventory();
}
A good way to handle something like this would be to replace the DrawInventory() call with something like InvalidateInventory(), which will mark the current graphical state of the inventory as outdated and request it to be redrawn during the next frame rendering (which'll happen pretty soon after when the main loop gets to DrawGame()).
This way, you can keep running through the main loop, but the only parts of the screen that get looked at for redrawing are the ones that have been invalidated, and during normal gameplay you can invalidate your (2/3)D environment as a normal part of processing, but then inside the inventory you can always mark only inventory assets as needing to be redrawn, which minimises overhead.
The other part of your inner loop, UpdateInventory(), can be a part of UpdateGame() if you use a flag to indicate the current game state, something like:
UpdateGame()
{
switch(gameState)
{
case INVENTORY:
UpdateInventory();
break;
case MAIN:
default:
UpdateMain();
break;
}
}
If you really wanted, you could also apply this to drawing:
DrawGame()
{
switch(gameState)
{
case INVENTORY:
DrawInventory();
break;
case MAIN:
default:
DrawMain();
break;
}
}
But I think drawing should be encapsulated and you should tell it which part of the screen, rather than which separate area of the game, needs to be drawn.
What you've created with your nested render loop is functionally a state machine (as most game render loops tend to be). The problem with the nested loop is that many times you'll want to do the same sorts of things in your nested loop as your outer loop (process input, handle IO, update debug info etc).
I've found that it's better to have one render loop and use a finite state machine (FSM) to represent your actual states. Your states might look like:
Main menu state
Options menu state
Inventory state
World view state
You hook up transitions between states to move between them. The player clicking a button might trigger the transition which could play an animation or otherwise, then move to the new state. With a FSM your loop might look like:
while (!LeaveGame()) {
input = GetInput();
timeInfo = GetTimeInfo();
StateMachine.UpdateCurrentState(input, timeInfo);
StateMachine.Draw();
}
A full FSM can be a bit heavyweight for a small game so you can try a simplified state machine using a stack of game states. Every time the user does an action to transition to a new state you push the state on a stack. Likewise when they leave a state you pop it off. Only the top of the stack typically receives input and the other items on the stack may/may not draw (depending on your preference). This is a common approach and has some upsides and downsides depending on who you talk to.
The simplest option of all is to just throw a switch statement in to pick which render function to use (similar to darvids0n's answer). If you're writing an arcade clone or a small puzzle game that would do just fine.
I have the following action which is executed when a certain
button is pressed in a Qt application:
#include <shape.h>
void computeOperations()
{
polynomial_t p1("x^2-x*y+1"),p2("x^2+2*y-1");
BoundingBox bx(-4.01, 4.01,-6.01,6.01,-6.01,6.01);
Topology3d g(bx);
AlgebraicCurve* cv= new AlgebraicCurve(p1,p2);
g.push_back(cv);
g.run();
//Other operations on g.
}
Topology3d(...), AlgebraicCurve(..), BoundingBox(...),
polynomial_t(...) are user defined types defined in the
corresponding header file .
Now for some values of p1 and p2, the method g.run() works perfectly.
Thus for some other values of p1 and p2, g.run() it is not
working anymore as the method gets blocked somehow and the
message "Application Not Responding" appears and I have to
kill the Application.
I would want to have the following behavior: whenever
g.run() is taking too long, gets blocked for some particular
values of p1, p2, I would want to display an warning box
using QMessageBox::Warning.
I try to do this with try{...} and catch{...}:
#include <shape.h>
class topologyException : public std::runtime_error
{
public:
topologyException::topologyException(): std::runtime_error( "topology fails" ) {}
};
void computeOperations()
{
try
{
polynomial_t p1("x^2-x*y+1"),p2("x^2+2*y-1");
BoundingBox bx(-4.01, 4.01,-6.01,6.01,-6.01,6.01);
Topology3d g(bx);
AlgebraicCurve* cv= new AlgebraicCurve(p1,p2);
g.push_back(cv);
g.run();
//other operations on g
throw topologyException();
}
catch(topologyException& topException)
{
QMessageBox errorBox;
errorBox.setIcon(QMessageBox::Warning);
errorBox.setText("The parameters are incorrect.");
errorBox.setInformativeText("Please insert another polynomial.");
errorBox.exec();
}
}
This code compiles, but when it runs it does not really
implement the required behavior.
For the polynomials for which g.run() gets blocked the error
message box code is never reached, plus for the polynomials
for which g.run() is working well, the error message box
code still is reached somehow and the box appears in the
application.
I am new to handling exceptions, so any help is more than
welcomed.
I think the program gets blocked somewhere inside g.run() so
it does not reach the exception, still I do not understand
what really happens.
Still I would want to throw this exception without going
into the code of g.run(), this function is implemented as
part of a bigger library, which I just use in my code.
Can I have this behavior in my program without putting any
try{...} catch{...} block statement in the g.run() function?
You cannot achieve what you want with the use of try-catch. if g.run() takes too much time or goes into an infinite loop, that doesn't mean an exception will be thrown.
What you can do is, you can move the operations that take a lot of time into another thread. Start that thread in your event handler and wait for it to finish in your main thread for a fixed amount of time. If it does not finish, kill that thread & show your messagebox.
For further reference, read QThread, Qt Thread Support
Thanks for the suggestions.
So I see how I should create the thread, something like:
class myopThread : public QThread
{
public:
void run();
};
Then I am rewriting the run() function and put all the operations that take a lot of time in it:
void myopThread::run()
{
polynomial_t p1("x^2-x*y+1"),p2("x^2+2*y-1");
BoundingBox bx(-4.01, 4.01,-6.01,6.01,-6.01,6.01);
Topology3d g(bx);
AlgebraicCurve* cv= new AlgebraicCurve(p1,p2);
g.push_back(cv);
g.run();
//other operations on g
exec();
}
Okay everything is clear so far, still I do not see how to "Start that thread in your event handler and wait for it to finish in your main thread for a fixed amount of time. If it does not finish, kill that thread & show your messagebox."
I mean start the thread in the event handler refers somehow at using the connect (..Signal, Slot..) still I do not see how exactly this is done. I have never used QThread before so it is more then new.
Thank you very much for your help,
madalina
The most elegant way to solve this that I know of is with a future value. If you haven't run across these before they can be quite handy in situations like this. Say you have a value that you'll need later on, but you can begin calculating concurrently. The code might look something like this:
SomeValue getValue() {
... calculate the value ...
}
void foo() {
Future<SomeValue> future_value(getValue);
... other code that takes a long time ...
SomeValue v = future_value.get();
}
Upon calling the .get() method of course, the value computed is returned, either by calling the function then and there or by retrieving the cache value calculated in another thread started when the Future<T> was created. One nice thing is that, at least for a few libraries, you can pass in a timeout parameter into the .get() method. This way if your value is taking too long to compute you can always unblock. Such elegant isn't usually achieved.
For a real life library, you might try looking into the library documented here. As I recall it wasn't accepted as the official boost futures library, but it certainly had promise. Good luck!