Achieving game engine determinism with threading

Achieving game engine determinism with threading - c++

I would like to achieve determinism in my game engine, in order to be able to save and replay input sequences and to make networking easier.
My engine currently uses a variable timestep: every frame I calculate the time it took to update/draw the last one and pass it to my entities' update method. This makes 1000FPS games seem as fast ad 30FPS games, but introduces undeterministic behavior.
A solution could be fixing the game to 60FPS, but it would make input more delayed and wouldn't get the benefits of higher framerates.
So I've tried using a thread (which constantly calls update(1) then sleeps for 16ms) and draw as fast as possible in the game loop. It kind of works, but it crashes often and my games become unplayable.
Is there a way to implement threading in my game loop to achieve determinism without having to rewrite all games that depend on the engine?

You should separate game frames from graphical frames. The graphical frames should only display the graphics, nothing else. For the replay it won't matter how many graphical frames your computer was able to execute, be it 30 per second or 1000 per second, the replaying computer will likely replay it with a different graphical frame rate.
But you should indeed fix the gameframes. E.g. to 100 gameframes per second. In the gameframe the game logic is executed: stuff that is relevant for your game (and the replay).
Your gameloop should execute graphical frames whenever there is no game frame necessary, so if you fix your game to 100 gameframes per second that's 0.01 seconds per gameframe. If your computer only needed 0.001 to execute that logic in the gameframe, the other 0.009 seconds are left for repeating graphical frames.
This is a small but incomplete and not 100% accurate example:
uint16_t const GAME_FRAMERATE = 100;
uint16_t const SKIP_TICKS = 1000 / GAME_FRAMERATE;
uint16_t next_game_tick;
Timer sinceLoopStarted = Timer(); // Millisecond timer starting at 0
unsigned long next_game_tick = sinceLoopStarted.getMilliseconds();
while (gameIsRunning)
{
//! Game Frames
while (sinceLoopStarted.getMilliseconds() > next_game_tick)
{
executeGamelogic();
next_game_tick += SKIP_TICKS;
}
//! Graphical Frames
render();
}
The following link contains very good and complete information about creating an accurate gameloop:
http://www.koonsolo.com/news/dewitters-gameloop/

To be deterministic across a network, you need a single point of truth, commonly called "the server". There is a saying in the game community that goes "the client is in the hands of the enemy". That's true. You cannot trust anything that is calculated on the client for a fair game.
If for example your game gets easier if for some reasons your thread only updates 59 times a second instead of 60, people will find out. Maybe at the start they won't even be malicious. They just had their machines under full load at the time and your process didn't get to 60 times a second.
Once you have a server (maybe even in-process as a thread in single player) that does not care for graphics or update cycles and runs at it's own speed, it's deterministic enough to at least get the same results for all players. It might still not be 100% deterministic based on the fact that the computer is not real time. Even if you tell it to update every $frequence, it might not, due to other processes on the computer taking too much load.
The server and clients need to communicate, so the server needs to send a copy of it's state (for performance maybe a delta from the last copy) to each client. The client can draw this copy at the best speed available.
If your game is crashing with the thread, maybe it's an option to actually put "the server" out of process and communicate via network, this way you will find out pretty fast, which variables would have needed locks because if you just move them to another project, your client will no longer compile.

Separate game logic and graphics into different threads . The game logic thread should run at a constant speed (say, it updates 60 times per second, or even higher if your logic isn't too complicated, to achieve smoother game play ). Then, your graphics thread should always draw the latest info provided by the logic thread as fast as possible to achieve high framerates.
In order to prevent partial data from being drawn, you should probably use some sort of double buffering, where the logic thread writes to one buffer, and the graphics thread reads from the other. Then switch the buffers every time the logic thread has done one update.
This should make sure you're always using the computer's graphics hardware to its fullest. Of course, this does mean you're putting constraints on the minimum cpu speed.

I don't know if this will help but, if I remember correctly, Doom stored your input sequences and used them to generate the AI behaviour and some other things. A demo lump in Doom would be a series of numbers representing not the state of the game, but your input. From that input the game would be able to reconstruct what happened and, thus, achieve some kind of determinism ... Though I remember it going out of sync sometimes.

Related

C++ Run only for a certain time

I'm writing a little game in c++ atm.
My Game While loop is always active, in this loop,
I have a condition if the player is shooting.
Now I face the following problem,
After every shot fired, there is a delay, this delay changes over time and while the delay the player should move.
shoot
move
wait 700 ms
shoot again
atm I'm using Sleep(700) the problem is I can't move while the 700 ms, I need something like a timer, so the move command is only executed for 700 ms instead of waiting 700 ms

This depends on how your hypothetical 'sleep' is implemented. There's a few things you should know, as it can be solved in a few ways.
You don't want to put your thread to sleep because then everything halts, which is not what you want.
Plus you may get more time than sleep allows. For example, if you sleep for 700ms you may get more than that, which means if you depend on accurate times you will get burned possibly by this.
1) The first way would be to record the raw time inside of the player. This is not the best approach but it'd work for a simple toy program and record the result of std::chrono::high_resolution_clock::now() (check #include <chrono> or see here) inside the class at the time you fire. To check if you can fire again, just compare the value you stored to ...::now() and see if 700ms has elapsed. You will have to read the documentation to work with it in milliseconds.
2) A better way would be to give your game a pulse via something called 'game ticks', which is the pulse to which your world moves forward. Then you can store the gametick that you fired on and do something similar to the above paragraph (except now you are just checking if currentGametick > lastFiredGametick + gametickUntilFiring).
For the gametick idea, you would make sure you do gametick++ every X milliseconds, and then run your world. A common value is somewhere between 10ms and 50ms.
Your game loop would then look like
while (!exit) {
readInput();
if (ticker.shouldTick()) {
ticker.tick();
world.tick(ticker.gametick);
}
render();
}
The above has the following advantages:
You only update the world every gametick
You keep rendering between gameticks, so you can have smooth animations since you will be rendering at a very high framerate
If you want to halt, just spin in a while loop until the amount of time has elapsed
Now this has avoided a significant amount of discussion, of which you should definitely read this if you are thinking of going the gametick route.
With whatever route you take, you probably need to read this.

Limit a Game Server Thread loop to 30FPS without a game engine/rendering window

I'll try to explain this as easily as I can.
I basically have a game that hosts multiple matches.
All the matches on the server are all processed by the server, no players host a match on their computers via port forwarding, it's all done by the server so they don't have to.
When a player requests to make a match, a match object with a thread is made on the server. This works fine. Each match object has it's own list of players. However, the game client runs at 30FPS and it needs to be in sync with the server, so just updating all the players in the thread loop will not do since it's not run at 30FPS.
What I'm doing right now is I use a game engine: SFML, which in its window loop, goes through all the players in the server and runs their update code at 30FPS.
This is fine however, when there comes a time where they may be a huge chunk of people, it will be better that the players are updated via the match threads so as not to slow down the processing speed by having it all done in one render loop.
What I want to know is, how would I simulate 30FPS in a match's thread loop? Basically have it so each player's update() function is called in the timeframe of 30FPS, without having to use a rendering engine such as SFML to limit how fast the code is to be run? The code is run in the background and output is shown on a console, no rendering is needed on the server.
Or put simply, how do I limit a while loop code to run in 30FPS without a game engine?

This seems a bit like a X Y problem : You try to sync players via server "FPS" limit. Server side doesn't really work like that.
Servers sync with clients in terms of time by passing to the client the server's time on each package or via a specific package (in other words, all clients have only the server's time).
But regarding server side implementations for gaming :
The problem is a bit more broad than what you mentioned. I'll post some guidelines which hopefully will help your research.
First of all, on server you don't require rendering, so FPS is not relevant (30 fps is required in order to give our eyes the sensation of fluidity). Server usually handles logic, like for example various events (for example someone fires a rocket, or a new enemy has spawned). As you can see events don't require FPS, and they are randomly triggered.
Something else that is done on the server is the Physics (or other player-player or player-environment interactions ). Collisions are done using a fixed update step. Physics are usually calculated at 20 FPS for example. Moving objects get capsule colliders in order to properly simulate interaction. In other words, severs ,while not rendering anything, are responsible for movement/collisions (meaning that if you don't have a connection to the server you won't move / go through walls , etc - implementation dependent).
Modern games also have prediction in order to reduce lagging (after all, after you give any input to your character, that input needs to get to the server first, get processed and be received back in order to have any effect on the client). This means that when you have an input on a client (let's take moving for example) , the client starts making the action in anticipation , and when the server response comes (for our example the new position) it will be considered as a correction. That's why sometimes in a game when you have lag you perceive that you are moving in a certain direction then all of a sudden you are somewhere completely different.
Regarding your question :
Inside your while loop, make a deltaT , and if that deltaT is lesser than 33 miliseconds use sleep(33-deltaT) .
As you requested, I'm posting a sample of deltaT Code :
while (gameIsRunning)
{
double time = GetTickCount();
UpdateGame();
double deltaT = GetTickCount()-time;
if (deltaT < 33 )
{
sleep(33- deltaT);
}
}
Where gameIsRunning is a global boolean, and UpdateGame is your game update function.
Please note that the code above works on Windows. For linux, you will require other functions instead of GetTicksCount and sleep.

SDL_PollEvent vs SDL_WaitEvent

So I was reading this article which contains 'Tips and Advice for Multithreaded Programming in SDL' - https://vilimpoc.org/research/portmonitorg/sdl-tips-and-tricks.html
It talks about SDL_PollEvent being inefficient as it can cause excessive CPU usage and so recommends using SDL_WaitEvent instead.
It shows an example of both loops but I can't see how this would work with a game loop. Is it the case that SDL_WaitEvent should only be used by things which don't require constant updates ie if you had a game running you would perform game logic each frame.
The only things I can think it could be used for are programs like a paint program where there is only action required on user input.
Am I correct in thinking I should continue to use SDL_PollEvent for generic game programming?

If your game only updates/repaints on user input, then you could use SDL_WaitEvent. However, most games have animation/physics going on even when there is no user input. So I think SDL_PollEvent would be best for most games.
One case in which SDL_WaitEvent might be useful is if you have it in one thread and your animation/logic on another thread. That way even if SDL_WaitEvent waits for a long time, your game will continue painting/updating. (EDIT: This may not actually work. See Henrik's comment below)
As for SDL_PollEvent using 100% CPU as the article indicated, you could mitigate that by adding a sleep in your loop when you detect that your game is running more than the required frames-per-second.

If you don't need sub-frame precision in your input, and your game is constantly animating, then SDL_PollEvent is appropriate.
Sub-frame precision can be important for, eg. games where the player might want very small increments in movement - quickly tapping and releasing a key has unpredictable behavior if you use the classic lazy method of keydown to mean "velocity = 1" and keyup to mean "velocity = 0" and then you only update position once per frame. If your tap happens to overlap with the frame render then you get one frame-duration of movement, if it does not you get no movement, where what you really want is an amount of movement smaller than the length of a frame based on the timestamps at which the events occurred.
Unfortunately SDL's events don't include the actual event timestamps from the operating system, only the timestamp of the PumpEvents call, and WaitEvent effectively polls at 10ms intervals, so even with WaitEvent running in a separate thread, the most precision you'll get is 10ms (you could maybe approximate smaller by saying if you get a keydown and keyup in the same poll cycle then it's ~5ms).
So if you really want precision timing on your input, you might actually need to write your own version of SDL_WaitEventTimeout with a smaller SDL_Delay, and run that in a separate thread from your main game loop.
Further unfortunately, SDL_PumpEvents must be run on the thread that initialized the video subsystem (per https://wiki.libsdl.org/SDL_PumpEvents ), so the whole idea of running your input loop on another thread to get sub-frame timing is nixed by the SDL framework.
In conclusion, for SDL applications with animation there is no reason to use anything other than SDL_PollEvents. The best you can do for sub-framerate input precision is, if you have time to burn between frames, you have the option of being precise during that time, but then you'll get weird render-duration windows each frame where your input loses precision, so you end up with a different kind of inconsistency.

In general, you should use SDL_WaitEvent rather than SDL_PollEvent to release the CPU to the operating system to handle other tasks, like processing user input. This will manifest to you users as sluggish reaction to user input, since this can cause a delay between when they enter a command and when your application processes the event. By using SDL_WaitEvent instead, the OS can post events to your application more quickly, which improves the perceived performance.
As a side benefit, users on battery powered systems, like laptops and portable devices should see slightly less battery usage since the OS has the opportunity to reduce overall CPU usage since your game isn't using it 100% of the time - it would only be using it when an event actually occurs.

This is a very late response, I know. But this is the thread that tops a Google search on this, so it seems the place to add an alternative suggestion to dealing with this that some might find useful.
You could write your code using SDL_WaitEvent, so that, when your application is not actively animating anything, it'll block and hand the CPU back to the OS.
But then you can send a user-defined message to the queue, from another thread (e.g. the game logic thread), to wake up the main rendering thread with that message. And then it goes through the loop to render a frame, swap and returns back to SDL_WaitEvent again. Where another of these user-defined messages can be waiting to be picked up, to tell it to loop once more.
This sort of structure might be good for an application (or game) where there's a "burst" of animation, but otherwise it's best for it to block and go idle (and save battery on laptops).
For example, a GUI where it animates when you open or close or move windows or hover over buttons, but it's otherwise static content most of the time.
(Or, for a game, though it's animating all the time in-game, it might not need to do that for the pause screen or the game menus. So, you could send the "SDL_ANIMATEEVENT" user-defined message during gameplay, but then, in the game menus and pause screen, just wait for mouse / keyboard events and actually allow the CPU to idle and cool down.)
Indeed, you could have self-triggering animation events. In that the rendering thread is woken up by a "SDL_ANIMATEEVENT" and then one more frame of animation is done. But because the animation is not complete, the rendering thread itself posts a "SDL_ANIMATEEVENT" to its own queue, that'll trigger it to wake up again, when it reaches SDL_WaitEvent.
And another idea there is that SDL events can carry data too. So you could supply, say, an animation ID in "data1" and a "current frame" counter in "data2" with the event. So that when the thread picks up the "SDL_ANIMATEEVENT", the event itself tells it which animation to do and what frame we're currently on.
This is a "best of both worlds" solution, I feel. It can behave like SDL_WaitEvent or SDL_PollEvent at the application's discretion by just sending messages to itself.
For a game, this might not be worth it, as you're updating frames constantly, so there's no big advantage to this and maybe it's not worth bothering with (though even games could benefit from going to 0% CPU usage in the pause screen or in-game menus, to let the CPU cool down and use less laptop battery).
But for something like a GUI - which has more "burst-y" animation - then a mouse event can trigger an animation (e.g. opening a new window, which zooms or slides into view) that sends "SDL_ANIMATEEVENT" back to the queue. And it keeps doing that until the animation is complete, then falls back to normal SDL_WaitEvent behaviour again.
It's an idea that might fit what some people need, so I thought I'd float it here for general consumption.

You could actually initialise the SDL and the window in the main thread and then create 2 more threads for updates(Just updates game states and variables as time passes) and rendering(renders the surfaces accordingly).
Then after all that is done, use SDL_WaitEvent in your main thread to manage SDL_Events. This way you could ensure that event is managed in the same thread that called the sdl_init.
I have been using this method for long to make my games work in windows and linux and have been able to successfully run 3 threads at the same time as mentioned above.
I had to use mutex to make sure that textures/surfaces can be transformed/changed in the update thread as well by pausing the render thread, and the lock is called every once 60 frames, so its not going to cause major perf issues.
This model works best to create event driven games, run time games, or both.

Effective frame limiting

I have a simulation that I am trying to convert to "real time". I say "real time" because its okay for performance to dip if needed (slowing down time for the observers/clients too). However, if there is a small number of objects, I want to limit the performance so that it runs at a steady frame rate (~100 FPS in this case).
I tried sleep() and Sleep() for linux and windows respectively but it doesn't seem to be accurate enough as the FPS really dips to a fraction of what I was aiming for. I suppose this scenario is common for games, especially online games but I was not able to find any helpful material on the subject. What is the preferable way of frame limiting? Is there a sleep method that can guarantee that it won't give up more time than what was specified?
Note: I'm running this on 2 different clusters (linux and windows) and all nodes only have built-in video. So I have to implement limiting on both platforms and it shouldn't be video card based (if there is even such a thing). I also need to implement the limiting on just one thread/node because there is already synchronization between nodes and the others would automatically be limited if one thread is properly limited.
Edit: some pseudo code that shows how I implemented the current limiter:
while (ProcessControlMessages())
{
uint64 tStart;
SimulateFrame();
uint64 newT =_context.GetTimeMs64();
if (newT - tStart < DESIRED_FRAME_RATE_DURATION)
this_thread::sleep_for(chrono::milliseconds(DESIRED_FRAME_RATE_DURATION - (newT - tStart)));
}
I was also thinking if I could do the limiting every N frames, where N is a fraction of the desired frame rate. I'll give it a try and report back.

For games a frame limiter is usually inadequate. Instead, the methods that update the game state (in your case SimulateFrame()) are kept frame rate independent. E.g. if you want to move an object, then the actual offset is the object's speed multiplied with the last frame's duration. Similarly, you can do this for all kind of calculations.
This approach has the advantage that the user gets maximum frame rate while maintaining the real-timeness. However, you should watch out that the frame durations don't get too small ( < 1 ms). This could result in inaccurate calculations. In this case a small sleep with a fixed duration could help.
This is how games usually handle this problem. You have to check if your simulation is appropriate for this technique, too.

Instead of having each frame try to sleep for long enough to be a full frame, have them sleep to try to average out. Keep a global/thread owned time count. for each frame have a "desired earliest end time," calculated from the previous desired earliest end time, rather than from the current time
tGoalEndTime = _context.GetTimeMS64() + DESIRED_FRAME_RATE_DURATION;
while (ProcessControlMessages())
{
SimulateFrame();
uint64 end =_context.GetTimeMs64();
if (end < tGoalEndTime) {
this_thread::sleep_for(chrono::milliseconds(tGoalEndTime - end)));
tGoalEndTime += DESIRED_FRAME_RATE_DURATION;
} else {
tGoalEndTime = end; // we ran over, pretend we didn't and keep going
}
Note: this uses your example's sleep_for because I wanted to show the minimum number of changes to enact it. sleep_until works better here.
The trick is that any frame that sleeps too long immediately causes the next few frames to rush to catch up.
Note: You cannot get any timing within 2ms (20% jitter on 100fps) on modern consumer OSs. The quantum for threads on most consumer OSs is around 100ms, so the instant you sleep, you may sleep for multiple quantums before it is your turn. sleep_until may use a OS specific technique to have less jitter, but you can't rely on it.

why is empty while loop using more cpu?

I have two programs that are supposed to do the same thing with slight differences. Both have infinite game loops that runs forever unless user stops the game somehow. One of these programs' game loop is implemented and rendering something, the other game loop is empty and does nothing(just listens for user to stop).
When i opened the task manager to see resource usage, i have discovered that the program with the empty loop uses 14% CPU and the program that actually draws something to screen uses about 1-2%.
My guess on the subject is as follows:
I compared the code of the both programs and looked for differences and there was not much. Then it occurred to me that the loop that renders to screen might be bound by other factors(like sending pixels to the screen, refresh rate maybe?) So after CPU does its thing, it puts that thread to sleep until other stuff is completed. But since other program does pretty much nothing and doing nothing is really easy, CPU never puts that thread to sleep and just keeps going. I lack the knowledge to confirm that if this is the reason, so i am asking you. Is this the reason this is happening? (Bonus question) And if so, why does the CPU stop at about 14% and not going all the way up to 100% ?
Thank you.

Hard to say for certain without seeing the code, but drawing to the screen will, inevitably involve some wait on IO; how much depends on many factors including sync + buffering options.
As for the 14% cpu usage - I'm guessing that your machine has 8 processing units (either cores or cores * hyperthreading) and your code is singlethreaded - i.e. it is maxing out one processing unit.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js