How to fix execution speed inconsistencies in C++

How to fix execution speed inconsistencies in C++ - c++

This is most noticeable on graphic files. Let's take as an example the OpenGL base program (a spinning triangle).
Whenever I run one normally, with no other apps open in the background, it will spin slowly, but when I run a game in the background, it starts spinning like mad. It seems as if the computer doesn't allocate enough memory for the programs to run at maximum speed, and paradoxically, doing resource-consuming stuff will accelerate it because it gets more memory.
The only way I found to fix this partially is to put a higher value in the Sleep function, however this doesn't fix it completely nor is a consistent solution, as other problems may arise from it. Is there any good way to fix this and make the program run consistently?

This mostly happens because you are not capping your FPS so there's nothing preventing your render loop from being called as much as possible and your logic (that is controlling rotation is executing in same loop).
What happens is that most GPUs have power management so they keep their frequencies low when there's no demand, opening an expensive game makes your GPU bump up its power thus rendering a lot faster, thus calling your rendering loop more times.
To prevent this (and to separate logic from rendering time in general) you must control the frame rate and use the time as an input for your rotation, something like:
auto elapsed = ..
while (!exit) {
render();
auto delta = now - elapsed;
if (delta < time_per_frame)
delay(TIME_PER_FRAME - delta);
updateLogic(delta);
}

For starters, you need to understand what's going on with your program. It has nothing to do with memory, and I don't see a reason to think about memory.
Opening other programs could make your CPU go faster because of the load (doubtfull, but clearly more likely than memory allocation).
The other programs could be messing with some setting.
If you're using sleep(), signals can interrupt the call (no one ever looks at the return code of the function; there's a reason for it to be uint sleep(uint) and not void sleep(uint)).
If you can, don't use sleep. And if you're going to, check. sleep doesn't grant you that the whole time has passed (IMHO bad design, but I'm not a POSIX fan).
The usual behaviour I think would be to have your function called periodically as a callback. If you're going to do some sort of delay or sleep, you should check that the time has passed.
Taken that you want your logic tied to the render and you use some function like sleep that can be interrupted (based on the other answer):
while (!exit) {
auto startofFrame= now();
render();
auto toDelay= startOfFrame + TIME_PER_FRAME - now();
while (toDelay > 0) {
delay(toDelay);
toDelay= startOfFrame + TIME_PER_FRAME - now();
}
updateLogic();
}

Related

C++ Run only for a certain time

I'm writing a little game in c++ atm.
My Game While loop is always active, in this loop,
I have a condition if the player is shooting.
Now I face the following problem,
After every shot fired, there is a delay, this delay changes over time and while the delay the player should move.
shoot
move
wait 700 ms
shoot again
atm I'm using Sleep(700) the problem is I can't move while the 700 ms, I need something like a timer, so the move command is only executed for 700 ms instead of waiting 700 ms

This depends on how your hypothetical 'sleep' is implemented. There's a few things you should know, as it can be solved in a few ways.
You don't want to put your thread to sleep because then everything halts, which is not what you want.
Plus you may get more time than sleep allows. For example, if you sleep for 700ms you may get more than that, which means if you depend on accurate times you will get burned possibly by this.
1) The first way would be to record the raw time inside of the player. This is not the best approach but it'd work for a simple toy program and record the result of std::chrono::high_resolution_clock::now() (check #include <chrono> or see here) inside the class at the time you fire. To check if you can fire again, just compare the value you stored to ...::now() and see if 700ms has elapsed. You will have to read the documentation to work with it in milliseconds.
2) A better way would be to give your game a pulse via something called 'game ticks', which is the pulse to which your world moves forward. Then you can store the gametick that you fired on and do something similar to the above paragraph (except now you are just checking if currentGametick > lastFiredGametick + gametickUntilFiring).
For the gametick idea, you would make sure you do gametick++ every X milliseconds, and then run your world. A common value is somewhere between 10ms and 50ms.
Your game loop would then look like
while (!exit) {
readInput();
if (ticker.shouldTick()) {
ticker.tick();
world.tick(ticker.gametick);
}
render();
}
The above has the following advantages:
You only update the world every gametick
You keep rendering between gameticks, so you can have smooth animations since you will be rendering at a very high framerate
If you want to halt, just spin in a while loop until the amount of time has elapsed
Now this has avoided a significant amount of discussion, of which you should definitely read this if you are thinking of going the gametick route.
With whatever route you take, you probably need to read this.

Effective frame limiting

I have a simulation that I am trying to convert to "real time". I say "real time" because its okay for performance to dip if needed (slowing down time for the observers/clients too). However, if there is a small number of objects, I want to limit the performance so that it runs at a steady frame rate (~100 FPS in this case).
I tried sleep() and Sleep() for linux and windows respectively but it doesn't seem to be accurate enough as the FPS really dips to a fraction of what I was aiming for. I suppose this scenario is common for games, especially online games but I was not able to find any helpful material on the subject. What is the preferable way of frame limiting? Is there a sleep method that can guarantee that it won't give up more time than what was specified?
Note: I'm running this on 2 different clusters (linux and windows) and all nodes only have built-in video. So I have to implement limiting on both platforms and it shouldn't be video card based (if there is even such a thing). I also need to implement the limiting on just one thread/node because there is already synchronization between nodes and the others would automatically be limited if one thread is properly limited.
Edit: some pseudo code that shows how I implemented the current limiter:
while (ProcessControlMessages())
{
uint64 tStart;
SimulateFrame();
uint64 newT =_context.GetTimeMs64();
if (newT - tStart < DESIRED_FRAME_RATE_DURATION)
this_thread::sleep_for(chrono::milliseconds(DESIRED_FRAME_RATE_DURATION - (newT - tStart)));
}
I was also thinking if I could do the limiting every N frames, where N is a fraction of the desired frame rate. I'll give it a try and report back.

For games a frame limiter is usually inadequate. Instead, the methods that update the game state (in your case SimulateFrame()) are kept frame rate independent. E.g. if you want to move an object, then the actual offset is the object's speed multiplied with the last frame's duration. Similarly, you can do this for all kind of calculations.
This approach has the advantage that the user gets maximum frame rate while maintaining the real-timeness. However, you should watch out that the frame durations don't get too small ( < 1 ms). This could result in inaccurate calculations. In this case a small sleep with a fixed duration could help.
This is how games usually handle this problem. You have to check if your simulation is appropriate for this technique, too.

Instead of having each frame try to sleep for long enough to be a full frame, have them sleep to try to average out. Keep a global/thread owned time count. for each frame have a "desired earliest end time," calculated from the previous desired earliest end time, rather than from the current time
tGoalEndTime = _context.GetTimeMS64() + DESIRED_FRAME_RATE_DURATION;
while (ProcessControlMessages())
{
SimulateFrame();
uint64 end =_context.GetTimeMs64();
if (end < tGoalEndTime) {
this_thread::sleep_for(chrono::milliseconds(tGoalEndTime - end)));
tGoalEndTime += DESIRED_FRAME_RATE_DURATION;
} else {
tGoalEndTime = end; // we ran over, pretend we didn't and keep going
}
Note: this uses your example's sleep_for because I wanted to show the minimum number of changes to enact it. sleep_until works better here.
The trick is that any frame that sleeps too long immediately causes the next few frames to rush to catch up.
Note: You cannot get any timing within 2ms (20% jitter on 100fps) on modern consumer OSs. The quantum for threads on most consumer OSs is around 100ms, so the instant you sleep, you may sleep for multiple quantums before it is your turn. sleep_until may use a OS specific technique to have less jitter, but you can't rely on it.

why is empty while loop using more cpu?

I have two programs that are supposed to do the same thing with slight differences. Both have infinite game loops that runs forever unless user stops the game somehow. One of these programs' game loop is implemented and rendering something, the other game loop is empty and does nothing(just listens for user to stop).
When i opened the task manager to see resource usage, i have discovered that the program with the empty loop uses 14% CPU and the program that actually draws something to screen uses about 1-2%.
My guess on the subject is as follows:
I compared the code of the both programs and looked for differences and there was not much. Then it occurred to me that the loop that renders to screen might be bound by other factors(like sending pixels to the screen, refresh rate maybe?) So after CPU does its thing, it puts that thread to sleep until other stuff is completed. But since other program does pretty much nothing and doing nothing is really easy, CPU never puts that thread to sleep and just keeps going. I lack the knowledge to confirm that if this is the reason, so i am asking you. Is this the reason this is happening? (Bonus question) And if so, why does the CPU stop at about 14% and not going all the way up to 100% ?
Thank you.

Hard to say for certain without seeing the code, but drawing to the screen will, inevitably involve some wait on IO; how much depends on many factors including sync + buffering options.
As for the 14% cpu usage - I'm guessing that your machine has 8 processing units (either cores or cores * hyperthreading) and your code is singlethreaded - i.e. it is maxing out one processing unit.

Windows Sleep inconsistency?

Having a bit of an issue with a game I'm making using opengl. The game will sometimes run at half speed and sometimes it will run normally.
I don't think it is the opengl causing the problem since it runs at literally 14,000 fps on my computer. (even when its running at half speed)
This has led me to believe that is is the "game timer" thats causing the problem. The game timer runs on a seperate thread and is programmed to pause at the end of its "loop" with a Sleep(5) call. if i remove the Sleep(5) call, it runs so fast that i can barely see the sprites on the screen. (predictable behavior)
I tried throwing a Sleep(16) at the end of the Render() thread (also on its own thread). This action should limit the fps to around 62. Remember that the app runs sometimes at its intended speed and sometimes at half speed (i have tried on both of the computers that i own and it persists).
When it runs at its intended speed, the fps is 62 (good) and sometimes 31-ish (bad). it never switches between half speed and full speed mid execution, and the problem persists even after a reboot..
So its not the rendering that causing the slowness, its the Sleep() function
I guess what im saying is that the Sleep() function is inconsistent with the times that it actually sleeps. is this a proven thing? is there a better Sleep() function that i could use?

A waitable timer (CreateWaitableTimer and WaitForSingleObject or friends) is much better for periodic wakeup.
However, in your case you probably should just enable VSYNC.

See the following discussion of the Sleep function, focusing on the bit about scheduling priorities:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686298(v=vs.85).aspx

yes, Sleep function is inconsistency, it is very useful in the case of macro condition.
if you want to a consistency time,please use QueryPerformanceFrequency get the frequency of CPU, and QueryPerformanceCount twice for start and end, and then (end-start) / frequency get the consistency time, but you must look out that if your CPU is mulit cores, the start and end time maybe not the same CPU core, so please us SetThreadAffinity for you working thread set the same CPU core.

Had a same problem. For I just made my own sleep logic and worked for me.
#include <chrono>
using namespace std::chrono;
high_resolution_clock::time_point sleep_start_time = high_resolution_clock::now();
while (duration_cast<duration<double>>(high_resolution_clock::now() - sleep_start_time).count() < must_sleep_duration) {}

Achieving game engine determinism with threading

I would like to achieve determinism in my game engine, in order to be able to save and replay input sequences and to make networking easier.
My engine currently uses a variable timestep: every frame I calculate the time it took to update/draw the last one and pass it to my entities' update method. This makes 1000FPS games seem as fast ad 30FPS games, but introduces undeterministic behavior.
A solution could be fixing the game to 60FPS, but it would make input more delayed and wouldn't get the benefits of higher framerates.
So I've tried using a thread (which constantly calls update(1) then sleeps for 16ms) and draw as fast as possible in the game loop. It kind of works, but it crashes often and my games become unplayable.
Is there a way to implement threading in my game loop to achieve determinism without having to rewrite all games that depend on the engine?

You should separate game frames from graphical frames. The graphical frames should only display the graphics, nothing else. For the replay it won't matter how many graphical frames your computer was able to execute, be it 30 per second or 1000 per second, the replaying computer will likely replay it with a different graphical frame rate.
But you should indeed fix the gameframes. E.g. to 100 gameframes per second. In the gameframe the game logic is executed: stuff that is relevant for your game (and the replay).
Your gameloop should execute graphical frames whenever there is no game frame necessary, so if you fix your game to 100 gameframes per second that's 0.01 seconds per gameframe. If your computer only needed 0.001 to execute that logic in the gameframe, the other 0.009 seconds are left for repeating graphical frames.
This is a small but incomplete and not 100% accurate example:
uint16_t const GAME_FRAMERATE = 100;
uint16_t const SKIP_TICKS = 1000 / GAME_FRAMERATE;
uint16_t next_game_tick;
Timer sinceLoopStarted = Timer(); // Millisecond timer starting at 0
unsigned long next_game_tick = sinceLoopStarted.getMilliseconds();
while (gameIsRunning)
{
//! Game Frames
while (sinceLoopStarted.getMilliseconds() > next_game_tick)
{
executeGamelogic();
next_game_tick += SKIP_TICKS;
}
//! Graphical Frames
render();
}
The following link contains very good and complete information about creating an accurate gameloop:
http://www.koonsolo.com/news/dewitters-gameloop/

To be deterministic across a network, you need a single point of truth, commonly called "the server". There is a saying in the game community that goes "the client is in the hands of the enemy". That's true. You cannot trust anything that is calculated on the client for a fair game.
If for example your game gets easier if for some reasons your thread only updates 59 times a second instead of 60, people will find out. Maybe at the start they won't even be malicious. They just had their machines under full load at the time and your process didn't get to 60 times a second.
Once you have a server (maybe even in-process as a thread in single player) that does not care for graphics or update cycles and runs at it's own speed, it's deterministic enough to at least get the same results for all players. It might still not be 100% deterministic based on the fact that the computer is not real time. Even if you tell it to update every $frequence, it might not, due to other processes on the computer taking too much load.
The server and clients need to communicate, so the server needs to send a copy of it's state (for performance maybe a delta from the last copy) to each client. The client can draw this copy at the best speed available.
If your game is crashing with the thread, maybe it's an option to actually put "the server" out of process and communicate via network, this way you will find out pretty fast, which variables would have needed locks because if you just move them to another project, your client will no longer compile.

Separate game logic and graphics into different threads . The game logic thread should run at a constant speed (say, it updates 60 times per second, or even higher if your logic isn't too complicated, to achieve smoother game play ). Then, your graphics thread should always draw the latest info provided by the logic thread as fast as possible to achieve high framerates.
In order to prevent partial data from being drawn, you should probably use some sort of double buffering, where the logic thread writes to one buffer, and the graphics thread reads from the other. Then switch the buffers every time the logic thread has done one update.
This should make sure you're always using the computer's graphics hardware to its fullest. Of course, this does mean you're putting constraints on the minimum cpu speed.

I don't know if this will help but, if I remember correctly, Doom stored your input sequences and used them to generate the AI behaviour and some other things. A demo lump in Doom would be a series of numbers representing not the state of the game, but your input. From that input the game would be able to reconstruct what happened and, thus, achieve some kind of determinism ... Though I remember it going out of sync sometimes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js