I use clock_gettime() in Linux and QueryPerformanceCounter() in Windows to measure time. When measuring time, I encountered an interesting case.
Firstly, I'm calculating DeltaTime in infinite while loop. This loop calls some update functions. To calculating DeltaTime, the program's waiting in 40 milliseconds in an Update function because update functions is empty yet.
Then, in the program compiled as Win64-Debug i measure DeltaTime. It's approximately 0.040f. And this continues as long as the program is running (Win64-Release works like that too). It runs correctly.
But in the program compiled as Linux64-Debug or Linux64-Release, there is a problem.
When the program starts running. Everything is normal. DeltaTime is approximately 0.040f. But after a while, deltatime is calculated 0.12XXf or 0.132XX, immediately after it's 0.040f. And so on.
I thought I was using QueryPerformanceCounter correctly and using clock_gettime() incorrectly. Then I decided to try it with the standard library std::chrono::high_resolution_clock, but it's the same. No change.
#define MICROSECONDS (1000*1000)
auto prev_time = std::chrono::high_resolution_clock::now();
decltype(prev_time) current_time;
while(1)
{
current_time = std::chrono::high_resolution_clock::now();
int64_t deltaTime = std::chrono::duration_cast<std::chrono::microseconds>(current_time - previous_time).count();
printf("DeltaTime: %f", deltaTime/(float)MICROSECONDS);
NetworkManager::instance().Update();
prev_time = current_time;
}
void NetworkManager::Update()
{
auto start = std::chrono::high_resolution_clock::now();
decltype(start) end;
while(1)
{
end = std::chrono::high_resolution_clock::now();
int64_t y = std::chrono::duration_cast<std::chrono::microseconds>(end-start).count();
if(y/(float)MICROSECONDS >= 0.040f)
break;
}
return;
}
Normal
Problem
Possible causes:
Your clock_gettime is not using VDSO and is a system call instead - will be visible if run under strace, can be configured on modern kernel versions.
Your thread gets preempted (taken out of CPU by the scheduler). To run a clean experiment run your app with real time priority and pinned to a specific CPU core.
Also, I would disable CPU frequency scaling when experimenting.
Related
I want to measure the CPU time, not elapsed time on a thread. For example, if a thread is waiting or sleeping, it shouldn't count as CPU time because the thread is not in runnable state. So I got the following link to get CPU time. However, it seems to be capturing the elapsed time instead based on my test below (I expect cpu_time_used should be close to 0 but it is actually 2).What am I missing?
https://www.gnu.org/software/libc/manual/html_node/CPU-Time.html
#include <time.h>
clock_t start, end;
double cpu_time_used;
start = clock();
std::this_thread::sleep_for (std::chrono::seconds(2));
end = clock();
cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
Note that clock() measures in units of core-time, i.e. CLOCKS_PER_SEC clock() ticks represent one second of computation on one processor, so it indicates the amount of time used by the process over all threads. So, if there is another thread running during the sleep it will still be increasing the clock count -- if there are two threads in total the clock count will indicate 2 seconds have elapsed, like you have shown. If the sleeping thread is the only thread, you get a small amount of time, like #NateEldredge reports. On Linux you can query or setup a timer on CLOCK_THREAD_CPUTIME_ID instead, like #KamilCuk said.
Currently I am coding a project that requires precise delay times over a number of computers. Currently this is the code I am using I found it on a forum. This is the code below.
{
LONGLONG timerResolution;
LONGLONG wantedTime;
LONGLONG currentTime;
QueryPerformanceFrequency((LARGE_INTEGER*)&timerResolution);
timerResolution /= 1000;
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
wantedTime = currentTime / timerResolution + ms;
currentTime = 0;
while (currentTime < wantedTime)
{
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
currentTime /= timerResolution;
}
}
Basically the issue I am having is this uses alot of CPU around 16-20% when I start to call on the function. The usual Sleep(); uses Zero CPU but it is extremely inaccurate from what I have read from multiple forums is that's the trade-off when you trade accuracy for CPU usage but I thought I better raise the question before I set for this sleep method.
The reason why it's using 15-20% CPU is likely because it's using 100% on one core as there is nothing in this to slow it down.
In general, this is a "hard" problem to solve as PCs (more specifically, the OSes running on those PCs) are in general not made for running real time applications. If that is absolutely desirable, you should look into real time kernels and OSes.
For this reason, the guarantee that is usually made around sleep times is that the system will sleep for atleast the specified amount of time.
If you are running Linux you could try using the nanosleep method (http://man7.org/linux/man-pages/man2/nanosleep.2.html) Though I don't have any experience with it.
Alternatively you could go with a hybrid approach where you use sleeps for long delays, but switch to polling when it's almost time:
#include <thread>
#include <chrono>
using namespace std::chrono_literals;
...
wantedtime = currentTime / timerResolution + ms;
currentTime = 0;
while(currentTime < wantedTime)
{
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
currentTime /= timerResolution;
if(currentTime-wantedTime > 100) // if waiting for more than 100 ms
{
//Sleep for value significantly lower than the 100 ms, to ensure that we don't "oversleep"
std::this_thread::sleep_for(50ms);
}
}
Now this is a bit race condition prone, as it assumes that the OS will hand back control of the program within 50ms after the sleep_for is done. To further combat this you could turn it down (to say, sleep 1ms).
You can set the Windows timer resolution to minimum (usually 1 ms), to make Sleep() accurate up to 1 ms. By default it would be accurate up to about 15 ms. Sleep() documentation.
Note that your execution can be delayed if other programs are consuming CPU time, but this could also happen if you were waiting with a timer.
#include <timeapi.h>
// Sleep() takes 15 ms (or whatever the default is)
Sleep(1);
TIMECAPS caps_;
timeGetDevCaps(&caps_, sizeof(caps_));
timeBeginPeriod(caps_.wPeriodMin);
// Sleep() now takes 1 ms
Sleep(1);
timeEndPeriod(caps_.wPeriodMin);
I was doing a microbenchmark. My code appears something like this
while(some condition){
struct timespec tps, tpe;
clock_gettime(CLOCK_REALTIME, &tps);
encrypt_data(some_data)
clock_gettime(CLOCK_REALTIME, &tpe);
long time_diff = tpe.tv_nsec - tps.tv_nsec;
usleep(1000);
}
However, the sleep time that I put in usleep() actually affects the observed time_diff that I get. If I measure the execution of this code using the skeleton above the time I get varies from ~1.8us to ~7us for sleep time of 100us and 1000us respectively. Why would the measured time change with the change in sleep time, when sleep time is outside the instrumentation block?
The time results are average of multiple runs. I am using Ubuntu 14.04 to run this code. For encryption I am using aesgcm from openssl.
I know that this is not the best way to microbenchmark but that is not the problem here.
Did you disable the CPU scaling?
sudo cpupower frequency-set --governor performance
See here and here.
I'm currently making a small console game. At the end of the game loop is another loop that doesn't release until 1/100s after the iteration's begin time.
Of course that uses up a lot of CPU, so I placed
Sleep(1);
at the end to solve it. I thought everything was right until I ran the game on a 2005 XP laptop... and it was really slow.
When I removed the Sleep command, the game worked perfectly on both computers, but now I have the CPU usage problem.
Does anyone have a good solution for this?
So I found out that the problem was with Windows NT (2000, XP, 2003) sleep granularity that was around 15 ms.. if anyone also struggles with this type of problem, here's how to solve it:
timeBeginPeriod(1); //from windows.h
Call it once at the beginning of the main() function. This affects a few things including Sleep() so that it's actually 'sleeping' for an exact millisecond.
timeEndPeriod(1); //on exit
Of course I was developing the game on Windows 7 all time and thought everything was right, so apparently Windows 6.0+ removed this problem... but it's still useful considering the fact that a lot of people still use XP
You should use std::this_thread::sleep_for in header <thread> for this, along with std::chrono stuff. Maybe something like this:
while(...)
{
auto begin = std::chrono::steady_clock::now();
// your code
auto end = std::chrono::steady_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - begin);
std::this_thread::sleep_for(std::chrono::milliseconds(10) - duration);
}
If your code doesn't consume much time during one iteration or if each iteration takes constant time, you can leave alone the measuring and just put there some constant:
std::this_thread::sleep_for(std::chrono::milliseconds(8));
Sounds like the older laptop just takes more time to do all your processes then it sleeps for 1 millisecond.
You should include a library that tells time,
get the current time at the start of the program / start of the loop then at the end of the loop / program compare the difference of your starting time and the current to the amount of time you want. If it's lower than the amount of time you want (let's say 8 milliseconds) tell it to sleep for minimumTime - currentTime - recordedTime (the variable you set at the start of the loop)
I've done this for my own game in SDL2, SDL_GetTicks() just finds the amount of milliseconds the program has been running and "frametime" is the time at the start of the main game loop. This is how I keep my game running at a maximum of 60fps. This if statement should be modified and placed at the bottom of your program.
if( SDL_GetTicks() - frametime < MINFRAMETIME )
{
SDL_Delay( MINFRAMETIME - ( SDL_GetTicks() - frametime ) );
}
I think the standard library equivalent would be:
if( clock() - lastCheck < MIN_TIME )
{
sleep( MIN_TIME - ( clock() - lastCheck ) );
}
I am trying to measure a multi-thread program's execution time. I've use this piece of code in main program for calculating time:
clock_t startTime = clock();
//do stuff
clock_t stopTime = clock();
float secsElapsed = (float)(stopTime - startTime)/CLOCKS_PER_SEC;
Now the problem i have is:
for example I run my program with 4 thread(each thread running on one core),
the execution time is 21.39 . I check my system monitor in run time, where the execution time is about 5.3.
It seems that the actual execution time is multiplied by the number of THREADS.
What is the problem??
It is because you are monitoring the CPU time which is the accumulated time spent by CPU executing your code and not the Wall time which is the real-world time elapsed between your startTime and stopTime.
Indeed clock_t :
Returns the processor time consumed by the program.
If you do the maths : 5.3 * 4 = 21.2 which is what you obtain meaning that you have good multithreaded code with a speedup of 4.
So to measure the wall time, you should rather use std::chrono::high_resolution_clock for instance and you should get back 5.3. You can also use the classic gettimeofday().
If you use OpenMP for multithreading you also have omp_get_wtime()
double startTime = omp_get_wtime();
// do stuff
double stopTime = omp_get_wtime();
double secsElapsed = stopTime - startTime; // that's all !