Need a better wait solution - c++

Recently I have been writing a program in C++ that pings three different websites and then depending on pass or fail it will wait 5 minutes or 30 seconds before it tries again.
Currently I have been using the ctime library and the following function to process my waiting. However, according to my CPU meter this is an unacceptable solution.
void wait (int seconds)
{
clock_t endwait;
endwait = clock () + seconds * CLOCKS_PER_SEC;
while (clock () < endwait) {}
}
The reason why this solution is unacceptable is because according to my CPU meter the program runs at 48% to 50% of my CPU when waiting. I have a Athlon 64 x2 1.2 GHz processor. There is no way my modest 130 line program should even get near 50%.
How can I write my wait function better so that it is only using minimal resources?

To stay portable you could use Boost::Thread for sleeping:
#include <boost/thread/thread.hpp>
int main()
{
//waits 2 seconds
boost::this_thread::sleep( boost::posix_time::seconds(1) );
boost::this_thread::sleep( boost::posix_time::milliseconds(1000) );
return 0;
}

With the C++11 standard the following approach can be used:
std::this_thread::sleep_for(std::chrono::milliseconds(100));
std::this_thread::sleep_for(std::chrono::seconds(100));
Alternatively sleep_until could be used.

Use sleep rather than an empty while loop.

Just to explain what's happening: when you call clock() your program retrieves the time again: you're asking it to do that as fast as it can until it reaches the endtime... that leaves the CPU core running the program "spinning" as fast as it can through your loop, reading the time millions of times a second in the hope it'll have rolled over to the endtime. You need to instead tell the operating system that you want to be woken up after an interval... then they can suspend your program and let other programs run (or the system idle)... that's what the various sleep functions mentioned in other answers are for.

There's Sleep in windows.h, on *nix there's sleep in unistd.h.
There's a more elegant solution # http://www.faqs.org/faqs/unix-faq/faq/part4/section-6.html

Related

Wait accurate for 20 millisec

I need to execute some function accurate in 20 milliseconds (for RTP packets sending) after some event. I have tried next variants:
std::this_thread::sleep_for(std::chrono::milliseconds(20));
boost::this_thread::sleep_for(std::chrono::milliseconds(20));
Sleep(20);
Also different perversions as:
auto a= GetTickCount();
while ((GetTickCount() - a) < 20) continue;
Also tried micro and nanoseconds.
All this methods have error in range from -6ms to +12ms but its not acceptable. How to make it work right?
My opinion, that +-1ms is acceptable, but no more.
UPDATE1: to measure time passed i use std::chrono::high_resolution_clock::now();
Briefly, because of how OS kernels manage time and threads, you won't get accuracy much better with that method. Also, you can't rely on sleep alone with a static interval or your stream will quickly drift off your intended send clock rate, because the thread could be interrupted or it could be scheduled again well after your sleep time... for this reason you should check the system clock to know how much to sleep for at each iteration (i.e. somewhere between 0ms and 20ms). Without going into too much detail, this is also why there's a jitter buffer in RTP streams... to account for variations in packet reception (due to network jitter or send jitter). Because of this, you likely won't need +/-1ms level accuracy anyway.
Using std::chrono::steady_clock, I got about 0.1ms accuracy on windows 7.
That is, simply:
auto a = std::chrono::steady_clock::now();
while ((std::chrono::steady_clock::now() - a) < WAIT_TIME) continue;
This should give you accurate "waiting" (about 0.1ms, as I said), at least. We all know that this kind of waiting is "ugly" and should be avoided, but it's a hack that might still do the trick just fine.
You could use high_resolution_clock, which might give even better accuracy for some systems, but it is not guaranteed not to be adjusted by the OS, and you don't want that. steady_clock is supposed to be guaranteed not to be adjusted, and often has the same accuracy as high_resolution_clock.
As for "sleep()" functions that are very accurate, I don't know. Perhaps someone else knows more about that.
In C we have a nanosleep function in time.h.
The nanosleep() function causes the current thread to be suspended from execution until either the time interval specified by the rqtp argument has elapsed or a signal is delivered to the calling thread and its action is to invoke a signal-catching function or to terminate the process.
This below program sleeps for 20 milli seconds.
int main()
{
struct timespec tim, tim2;
tim.tv_sec = 0;
tim.tv_nsec =20000000;//20 milliseconds converted to nano seconds
if(nanosleep(&tim , NULL) < 0 )
{
printf("Nano sleep system call failed \n");
return -1;
}
printf("Nano sleep successfull \n");
return 0;
}

Does clock_t calculate the time of all threads, c++, pthreads?

Lets say my code is made of main() and in main I call 2 threads that run in parallel.
lets say that main takes 5 seconds to finish, and each thread takes 10 seconds to finish.
if I time the main program using clock_t, assuming the 2 threads run in parallel, the real time that the program will take is 15 seconds.
Now if I time it using clock_t, will that give me a time of 15 seconds or 25 seconds?
Although thread 1 and thread 2 ran in parallel, will the clock_t() calculate every cycle used by thread 1 and thread 2 and return the total number of cycles used?
I use windows mingw32, and pthreads.
example code:
main(){
clock_t begin_time ;
for (unsigned int id = 0; id < 2; ++id)
{
pthread_create(&(threads[id]), NULL, thread_func, (void *) &(data[id]));
}
for (unsigned int id = 0; id < 2; ++id)
{
pthread_join(threads[id], NULL);
}
time = double( clock () - begin_time )/CLOCKS_PER_SEC;
}
The function clock does different things in different implementations (in particular, in different OS's). The clock function in Windows gives the number of clock-ticks from when your program started, regardless of number of threads, and regardless of whether the machine is busy or not [I believe this design decision stems from the ancient days when DOS and Windows 2.x was the fashionable things to use, and the OS didn't have a way of "not running" something].
In Linux, it gives the CPU-time used, as is the case in all Unix-like operating systems, as far as I'm aware.
Edit to clarify: My Linux system says this:
In glibc 2.17 and earlier, clock() was implemented on top of times(2).
For improved precision, since glibc 2.18, it is implemented on top of
clock_gettime(2) (using the CLOCK_PROCESS_CPUTIME_ID clock).
In other words, the time is for the process, not for the current thread.
To get the actual CPU-time used by your process if you are using Windows, you can (and should) use GetProcessTimes

gettimeofday on uLinux wierd behaviour

Recently i've been trying to create a wait function that waits for 25 ms using the wall clock as reference. I looked around and found "gettimeofday", but i've been having problems with it. My code (simplified):
while(1)
{
timeval start, end;
double t_us;
bool release = false;
while (release == false)
{
gettimeofday(&start, NULL);
DoStuff();
{
gettimeofday(&end, NULL);
t_us = ( (end.tv_sec - start.tv_sec) * 1000*1000) + (end.tv_usec - start.tv_usec);
if (t_us >= 25000) //25 ms
{
release = true;
}
}
}
}
This code runs in a thread (Posix) and, on it's its own, works fine. DoStuff() is called every 25ms. It does however eat all the CPU if it can (as you might expect) so obviously this isn't a good idea.
When I tried throttling it by adding a Sleep(1); in the wait loop after the if statement, the entire thing slows by about 50% (that is, it called DoStuff every 37ms or so. This makes no sense to me - assuming DoStuff and any other threads complete their tasks in under (25 - 1) ms the called rate of DoStuff shouldn't be affected (allowing a 1ms error margin)
I also tried Sleep(0), usleep(1000) and usleep(0) but the behaviour is the same.
The same behaviour occurs whenever another higher priority thread needs CPU time (without the sleep). It's as if the clock stops counting when the thread reliqnuishes runtime.
I'm aware that gettimeofday is vulnerable to things like NTP updates etc... so I tried using clock_gettime but linking -ltr on my system causes problems so i don't think that is an option.
Does anyone know what i'm doing wrong?
The part that's missing here is how the kernel does thread scheduling based on time slices. In rough numbers, if you sleep at the beginning of your time slice for 1ms and the scheduling is done on a 35ms clock rate, your thread may not execute again for 35ms. If you sleep for 40ms, your thread may not execute again for 70ms. You can't really change that without changing the scheduling, but that's not recommended due to overall performance implications of the system. You could use a "high-resolution" timer, but often that's implemented in a tight cycle-wasting loop of "while it's not time yet, chew CPU" so that's not really desirable either.
If you used a high-resolution clock and queried it frequently inside of your DoStuff loop, you could potentially play some tricks like run for 30ms, then do a sleep(1) which could effectively relinquish your thread for the remainder of your timeslice (e.g. 5ms) to let other threads run. Kind of a cooperative/preemptive multitasking if you will. It's still possible you don't get back to work for an extended period of time though...
All variants of sleep()/usleep() involve yielding the CPU to other runnable tasks. Your programm can then run only after it is rescheduled by the kernel, which seems to last about 37 ms in your case.

Why am I getting excessive CPU Usage with "clock( )"?

I'm trying to create a simple timer using the clock() method. When the application is executed, my CPU usage jumps from 0% to 25%. For a simple program that does nothing but count from 60 to 0 in seconds, it's a bit excessive.
I was following this: http://www.cplusplus.com/reference/clibrary/ctime/clock/
Any reason for this? Are there any alternatives I could use?
See:
http://msdn.microsoft.com/en-us/library/ms686298%28v=vs.85%29.aspx
The code you reference:
while (clock() < endwait) {}
will clearly just chew CPU while waiting for the time to pass, hence the 25% usage ( one core).
while (clock() < endwait) { Sleep(1);}
should solve your problem.
Use boost::this_thread::sleep
// sleep for one second
boost::this_thread::sleep(boost::posix_time::seconds(1));
My best guess is that your problem is not the clock function, but the wait function.
It loops until a certain time is reached. You should use a function that actually suspends your program, like the sleep function.
Most simple timing tests are better run with some pseudo-code like this:
start = get_time()
for 1 .. 10:
do_the_task()
end = get_time()
diff = end - start
print "%d seconds elapsed", diff
On Unix-derived platforms, gettimeofday(2) returns a struct with seconds and microseconds since the Epoch, which makes for some pretty decent resolution timing. On other platforms, you'll have to hunt around for decent time sources.

clock() vs getsystemtime()

I developed a class for calculations on multithreads and only one instance of this class is used by a thread. Also I want to measure the duration of calculations by iterating over a container of this class from another thread. The application is win32. The thing is I have read QueryPerformanceCounter is useful when comparing the measuremnts on a single thread. Because I can not use it my problem, I think of clock() or GetSystemTime(). It is sad that both methods have a 'resolution' of milliseconds (since CLOCKS_PER_SEC is 1000 on win32). Which method should I use or to generalize, is there a better option for me?
As a rule I have to take the measurements outside the working thread.
Here is some code as an example.
unsinged long GetCounter()
{
SYSTEMTIME ww;
GetSystemTime(&ww);
return ww.wMilliseconds + 1000 * ww.wSeconds;
// or
return clock();
}
class WorkClass
{
bool is_working;
unsigned long counter;
HANDLE threadHandle;
public:
DoWork()
{
threadHandle = GetCurrentThread();
is_working = true;
counter = GetCounter();
// Do some work
is_working = false;
}
};
void CheckDurations() // will work on another thread;
{
for(size_t i =0;i < vector_of_workClass.size(); ++i)
{
WorkClass & wc = vector_of_workClass[i];
if(wc.is_working)
{
unsigned long dur = GetCounter() - wc.counter;
ReportDuration(wc,dur);
if( dur > someLimitValue)
TerminateThread(wc.threadHandle);
}
}
}
QueryPerformanceCounter is fine for multithreaded applications. The processor instruction that may be used (rdtsc) can potentially provide invalid results when called on different processors.
I recommend reading "Game Timing and Multicore Processors".
For your specific application, the problem it appears you are trying to solve is using a timeout on some potentially long-running threads. The proper solution to this would be to use the WaitForMultipleObjects function with a timeout value. If the time expires, then you can terminate any threads that are still running - ideally by setting a flag that each thread checks, but TerminateThread may be suitable.
both methods have a precision of milliseconds
They don't. They have a resolution of a millisecond, the precision is far worse. Most machines increment the value only at intervals of 15.625 msec. That's a heckofalot of CPU cycles, usually not good enough to get any reliable indicator of code efficiency.
QPF does much better, no idea why you couldn't use it. A profiler is a the standard tool to measure code efficiency. Beats taking dependencies you don't want.
QueryPerformanceCounter should give you the best precision, but there is issues when the function get run on different processors (you get a different result for each processor). So when running in a thread you will experience shifts when the thread switch processor. To solve this you can set processor affinity for the thread that measures time.
GetSystemTime gets an absolute time, clock is a relative time but both measure elapsed time, not CPU time related to the actual thread/process.
Of course clock() is more portable. Having said that I use clock_gettime on Linux because I can get both elapsed and thread CPU time with that call.
boost has some time functions that you could use that will run on multiple platforms if you want platform independent code.