I was trying to implement small time delays in multithreading code using boost::this_thread::sleep.
Here is code example:
{
boost::timer::auto_cpu_timer t;//just to check sleep interval
boost::this_thread::sleep(boost::posix_time::milliseconds(25));
}
Output generated by auto_cpu_timer confused me little bit:
0.025242s wall, 0.010000s user + 0.020000s system = 0.030000s CPU (118.9%)
Why it 0.025242s but not 0.0025242s ?
Because 25 milliseconds is 0.025 seconds; 0.0025 seconds would be 2.5 milliseconds.
Related
Currently I am coding a project that requires precise delay times over a number of computers. Currently this is the code I am using I found it on a forum. This is the code below.
{
LONGLONG timerResolution;
LONGLONG wantedTime;
LONGLONG currentTime;
QueryPerformanceFrequency((LARGE_INTEGER*)&timerResolution);
timerResolution /= 1000;
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
wantedTime = currentTime / timerResolution + ms;
currentTime = 0;
while (currentTime < wantedTime)
{
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
currentTime /= timerResolution;
}
}
Basically the issue I am having is this uses alot of CPU around 16-20% when I start to call on the function. The usual Sleep(); uses Zero CPU but it is extremely inaccurate from what I have read from multiple forums is that's the trade-off when you trade accuracy for CPU usage but I thought I better raise the question before I set for this sleep method.
The reason why it's using 15-20% CPU is likely because it's using 100% on one core as there is nothing in this to slow it down.
In general, this is a "hard" problem to solve as PCs (more specifically, the OSes running on those PCs) are in general not made for running real time applications. If that is absolutely desirable, you should look into real time kernels and OSes.
For this reason, the guarantee that is usually made around sleep times is that the system will sleep for atleast the specified amount of time.
If you are running Linux you could try using the nanosleep method (http://man7.org/linux/man-pages/man2/nanosleep.2.html) Though I don't have any experience with it.
Alternatively you could go with a hybrid approach where you use sleeps for long delays, but switch to polling when it's almost time:
#include <thread>
#include <chrono>
using namespace std::chrono_literals;
...
wantedtime = currentTime / timerResolution + ms;
currentTime = 0;
while(currentTime < wantedTime)
{
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
currentTime /= timerResolution;
if(currentTime-wantedTime > 100) // if waiting for more than 100 ms
{
//Sleep for value significantly lower than the 100 ms, to ensure that we don't "oversleep"
std::this_thread::sleep_for(50ms);
}
}
Now this is a bit race condition prone, as it assumes that the OS will hand back control of the program within 50ms after the sleep_for is done. To further combat this you could turn it down (to say, sleep 1ms).
You can set the Windows timer resolution to minimum (usually 1 ms), to make Sleep() accurate up to 1 ms. By default it would be accurate up to about 15 ms. Sleep() documentation.
Note that your execution can be delayed if other programs are consuming CPU time, but this could also happen if you were waiting with a timer.
#include <timeapi.h>
// Sleep() takes 15 ms (or whatever the default is)
Sleep(1);
TIMECAPS caps_;
timeGetDevCaps(&caps_, sizeof(caps_));
timeBeginPeriod(caps_.wPeriodMin);
// Sleep() now takes 1 ms
Sleep(1);
timeEndPeriod(caps_.wPeriodMin);
I used the following function to find the time taken by my code.
#include <sys/time.h>
struct timeval start, end;
gettimeofday(&start,NULL);
//mycode
gettimeofday(&end,NULL);
cout<<" time taken by my code: "<<((end.tv_sec - start.tv_sec) * 1000000 + end.tv_usec - start.tv_usec ) / 1000.0<<" msec"<<endl;
I observed that even though my code runs for 2 hours, yet the time reported by the above function is 1213 milliseconds. I am not able to understand as to why is it happened. Also is there a way by which I may record the time taken by my code in hours correctly
My best guess is that time_t (the type of tv_sec) on your system is signed 32 bits and that (end.tv_sec - start.tv_sec) * 1000000 overflows.
You could test that theory by making sure that you don't use 32 bit arithmetic for this computation:
(end.tv_sec - start.tv_sec) * 1000000LL
That being said, I advise use of the C++11 <chrono> library instead:
#include <chrono>
auto t0 = std::chrono::system_clock::now();
//mycode
auto t1 = std::chrono::system_clock::now();
using milliseconds = std::chrono::duration<double, std::milli>;
milliseconds ms = t1 - t0;
std::cout << " time taken by my code: " << ms.count() << '\n';
The <chrono> library has an invariant that none of the "predefined" durations will overflow in less than +/- 292 years. In practice, only nanoseconds will overflow that quickly, and the other durations will have a much larger range. Each duration has static ::min() and ::max() functions you can use to query the range for each.
The original proposal for <chrono> has a decent tutorial section that might be a helpful introduction. It is only slightly dated. What it calls monotonic_clock is now called steady_clock. I believe that is the only significant update it lacks.
On which platform are you doing this? If it's Linux/Unix-like your easiest non-intrusive bet is simply using the time command from the command-line. Is the code you're running single-threaded or not? Some of the functions in time.h (like clock() e.g. ) return the number of ticks against each core, which may or may not be what you want. And the newer stuff in the chrono may not be as exact as you like (a while back I tried to measure time intervals in nanoseconds with chrono, but the lowest time interval I got back back then was 300ns, which was much less exact than I'd hoped).
This part of the bench marking process may help your purpose:
#include<time.h>
#include<cstdlib>
...
...
float begin = (float)clock()/CLOCKS_PER_SEC;
...
//do your bench-marking stuffs
...
float end = (float)clock()/CLOCKS_PER_SEC;
float totalTime = end - begin;
cout<<"Time Req in the stuffs: "<<totalTime<<endl;
NOTE: This process is a simple alternative to the chrono library
If you are on linux and the code that you want to time is largely the program itself, then you can time your program by passing it as an argument to the time command and look at the 'elapsed time' row.
/usr/bin/time -v <your program's executable>
For example:
/usr/bin/time -v sleep 3 .../home/aakashah/workspace/head/src/GroverStorageCommon
Command being timed: "sleep 3"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2176
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 165
Voluntary context switches: 2
Involuntary context switches: 0
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
To do timing comparisons I wanted to use boost::timer. Here is a simple test case that performs some vector operations:
std::vector<float> hv( 1000*1000 );
std::generate(hv.begin(), hv.end(), rand);
{
boost::timer::auto_cpu_timer t;
std::transform(hv.begin(), hv.end(), hv.begin(), sqrtf);
}
The confusing part is that boost::timer reports this:
0.011577s wall, 0.020000s user + 0.000000s system = 0.020000s CPU (172.8%)
How can my userspace time exceed wall time?
Most likely if you use threads, it will display the CPU time spent on all threads in the process
By adding more test code the userspace time will jump to 0.03s and then to 0.04s
So it looks like the userspace duration is only accurate to within 10 ms causing the CPU utilization calculation to be wrong.
I'm trying to make my program self timed and I know two methods
1) using getrusage
truct rusage startu; struct rusage endu;
getrusage(RUSAGE_SELF, &startu);
//Do computation here
getrusage(RUSAGE_SELF, &endu);
double start_sec = start.ru_utime.tv_sec + startu.ru_utime.tv_usec/1000000.0;
double end_sec = endu.ru_utime.tv_sec + endu.ru_utime.tv_usec/1000000.0;
double duration = end_sec - start_sec;
This fetches the user time of a program segment.
2) using clock(), which gets the processor's executing time
double start_sec = (double)clock()/CLOCKS_PER_SEC;
//Do computation here
double end_sec = (double)clock()/CLOCKS_PER_SEC;
double duration = end_sec - start_sec;
This fetches the real time of a program segment.
However, I get really long sys time for both methods. The user time is also longer than without these timings. System time sometimes even doubles the user time.
For example, I'm doing Traveling Salesman Problem, for a input that runs around 3 seconds for both user and real time normally, these two timings both make the user time to be over 5 seconds and real time over 15 secs, which means sys time is around 10 seconds long.
I hope to know if there are ways of improvements or other libraries that are capable of shortening the sys time and user time if possible. If I have to user other libraries, I want libraries for both user time timing and real time timing.
Thanks for any advice!
I suggest to carefully read the time(7) man page and to consider also the clock_gettime(2) syscall.
I want to wait 1.5 seconds in a boost thread. Using boost::xtime I can wait an integer number of seconds:
// Block on the queue / wait for data for up two seconds.
boost::xtime_get(&xt, boost::TIME_UTC);
xt.sec++;
xt.sec++;
....
_condition.timed_wait(_mutex, xt)
How can I wait 1.5 seconds instead?
Would the following not work using the nanoseconds and seconds portion and increasing by 0.5 billion nanoseconds and adding a second which is 1.5 seconds
xt.sec++;
xt.nsec += 500000000;
_condition.timed_wait(_mutex, xt);