I'm trying to measure getrusage resolution via simple program:
#include <cstdio>
#include <sys/time.h>
#include <sys/resource.h>
#include <cassert>
int main(int argc, const char *argv[]) {
struct rusage u = {0};
assert(!getrusage(RUSAGE_SELF, &u));
size_t cnt = 0;
while(true) {
++cnt;
struct rusage uz = {0};
assert(!getrusage(RUSAGE_SELF, &uz));
if(u.ru_utime.tv_sec != uz.ru_utime.tv_sec || u.ru_utime.tv_usec != uz.ru_utime.tv_usec) {
std::printf("u:%ld.%06ld\tuz:%ld.%06ld\tcnt:%ld\n",
u.ru_utime.tv_sec, u.ru_utime.tv_usec,
uz.ru_utime.tv_sec, uz.ru_utime.tv_usec,
cnt);
break;
}
}
}
And when I run it, I usually get output similar to the following:
ema#scv:~/tmp/getrusage$ ./gt
u:0.000562 uz:0.000563 cnt:1
ema#scv:~/tmp/getrusage$ ./gt
u:0.000553 uz:0.000554 cnt:1
ema#scv:~/tmp/getrusage$ ./gt
u:0.000496 uz:0.000497 cnt:1
ema#scv:~/tmp/getrusage$ ./gt
u:0.000475 uz:0.000476 cnt:1
Which seems to hint that the resolution of getrusage is around 1 microsecond.
I thought it should be around 1 / getconf CLK_TCK (i.e. 100hz, hence 10 millisecond).
What is the true getrusage resolution?
Am I doing anything wrong?
Ps. Running this on Ubuntu 20.04, Linux scv 5.13.0-52-generic #59~20.04.1-Ubuntu SMP Thu Jun 16 21:21:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux, 5950x.
The publicly defined tick interval is nothing more than a common reference point for the default time-slice that each process gets to run. When its tick expires the process loses its assigned CPU which then begins executing some other task, which is given another tick-long timeslice to run.
But that does not guarantee that a given process will run for its full tick. If a process attempts to read() an empty socket, and has nothing to do in a middle of a tick the kernel is not going to do nothing with the process's CPU, and find something better to do, instead. The kernel knows exactly how long the process ran for, and there is no reason whatsoever why the actual running time of the process cannot be recorded in its usage statistics, especially if the clock reference used for measuring process execution time can offer much more granularity than the tick interval.
Finally the modern Linux kernel can be configured to not even use tick intervals, in specific situations, and its advertised tick interval is mostly academic.
Related
I am making a program using the Sleep command via Windows.h, and am experiencing a frustrating difference between running my program on Windows 10 instead of Windows 7. I simplified my program to the program below which exhibits the same behavior as my more complicated program.
On Windows 7 this 5000 count loop runs with the Sleep function at 1ms. This takes 5 seconds to complete.
On Windows 10 when I run the exact same program (exact same binary executable file), this program takes almost a minute to complete.
For my application this is completely unacceptable as I need to have the 1ms timing delay in order to interact with hardware I am using.
I also tried a suggestion from another post to use the select() command (via winsock2), but that command did not work to delay 1ms either. I have tried this program on multiple Windows 7 and Windows 10 PC's and the root cause of the issue always points to using Windows 10 instead of Windows 7. The program always runs within ~5 seconds on numerous Windows 7 PC's, and on the multiple Windows 10 PC's that I have tested the duration has been much longer ~60 seconds.
I have been using Microsoft Visual Studio Express 2010 (C/C++) as well as Microsoft Visual Studio Express 2017 (C/C++) to compile the programs. The version of visual studio does not influence the results.
I have also changed the compile options from 'Debug' to 'Release' and tried to optimize the compiler but this will not help either.
Any suggestions would be greatly appreciated.
#include <stdio.h>
#include <Windows.h>
#define LOOP_COUNT 5000
int main()
{
int i = 0;
for (i; i < LOOP_COUNT; i++){
Sleep(1);
}
return 0;
}
I need to have the 1ms timing delay in order to interact with hardware I am using
Windows is the wrong tool for this job.
If you insist on using this wrong tool, you are going to have to make compromises (such as using a busy-wait and accepting the corresponding poor battery life).
You can make Sleep() more accurate using timeBeginPeriod(1) but depending on your hardware peripheral's limits on the "one millisecond" delay -- is that a minimum, maximum, or the middle of some range? -- it still will fail to meet your timing requirement with some non-zero probability.
The timeBeginPeriod function requests a minimum resolution for periodic timers.
The right solution for talking to hardware with tight timing tolerances is an embedded microcontroller which talks to the Windows PC through some very flexible interface such as UART or Ethernet, buffers data, and uses hardware timers to generate signals with very well-defined timing.
In some cases, you might be able to use embedded circuitry already existing within your Windows PC, such as "sound card" functionality.
#BenVoigt & #mzimmers thank you for your responses and suggestions. I did find a unique solution to this question and the solution was inspired by the post I have linked directly below.
Units of QueryPerformanceFrequency
In this post BrianP007 writes a function to see how fast the Sleep(1000) command takes. However, while I was playing around I realized that Sleep() accepts 0. Therefore I used a similar structure to the linked post to find the time that it takes to loop until reaching a delta t of 1ms.
For my purposes I increased i by 100, however it can be increased by 10 or by 1 in order to get a more accurate estimate as to what i should be.
Once you get a value for i, you can use that value to get an approximate delay for 1ms on your machine. If you run this function in a loop (I ran it 100 times) I was able to get anywhere from i = 3000 to i = 6000. However, my machine averages out around 5500. This spread is probably due to jitter/clock frequency changes through time in the processor.
The processor_check() function below only finds out what value should be returned for the for loop argument; the actual 'timer' needs to just have the for loop with Sleep(0) inside of it to run a timer with ~1ms resolution on the machine.
While this method is not perfect, it is much closer and works a ton better than using Sleep(1). I have to test this more thoroughly, but please let me know if this works for you as well. Please feel free to use the code below if you need it for your own applications. This code should be able to be copy and pasted into an empty command prompt C program in Visual Studio directly without modification.
/*ZKR Sleep_ZR()*/
#include "stdio.h"
#include <windows.h>
/*Gets for loop value*/
int processor_check()
{
double delta_time = 0;
int i = 0;
int n = 0;
while(delta_time < 0.001){
LARGE_INTEGER sklick, eklick, cpu_khz;
QueryPerformanceFrequency(&cpu_khz);
QueryPerformanceCounter(&sklick);
for(n = 0; n < i; n++){
Sleep(0);
}
QueryPerformanceCounter(&eklick);
delta_time = (eklick.QuadPart-sklick.QuadPart) / (double)cpu_khz.QuadPart;
i = i + 100;
}
return i;
}
/*Timer*/
void Sleep_ZR(int cnt)
{
int i = 0;
for(i; i < cnt; i++){
Sleep(0);
}
}
/*Main*/
int main(int argc, char** argv)
{
double average = 0;
int i = 0;
/*Single use*/
int loop_count = processor_check();
Sleep_ZR(loop_count);
/*Average based on processor to get more accurate Sleep_ZR*/
for(i = 0; i < 100; i++){
loop_count = processor_check();
average = average + loop_count;
}
average = average / 100;
printf("Average: %f\n", average);
/*10 second test*/
for (i = 0; i < 10000; i++){
Sleep_ZR((int)average);
}
return 0;
}
Currently I am coding a project that requires precise delay times over a number of computers. Currently this is the code I am using I found it on a forum. This is the code below.
{
LONGLONG timerResolution;
LONGLONG wantedTime;
LONGLONG currentTime;
QueryPerformanceFrequency((LARGE_INTEGER*)&timerResolution);
timerResolution /= 1000;
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
wantedTime = currentTime / timerResolution + ms;
currentTime = 0;
while (currentTime < wantedTime)
{
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
currentTime /= timerResolution;
}
}
Basically the issue I am having is this uses alot of CPU around 16-20% when I start to call on the function. The usual Sleep(); uses Zero CPU but it is extremely inaccurate from what I have read from multiple forums is that's the trade-off when you trade accuracy for CPU usage but I thought I better raise the question before I set for this sleep method.
The reason why it's using 15-20% CPU is likely because it's using 100% on one core as there is nothing in this to slow it down.
In general, this is a "hard" problem to solve as PCs (more specifically, the OSes running on those PCs) are in general not made for running real time applications. If that is absolutely desirable, you should look into real time kernels and OSes.
For this reason, the guarantee that is usually made around sleep times is that the system will sleep for atleast the specified amount of time.
If you are running Linux you could try using the nanosleep method (http://man7.org/linux/man-pages/man2/nanosleep.2.html) Though I don't have any experience with it.
Alternatively you could go with a hybrid approach where you use sleeps for long delays, but switch to polling when it's almost time:
#include <thread>
#include <chrono>
using namespace std::chrono_literals;
...
wantedtime = currentTime / timerResolution + ms;
currentTime = 0;
while(currentTime < wantedTime)
{
QueryPerformanceCounter((LARGE_INTEGER*)¤tTime);
currentTime /= timerResolution;
if(currentTime-wantedTime > 100) // if waiting for more than 100 ms
{
//Sleep for value significantly lower than the 100 ms, to ensure that we don't "oversleep"
std::this_thread::sleep_for(50ms);
}
}
Now this is a bit race condition prone, as it assumes that the OS will hand back control of the program within 50ms after the sleep_for is done. To further combat this you could turn it down (to say, sleep 1ms).
You can set the Windows timer resolution to minimum (usually 1 ms), to make Sleep() accurate up to 1 ms. By default it would be accurate up to about 15 ms. Sleep() documentation.
Note that your execution can be delayed if other programs are consuming CPU time, but this could also happen if you were waiting with a timer.
#include <timeapi.h>
// Sleep() takes 15 ms (or whatever the default is)
Sleep(1);
TIMECAPS caps_;
timeGetDevCaps(&caps_, sizeof(caps_));
timeBeginPeriod(caps_.wPeriodMin);
// Sleep() now takes 1 ms
Sleep(1);
timeEndPeriod(caps_.wPeriodMin);
I am trying to measure time take by processes in C++ program with linux and Vxworks. I have noticed that clock_gettime(CLOCK_REALTIME, timespec ) is accurate enough (resolution about 1 ns) to do the job on many Oses. For a portability matter I am using this function and running it on both Vxworks 6.2 and linux 3.7.
I ve tried to measure the time taken by a simple print:
#define <timers.h<
#define <iostream>
#define BILLION 1000000000L
int main(){
struct timespec start, end; uint32_t diff;
for(int i=0; i<1000; i++){
clock_gettime(CLOCK_REALTME, &start);
std::cout<<"Do stuff"<<std::endl;
clock_gettime(CLOCK_REALTME, &end);
diff = BILLION*(end.tv_sec-start.tv_sec)+(end.tv_nsec-start.tv_nsec);
std::cout<<diff<<std::endl;
}
return 0;
}
I compiled this on linux and vxworks. For linux results seemed logic (average 20 µs). But for Vxworks, I ve got a lot of zeros , then 5000000 ns , then a lot of zeros...
PS , for vxwroks, I runned this app on ARM-cortex A8, and results seemed random
have anyone seen the same bug before,
In vxworks, the clock resolution is defined by the system scheduler frequency. By default, this is typically 60Hz, however may be different dependant on BSP, kernel configuration, or runtime configuration.
The VxWorks kernel configuration parameters SYS_CLK_RATE_MAX and SYS_CLK_RATE_MIN define the maximum and minimum values supported, and SYS_CLK_RATE defines the default rate, applied at boot.
The actual clock rate can be modified at runtime using sysClkRateSet, either within your code, or from the shell.
You can check the current rate by using sysClkRateGet.
Given that you are seeing either 0 or 5000000ns - which is 5ms, I would expect that your system clock rate is ~200Hz.
To get greater resolution, you can increase the system clock rate. However, this may have undesired side effects, as this will increase the frequency of certain system operations.
A better method of timing code may be to use sysTimestamp which is typically driven from a high frequency timer, and can be used to perform high-res timing of short-lived activities.
I think in vxworks by default the clock resolution is 16.66ms which you can get by calling clock_getres() function. You can change the resolution by calling sysclkrateset() function(max resolution supported is 200us i guess by passing 5000 as argument to sysclkrateset function). You can then calculate the difference between two timestamps using difftime() function
I am using the timed_wait from boost C++ library and I am getting a problem with leap seconds.
Here is a quick test:
#include <boost/thread.hpp>
#include <stdio.h>
#include <boost/date_time/posix_time/posix_time.hpp>
int main(){
// Determine the absolute time for this timer.
boost::system_time tAbsoluteTime = boost::get_system_time() + boost::posix_time::milliseconds(35000);
bool done;
boost::mutex m;
boost::condition_variable cond;
boost::unique_lock<boost::mutex> lk(m);
while(!done)
{
if(!cond.timed_wait(lk,tAbsoluteTime))
{
done = true;
std::cout << "timed out";
}
}
return 1;
}
The timed_wait function is returning 24 seconds earlier than it should. 24 seconds is the current amount of leap seconds in UTC.
So, boost is widely used but I could not find any info about this particular problem. Has anyone else experienced this problem? What are the possible causes and solutions?
Notes: I am using boost 1.38 on a linux system. I've heard that this problem doesn't happen on MacOS.
UPDATE: A little more info: This is happening on 2 redhat machines with kernel 2.6.9. I have executed the same code on an ubuntu machine with kernel 2.6.30 and the timer behaves as expected.
So, what I think is that this is probably being caused by the OS or by some mis-set configuration on the redhat machines.
I have coded a workaround that adjusts the time to UTC and than get the difference from this adjustment and add to the original time. This seens like a bad idea to me because if this code is executed on a machine without this problem, it might be 24s AHEAD. Still could not find the reason for this.
On a Linux system, the system clock will follow the POSIX standard, which mandates that
leap seconds are NOT observed! If you expected otherwise, that's probably the source of the discrepancy you're seeing. This document has a great explanation of how UTC relates to other time scales, and the problems one is likely to encounter if one relies on the operating system's concept of timekeeping.
Is it possible that done is getting set prematurely and a spurious wakeup is causing the loop to exit sooner than you expected?
Ok, here is what I did. It's a workaround and I am not happy with it but it was the best I could come up with:
int main(){
typedef boost::date_time::c_local_adjustor<boost::system_time> local_adj;
// Determine the absolute time for this timer.
boost::system_time tAbsoluteTime = boost::get_system_time() + boost::posix_time::milliseconds(25000);
/*
* A leap second is a positive or negative one-second adjustment to the Coordinated
* Universal Time (UTC) time scale that keeps it close to mean solar time.
* UTC, which is used as the basis for official time-of-day radio broadcasts for civil time,
* is maintained using extremely precise atomic clocks. To keep the UTC time scale close to
* mean solar time, UTC is occasionally corrected by an adjustment, or "leap",
* of one second.
*/
boost::system_time tAbsoluteTimeUtc = local_adj::utc_to_local(tAbsoluteTime);
// Calculate the local-to-utc difference.
boost::posix_time::time_duration tLocalUtcDiff = tAbsoluteTime - tAbsoluteTimeUtc;
// Get only the seconds from the difference. These are the leap seconds.
tAbsoluteTime += boost::posix_time::seconds(tLocalUtcDiff.seconds());
bool done;
boost::mutex m;
boost::condition_variable cond;
boost::unique_lock<boost::mutex> lk(m);
while(!done)
{
if(!cond.timed_wait(lk,tAbsoluteTime))
{
done = true;
std::cout << "timed out";
}
}
return 1;
}
I've tested it on problematic and non-problematic machines and it worked as expected on both, so I'm keeping it as long as I can't found a better solution.
Thank you all for your help.
Is there a way to get notified when there is update to the system time from a time-server or due to DST change? I am after an API/system call or equivalent.
It is part of my effort to optimise generating a value for something similar to SQL NOW() to an hour granularity, without using SQL.
You can use timerfd_create(2) to create a timer, then mark it with the TFD_TIMER_CANCEL_ON_SET option when setting it. Set it for an implausible time in the future and then block on it (with poll/select etc.) - if the system time changes then the timer will be cancelled, which you can detect.
(this is how systemd does it)
e.g.:
#include <sys/timerfd.h>
#include <limits.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
int main(void) {
int fd = timerfd_create(CLOCK_REALTIME, 0);
timerfd_settime(fd, TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET,
&(struct itimerspec){ .it_value = { .tv_sec = INT_MAX } },
NULL);
printf("Waiting\n");
char buffer[10];
if (-1 == read(fd, &buffer, 10)) {
if (errno == ECANCELED)
printf("Timer cancelled - system clock changed\n");
else
perror("error");
}
close(fd);
return 0;
}
I don't know if there is a way to be notified of a change in the system time, but
The system time is stored as UTC, so there is never a change due to DST change to be notified.
If my memory is correct, NTP deamon usually adjust the clock by changing its speed, again no change to be notified.
So the only times where you would be notified is after an uncommon manipulation.
clock_gettime on most recent Linux systems is incredibly fast, and usually pretty amazingly precise as well; you can find out the precision using clock_getres. But for hour level timestamps, gettimeofday might be more convenient since it can do the timezone adjustment for you.
Simply call the appropriate system call and do the division into hours each time you need a timestamp; all the other time adjustments from NTP or whatever will already have been done for you.