Any difference between clock_gettime( CLOCK_REALTIME .... ) and time()? - c++

A simple question: do time(...) and clock_gettime( CLOCK_REALTIME, ... ) produce the same time theoretically (in respect to seconds only)?
Here's what I mean:
time_t epoch;
time( &epoch );
and
struct timespec spec;
clock_gettime( CLOCK_REALTIME, &spec );
Are these two supposed to return exactly the same result (in respect to seconds)?
I "tested" this with changing time and time zones and epoch and spec.tv_sec always show the same result, but the documentation of CLOCK_REATIME confuses me a bit and I'm not sure, that they will always be the same.
Real world situation: I have a piece of code, which uses time. Now I want to have the time in milliseconds (which can be taken from spec.tv_nsec, multiplied by 1000000). So I think about removing time and using directly clock_gettime, but I'm not sure if this will remain the same in all situations.
The question is somehow related to Measure time in Linux - time vs clock vs getrusage vs clock_gettime vs gettimeofday vs timespec_get? but the information there was not enough for me.. I think.

[Note: I used the git master branch and v4.7 for the reference links below, x86 only, as I'm lazy.]
time() is in fact an alias for the equally named syscall, which calls get_seconds, happens at kernel/time/time.c. That syscall uses the get_seconds function to return the UNIX timestamp, which is read from the core timekeeping struct, more precisely from the "Current CLOCK_REALTIME time in seconds" field (xtime_sec).
clock_gettime() is a glibc function in sysdeps\unix\clock_gettime.c, which simply calls gettimeofday if the supplied clock ID is CLOCK_REALTIME, which is again backed by the equally named syscall (source is in the same time.c file, above). This one calls do_gettimeofday, which eventually ends up calling __getnstimeofday64, that queries... the very same xtime_sec field from the same struct as above.
Update:
As #MaximEgorushkin cleverly pointed out, a new vDSO mechanism hijacks (a good sign it is present, if your binary depends on linux-vdso.so.*) the clock_gettime call and redirects it to __vdso_clock_gettime. This one uses a new clock source management framework (gtod - Generic Time Of Day). A call to do_realtime, and it reads from a structure, struct vsyscall_gtod_data's wall_time_sec field. This structure is maintained by update_vsyscall, from the same timekeeper struct as the above.
tl;dr
The answer is: yes, they get the time from the same clock source.

Related

Time taken between two points in code independent of system clock CPP Linux

I need to find the time taken to execute a piece of code, and the method should be independent of system time, ie chrono and all wouldn't work.
My usecse looks somewhat like this.
int main{
//start
function();
//end
time_take = end - start;
}
I am working in an embedded platform that doesn't have the right time at the start-up. In my case, the start of funcion happens before actual time is set from ntp server and end happens after the exact time is obtained. So any method that compares the time difference between two points wouldn't work. Also, number of CPU ticks wouldn't work for me since my programme necessarily be running actively throughout.
I tried the conventional methods and they didn't work for me.
On Linux clock_gettime() has an option to return the the current CLOCK_MONOTONIC, which is unaffected by system time changes. Measuring the CLOCK_MONOTONIC at the beginning and the end, and then doing your own math to subtract the two values, will measure the elapsed time ignoring any system time changes.
If you don't want to dip down to C-level abstractions, <chrono> has this covered for you with steady_clock:
int main{
//start
auto t0 = std::chrono::steady_clock::now();
function();
auto t1 = std::chrono::steady_clock::now();
//end
auto time_take = end - start;
}
steady_clock is generally a wrapper around clock_gettime used with CLOCK_MONOTONIC except is portable across all platforms. I.e. some platforms don't have clock_gettime, but do have an API for getting a monotonic clock time.
Above the type of take_time will be steady_clock::duration. On all platforms I'm aware of, this type is an alias for nanoseconds. If you want an integral count of nanoseconds you can:
using namespace std::literals;
int64_t i = time_take/1ns;
The above works on all platforms, even if steady_clock::duration is not nanoseconds.
The minor advantage of <chrono> over a C-level API is that you don't have to deal with computing timespec subtraction manually. And of course it is portable.

intrinsic tick count, a fine performance measurement no need for external api

taking minimal steps to question how a given code perform(a fast one), isn't
that the smallest unit, most fine measurment?
#pragma intrinsic(__rdtsc)
int main(void)
{
ULONGLONG t1,t2;
t1= __rdtsc();
work();
t2= __rdtsc();
std::cout<<t2-t1<<std::endl;
}
the man page, found at: http://linux.die.net/man/3/clock_gettime gives all the details.
you want to be calling the clock_gettime() function
to get just the time for your process, use:
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, struct timespec * );
or for the current thread use:
clock_gettime(CLOCK_THREAD_CPUTIME_ID, struct timespec * );
returns 0 for success, or -1 for failure (in which case errno is set appropriately).
The struct timespec is defined as:
struct timespec
{
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
All the above is defined in the header file: time.h
It depends on what you want to measure. It does not necessarily give you any information about the elapsed time. That depends on the concrete x86 implementation. It gives you the number of "ticks" elapsed, with different definition of "ticks". It might be the more or less constant maximum clock frequency or the actually used frequency.
To make rdtsc usable for performance measurements of small core fragments, you also have to make sure, that the OS does not preempt your thread, or move it to another core, that might have a different TSC value. Use CPU binding and CPU shielding for your performance measurement thread. Also consider the difference between cold and warm performance testing. Chose wisely between the two depending on your use case.
In a team, I have worked a few years ago, we used it that way and it did give us good, rather stable and reproducible results, since we took care of all these other issues as well: warm tests with CPU binding and CPU shielding.

C++ Equivalent for GetLocalTime in Linux (with milliseconds!)

I have been searching for over an hour but I simply seem to not be able to find the solution!
I am looking for a function that gives me a similar struct as GetLocalTime on Windows does. The important thing for me is that this struct has hours, minutes, seconds and milliseconds.
localtime() does not include milliseconds and therefore I cannot use it!
I would apprechiate a solution that uses the standard library or another very small library since I am working on a Raspberry Pi and connot use large libraries like boost!
As it was mentioned above, there are not direct equivalent. If you can use C++ 11, <chrono> header allows to get the same result, but not in single call. You can use high_resolution_clock to get current Unix time in milliseconds, then you can get localtime C function to get time without milliseconds, and use current Unix time in milleseconds to find milliseconds count. It looks like you will have to write your own GetLocalTime implementation, but with C++ 11 it will not be complex.
GetLocalTime is not a usual Linux function.
Read time(7), you probably want clock_gettime(2), or (as commented by Joachim Pileborg), the older gettimeofday(2)
If you need some struct giving all of hours, minutes, seconds, milliseconds you have to code that yourself using localtime(3) and explicitly computing the millisecond part.
Something like the below code is printing the time with milliseconds
struct timespec ts = {0,0};
struct tm tm = {};
char timbuf[64];
if (clock_gettime(CLOCK_REALTIME, &ts))
{ perror("clock_gettime"), exit(EXIT_FAILURE);};
time_t tim = ts.tv_sec;
if (localtime(&tim, &tm))
{ perror("localtime"), exit(EXIT_FAILURE);};
if (strftime(timbuf, sizeof(timbuf), "%D %T", &tm))
{ perror("strftime"), exit(EXIT_FAILURE);};
printf("%s.%03d\n", timbuf, (int)(ts.tv_nsec/1000000));
You can use a combination of:
clock_gettime(CLOCK_REALTIME); returns local time, up to millisecond (of course constrained by the actual clock resolution); does not care of local timezone information, returns UTC time. Just use millisecond information (from tv_nsec field).
time(); returns local time, up to the second - no millisecond - also UTC time. time() results (a time_t) is easy to convert to the final format.
then convert time() result using localtime_r(); this sets up a structure very similar to Windows SYSTEMTIME; result is up to the second, and takes into account local timezone information.
finally set up the millisecond field using clock_gettime() results.
These routines are documented, not deprecated, portable.
You may need to call tzset() once (this sets the timezone information - a C global variable - from operating system environment - probably a heavy operation).

extending the std::chrono functionality to deal with run-time (non compile-time) constant periods

I have been experimenting with all kind of timers on Linux and OSX, and would like to try and wrap some of them with the same interface used by std::chrono.
That's easy to do for timers that have a well-defined "period" at compile time, e.g. the POSIX clock_gettime() familiy, the clock_get_time() family on OSX, or gettimeofday().
However, there are some useful timers for which the "period" - while constant - is only known at runtime.
For example:
- POSIX states the period of clock(), CLOCKS_PER_SEC, may be a variable on non-XSI systems
- on Linux, the period of times() is given at runtime by sysconf(_SC_CLK_TCK)
- on OSX, the period of mach_absolute_time() is given at runtime by mach_timebase_info()
- on recent Intel processors, the DST register ticks at a constant rate, but of course that can only be determined at runtime
To wrap these timers in the std::chrono interface, one possibility would be to use a period of std::chrono::nanosecond , and convert the value of each timer to nanoseconds. An other approach could be to use a floating point representation. However, both approaches would introduce a (very small) overhead to the now() function, and a (probably small) loss in precision.
The solution I'm trying to pursue is to define a set of classes to represent such "run-time constant" periods, built along the same lines as the std::ratio class.
However I expect that will require rewriting all the related template classes and functions (as they assume constexpr values).
How do I wrap these kind of timers a la std:chrono ?
Or use non-constexpr values for the time period of a clock ?
Does anyone have any experience with wrapping these kind of timers a
la std:chrono ?
Actually I do. And on OSX, one of your platforms of interest. :-)
You mention:
on OSX, the period of mach_absolute_time() is given at runtime by
mach_timebase_info()
Absolutely correct. Also on OSX, the libc++ implementation of high_resolution_clock and steady_clock is actually based on mach_absolute_time. I'm the author of this code, which is open source with a generous license (do anything you want with it as long as you retain the copyright).
Here is the source for libc++'s steady_clock::now(). It is built pretty much the way you surmised. The run time period is converted to nanoseconds prior to returning. On OS X the conversion factor is very often 1, and the code takes advantage of that fact with an optimization. However the code is general enough to handle non-1 conversion factors.
On the first call to now() there's a small cost of querying the run time conversion factor to nanoseconds. In the general case a floating point conversion factor is computed. In the common case (conversion factor == 1) the subsequent cost is calling through a function pointer. I've found that the overhead is really quite reasonable.
On OS X the conversion factor, although not determined until run time, is still a constant (i.e. does not vary as the program executes), so it only needs to be computed once.
If you're in a situation where your period is actually varying dynamically, you'll need more infrastructure to handle this. Essentially you would need to integrate (calculus) the period vs time curve and then compute an average period between two points in time. That would require a constant monitoring of the period as it changes with time, and <chrono> isn't the right tool for that. Such tools are typically handled at the OS level.
[Does anyone have any experience] Or with using non-constexpr values for the time period of a clock ?
After reading through the standard (20.11.5, Class template duration), "period" is expected to be "a specialization of ratio":
Remarks: If Period is not a specialization of ratio, the program is ill-formed.
and all chrono templates rely heavily on constexpr functionality.
Does anyone have any experience with wrapping these kind of timers a la std:chrono ?
I've found here a suggestion to use a duration with period = 1, boost::rational as rep , though without any concrete examples.
I have done a similar thing for my purposes, only for Linux though. You find the code here; feel free to use the code in whatever way you want.
The challenges my implementation addresses overlap partially with the ones mentioned in your question. Specifically:
The tick factor (required to convert from clock ticks to a time unit based on seconds) is retrieved at run time, but only the first time now() is used&ddagger;. If you are concerned about the small overhead this causes, you may call the now() function once at start-up before you measure any actual intervals. The tick factor is stored in a static variable, which means there is still some overhead as – on the lowest level – each call of the now() function implies checking whether the static variable has been initialized. However, this overhead will be the same in each call of now(), so it shouldn't impact measuring time intervals.
I do not convert to nanoseconds by default, because when measuring relatively long periods of time (e.g. a few seconds) this causes overflows very quickly. This is in fact the main reason why I don't use the boost implementation. Instead of converting to nanoseconds, I implement the base unit as a template parameter (called Precision in the code). I use std::ratio from C++11 as template arguments. So I can choose, for example, a clock<micro>, which implies that calling the now() function will internally convert to microseconds rather than nanoseconds, which means I can measure periods of many seconds or minutes without overflows and still with good precision. (This is independent of the unit used to produce output. You can have a clock<micro> and display the result in seconds, etc.)
My clock type, which is called combined_clock combines user time, system time and wall-clock time. There is a boost clock type for this, too, but it's not compatible with the ratio types and units from std, whereas mine is.
&ddagger;The tick factor is retrieved using the ::sysconf() call you suggest, and that is guaranteed to return one and the same value throughout the life time of the process.
So the way you use it is as follows:
#include "util/proctime.hpp"
#include <ratio>
#include <chrono>
#include <thread>
#include <utility>
#include <iostream>
int main()
{
using std::chrono::duration_cast;
using millisec = std::chrono::milliseconds;
using clock_type = rlxutil::combined_clock<std::micro>;
auto tp1 = clock_type::now();
/* Perform some random calculations. */
unsigned long step1 = 1;
unsigned long step2 = 1;
for (int i = 0 ; i < 50000000 ; ++i) {
unsigned long step3 = step1 + step2;
std::swap(step1,step2);
std::swap(step2,step3);
}
/* Sleep for a while (this adds to real time, but not CPU time). */
std::this_thread::sleep_for(millisec(1000));
auto tp2 = clock_type::now();
std::cout << "Elapsed time: "
<< duration_cast<millisec>(tp2 - tp1)
<< std::endl;
return 0;
}
The usage above involves a pretty-print function that generates output like this:
Elapsed time: [user 40, system 0, real 1070 millisec]

C++ fine granular time

The following piece of code gives 0 as runtime of the function. Can anybody point out the error?
struct timeval start,end;
long seconds,useconds;
gettimeofday(&start, NULL);
int optimalpfs=optimal(n,ref,count);
gettimeofday(&end, NULL);
seconds = end.tv_sec - start.tv_sec;
useconds = end.tv_usec - start.tv_usec;
long opt_runtime = ((seconds) * 1000 + useconds/1000.0) + 0.5;
cout<<"\nOptimal Runtime is "<<opt_runtime<<"\n";
I get both start and end time as the same.I get the following output
Optimal Runtime is 0
Tell me the error please.
POSIX 1003.1b-1993 specifies interfaces for clock_gettime() (and clock_getres()), and offers that with the MON option there can be a type of clock with a clockid_t value of CLOCK_MONOTONIC (so that your timer isn't affected by system time adjustments). If available on your system then these functions return a structure which has potential resolution down to one nanosecond, though the latter function will tell you exactly what resolution the clock has.
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* and nanoseconds */
};
You may still need to run your test function in a loop many times for the clock to register any time elapsed beyond its resolution, and perhaps you'll want to run your loop enough times to last at least an order of magnitude more time than the clock's resolution.
Note though that apparently the Linux folks mis-read the POSIX.1b specifications and/or didn't understand the definition of a monotonically increasing time clock, and their CLOCK_MONOTONIC clock is affected by system time adjustments, so you have to use their invented non-standard CLOCK_MONOTONIC_RAW clock to get a real monotonic time clock.
Alternately one could use the related POSIX.1 timer_settime() call to set a timer running, a signal handler to catch the signal delivered by the timer, and timer_getoverrun() to find out how much time elapsed between the queuing of the signal and its final delivery, and then set your loop to run until the timer goes off, counting the number of iterations in the time interval that was set, plus the overrun.
Of course on a preemptive multi-tasking system these clocks and timers will run even while your process is not running, so they are not really very useful for benchmarking.
Slightly more rare is the optional POSIX.1-1999 clockid_t value of CLOCK_PROCESS_CPUTIME_ID, indicated by the presence of the _POSIX_CPUTIME from <time.h>, which represents the CPU-time clock of the calling process, giving values representing the amount of execution time of the invoking process. (Even more rare is the TCT option of clockid_t of CLOCK_THREAD_CPUTIME_ID, indicated by the _POSIX_THREAD_CPUTIME macro, which represents the CPU time clock, giving values representing the amount of execution time of the invoking thread.)
Unfortunately POSIX makes no mention of whether these so-called CPUTIME clocks count just user time, or both user and system (and interrupt) time, accumulated by the process or thread, so if your code under profiling makes any system calls then the amount of time spent in kernel mode may, or may not, be represented.
Even worse, on multi-processor systems, the values of the CPUTIME clocks may be completely bogus if your process happens to migrate from one CPU to another during its execution. The timers implementing these CPUTIME clocks may also run at different speeds on different CPU cores, and at different times, further complicating what they mean. I.e. they may not mean anything related to real wall-clock time, but only be an indication of the number of CPU cycles (which may still be useful for benchmarking so long as relative times are always used and the user is aware that execution time may vary depending on external factors). Even worse it has been reported that on Linux CPU TimeStampCounter-based CPUTIME clocks may even report the time that a process has slept.
If your system has a good working getrusage() system call then it will hopefully be able to give you a struct timeval for each of the the actual user and system times separately consumed by your process while it was running. However since this puts you back to a microsecond clock at best then you'll need to run your test code enough times repeatedly to get a more accurate timing, calling getrusage() once before the loop, and again afterwards, and the calculating the differences between the times given. For simple algorithms this might mean running them millions of times, or more. Note also that on many systems the division between user time and system time is done somewhat arbitrarily and if examined separately in a repeated loop one or the other can even appear to run backwards. However if your algorithm makes no system calls then summing the time deltas should still be a fair total time for your code execution.
BTW, take care when comparing time values such that you don't overflow or end up with a negative value in a field, either as #Nim suggests, or perhaps like this (from NetBSD's <sys/time.h>):
#define timersub(tvp, uvp, vvp) \
do { \
(vvp)->tv_sec = (tvp)->tv_sec - (uvp)->tv_sec; \
(vvp)->tv_usec = (tvp)->tv_usec - (uvp)->tv_usec; \
if ((vvp)->tv_usec < 0) { \
(vvp)->tv_sec--; \
(vvp)->tv_usec += 1000000; \
} \
} while (0)
(you might even want to be more paranoid that tv_usec is in range)
One more important note about benchmarking: make sure your function is actually being called, ideally by examining the assembly output from your compiler. Compiling your function in a separate source module from the driver loop usually convinces the optimizer to keep the call. Another trick is to have it return a value that you assign inside the loop to a variable defined as volatile.
You've got weird mix of floats and ints here:
long opt_runtime = ((seconds) * 1000 + useconds/1000.0) + 0.5;
Try using:
long opt_runtime = (long)(seconds * 1000 + (float)useconds/1000);
This way you'll get your results in milliseconds.
The execution time of optimal(...) is less than the granularity of gettimeofday(...). This likely happes on Windows. On Windows the typical granularity is up to 20 ms. I've answered a related gettimeofday(...) question here.
For Linux I asked How is the microsecond time of linux gettimeofday() obtained and what is its accuracy? and got a good result.
More information on how to obtain accurate timing is described in this SO answer.
I normally do such a calculation as:
long long ss = start.tv_sec * 1000000LL + start.tv_usec;
long long es = end.tv_sec * 1000000LL + end.tv_usec;
Then do a difference
long long microsec_diff = es - ss;
Now convert as required:
double seconds = microsec_diff / 1000000.;
Normally, I don't bother with the last step, do all timings in microseconds.