Lets say I want to measure the total time of a particular function. Now this function calls other functions (f1 and f2). So I want to calculate total time of f1 and f2.
What I was expecting was f total time = f1 total time + f2 total time
void f(){
struct timespec total_start, total_end;
struct timespec f1_start, f1_end;
struct timespec f2_start, f2_end;
clock_gettime(CLOCK_MONOTONIC, &total_start);
clock_gettime(CLOCK_MONOTONIC, &f1_start);
f1();
clock_gettime(CLOCK_MONOTONIC, &f1_end);
clock_gettime(CLOCK_MONOTONIC, &f2_start);
f2();
clock_gettime(CLOCK_MONOTONIC, &f2_end);
clock_gettime(CLOCK_MONOTONIC, &total_end);
f_total_time = (total_end.tv_sec - total_start.tv_sec) + (total_end.tv_nsec - total_start.tv_nsec)/1e9 ;
f1_total_time = (f1_end.tv_sec - f1_start.tv_sec) + (f1_end.tv_nsec - f1_start.tv_nsec)/1e9 ;
f2_total_time = (f2_end.tv_sec - f2_start.tv_sec) + (f2_end.tv_nsec - f2_start.tv_nsec)/1e9 ;
}
My question is, Is this a correct way to measure time of functions inside function.
Problem : The problem I am facing is total time of f1 and f2 does not add up to total time of f. ie f total time != f1 total time + f2 total time what actually happens is f total time > f1 total time + f2 total time
Am I doing something wrong?
Answer -
Yes. IMHO it appears to be a valid duration measurement technique
of a function within a function.
The Posix clock_gettime() reports sec/nanoseconds from a fixed
time, so each access is independent of any other.
From "man clock_gettime" :
All implementations support the system-wide real-time clock,
which is identified by CLOCK_REALTIME. Its time represents
seconds and nanoseconds since the Epoch. When its time is
changed, timers for a relative interval are unaffected, but
timers for an absolute point in time are affected.
I see nothing wrong with your approach.
Perhaps you need to know more about the relative duration of your
code vs duration of the clock read mechanisms you are using.
On my Ubuntu 15.10, on an older Dell, using g++ 5.2.1, the Posix
call
clock_gettime(CLOCK_REALTIME, ...)
uses > 1,500 ns (avg over 3 seconds) (i.e. ~1.5 us)
To achieve some measure of repeatability, the duration you are
trying to measure (f1() and f2() and f1()+f2()) must be more than
this, probably by a factor of 10.
Your system will be different (than mine), so you must test it to
know how long these clock reads take.
There is also the interesting idea of knowing how fast
CLOCK_REALTIME increments. Even though the API indicates
nanoseconds, it might not be that fast.
An alternative I use is std::time(nullptr) with a cost of ~5 ns (on my system), 3 orders of magnitude less. FYI: ::time(0) measures the same.
A loop controlled by this API return simply kicks out at the end
of a second, when the value returned has changed from previous
values. I usually accumulate 3 seconds of loops (i.e. a fixed
test time) and compute the average event duration.
Example measurement output:
751.1412070 M 'std::time(nullptr) duration' invocations in 3.999,788 sec (3999788 us)
187.7952549 M 'std::time(nullptr) duration' events per second
5.324948176 n seconds per 'std::time(nullptr) duration' event
If using this clock access, you can simply subtract 5.3 ns (on my
system) from each invocation when calculating the seconds per event for
your functions.
Note: Any Posix API is an interface to a system provided
function, not the function itself.
Being part of an API is not conclusive evidence about the
functions implementation ... which may be in any language,
even assy for peak performance.
To chronometer a C++ application, note initial time in a variable, and a declare a duration (seconds) :
#include "time.h"
clock_t t (clock ());
size_t duration (0);
during execution, duration is updated this way :
duration = (clock() - t)/CLOCKS_PER_SEC;
Related
The time command returns the time elapsed in execution of a command.
If I put a "gettimeofday()" at the start of the command call (using system() ), and one at the end of the call, and take a difference, it doesn't come out the same. (its not a very small difference either)
Can anybody explain what is the exact difference between the two usages and which is the best way to time the execution of a call?
Thanks.
The Unix time command measures the whole program execution time, including the time it takes for the system to load your binary and all its libraries, and the time it takes to clean up everything once your program is finished.
On the other hand, gettimeofday can only work inside your program, that is after it has finished loading (for the initial measurement), and before it is cleaned up (for the final measurement).
Which one is best? Depends on what you want to measure... ;)
It's all dependent on what you are timing. If you are trying to time something in seconds, then time() is probably your best bet. If you need higher resolution than that, then I would consider gettimeofday(), which gives up to microsecond resolution (1 / 1000000th of a second).
If you need even higher resolution than that, consider using clock() and CLOCKS_PER_SECOND, just note that clock() is rarely an accurate description for the amount of time taken, but rather the number of CPU cycles used.
time() returns time since epoch in seconds.
gettimeofday(): returns:
struct timeval {
time_t tv_sec; /* seconds */
suseconds_t tv_usec; /* microseconds */
};
Each time function has different precision. In C++11 you would use std::chrono:
using namespace std::chrono;
auto start = high_resolution_clock::now();
/* do stuff*/
auto end = high_resolution_clock::now();
float elapsedSeconds = duration_cast<duration<float>>(end-start).count();
How can i count the millisecond a certain function (called repeatedly) takes ?
I thought of:
CTime::GetCurrentTM() before,
CTime::GetCurrentTM() after,
And then insert the result to CTimeSpan diff = after - before.
Finally store that diff to global member that sum all diffs since i want to know the total time this function spent.
but it will give the answer in seconds and not milliseconds.
MFC is C++, right?
If so, you can just use clock().
#include <ctime>
clock_t time1 = clock();
// do something heavy
clock_t time2 = clock();
clock_t timediff = time2 - time1;
float timediff_sec = ((float)timediff) / CLOCKS_PER_SEC;
This will usually give you millisecond precision.
If you are using MFC, the nice way is to use WIN API. And since you are worried just to calculate the time difference, the below function might suit you perfectly.
GetTickCount64()
directly returns the number of milli seconds that has elapsed since the system was started.
If you don't plan to keep your system up long (precisely more than 49.7 days), little bit faster version - GetTickCount() function
The COleDateTime is known to internally work based on milliseconds, because it stores timestamp on its m_dt variable, which is of DATE type, so having resolution for the intended purpose.
I can suggest you to base your time on
DATE now= (DATE) COleDateTime::GetCurrentTime();
and after do the respective calculations.
The following piece of code gives 0 as runtime of the function. Can anybody point out the error?
struct timeval start,end;
long seconds,useconds;
gettimeofday(&start, NULL);
int optimalpfs=optimal(n,ref,count);
gettimeofday(&end, NULL);
seconds = end.tv_sec - start.tv_sec;
useconds = end.tv_usec - start.tv_usec;
long opt_runtime = ((seconds) * 1000 + useconds/1000.0) + 0.5;
cout<<"\nOptimal Runtime is "<<opt_runtime<<"\n";
I get both start and end time as the same.I get the following output
Optimal Runtime is 0
Tell me the error please.
POSIX 1003.1b-1993 specifies interfaces for clock_gettime() (and clock_getres()), and offers that with the MON option there can be a type of clock with a clockid_t value of CLOCK_MONOTONIC (so that your timer isn't affected by system time adjustments). If available on your system then these functions return a structure which has potential resolution down to one nanosecond, though the latter function will tell you exactly what resolution the clock has.
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* and nanoseconds */
};
You may still need to run your test function in a loop many times for the clock to register any time elapsed beyond its resolution, and perhaps you'll want to run your loop enough times to last at least an order of magnitude more time than the clock's resolution.
Note though that apparently the Linux folks mis-read the POSIX.1b specifications and/or didn't understand the definition of a monotonically increasing time clock, and their CLOCK_MONOTONIC clock is affected by system time adjustments, so you have to use their invented non-standard CLOCK_MONOTONIC_RAW clock to get a real monotonic time clock.
Alternately one could use the related POSIX.1 timer_settime() call to set a timer running, a signal handler to catch the signal delivered by the timer, and timer_getoverrun() to find out how much time elapsed between the queuing of the signal and its final delivery, and then set your loop to run until the timer goes off, counting the number of iterations in the time interval that was set, plus the overrun.
Of course on a preemptive multi-tasking system these clocks and timers will run even while your process is not running, so they are not really very useful for benchmarking.
Slightly more rare is the optional POSIX.1-1999 clockid_t value of CLOCK_PROCESS_CPUTIME_ID, indicated by the presence of the _POSIX_CPUTIME from <time.h>, which represents the CPU-time clock of the calling process, giving values representing the amount of execution time of the invoking process. (Even more rare is the TCT option of clockid_t of CLOCK_THREAD_CPUTIME_ID, indicated by the _POSIX_THREAD_CPUTIME macro, which represents the CPU time clock, giving values representing the amount of execution time of the invoking thread.)
Unfortunately POSIX makes no mention of whether these so-called CPUTIME clocks count just user time, or both user and system (and interrupt) time, accumulated by the process or thread, so if your code under profiling makes any system calls then the amount of time spent in kernel mode may, or may not, be represented.
Even worse, on multi-processor systems, the values of the CPUTIME clocks may be completely bogus if your process happens to migrate from one CPU to another during its execution. The timers implementing these CPUTIME clocks may also run at different speeds on different CPU cores, and at different times, further complicating what they mean. I.e. they may not mean anything related to real wall-clock time, but only be an indication of the number of CPU cycles (which may still be useful for benchmarking so long as relative times are always used and the user is aware that execution time may vary depending on external factors). Even worse it has been reported that on Linux CPU TimeStampCounter-based CPUTIME clocks may even report the time that a process has slept.
If your system has a good working getrusage() system call then it will hopefully be able to give you a struct timeval for each of the the actual user and system times separately consumed by your process while it was running. However since this puts you back to a microsecond clock at best then you'll need to run your test code enough times repeatedly to get a more accurate timing, calling getrusage() once before the loop, and again afterwards, and the calculating the differences between the times given. For simple algorithms this might mean running them millions of times, or more. Note also that on many systems the division between user time and system time is done somewhat arbitrarily and if examined separately in a repeated loop one or the other can even appear to run backwards. However if your algorithm makes no system calls then summing the time deltas should still be a fair total time for your code execution.
BTW, take care when comparing time values such that you don't overflow or end up with a negative value in a field, either as #Nim suggests, or perhaps like this (from NetBSD's <sys/time.h>):
#define timersub(tvp, uvp, vvp) \
do { \
(vvp)->tv_sec = (tvp)->tv_sec - (uvp)->tv_sec; \
(vvp)->tv_usec = (tvp)->tv_usec - (uvp)->tv_usec; \
if ((vvp)->tv_usec < 0) { \
(vvp)->tv_sec--; \
(vvp)->tv_usec += 1000000; \
} \
} while (0)
(you might even want to be more paranoid that tv_usec is in range)
One more important note about benchmarking: make sure your function is actually being called, ideally by examining the assembly output from your compiler. Compiling your function in a separate source module from the driver loop usually convinces the optimizer to keep the call. Another trick is to have it return a value that you assign inside the loop to a variable defined as volatile.
You've got weird mix of floats and ints here:
long opt_runtime = ((seconds) * 1000 + useconds/1000.0) + 0.5;
Try using:
long opt_runtime = (long)(seconds * 1000 + (float)useconds/1000);
This way you'll get your results in milliseconds.
The execution time of optimal(...) is less than the granularity of gettimeofday(...). This likely happes on Windows. On Windows the typical granularity is up to 20 ms. I've answered a related gettimeofday(...) question here.
For Linux I asked How is the microsecond time of linux gettimeofday() obtained and what is its accuracy? and got a good result.
More information on how to obtain accurate timing is described in this SO answer.
I normally do such a calculation as:
long long ss = start.tv_sec * 1000000LL + start.tv_usec;
long long es = end.tv_sec * 1000000LL + end.tv_usec;
Then do a difference
long long microsec_diff = es - ss;
Now convert as required:
double seconds = microsec_diff / 1000000.;
Normally, I don't bother with the last step, do all timings in microseconds.
I want to measure the runtime of my C++ code. Executing my code takes about 12 hours and I want to write this time at the end of execution of my code. How can I do it in my code?
Operating system: Linux
If you are using C++11 you can use system_clock::now():
auto start = std::chrono::system_clock::now();
/* do some work */
auto end = std::chrono::system_clock::now();
auto elapsed = end - start;
std::cout << elapsed.count() << '\n';
You can also specify the granularity to use for representing a duration:
// this constructs a duration object using milliseconds
auto elapsed =
std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
// this constructs a duration object using seconds
auto elapsed =
std::chrono::duration_cast<std::chrono::seconds>(end - start);
If you cannot use C++11, then have a look at chrono from Boost.
The best thing about using such a standard libraries is that their portability is really high (e.g., they both work in Linux and Windows). So you do not need to worry too much if you decide to port your application afterwards.
These libraries follow a modern C++ design too, as opposed to C-like approaches.
EDIT: The example above can be used to measure wall-clock time. That is not, however, the only way to measure the execution time of a program. First, we can distinct between user and system time:
User time: The time spent by the program running in user space.
System time: The time spent by the program running in system (or kernel) space. A program enters kernel space for instance when executing a system call.
Depending on the objectives it may be necessary or not to consider system time as part of the execution time of a program. For instance, if the aim is to just measure a compiler optimization on the user code then it is probably better to leave out system time. On the other hand, if the user wants to determine whether system calls are a significant overhead, then it is necessary to measure system time as well.
Moreover, since most modern systems are time-shared, different programs may compete for several computing resources (e.g., CPU). In such a case, another distinction can be made:
Wall-clock time: By using wall-clock time the execution of the program is measured in the same way as if we were using an external (wall) clock. This approach does not consider the interaction between programs.
CPU time: In this case we only count the time that a program is actually running on the CPU. If a program (P1) is co-scheduled with another one (P2), and we want to get the CPU time for P1, this approach does not include the time while P2 is running and P1 is waiting for the CPU (as opposed to the wall-clock time approach).
For measuring CPU time, Boost includes a set of extra clocks:
process_real_cpu_clock, captures wall clock CPU time spent by the current process.
process_user_cpu_clock, captures user-CPU time spent by the current process.
process_system_cpu_clock, captures system-CPU time spent by the current process. A tuple-like class process_cpu_clock, that captures real, user-CPU, and system-CPU process times together.
A thread_clock thread steady clock giving the time spent by the current thread (when supported by a platform).
Unfortunately, C++11 does not have such clocks. But Boost is a wide-used library and, probably, these extra clocks will be incorporated into C++1x at some point. So, if you use Boost you will be ready when the new C++ standard adds them.
Finally, if you want to measure the time a program takes to execute from the command line (as opposed to adding some code into your program), you may have a look at the time command, just as #BЈовић suggests. This approach, however, would not let you measure individual parts of your program (e.g., the time it takes to execute a function).
Use std::chrono::steady_clock and not std::chrono::system_clock for measuring run time in C++11. The reason is (quoting system_clock's documentation):
on most systems, the system time can be adjusted at any moment
while steady_clock is monotonic and is better suited for measuring intervals:
Class std::chrono::steady_clock represents a monotonic clock. The time
points of this clock cannot decrease as physical time moves forward.
This clock is not related to wall clock time, and is best suitable for
measuring intervals.
Here's an example:
auto start = std::chrono::steady_clock::now();
// do something
auto finish = std::chrono::steady_clock::now();
double elapsed_seconds = std::chrono::duration_cast<
std::chrono::duration<double> >(finish - start).count();
A small practical tip: if you are measuring run time and want to report seconds std::chrono::duration_cast<std::chrono::seconds> is rarely what you need because it gives you whole number of seconds. To get the time in seconds as a double use the example above.
You can use time to start your program. When it ends, it print nice time statistics about program run. It is easy to configure what to print. By default, it print user and CPU times it took to execute the program.
EDIT : Take a note that every measure from the code is not correct, because your application will get blocked by other programs, hence giving you wrong values*.
* By wrong values, I meant it is easy to get the time it took to execute the program, but that time varies depending on the CPUs load during the program execution. To get relatively stable time measurement, that doesn't depend on the CPU load, one can execute the application using time and use the CPU as the measurement result.
I used something like this in one of my projects:
#include <sys/time.h>
struct timeval start, end;
gettimeofday(&start, NULL);
//Compute
gettimeofday(&end, NULL);
double elapsed = ((end.tv_sec - start.tv_sec) * 1000)
+ (end.tv_usec / 1000 - start.tv_usec / 1000);
This is for milliseconds and it works both for C and C++.
This is the code I use:
const auto start = std::chrono::steady_clock::now();
// Your code here.
const auto end = std::chrono::steady_clock::now();
std::chrono::duration<double> elapsed = end - start;
std::cout << "Time in seconds: " << elapsed.count() << '\n';
You don't want to use std::chrono::system_clock because it is not monotonic! If the user changes the time in the middle of your code your result will be wrong - it might even be negative. std::chrono::high_resolution_clock might be implemented using std::chrono::system_clock so I wouldn't recommend that either.
This code also avoids ugly casts.
If you wish to print the measured time with printf(), you can use this:
auto start = std::chrono::system_clock::now();
/* measured work */
auto end = std::chrono::system_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
printf("Time = %lld ms\n", static_cast<long long int>(elapsed.count()));
You could also try some timer classes that start and stop automatically, and gather statistics on the average, maximum and minimum time spent in any block of code, as well as the number of calls. These cxx-rtimer classes are available on GitHub, and offer support for using std::chrono, clock_gettime(), or boost::posix_time as a back-end clock source.
With these timers, you can do something like:
void timeCriticalFunction() {
static rtimers::cxx11::DefaultTimer timer("expensive");
auto scopedStartStop = timer.scopedStart();
// Do something costly...
}
with timing stats written to std::cerr on program completion.
I have found some code on measuring execution time here
http://www.dreamincode.net/forums/index.php?showtopic=24685
However, it does not seem to work for calls to system(). I imagine this is because the execution jumps out of the current process.
clock_t begin=clock();
system(something);
clock_t end=clock();
cout<<"Execution time: "<<diffclock(end,begin)<<" s."<<endl;
Then
double diffclock(clock_t clock1,clock_t clock2)
{
double diffticks=clock1-clock2;
double diffms=(diffticks)/(CLOCKS_PER_SEC);
return diffms;
}
However this always returns 0 seconds... Is there another method that will work?
Also, this is in Linux.
Edit: Also, just to add, the execution time is in the order of hours. So accuracy is not really an issue.
Thanks!
Have you considered using gettimeofday?
struct timeval tv;
struct timeval start_tv;
gettimeofday(&start_tv, NULL);
system(something);
double elapsed = 0.0;
gettimeofday(&tv, NULL);
elapsed = (tv.tv_sec - start_tv.tv_sec) +
(tv.tv_usec - start_tv.tv_usec) / 1000000.0;
Unfortunately clock() only has one second resolution on Linux (even though it returns the time in units of microseconds).
Many people use gettimeofday() for benchmarking, but that measures elapsed time - not time used by this process/thread - so isn't ideal. Obviously if your system is more or less idle and your tests are quite long then you can average the results. Normally less of a problem but still worth knowing about is that the time returned by gettimeofday() is non-monatonic - it can jump around a bit e.g. when your system first connects to an NTP time server.
The best thing to use for benchmarking is clock_gettime() with whichever option is most suitable for your task.
CLOCK_THREAD_CPUTIME_ID - Thread-specific CPU-time clock.
CLOCK_PROCESS_CPUTIME_ID - High-resolution per-process timer from the CPU.
CLOCK_MONOTONIC - Represents monotonic time since some unspecified starting point.
CLOCK_REALTIME - System-wide realtime clock.
NOTE though, that not all options are supported on all Linux platforms - except clock_gettime(CLOCK_REALTIME) which is equivalent to gettimeofday().
Useful link: Profiling Code Using clock_gettime
Tuomas Pelkonen already presented the gettimeofday method that allows to get times with a resolution to the microsecond.
In his example he goes on to convert to double. I personally have wrapped the timeval struct into a class of my own that keep the counts into seconds and microseconds as integers and handle the add and minus operations correctly.
I prefer to keep integers (with exact maths) rather than get to floating points numbers and all their woes when I can.