How to time event in C++? - c++

I'd like to be able to get number of nanoseconds it takes to do something in my C++ program.
Object creation, time for a function to do it's thing etc.
In Java, we'd do something along the lines of:
long now = System.currentTimeMillis();
// stuff
long diff = (System.currentTimeMillis() - now);
How would you do the same in C++?

The <chrono> library in standard C++ provides the best way to do this. This library provides a standard, type safe, generic API for clocks.
#include <chrono>
#include <iostream>
int main() {
using std::chrono::duration_cast;
using std::chrono::nanoseconds;
typedef std::chrono::high_resolution_clock clock;
auto start = clock::now();
// stuff
auto end = clock::now();
std::cout << duration_cast<nanoseconds>(end-start).count() << "ns\n";
}
The actual resolution of the clock will vary between implementations, but this code will always show results in nanoseconds, as accurately as possible given the implementation's tick period.

In C++11 you can do it using chrono library where -
Class template std::chrono::duration represents a time interval.
It consists of a count of ticks of type Rep and a tick period, where the tick period is a compile-time rational constant representing the number of seconds from one tick to the next.
Currently implemented in GCC 4.5.1. (not yet in VC++). See sample code from cppreference.com on Ideone.com execution time of a function call

Take a look at clock and clock_t. For the resolution you're talking about, I don't think there's native support in C++. To get meaningful values, you'll have to time multiple calls or constructions, or use a profiler (desired).

I asked this exact question earlier today. The best solution I have at the moment is to use SDL, and call:
uint32 a_time = SDL_GetTicks(); // Return uint32 count of milliseconds since SDL_Init was called
Although this is probably going to give you lots of overhead, even if you just init SDL with the timer functionality. (SDL_Init(SDL_INIT_TIMER).
Hope this helps you - I settled for this as a solution because it is portable.

Asked and answered many times.
How do I do High Resolution Timing in C++ on Windows?
C++ obtaining milliseconds time on Linux — clock() doesn't seem to work properly
High Resolution Timing Part of Your Code
High resolution timer with C++ and Linux?
If you're using C++11 you can consider chrono.

Related

Time taken between two points in code independent of system clock CPP Linux

I need to find the time taken to execute a piece of code, and the method should be independent of system time, ie chrono and all wouldn't work.
My usecse looks somewhat like this.
int main{
//start
function();
//end
time_take = end - start;
}
I am working in an embedded platform that doesn't have the right time at the start-up. In my case, the start of funcion happens before actual time is set from ntp server and end happens after the exact time is obtained. So any method that compares the time difference between two points wouldn't work. Also, number of CPU ticks wouldn't work for me since my programme necessarily be running actively throughout.
I tried the conventional methods and they didn't work for me.
On Linux clock_gettime() has an option to return the the current CLOCK_MONOTONIC, which is unaffected by system time changes. Measuring the CLOCK_MONOTONIC at the beginning and the end, and then doing your own math to subtract the two values, will measure the elapsed time ignoring any system time changes.
If you don't want to dip down to C-level abstractions, <chrono> has this covered for you with steady_clock:
int main{
//start
auto t0 = std::chrono::steady_clock::now();
function();
auto t1 = std::chrono::steady_clock::now();
//end
auto time_take = end - start;
}
steady_clock is generally a wrapper around clock_gettime used with CLOCK_MONOTONIC except is portable across all platforms. I.e. some platforms don't have clock_gettime, but do have an API for getting a monotonic clock time.
Above the type of take_time will be steady_clock::duration. On all platforms I'm aware of, this type is an alias for nanoseconds. If you want an integral count of nanoseconds you can:
using namespace std::literals;
int64_t i = time_take/1ns;
The above works on all platforms, even if steady_clock::duration is not nanoseconds.
The minor advantage of <chrono> over a C-level API is that you don't have to deal with computing timespec subtraction manually. And of course it is portable.

C++ , Timer, Milliseconds

#include <iostream>
#include <conio.h>
#include <ctime>
using namespace std;
double diffclock(clock_t clock1,clock_t clock2)
{
double diffticks=clock1-clock2;
double diffms=(diffticks)/(CLOCKS_PER_SEC/1000);
return diffms;
}
int main()
{
clock_t start = clock();
for(int i=0;;i++)
{
if(i==10000)break;
}
clock_t end = clock();
cout << diffclock(start,end)<<endl;
getch();
return 0;
}
So my problems comes to that it returns me a 0, well to be stright i want to check how much time my program does operate...
I found tons of crap over the internet well mostly it comes to the same point of getting a 0 beacuse the start and the end is the same
This problems goes to C++ remeber : <
There are a few problems in here. The first is that you obviously switched start and stop time when passing to diffclock() function. The second problem is optimization. Any reasonably smart compiler with optimizations enabled would simply throw the entire loop away as it does not have any side effects. But even you fix the above problems, the program would most likely still print 0. If you try to imagine doing billions operations per second, throw sophisticated out of order execution, prediction and tons of other technologies employed by modern CPUs, even a CPU may optimize your loop away. But even if it doesn't, you'd need a lot more than 10K iterations in order to make it run longer. You'd probably need your program to run for a second or two in order to get clock() reflect anything.
But the most important problem is clock() itself. That function is not suitable for any time of performance measurements whatsoever. What it does is gives you an approximation of processor time used by the program. Aside of vague nature of the approximation method that might be used by any given implementation (since standard doesn't require it of anything specific), POSIX standard also requires CLOCKS_PER_SEC to be equal to 1000000 independent of the actual resolution. In other words — it doesn't matter how precise the clock is, it doesn't matter at what frequency your CPU is running. To put simply — it is a totally useless number and therefore a totally useless function. The only reason why it still exists is probably for historical reasons. So, please do not use it.
To achieve what you are looking for, people have used to read the CPU Time Stamp also known as "RDTSC" by the name of the corresponding CPU instruction used to read it. These days, however, this is also mostly useless because:
Modern operating systems can easily migrate the program from one CPU to another. You can imagine that reading time stamp on another CPU after running for a second on another doesn't make a lot of sense. It is only in latest Intel CPUs the counter is synchronized across CPU cores. All in all, it is still possible to do this, but a lot of extra care must be taken (i.e. once can setup the affinity for the process, etc. etc).
Measuring CPU instructions of the program oftentimes does not give an accurate picture of how much time it is actually using. This is because in real programs there could be some system calls where the work is performed by the OS kernel on behalf of the process. In that case, that time is not included.
It could also happen that OS suspends an execution of the process for a long time. And even though it took only a few instructions to execute, for user it seemed like a second. So such a performance measurement may be useless.
So what to do?
When it comes to profiling, a tool like perf must be used. It can track a number of CPU clocks, cache misses, branches taken, branches missed, a number of times the process was moved from one CPU to another, and so on. It can be used as a tool, or can be embedded into your application (something like PAPI).
And if the question is about actual time spent, people use a wall clock. Preferably, a high-precision one, that is also not a subject to NTP adjustments (monotonic). That shows exactly how much time elapsed, no matter what was going on. For that purpose clock_gettime() can be used. It is part of SUSv2, POSIX.1-2001 standard. Given that use you getch() to keep the terminal open, I'd assume you are using Windows. There, unfortunately, you don't have clock_gettime() and the closest thing would be performance counters API:
BOOL QueryPerformanceFrequency(LARGE_INTEGER *lpFrequency);
BOOL QueryPerformanceCounter(LARGE_INTEGER *lpPerformanceCount);
For a portable solution, the best bet is on std::chrono::high_resolution_clock(). It was introduced in C++11, but is supported by most industrial grade compilers (GCC, Clang, MSVC).
Below is an example of how to use it. Please note that since I know that my CPU will do 10000 increments of an integer way faster than a millisecond, I have changed it to microseconds. I've also declared the counter as volatile in hope that compiler won't optimize it away.
#include <ctime>
#include <chrono>
#include <iostream>
int main()
{
volatile int i = 0; // "volatile" is to ask compiler not to optimize the loop away.
auto start = std::chrono::steady_clock::now();
while (i < 10000) {
++i;
}
auto end = std::chrono::steady_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::cout << "It took me " << elapsed.count() << " microseconds." << std::endl;
}
When I compile and run it, it prints:
$ g++ -std=c++11 -Wall -o test ./test.cpp && ./test
It took me 23 microseconds.
Hope it helps. Good Luck!
At a glance, it seems like you are subtracting the larger value from the smaller value. You call:
diffclock( start, end );
But then diffclock is defined as:
double diffclock( clock_t clock1, clock_t clock2 ) {
double diffticks = clock1 - clock2;
double diffms = diffticks / ( CLOCKS_PER_SEC / 1000 );
return diffms;
}
Apart from that, it may have something to do with the way you are converting units. The use of 1000 to convert to milliseconds is different on this page:
http://en.cppreference.com/w/cpp/chrono/c/clock
The problem appears to be the loop is just too short. I tried it on my system and it gave 0 ticks. I checked what diffticks was and it was 0. Increasing the loop size to 100000000, so there was a noticeable time lag and I got -290 as output (bug -- I think that the diffticks should be clock2-clock1 so we should get 290 and not -290). I tried also changing "1000" to "1000.0" in the division and that didn't work.
Compiling with optimization does remove the loop, so you have to not use it, or make the loop "do something", e.g. increment a counter other than the loop counter in the loop body. At least that's what GCC does.
Note: This is available after c++11.
You can use std::chrono library.
std::chrono has two distinct objects. (timepoint and duration). Timepoint represents a point in time, and duration, as we already know the term represents an interval or a span of time.
This c++ library allows us to subtract two timepoints to get a duration of time passed in the interval. So you can set a starting point and a stopping point. Using functions you can also convert them into appropriate units.
Example using high_resolution_clock (which is one of the three clocks this library provides):
#include <chrono>
using namespace std::chrono;
//before running function
auto start = high_resolution_clock::now();
//after calling function
auto stop = high_resolution_clock::now();
Subtract stop and start timepoints and cast it into required units using the duration_cast() function. Predefined units are nanoseconds, microseconds, milliseconds, seconds, minutes, and hours.
auto duration = duration_cast<microseconds>(stop - start);
cout << duration.count() << endl;
First of all you should subtract end - start not vice versa.
Documentation says if value is not available clock() returns -1, did you check that?
What optimization level do you use when compile your program? If optimization is enabled compiler can effectively eliminate your loop entirely.

extending the std::chrono functionality to deal with run-time (non compile-time) constant periods

I have been experimenting with all kind of timers on Linux and OSX, and would like to try and wrap some of them with the same interface used by std::chrono.
That's easy to do for timers that have a well-defined "period" at compile time, e.g. the POSIX clock_gettime() familiy, the clock_get_time() family on OSX, or gettimeofday().
However, there are some useful timers for which the "period" - while constant - is only known at runtime.
For example:
- POSIX states the period of clock(), CLOCKS_PER_SEC, may be a variable on non-XSI systems
- on Linux, the period of times() is given at runtime by sysconf(_SC_CLK_TCK)
- on OSX, the period of mach_absolute_time() is given at runtime by mach_timebase_info()
- on recent Intel processors, the DST register ticks at a constant rate, but of course that can only be determined at runtime
To wrap these timers in the std::chrono interface, one possibility would be to use a period of std::chrono::nanosecond , and convert the value of each timer to nanoseconds. An other approach could be to use a floating point representation. However, both approaches would introduce a (very small) overhead to the now() function, and a (probably small) loss in precision.
The solution I'm trying to pursue is to define a set of classes to represent such "run-time constant" periods, built along the same lines as the std::ratio class.
However I expect that will require rewriting all the related template classes and functions (as they assume constexpr values).
How do I wrap these kind of timers a la std:chrono ?
Or use non-constexpr values for the time period of a clock ?
Does anyone have any experience with wrapping these kind of timers a
la std:chrono ?
Actually I do. And on OSX, one of your platforms of interest. :-)
You mention:
on OSX, the period of mach_absolute_time() is given at runtime by
mach_timebase_info()
Absolutely correct. Also on OSX, the libc++ implementation of high_resolution_clock and steady_clock is actually based on mach_absolute_time. I'm the author of this code, which is open source with a generous license (do anything you want with it as long as you retain the copyright).
Here is the source for libc++'s steady_clock::now(). It is built pretty much the way you surmised. The run time period is converted to nanoseconds prior to returning. On OS X the conversion factor is very often 1, and the code takes advantage of that fact with an optimization. However the code is general enough to handle non-1 conversion factors.
On the first call to now() there's a small cost of querying the run time conversion factor to nanoseconds. In the general case a floating point conversion factor is computed. In the common case (conversion factor == 1) the subsequent cost is calling through a function pointer. I've found that the overhead is really quite reasonable.
On OS X the conversion factor, although not determined until run time, is still a constant (i.e. does not vary as the program executes), so it only needs to be computed once.
If you're in a situation where your period is actually varying dynamically, you'll need more infrastructure to handle this. Essentially you would need to integrate (calculus) the period vs time curve and then compute an average period between two points in time. That would require a constant monitoring of the period as it changes with time, and <chrono> isn't the right tool for that. Such tools are typically handled at the OS level.
[Does anyone have any experience] Or with using non-constexpr values for the time period of a clock ?
After reading through the standard (20.11.5, Class template duration), "period" is expected to be "a specialization of ratio":
Remarks: If Period is not a specialization of ratio, the program is ill-formed.
and all chrono templates rely heavily on constexpr functionality.
Does anyone have any experience with wrapping these kind of timers a la std:chrono ?
I've found here a suggestion to use a duration with period = 1, boost::rational as rep , though without any concrete examples.
I have done a similar thing for my purposes, only for Linux though. You find the code here; feel free to use the code in whatever way you want.
The challenges my implementation addresses overlap partially with the ones mentioned in your question. Specifically:
The tick factor (required to convert from clock ticks to a time unit based on seconds) is retrieved at run time, but only the first time now() is used&ddagger;. If you are concerned about the small overhead this causes, you may call the now() function once at start-up before you measure any actual intervals. The tick factor is stored in a static variable, which means there is still some overhead as – on the lowest level – each call of the now() function implies checking whether the static variable has been initialized. However, this overhead will be the same in each call of now(), so it shouldn't impact measuring time intervals.
I do not convert to nanoseconds by default, because when measuring relatively long periods of time (e.g. a few seconds) this causes overflows very quickly. This is in fact the main reason why I don't use the boost implementation. Instead of converting to nanoseconds, I implement the base unit as a template parameter (called Precision in the code). I use std::ratio from C++11 as template arguments. So I can choose, for example, a clock<micro>, which implies that calling the now() function will internally convert to microseconds rather than nanoseconds, which means I can measure periods of many seconds or minutes without overflows and still with good precision. (This is independent of the unit used to produce output. You can have a clock<micro> and display the result in seconds, etc.)
My clock type, which is called combined_clock combines user time, system time and wall-clock time. There is a boost clock type for this, too, but it's not compatible with the ratio types and units from std, whereas mine is.
&ddagger;The tick factor is retrieved using the ::sysconf() call you suggest, and that is guaranteed to return one and the same value throughout the life time of the process.
So the way you use it is as follows:
#include "util/proctime.hpp"
#include <ratio>
#include <chrono>
#include <thread>
#include <utility>
#include <iostream>
int main()
{
using std::chrono::duration_cast;
using millisec = std::chrono::milliseconds;
using clock_type = rlxutil::combined_clock<std::micro>;
auto tp1 = clock_type::now();
/* Perform some random calculations. */
unsigned long step1 = 1;
unsigned long step2 = 1;
for (int i = 0 ; i < 50000000 ; ++i) {
unsigned long step3 = step1 + step2;
std::swap(step1,step2);
std::swap(step2,step3);
}
/* Sleep for a while (this adds to real time, but not CPU time). */
std::this_thread::sleep_for(millisec(1000));
auto tp2 = clock_type::now();
std::cout << "Elapsed time: "
<< duration_cast<millisec>(tp2 - tp1)
<< std::endl;
return 0;
}
The usage above involves a pretty-print function that generates output like this:
Elapsed time: [user 40, system 0, real 1070 millisec]

Linux C++ time measurement library, fast printing library

I just started programming C++ in Linux, can anyone recommend a good way for measurement of code elapsed time, ideally to nanoseconds precision, but milli-seconds will do as well.
And also a fast printing method, I am using std::cout at the moment, but I feel it's kind of slow.
Thanks.
You could use gettimeofday, or clock_gettime.
To get a time in nanoseconds, use clock_gettime(). To measure an elapsed time taken by the code, CLOCK_MONOTONIC_RAW clock type must be used. Using other clock types is not really a solution because they are subject to NTP adjustments.
As for the printing part - define slow. A "general" code to convert built-in data types into ASCII strings is always slow. There is also a buffering going on (which is good in most cases). If you can make some good assumptions about your data, you can always throw in your own conversion to ASCII which will beat a general-purpose solutions, and make it faster.
EDIT:
See also an example of using clock_gettime() function and OS X specific mach_absolute_time() functions here:
stopwatch.h
stopwatch.c
stopwatch_example.c
For timing you can use the <chrono> standard library:
#include <chrono>
#include <iostream>
int main() {
using Clock = std::chrono::high_resolution_clock;
using std::chrono::milliseconds;
using std::chrono::nanoseconds;
using std::chrono::duration_cast;
auto start = Clock::now();
// code to time
std::this_thread::sleep_for(milliseconds(500));
auto end = Clock::now();
std::cout << duration_cast<nanoseconds>(end-start).count() << " ns\n";
}
The actual clock resolution depends on the implementation, but this will always output the correct units.
The performance of std::cout depends on the implementation as well. IME, as long as you don't use std::endl everywhere its performance compares quite well with printf on Linux or OS X. Microsoft's implementation in VC++ seems to be much slower.
Printing things is normally slow because of the terminal you're watching it in, rather than because you're printing something in the first place. You can redirect output to a file, then you might see a significant speedup if you're printing a lot to the console.
I think you probably also want to have a look at the time [0] command, which reports the time taken by a specific program to complete execution.
[0] http://linux.about.com/library/cmd/blcmdl1_time.htm
Time measurement:
Boost.Chrono: http://www.boost.org/doc/libs/release/doc/html/chrono.html
// note that if you have a modern C++11 (used to be C++0x) compiler you already have this out of the box, since "Boost.Chrono aims to implement the new time facilities in C++0x, as proposed in N2661 - A Foundation to Sleep On."
Boost.Timer: http://www.boost.org/doc/libs/release/libs/timer/
Posix Time from Boost.Date_Time: http://www.boost.org/doc/libs/release/doc/html/date_time/posix_time.html
Fast printing:
FastFormat: http://www.fastformat.org/
Benchmarks: http://www.fastformat.org/performance.html
Regarding the performance of C++ streams -- remember about std::ios_base::sync_with_stdio, see:
http://en.cppreference.com/w/cpp/io/ios_base/sync_with_stdio
http://www.cplusplus.com/reference/iostream/ios_base/sync_with_stdio/

What is the best, most accurate timer in C++?

What is the best, most accurate timer in C++?
In C++11 you can portably get to the highest resolution timer with:
#include <iostream>
#include <chrono>
#include "chrono_io"
int main()
{
typedef std::chrono::high_resolution_clock Clock;
auto t1 = Clock::now();
auto t2 = Clock::now();
std::cout << t2-t1 << '\n';
}
Example output:
74 nanoseconds
"chrono_io" is an extension to ease I/O issues with these new types and is freely available here.
There is also an implementation of <chrono> available in boost (might still be on tip-of-trunk, not sure it has been released).
The answer to this is platform-specific. The operating system is responsible for keeping track of timing and consequently, the C++ language itself provides no language constructs or built-in functions for doing this.
However, here are some resources for platform-dependent timers:
Windows API - SetTimer: http://msdn.microsoft.com/en-us/library/ms644906(v=vs.85).aspx
Unix - setitimer: http://linux.die.net/man/2/setitimer
A cross-platform solution might be boost::asio::deadline_timer.
Under windows it would be QueryPerformanceCounter, though seeing as you didn't specify any conditions it possible to have an external ultra high resolution timer that has a c++ interface for the driver
The C++ standard doesn't say a whole lot about time. There are a few features inherited from C via the <ctime> header.
The function clock is the only way to get sub-second precision, but precision may be as low as one second (it is defined by the macro CLOCKS_PER_SEC). Also, it does not measure real time at all, but processor time.
The function time measures real time, but (usually) only to the nearest second.
To measure real time with subsecond precision, you need a nonstandard library.