Time a function in C++ - c++

I'd like to time how long a function takes in C++ in milliseconds.
Here's what I have:
#include<iostream>
#include<chrono>
using timepoint = std::chrono::steady_clock::time_point;
float elapsed_time[100];
// Run function and count time
for(int k=0;k<100;k++) {
// Start timer
const timepoint clock_start = chrono::system_clock::now();
// Run Function
Recursive_Foo();
// Stop timer
const timepoint clock_stop = chrono::system_clock::now();
// Calculate time in milliseconds
chrono::duration<double,std::milli> timetaken = clock_stop - clock_start;
elapsed_time[k] = timetaken.count();
}
for(int l=0;l<100;l++) {
cout<<"Array: "<<l<<" Time: "<<elapsed_time[l]<<" ms"<<endl;
}
This compiles but I think multithreading is preventing it from working properly. The output produces times in irregular intervals, e.g.:
Array: 0 Time: 0 ms
Array: 1 Time: 0 ms
Array: 2 Time: 15.6 ms
Array: 3 Time: 0 ms
Array: 4 Time: 0 ms
Array: 5 Time: 0 ms
Array: 6 Time: 15.6 ms
Array: 7 Time: 0 ms
Array: 8 Time: 0 ms
Do I need to use some kind of mutex lock? Or is there an easier way to time how many milliseconds a function took to execute?
EDIT
Maybe people are suggesting using high_resolution_clock or steady_clock, but all three produce the same irregular results.
This solution seems to produce real results: How to use QueryPerformanceCounter? but it's not clear to me why. Also, https://gamedev.stackexchange.com/questions/26759/best-way-to-get-elapsed-time-in-miliseconds-in-windows works well. Seems to be a Windows implementation issue.

Microsoft has a nice, clean solution in microseconds, via: MSDN
#include <windows.h>
LONGLONG measure_activity_high_resolution_timing()
{
LARGE_INTEGER StartingTime, EndingTime, ElapsedMicroseconds;
LARGE_INTEGER Frequency;
QueryPerformanceFrequency(&Frequency);
QueryPerformanceCounter(&StartingTime);
// Activity to be timed
QueryPerformanceCounter(&EndingTime);
ElapsedMicroseconds.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
//
// We now have the elapsed number of ticks, along with the
// number of ticks-per-second. We use these values
// to convert to the number of elapsed microseconds.
// To guard against loss-of-precision, we convert
// to microseconds *before* dividing by ticks-per-second.
//
ElapsedMicroseconds.QuadPart *= 1000000;
ElapsedMicroseconds.QuadPart /= Frequency.QuadPart;
return ElapsedMicroseconds.QuadPart;
}

Profile code using a high-resolution timer, not the system-clock; which, as you're seeing, has a very limited granularity.
http://www.cplusplus.com/reference/chrono/high_resolution_clock/
typedef tp high_resolution_clock::time_point
const tp start = high_resolution_clock::now();
// do stuff
const tp end = high_resolution_clock::now();

If you suspect that some other process or thread in your app is taking too much CPU time then use:
GetThreadTimes under windows
or
clock_gettime with CLOCK_THREAD_CPUTIME_ID under linux
to measure threads CPU time your function was being executed. This will exclude from your measurements time other threads/processes were executed during profiling.

Related

Longer execution time of OpenCV function cv::remap() when program is put to sleep in between

I am doing some image processing using OpenCV library and I discovered that the time it takes to process an image depends on the amount of time I put my thread to sleep in between image processing. I measured execution time of several parts of my program and I discovered that the function cv::remap() seems to execute two times slower if I put my thread to sleep for more then certain time period.
Below is the minimal code snippet which shows the strange behavior. I measure the time it takes to execute the cv::remap() function and then I send my threat to sleep for amount of milliseconds set in sleep_time.
#include <opencv2/imgproc.hpp>
#include <thread>
#include <iostream>
int main(int argc, char **argv) {
cv::Mat src = ... // Init
cv::Mat dst = ... // Init
cv::Mat1f map_x = ... // Init;
cv::Mat1f map_y = ... // Init;
for (int i = 0; i < 5; i++) {
auto t1 = std::chrono::system_clock::now();
cv::remap(src, dst, map_x, map_y, cv::INTER_NEAREST, cv::BORDER_CONSTANT, 0);
auto t2 = std::chrono::system_clock::now();
std::chrono::duration<double> elapsed_time = t2 - t1;
std::cout << "elapsed time = " << elapsed_time.count() * 1e3 << " ms" << std::endl;
int sleep_time = 0;
// int sleep_time = 20;
// int sleep_time = 100;
std::this_thread::sleep_for( std::chrono::milliseconds(sleep_time));
}
return 0;
}
If sleep_time is set to 0 the processing takes about 5 ms. Here is the output.
elapsed time = 5.94945 ms
elapsed time = 5.7458 ms
elapsed time = 5.69947 ms
elapsed time = 5.68581 ms
elapsed time = 5.7218 ms
But if I set the sleep_time to 100, the processing is more then two times slower.
elapsed time = 6.09076 ms
elapsed time = 13.2568 ms
elapsed time = 13.4524 ms
elapsed time = 13.3631 ms
elapsed time = 13.3581 ms
I tried out many different values for sleep_time and it seems that the execution doubles when the sleep_time is roughly three times higher then the elapsed_time (sleep_time > 3 * elapsed_time). If I increase the complexity of computation inside the function cv::remap() (e.g. increase the size of the processed image), then the sleep_time can be also set to higher values before the execution starts to double.
I am running my program on a embedded device with ARM processor iMX6 with linux operating system, but I was able to recreate the problem on my desktop running Ubuntu 16.04. I am using compiler arm-angstrom-linux-gnueabi-gcc (GCC) 7.3.0 and the Opencv version 3.3.0.
Does anybody have an idea what is going on?
This is probably your CPU frequency scaling kicking in.
The default frequency governor on Linux is usually "ondemand", which means clock speed is scaled down when load on CPU is low, and scaled back up when load increases. As this process takes some time, your short computation bursts fail to bring the clock speed up, and your process effectively runs on a slower CPU than you actually have.
I have tested this theory on my machine by executing
sudo cpupower frequency-set -g performance
and the effect immediately disappeared. To set the governor back, execute
sudo cpupower frequency-set -g ondemand

IBM AIX std::clock()?

I execute in IBM AIX the following code.
int main(void)
{
printf( "start\n");
double time1 = (double)clock(); /* get initial time */
time1 = time1 / CLOCKS_PER_SEC; /* in seconds */
boost::this_thread::sleep_for(boost::chrono::seconds(5));
/* call clock a second time */
double time2 = (((double)clock()) / CLOCKS_PER_SEC);
double timedif = time2 - time1;
printf( "The elapsed time is %lf seconds, time1:%lf time2:%lf CLOCKS_PER_SEC:%ld\n",
timedif));
}
The result is:
2018-04-07 09:58:37 start
2018-04-07 09:58:42 The elapsed time is 0.000180 seconds, time1:0.000000
time2:0.000181 CLOCKS_PER_SEC:1000000
I don't know why elapsed time is 0.000180 (why not 5)?
According to the manual
Returns the processor time consumed by the program.
It is CPU time consumed by a program, it is not a physical time. A sleeping program does not consume CPU time. Thus in raw words, it is time interval from main till sleep plus time interval after sleep till return.
If you want to get system/real time, look at the std::chrono::system_clock class.
#include <chrono>
using std::chrono::system_clock;
system_clock::time_point time_now = system_clock::now();

Different values in measuring the elapsed time C++

I have a simple code and I used clock() and other suggested methods to measure the running time of program. The problem is I got different values when I run it times to times.
Is there any way to elapsed the real execution time of the program?
Thanks in advance
One way of doint it uses #include <ctime>
clock_t t = clock(); // take a start time
// ... do something
clock_t dt = clock() - t; // take elapsed time
cout << (((double)dt) / CLOCKS_PER_SEC) * 1000); // duration in MILLIseconds.
The other approach uses the high_resolution_clock of #include <chrono>:
chrono::high_resolution_clock::time_point t = chrono::high_resolution_clock::now();
//... do something
chrono::high_resolution_clock::time_point t2 = chrono::high_resolution_clock::now();
cout << chrono::duration_cast<chrono::duration<double>>(t2 - t).count();
// or if you prefer duration_cast<milliseconds>(t2 - t).count();
In any case, it's normal that you find small variations. First reason is your other running programms on your PC. Second reason is the clock accuracy (for example the famous 15 milliseconds on windows).

Millisecond timing C++

I want to time the real-time performance of some C++ functions I have written. How do I get the timing in milliseconds scale?
I know how to get time in seconds via
start=clock()
diff=(clock()-start)/(double) CLOCKS_PER_SEC
cout<<diff
I am using Ubuntu-Linux OS and g++ compiler.
In Linux, take a look at clock_gettime(). It can essentially give you the time elapsed since an arbitrary point, in nanoseconds (which should be good enough for you).
Note that it is specified by the POSIX standard, so you should be fine using it on Unix-derived systems.
Try diff = (clock() - start) * 1000.0 / CLOCKS_PER_SEC;
The idea is that you multiply the number of clocks by 1000, so that whereas before you might get 2 (seconds), you now get 2000 (milliseconds).
Notes:
On my Dell desktop, which is reasonably quick ...
ubuntu bogomips peak at 5210
time(0) takes about 80 nano-seconds (30 million calls in 2.4 seconds)
time(0) allows me to measure
clock_gettime() which takes about 1.3 u-seconds per call (2.2 million in 3 seconds)
(I don't remember how many nano-seconds per time step)
So typically, I use the following, with about 3 seconds of invocations.
// ////////////////////////////////////////////////////////////////////////////
void measuring_something_duration()
...
uint64_t start_us = dtb::get_system_microsecond();
do_something_for_about_3_seconds()
uint64_t test_duration_us = dtb::get_system_microsecond() - start_us;
uint64_t test_duration_ms = test_duration_us / 1000;
...
which use these functions
// /////////////////////////////////////////////////////////////////////////////
uint64_t mynamespace::get_system_microsecond(void)
{
uint64_t total_ns = dtb::get_system_nanosecond(); // see below
uint64_t ret_val = total_ns / NSPUS; // NanoSecondsPerMicroSeconds
return(ret_val);
}
// /////////////////////////////////////////////////////////////////////////////
uint64_t mynamespace::get_system_nanosecond(void)
{
//struct timespec { __time_t tv_sec; long int tv_nsec; }; -- total 8 bytes
struct timespec ts;
// CLOCK_REALTIME - system wide real time clock
int status = clock_gettime(CLOCK_REALTIME, &ts);
dtb_assert(0 == status);
// to 8 byte from 4 byte
uint64_t uli_nsec = ts.tv_nsec;
uint64_t uli_sec = ts.tv_sec;
uint64_t total_ns = uli_nsec + (uli_sec * NSPS); // nano-seconds-per-second
return(total_ns);
}
Remember to link -lrt

Calculating time length between operations in c++

The program is a middleware between a database and application. For each database access I most calculate the time length in milliseconds. The example bellow is using TDateTime from Builder library. I must, as far as possible, only use standard c++ libraries.
AnsiString TimeInMilliseconds(TDateTime t) {
Word Hour, Min, Sec, MSec;
DecodeTime(t, Hour, Min, Sec, MSec);
long ms = MSec + Sec * 1000 + Min * 1000 * 60 + Hour * 1000 * 60 * 60;
return IntToStr(ms);
}
// computing times
TDateTime SelectStart = Now();
sql_manipulation_statement();
TDateTime SelectEnd = Now();
On both Windows and POSIX-compliant systems (Linux, OSX, etc.), you can calculate the time in 1/CLOCKS_PER_SEC (timer ticks) for a call using clock() found in <ctime>. The return value from that call will be the elapsed time since the program started running in milliseconds. Two calls to clock() can then be subtracted from each other to calculate the running time of a given block of code.
So for example:
#include <ctime>
#include <cstdio>
clock_t time_a = clock();
//...run block of code
clock_t time_b = clock();
if (time_a == ((clock_t)-1) || time_b == ((clock_t)-1))
{
perror("Unable to calculate elapsed time");
}
else
{
unsigned int total_time_ticks = (unsigned int)(time_b - time_a);
}
Edit: You are not going to be able to directly compare the timings from a POSIX-compliant platform to a Windows platform because on Windows clock() measures the the wall-clock time, where-as on a POSIX system, it measures elapsed CPU time. But it is a function in a standard C++ library, and for comparing performance between different blocks of code on the same platform, should fit your needs.
On windows you can use GetTickCount (MSDN) Which will give the number of milliseconds that have elapsed since the system was started. Using this before and after the call you get the amount of milliseconds the call took.
DWORD start = GetTickCount();
//Do your stuff
DWORD end = GetTickCount();
cout << "the call took " << (end - start) << " ms";
Edit:
As Jason mentioned, Clock(); would be better because it is not related to Windows only.