I would like to measure wallclock time taken by my algorithm in C++. Many articles point to this code.
clock_t begin_time, end_time;
begin_time = clock();
Algorithm();
end_time = clock();
cout << ((double)(end_time - begin_time)/CLOCKS_PER_SEC) << endl;
But this measures only cpu time taken by my algorithm.
Some other article pointed out this code.
double getUnixTime(void)
{
struct timespec tv;
if(clock_gettime(CLOCK_REALTIME, &tv) != 0) return 0;
return (tv.tv_sec + (tv.tv_nsec / 1000000000.0));
}
double begin_time, end_time;
begin_time = getUnixTime();
Algorithm();
end_time = getUnixTime();
cout << (double) (end_time - begin_time) << endl;
I thought it would print wallclock time taken by my algorithm. But surprisingly, the time printed by this code is much lower than cpu time printed by previous code. So, I am confused. Please provide code for printing wallclock time.
Those times are probably down in the noise. To get a reasonable time measurement, try executing your algorithm many times in a loop:
const int loops = 1000000;
double begin_time, end_time;
begin_time = getUnixTime();
for (int i = 0; i < loops; ++i)
Algorithm();
end_time = getUnixTime();
cout << (double) (end_time - begin_time) / loops << endl;
I'm getting approximately the same times in a single threaded program:
#include <time.h>
#include <stdio.h>
__attribute((noinline)) void nop(void){}
void loop(unsigned long Cnt) { for(unsigned long i=0; i<Cnt;i++) nop(); }
int main()
{
clock_t t0,t1;
struct timespec ts0,ts1;
t0=clock();
clock_gettime(CLOCK_REALTIME,&ts0);
loop(1000000000);
t1=clock();
clock_gettime(CLOCK_REALTIME,&ts1);
printf("clock-diff: %lu\n", (unsigned long)((t1 - t0)/CLOCKS_PER_SEC));
printf("clock_gettime-diff: %lu\n", (unsigned long)((ts1.tv_sec - ts0.tv_sec)));
}
//prints 2 and 3 or 2 and 2 on my system
But clocks manpage only describes it as returning an approximation. There's no indication that approximation is comparable to what clock_gettime returns.
Where I get drastically different results is where I throw in multiple threads:
#include <time.h>
#include <stdio.h>
#include <pthread.h>
__attribute((noinline)) void nop(void){}
void loop(unsigned long Cnt) {
for(unsigned long i=0; i<Cnt;i++) nop();
}
void *busy(void *A){ (void)A; for(;;) nop(); }
int main()
{
pthread_t ptids[4];
for(size_t i=0; i<sizeof(ptids)/sizeof(ptids[0]); i++)
pthread_create(&ptids[i], 0, busy, 0);
clock_t t0,t1;
struct timespec ts0,ts1;
t0=clock();
clock_gettime(CLOCK_REALTIME,&ts0);
loop(1000000000);
t1=clock();
clock_gettime(CLOCK_REALTIME,&ts1);
printf("clock-diff: %lu\n", (unsigned long)((t1 - t0)/CLOCKS_PER_SEC));
printf("clock_gettime-diff: %lu\n", (unsigned long)((ts1.tv_sec - ts0.tv_sec)));
}
//prints 18 and 4 on my 4-core linux system
That's because both musl and glibc on Linux use clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts) to implement clock() and the CLOCK_PROCESS_CPUTIME_ID nonstandard clock is described in the clock_gettime manpage as returning time for all process threads together.
Related
The following Binary Search program is returning a running time of 0 milliseconds using GetTickCount() no matter how big the search item is set in the given list of values.
Is there any other way to get the running time for comparison?
Here's the code :
#include <iostream>
#include <windows.h>
using namespace std;
int main(int argc, char **argv)
{
long int i = 1, max = 10000000;
long int *data = new long int[max];
long int initial = 1;
long int final = max, mid, loc = -5;
for(i = 1; i<=max; i++)
{
data[i] = i;
}
int range = final - initial + 1;
long int search_item = 8800000;
cout<<"Search Item :- "<<search_item<<"\n";
cout<<"-------------------Binary Search-------------------\n";
long int start = GetTickCount();
cout<<"Start Time : "<<start<<"\n";
while(initial<=final)
{
mid=(initial+final)/2;
if(data[mid]==search_item)
{
loc=mid;
break;
}
if(search_item<data[mid])
final=mid-1;
if(search_item>data[mid])
initial=mid+1;
}
long int end = GetTickCount();
cout<<"End Time : "<<end<<"\n";
cout << "time: " << double(end - start)<<" milliseconds \n";
if(loc==-5)
cout<<" Required number not found "<<endl;
else
cout<<" Required number is found at index "<<loc<<endl;
return 0;
}
Your code looks like this:
int main()
{
// Some code...
while (some_condition)
{
// Some more code...
// Print timing result
return 0;
}
}
That's why your code prints zero time, you only do one iteration of the loop then you exit the program.
Try to use the clock_t object from the time.h header:
clock_t START, END;
START = clock();
**YOUR CODE GOES HERE**
END = clock();
float clocks = END - START;
cout <<"running time : **" << clocks/CLOCKS_PER_SEC << "** seconds" << endl;
CLOCKS_PER_SEC is a defined var to convert from clock ticks to seconds.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724408(v=vs.85).aspx
This article says that result of GetTickCount will wrap to zero if you system runs for 49.7 days.
You can find here: Easily measure elapsed time how to measure time in C++.
You can use time.h header
and do something like this in your code :
clock_t Start, Stop;
double sec;
Start = clock();
//call your BS function
Stop = clock();
Sec = ((double) (Stop - Start) / CLOCKS_PER_SEC);
and print the sec!
I hope this helps you!
The complexity of binary search is log2(N), it's about 23 for N = 10000000.
I think its not enough to mesure in realtime scale and even clock.
In this case you should use unsigned long long __rdtsc(), that returns number of processor ticks from last reset. Put this before and after your binary search and place cout << start; after obtaining end time. Overwise time of output would be included.
There is also memory corruption around data array. Index in C runs from 0 to size - 1, so thereis no data[max] element.
And delete [] data; before calling return.
I have a c++ program running in vs2010 as follows:
#include <windows.h>
#include <stdio.h>
#include <time.h>
long factorial(int n)
{
int counter;
long fact = 1;
for (int counter = 1; counter <= n; counter++)
{
fact = fact * counter;
}
Sleep(100);
return fact;
}
int main(void)
{
LARGE_INTEGER freq;
LARGE_INTEGER t0, tF, tDiff;
double elapsedTime;
double resolution;
QueryPerformanceFrequency(&freq);
QueryPerformanceCounter(&t0);
// code to be timed goes HERE
{
factorial(10);
}
QueryPerformanceCounter(&tF);
tDiff.QuadPart = tF.QuadPart - t0.QuadPart;
float deltaseconds = ((float)tDiff.QuadPart)/((float) freq.QuadPart);
elapsedTime = tDiff.QuadPart / (double) freq.QuadPart;
printf("Code under test took %lf sec\n", elapsedTime);
return 0;
}
This program displays the time taken by factorial. Now when i build it then in build window i see :
1>Build succeeded.
1>Time Elapsed 00:00:01.36
Now , i want to get "Time Elapsed" information to be displayed from my program also. So, suggest me how can i fetch this information and display it through my program only or by using some other interface.
I have an array of booleans each representing a number. I am printing each one that is true with a for loop: for(unsigned long long l = 0; l<numt; l++) if(primes[l]) cout << l << endl; numt is the size of the array and is equal to over 1000000. The console window takes 30 seconds to print out all the value, but a timer I put in my program says 37ms. How do I wait for all the values to finish printing on the screen in my program so I can include that in my time.
Try this:
#include <windows.h>
...
int main() {
//init code
double startTime = GetTickCount();
//your loop
double timeNeededinSec = (GetTickCount() - startTime) / 1000.0;
}
Just in defense of ctime, cause it gives same result as with GetTickCount:
#include <ctime>
int main()
{
...
clock_t start = clock();
...
clock_t end = clock();
double timeNeededinSec = static_cast<double>(end - start) / CLOCKS_PER_SEC;
...
}
Update:
And the one with time() but in this case we can lost some precision( ~1 sec) because result in seconds.
#include <ctime>
int main()
{
time_t start;
time_t end;
...
time(&start);
...
time(&end);
int timeNeededinSec = static_cast<int>(end-start);
}
Combining both of them in simple example will show you the difference in result. In my tests I saw difference only in value after dot.
I'm trying to code a 'live' timer that runs during a calculation. It should continuously output the seconds since the start of the calculation - something like a progress bar. I did this so far:
#include <conio.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
clock_t t;
t = clock();
while (!_kbhit())
{
t = clock() - t;
printf("%f", ((float)t) / CLOCKS_PER_SEC);
system("cls");
}
return 0;
}
But there are a few problems:
It's flickering due to the call of system("cls").
The time is by far not correct due to the continuous call of printf()
Is there a rather easy way of doing this with C?
One very simple and not ideal way would be to simply stop printing so frequently, so here is an untested code as example to print the time once every INTERVAL clocks.
#define INTERVAL 1000
int main(void)
{
clock_t t;
t = clock();
clock_t step_time = INTERVAL;
while (!_kbhit())
{
t = clock() - t;
if (t - step_time > INTERVAL){
step_time = t + INTERVAL;
printf("%f", ((float)t) / CLOCKS_PER_SEC);
system("cls");
}
}
return 0;
}
You can change the INTERVAL to something smaller
I'm playing around new c++ standard. I write a test to observe behavior of scheduling algorithms and see what's happening with threads. Considering context switch time, I expected real waiting time for a specific thread to be a bit more than value specified by std::this_thread::sleep_for() function. But surprisingly it's sometimes even less than sleep time! I can't figure out why this happens, or what I'm doing wrong...
#include <iostream>
#include <thread>
#include <random>
#include <vector>
#include <functional>
#include <math.h>
#include <unistd.h>
#include <sys/time.h>
void heavy_job()
{
// here we're doing some kind of time-consuming job..
int j=0;
while(j<1000)
{
int* a=new int[100];
for(int i=0; i<100; ++i)
a[i] = i;
delete[] a;
for(double x=0;x<10000;x+=0.1)
sqrt(x);
++j;
}
std::cout << "heavy job finished" << std::endl;
}
void light_job(const std::vector<int>& wait)
{
struct timeval start, end;
long utime, seconds, useconds;
std::cout << std::showpos;
for(std::vector<int>::const_iterator i = wait.begin();
i!=wait.end();++i)
{
gettimeofday(&start, NULL);
std::this_thread::sleep_for(std::chrono::microseconds(*i));
gettimeofday(&end, NULL);
seconds = end.tv_sec - start.tv_sec;
useconds = end.tv_usec - start.tv_usec;
utime = ((seconds) * 1000 + useconds/1000.0);
double delay = *i - utime*1000;
std::cout << "delay: " << delay/1000.0 << std::endl;
}
}
int main()
{
std::vector<int> wait_times;
std::uniform_int_distribution<unsigned int> unif;
std::random_device rd;
std::mt19937 engine(rd());
std::function<unsigned int()> rnd = std::bind(unif, engine);
for(int i=0;i<1000;++i)
wait_times.push_back(rnd()%100000+1); // random sleep time between 1 and 1 million µs
std::thread heavy(heavy_job);
std::thread light(light_job,wait_times);
light.join();
heavy.join();
return 0;
}
Output on my Intel Core-i5 machine:
.....
delay: +0.713
delay: +0.509
delay: -0.008 // !
delay: -0.043 // !!
delay: +0.409
delay: +0.202
delay: +0.077
delay: -0.027 // ?
delay: +0.108
delay: +0.71
delay: +0.498
delay: +0.239
delay: +0.838
delay: -0.017 // also !
delay: +0.157
Your timing code is causing integral truncation.
utime = ((seconds) * 1000 + useconds/1000.0);
double delay = *i - utime*1000;
Suppose your wait time was 888888 microseconds and you sleep for exactly that amount. seconds will be 0 and useconds will be 888888. After dividing by 1000.0, you get 888.888. Then you add 0*1000, still yielding 888.888. That then gets assigned to a long, leaving you with 888, and an apparent delay of 888.888 - 888 = 0.888.
You should update utime to actually store microseconds so that you don't get the truncation, and also because the name implies that the unit is in microseconds, just like useconds. Something like:
long utime = seconds * 1000000 + useconds;
You've also got your delay calculation backwards. Ignoring the effects of the truncation, it should be:
double delay = utime*1000 - *i;
std::cout << "delay: " << delay/1000.0 << std::endl;
The way you've got it, all the positive delays you're outputting are actually the result of the truncation, and the negative ones represent actual delays.